如何从一棵大在newick系统树中提取子树(subtree).md

前言

  • 我打算从一棵大的系统树中,提取子树(subtree),newick格式,但是找了好几个软件,都没有满意的
  • 直到我发现了ete3

安装

  • 安装annaconda或Minconda
# Install Minconda  (you can ignore this step if you already have Anaconda/Miniconda)
curl -L 'http://repo.continuum.io/miniconda/Miniconda3-latest-MacOSX-x86_64.sh' -o Miniconda3-latest-MacOSX-x86_64.sh
bash Miniconda3-latest-MacOSX-x86_64.sh -b -p ~/anaconda_ete/
export PATH=~/anaconda_ete/bin:$PATH;
  • 安装ete3
conda install -c etetoolkit ete3 ete_toolchain
  • 检查是否安装正确
ete3 build check

提取子树

  • 从这棵大树里提取子树,大树文件名为"plants.tre",newick格式如下:
((Amborella:0.22394516,((((Aquilegia:0.23014819,(((((((((Arabidopsis:0.06787152,Brassica:0.08531193):0.23571511,Carica:0.13440803):0.03290433,(Gossypium:0.08495118,Theobroma:0.04200667):0.08491018):0.01036229,Citrus:0.15002114):0.00938732,((Manihot:0.0794082,Ricinus:0.09425664):0.03219833,Populus:0.11770477):0.04468638):0.0181717,((Betula:0.06238255,Quercus:0.07252902):0.04719275,((Cannabis:0.16605319,(Fragaria:0.11128062,Malus:0.09411765):0.05408855):0.02282758,(Cucumis:0.2236804,(Glycine:0.07861757,Medicago:0.10795512):0.11913914):0.01534745):0.00582486):0.01247254):0.00821191,Eucalyptus:0.22923685):0.01579662,Vitis:0.12611049):0.01086697,((Camellia:0.11965395,(((Coffea:0.16294648,(Ipomoea:0.14516135,Solanum:0.16349517):0.03370946):0.01247767,((Mimulus:0.11124423,Striga:0.15488643):0.0174298,Sesamum:0.07892533):0.1018001):0.03282718,((Helianthus:0.10541166,Lactuca:0.09207387):0.13797544,Panax:0.12097508):0.01978585):0.01557621):0.01460421,Silene:0.33373615):0.01530354):0.05239645):0.04532908,(Aristolochia:0.2454895,(Liriodendron:0.11406019,Persea:0.13187068):0.02394243):0.01648871):0.01530276,((Dioscorea:0.18220109,Phalaenopsis:0.24290788):0.02133135,((Musa:0.15964715,(Oryza:0.08591361,Sorghum:0.09485301):0.23187058):0.01709166,Phoenix:0.12190251):0.02181441):0.08189202):0.0536551,Nuphar:0.22604966):0.03246267):0.14424643,(((Picea:0.02820848,Pinus:0.03621121):0.13031718,Zamia:0.13395319):0.10258004,Selaginella:0.81374637):0);
  • 要提取的子树:
Musa
Oryza
Sorghum
Phoenix
Nuphar
Picea
Pinus
  • 命令如下(在ipyhon中操作,更方便):
from ete3 import Tree
#如果你保存为文件,可以用
#t=Tree("Astral_50_prune121.tre.retitle") 
t=Tree("((Amborella:0.22394516,((((Aquilegia:0.23014819,(((((((((Arabidopsis:0.06787152,Brassica:0.08531193):0.23571511,Carica:0.13440803):0.03290433,(Gossypium:0.08495118,Theobroma:0.04200667):0.08491018):0.01036229,Citrus:0.15002114):0.00938732,((Manihot:0.0794082,Ricinus:0.09425664):0.03219833,Populus:0.11770477):0.04468638):0.0181717,((Betula:0.06238255,Quercus:0.07252902):0.04719275,((Cannabis:0.16605319,(Fragaria:0.11128062,Malus:0.09411765):0.05408855):0.02282758,(Cucumis:0.2236804,(Glycine:0.07861757,Medicago:0.10795512):0.11913914):0.01534745):0.00582486):0.01247254):0.00821191,Eucalyptus:0.22923685):0.01579662,Vitis:0.12611049):0.01086697,((Camellia:0.11965395,(((Coffea:0.16294648,(Ipomoea:0.14516135,Solanum:0.16349517):0.03370946):0.01247767,((Mimulus:0.11124423,Striga:0.15488643):0.0174298,Sesamum:0.07892533):0.1018001):0.03282718,((Helianthus:0.10541166,Lactuca:0.09207387):0.13797544,Panax:0.12097508):0.01978585):0.01557621):0.01460421,Silene:0.33373615):0.01530354):0.05239645):0.04532908,(Aristolochia:0.2454895,(Liriodendron:0.11406019,Persea:0.13187068):0.02394243):0.01648871):0.01530276,((Dioscorea:0.18220109,Phalaenopsis:0.24290788):0.02133135,((Musa:0.15964715,(Oryza:0.08591361,Sorghum:0.09485301):0.23187058):0.01709166,Phoenix:0.12190251):0.02181441):0.08189202):0.0536551,Nuphar:0.22604966):0.03246267):0.14424643,(((Picea:0.02820848,Pinus:0.03621121):0.13031718,Zamia:0.13395319):0.10258004,Selaginella:0.81374637):0);")
subtree_taxa = ["Musa", "Oryza", "Sorghum", "Phoenix", "Nuphar", "Picea", "Pinus"]
t.prune(subtree_taxa,preserve_branch_length=True)
t.write()

ChangeLog:

  • 作者:石博士
  • 时间:20200919