NCBI批量下载基因组序列

 Get the summary as a tabular text file.
curl -O ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/bacteria/assembly_summary.txt

# Filter for complete genomes.
awk -F "\t" '$12=="Complete Genome" && $11=="latest"{print $20}' assembly_summary.txt > ftpdirpaths

# Identify the FASTA files (.fna.) other files may also be downloaded here.
awk 'BEGIN{FS=OFS="/";filesuffix="genomic.fna.gz"}{ftpdir=$0;asm=$10;file=asm"_"filesuffix;print ftpdir,file}' ftpdirpaths > ftpfilepaths

# Download everything in parallel
mkdir -p all
cat ftpfilepaths | parallel -j 20 --verbose --progress "cd all && curl -O {}"

Download All The Bacterial Genomes From Ncbi

https://www.ncbi.nlm.nih.gov/genome/browse/#!/overview/

https://www.biostars.org/p/61081/

GNU Parallel

http://www.gnu.org/software/parallel/

参考链接:

https://www.biostars.org/p/275452/

https://www.biostars.org/p/61081/

课程分享
生信技能树全球公益巡讲
https://mp.weixin.qq.com/s/E9ykuIbc-2Ja9HOY0bn_6g
B站公益74小时生信工程师教学视频合辑
https://mp.weixin.qq.com/s/IyFK7l_WBAiUgqQi8O7Hxw
招学徒:
https://mp.weixin.qq.com/s/KgbilzXnFjbKKunuw7NVfw

推荐阅读更多精彩内容