UCSC ENSEMBL NCBI基因组各个版本对应关系 下载地址

一、对应关系

NCBI的版本包括GRCh36,37,38,UCSC包括hg18,19,38, ENSEMBL有各种release,他们之间的对应关系如下:
GRCh36 (hg18): ENSEMBL release_52.

GRCh37 (hg19): ENSEMBL release_59/61/64/68/69/75.

GRCh38 (hg38): ENSEMBL release_76/77/78/80/81/82.

二、参考基因组的下载

基因组fasta文件的下载可以在illumina网站下载 各个版本都有 https://support.illumina.com.cn/sequencing/sequencing_software/igenome.html?langsel=/cn/

igenome.png

UCSC的下载地址:
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/
如果要按照染色体号进行下载可以用脚本:

for i in $(seq 1 22) X Y M;
do echo $i;
wget http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/chr${i}.fa.gz;
done

NCBI的下载地址在:ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/BUILD.37.3/

三、下载GTF注释文件

NCBI:
ftp://ftp.ncbi.nlm.nih.gov/genomes/Homo_sapiens/ARCHIVE/

ENSEMBL:ftp://ftp.ensembl.org/pub/release75/gtf/homo_sapiens/Homo_sapiens.GRCh37.75.gtf.gz

UCSC需要自己选择一系列参数:

http://genome.ucsc.edu/cgi-bin/hgTables

clade: Mammal
genome: Human
assembly: Feb. 2009 (GRCh37/hg19)
group: Genes and Gene Predictions
track: UCSC Genes
table: knownGene
region: Select "genome" for the entire genome.
output format: GTF - gene transfer format
output file: hg19_ucsc.gtf
Click 'get output'.

推荐阅读更多精彩内容