Data Sources
Data sources available for genes annotation.
RefSeq
Files
ncbi refseq RefSeqGene:
LRG_RefSeqGene
refseqgene.<n>.genomic.gbff.gz
ncbi refseq mRNA_Prot:
human.<n>.rna.gbff.gz
ncbi gene:
gene2ensembl.gz
gene2refseq.gz
Description
LRG_RefSeqGene is a tab-delimited file reporting, for each gene, the accession.version of the genomic RefSeq (RSG) that is the standard reference. Additionally reports the accession.version of the associated RNA and protein RefSeqs.
#tax_id GeneID Symbol RSG LRG RNA t Protein p Category
refseqgene.<n>.genomic.gbff report annotations for each RSG in GenBank format.
human.<n>.rna.gbff report annotations for each RNA and protein RefSeq in GenBank format.
gene2ensembl is a tab-delimited file matching NCBI to Ensembl annotations.
#tax_id GeneID Ensembl_gene_identifier RNA_nucleotide_accession.version Ensembl_rna_identifier protein_accession.version Ensembl_protein_identifier
gene2refseq is a tab-delimited file reporting genomic/RNA/protein sets of matching RefSeqs.
#tax_id GeneID status RNA_nucleotide_accession.version RNA_nucleotide_gi protein_accession.version protein_gi genomic_nucleotide_accession.version genomic_nucleotide_gi start_position_on_the_genomic_accession end_position_on_the_genomic_accession orientation assembly mature_peptide_accession.version mature_peptide_gi Symbol
Version
Current version accessed 2020-10-22.
LRG_RefSeqGene: v20201020
refseqgene.<n>.genomic.gbff.gz: v20201020
human.<n>.rna.gbff.gz: v20201020
gene2ensembl.gz: v20201022
gene2refseq.gz: v20201022