################################################################### README for ftp://ncbi.nlm.nih.gov/refseq/release/release-catalog Last updated: February 13, 2004 ################################################################### _________________________________________________________________________ National Center for Biotechnology Information (NCBI) National Library of Medicine National Institutes of Health 8600 Rockville Pike Bethesda, MD 20894, USA tel: (301) 496-2475 fax: (301) 480-9241 e-mail: info@ncbi.nlm.nih.gov _________________________________________________________________________ This directory includes files documenting the contents of the RefSeq release both as an accession list and file list, and records that were included in the previous release but are not included in the current release. Files included are: RefSeq-release#.catalog release#.files.installed release#.removed-records where '#' is the release number Subdirectories: archive - previous release catalogs are available here ========================================== RefSeq-release#.catalog ========================================== Content: Tab-delimited listing of all accessions included in the current RefSeq release. Columns: 1. taxonomy ID 2. species name 3. accession.version 4. gi 5. refseq release directory accession is included in complete + other directories '|' delimited 6. refseq status na - not available; status codes are not applied to most genomic records INFERRED PREDICTED PROVISIONAL VALIDATED REVIEWED MODEL UNKNOWN - status code not provided; however usually is provided for this type of record 7. length ========================================== release#.files.installed ========================================== Complete listing of sequence files installed for the current release. File name format indicates the directory node, molecule type, and format type. Multiple files may be provided for any given molecule and format type and file names include a numerical increment. Files with the same numerical increment are related by content, they are all derived from the same ASN.1 file. Name format: complete10.bna.gz |-------|--|---|--| 1 2 3 4 1. directory location 2. numerical increment 3. format type 4. compression Note that for some molecule and format types, a number increment is skipped. This is not an error. The RefSeq release processing first produces a set of split ASN.1 files which are used to export the records by molecule and format type. If an ASN.1 file does not include any records for a given molecule type, such as genomic sequence data, then the corresponding 'genomic' fasta and flatfile records will not be found. For example: complete10.bna.gz complete10.genomic.fna.gz complete10.genomic.gbff.gz complete10.protein.faa.gz complete10.protein.gpff.gz complete10.rna.fna.gz complete10.rna.gbff.gz If complete10.bna.gz includes genomic, and RNA, and protein data then the full set of files are provided. In contrast, if complete24.bna includes only genomic and protein data then the corresponding rna file is not provided: complete24.bna.gz complete24.genomic.fna.gz complete24.genomic.gbff.gz complete24.protein.faa.gz complete24.protein.gpff.gz ========================================== release#.removed-records ========================================== Content: Tab-delimited report of records that were included in the previous release but are not included in the current release. Columns: 1. taxonomy ID 2. species name 3. accession.version 4. gi 5. refseq release directory accession is included in complete + other directories '|' delimited 6. refseq status na - not available; status codes are not applied to most genomic records INFERRED PREDICTED PROVISIONAL VALIDATED REVIEWED MODEL UNKNOWN - status code not provided; however usually is provided for this type of record 7. length 8. removed status dead protein: protein was removed when genomic record was reloaded and protein was not found on the nucleotide update. This is an implied permanent suppress. temporarily suppressed: record was temporarily removed and may be restored at a later date. permanently suppressed: record was permanently removed. It is possible to restore this type of record however at the time of removal that action is not anticipated. replaced by accession: the accession in column 3 has become a secondary accession that cited in column 8.