----------------------------------------------- Trypasosoma brucei README =============================================== Table of Contents ----------------------------------------------- 1. Trypanosoma brucei genome sequencing project 2. File format 3. Data Access 4. Contacts =============================================== -------------------------------------------------------------------------------------- 1. Trypanosoma brucei genome sequencing project =============================================== T. brucei, the protozoan responsible for African Sleeping Sickness, possesses a two-unit genome, a nuclear genome and a mitochondrial (kinetoplast) genome with a total estimated size of 35Mb/haploid genome. The nuclear genome is split into three classes of chromosomes according to their size on pulse filed gel electrophoresis, 11 megabase chromosomes (0.9-5.7 Mb), intermediate (300-900 kb) and minichromosomes (50-100 kb). The 11 chromsomes are being sequenced by The Wellcome Trust Sanger Institute and TIGR as follows 1 Sanger 2-8 TIGR 9-11 Sanger IMPORTANT NOTE: The file gene_association.GeneDB_Tbrucei only contains gene associations for the chromsomes sequenced at the Wellcome Trust Sanger Institute. 2. File format ============== The file format is based on the the guidelines provided by the Gene Ontology Consortium (www.geneontology.org) with several 1. DB Database from which annotated entry has been taken. Here: GeneDB_Tbrucei is used to signify that the data will eventually be is housed in the T. brucei at GeneDB (http://www.genedb.org/genedb/tryp/) 2. DB_Object_ID A systematic gene name is used as a unique identifier to access gene pages in the DB is normally used for the item being annotated. To access each gene page, append the Systematic id to the following URL: http://www.genedb.org/genedb/Search?organism=tryp&name= 3. DB_Object_Symbol The systematic gene name is currently used to identify gene pages thus, DB_Object_Symbol and DB_Object_ID are equivalent. 4. NOT Rarely used; indicates that a GO assignment is not appropriate despite some evidence to the contrary, see main GO documentation. 5. GOid The GO identifier for the term attributed to the DB_Object_ID. Example: GO:0005625 6. DB:Reference Reference cited to support the attribution. This is pre-publication data, after publication this column will contain a PubMed id for a supporting reference for the attributions 7. Evidence See main documentation 8. With Here: Currently not applicable, always empty. 9. Aspect One of the three ontologies: P (biological process), F (molecular function) or C (cellular component). Example: P 10. DB_Object_Name Here: Currently not applicable. Contains the same information as column 2. Will soon contain details of the closest Drosophila homologue for reference purposes 11. Synonym Here: Currently not applicable, always empty. 12. DB_Object_Type What kind of entity is being annotated. Here: always 'gene' 13. Taxon_ID Identifier for the species being annotated. Here: taxon:5833 14. Date Date that the analysis was performed 3. Data Access ============== All sequences analysed are available from http://www.sanger.ac.uk 4. Contacts =========== Matt Berriman, mb4@sanger.ac.uk Christiane Hertz-Fowler, chf@sanger.ac.uk