!version: $Revision: 1.50 $ !date: $Date: 2003/10/03 14:43:11 $ ! !Gene Ontology !what_is_new ! !editors: Michael Ashburner (FlyBase), Midori Harris (GO), Judith Blake (MGD) !Leonore Reiser (TAIR), Karen Christie (SGD) and colleagues !with software by Suzanna Lewis (FlyBase Berkeley). ! GO - WHAT'S NEW. This file documents major changes to the structure or content of GO files. The most recent changes are at the top of the file. ** October 3, 2003: MultiFun gene classification system mapped to GO terms ======================================================================= A mapping of MultiFun (a classification system for the cellular/physiological roles of E. Coli gene products) to GO terms has been added to the go/external2go directory. [mah and jl] ! ** May 22, 2003: Microbial Structure Ontology ========================================== A web page is available for the Microbial Structure Ontology project: http://www.geneontology.org/doc/microbial_structure_ontology/ The project aims to develop a controlled vocabulary to describe the 'anatomy' of fungi and other microbes, called the microbial structure ontology (MSO). The ontology is orthogonal to those developed by the GO Consortium and those listed at the Open Biological Ontology site (OBO). Thus, the MSO may be used alone or in conjunction with the ontologies at GO and OBO to make robust functional annotations. Researchers are encouraged to help develop and to use the MSO in their work. [mah] ! ** Feburary 5, 2003: Changes to molecular function terms ===================================================== All GO molecular function term names are to be appended with the word 'activity' on March 1 2003. Versions of the ammended flat files can be found here: http://www.ebi.ac.uk/~jane/newfunction.ontology http://www.ebi.ac.uk/~jane/newGO.defs For further information please contact Jane Lomax (jane@ebi.ac.uk). [jl] ! ** December 9, 2002: ZFIN annotations available ============================================ The first set of GO annotations for the zebrafish (Danio rerio) has been provided by ZFIN. The file is available from the usual GO FTP, HTTP, and CVS resources. FTP: ftp://ftp.geneontology.org/pub/go/gene-associations/gene_association.zfin HTTP: http://www.geneontology.org/cgi-bin/GO/downloadGOGA.pl/gene_association.zfin [mah] ! ** December 3, 2002: Users Meeting archive page ============================================ A new web page stores links to information (e.g. programs, abstracts) about past GO Users Meetings. See http://www.geneontology.org/doc/GO>Users_Meetings.html [mah] ! ** October 24, 2002: Update on annotations: newly annotated species ================================================================ Annotation files have been released over the past few months for several species; see the go/gene-associations/ directory for the following new files: Glossina morsitans (tsetse fly) gene_association.GeneDB_tsetse Oryza sativa (rice) gene_association.gramene_oryza Trypanosoma brucei gene_association.GeneDB_Tbrucei and gene_association.tigr_Tbrucei_chr2 Vibrio cholerae gene_association.tigr_vibrio [mah] ! ** August 27, 2002: MIPS Functional Catalogue mapped to GO terms ============================================================= A mapping of MIPS Functional Catalogue entries to GO terms has been added to the go/external2go directory. [mah] ! ** June 21 2002: *All* GOA data from EBI published. =============================================== GOA (GO Annotation@EBI) is a project run by the European Bioinformatics Institute that aims to provide assignments of gene products to the Gene Ontology (GO) resource. The project announces the release of all GO annotations that exist in SWISS-PROT and TrEMBL as well as a third release of annotation for the SWISS-PROT/TrEMBL/Ensembl non-redundant human proteome set. This release represents a considerable contribution to the GO Consortium annotation effort providing over 2 million GO associations across 481422 SWISS-PROT and TrEMBL entries covering 43239 species. The data can be obtained via: EBI FTP: ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/ EBI SRS: http://srs.ebi.ac.uk. Search GOA data library where both GOA SPTr and GOA Human association files have been merged. GO FTP: ftp://ftp.geneontology.org/pub/go/gene-associations/ For further information read: http://www.ebi.ac.uk/GOA or contact goa@ebi.ac.uk. Enjoy! [ec] and [db] ! ** June 5 2002: New evidence code inferred by curator (IC). ================================================== GO has added a new evidence code, "inferred by curator," for those cases where an annotation is not supported by any evidence, but can be reasonably inferred by a curator from other GO annotations, for which evidence is available. An example would be when there is evidence (be it direct assay, sequence similarity or even from electronic annotation) that a particular gene product has the function "transcription factor". There is no evidence whatsoever that this gene product has the cellular location "nucleus", but this would be a perfectly reasonable inference for a curator to make. This inference would be linked to the annotation "transcription factor" in two ways: (i) both annotations would share the same reference, and the inferred annotation would include one or more "from" statements pointing to the GO term(s) used by the curator for the inference. gene_product: jubi reference: Ashburner et al. 2006 J. irreprod. res. 107:11989-11990 molecular_function: general RNA polymerase II transcription factor ; GO:0016251 | inferred from sequence similarity cellular_location: nucleus ; GO:0005634 | inferred by curator from GO:0016251 The abbreviation for "inferred by curator" will be IC. There will be cases in existing annotations where inferences have been made by curators who will have used one of the already existing evidence codes. It is doubtful that these will be retrofitted. [jl for ma] ! ** April 26 2002: February meeting minutes available. ================================================== The minutes from the February meeting of the GO Consortium, organized by SGD and sponsored by O'Reilly and Associates at The Westin La Paloma in Tucson, AZ, are now available from the CVS repository and from the ftp site (ftp://ftp.geneontology.org/pub/go/minutes/). [krc for rc] ! ** March 20 2002: SourceForge curator requests tracker system now active. ====================================================================== The SourceForge system for tracking curator requests is now in use and can be found at: https://sourceforge.net/tracker/?atid=440764&group_id=36855&func=browse/ There are also some instructions for using the system which are in CVS (go/doc/GO.curator_requests.html) and at the url: http://www.geneontology.org/doc/GO.curator_requests.html [jl] ! ** March 6 2002: XML and MySQL repository to be located only at Berkeley. ====================================================================== As agreed at the October meeting of the GO Consortium, the latest versions and archives of the monthly releases of the MySQL and XML releases will now be kept only at UC Berkeley for the best access to the most up to date files. The links on the GO home page now point only to the MySQL and XML repository at Berkeley. The copies at Stanford will be removed from the CVS repository and from the ftp site; specifically, this file will be removed: ftp://ftp.geneontology.org/pub/go-xml/go.xml [krc for sl and cm] ! ** January 4 2002: GO and RESID now reciprocally cross-linked. =========================================================== As announced on September 20 2001 John Garavelli and I have been working to cross-link his RESID database of modified amino acids to GO. This has now been completed and updated to version 28.00 of the RESID database (available from ftp://ftp.ncifcrf.gov/pub/users/residues/). This release contains 311 records and these are cross-linked to GO in Xref lines. GO has created "biological_process" terms relevant to these modified residues and the cross-links from GO to RESID are stored in the definitions (go/doc/GO.defs) of these terms. [ma & jg] ! ** January 3 2002: October meeting minutes available. The minutes from the October meeting of the GO Consortium, hosted by DictyBase at Northwestern University, are now available from the CVS repository and from the ftp site (ftp://ftp.geneontology.org/pub/go/minutes/). [jb & ma] ! ** December 21 2001: Rat annotations released. =========================================== The Rat Genome Database has released to gene-associations/ a file of 3774 annotations of Rat genes (/gene-associations/gene_association.rgd). [ma for RGD] ! ** December 17 2001: RNAi-based GO annotations for C. elegans ========================================================== WormBase's RNAi-to-GO mappings have been implemented for C. elegans and made available in the Current Annotations table. The file is: gene_association.wb. From here on there should be reasonably frequent updates, as both the GO terms are expanded to cover new C. elegans processes and the RNAi data are expanded. [mh for WB] ! ** December 17 2001 Redesigned GO home page ======================================== We have changed the appearance and organization of the Gene Ontology home page (www.geneontology.org). New features include: - Search GO terms and annotations directly from the home page - Links to more GO browsers - Job openings on separate page - Contact information on separate page (mailing lists and specific individuals at each member organizatien) - GO Consortium member organizations and people on separate page [mh,krc] ! ** November 6 2001: Corrections to GO. =================================== As part of the checking of GO terms against the GO word dictionary (see go/doc/GO.word_dictionary) I have rationalised the use of the double hyphen (--) in enzyme names. This is often used by the Enzyme Commission, but it makes searching in GO easier if a single hyphen is used in these names and also in chemical names that might appear elsewhere in GO. Without this rationalisation some names would be present in GO in two forms, one with '--' and one with '-'. I have not yet finished checking against the dictionary but will report in this file any other systematic changes that I make. [ma] ! ** November 1 2001: GOA data from EBI published. ============================================= GOA (GO Annotation@EBI) is a project run by the European Bioinformatics Institute that aims to provide assignments of gene products to the Gene Ontology (GO) resource. In the GOA project, GO terms will be applied to a non-redundant set of proteins described in the SWISS-PROT, TrEMBL and Ensembl databases that collectively provide complete proteomes for Homo sapiens and other organisms. The first set of GOA data has now been made public. The data can be obtained via FTP from EBI or from GO. GO: ftp://ftp.geneontology.org/pub/go/doc/goa.README ftp://ftp.geneontology.org/pub/go/gene-associations/gene_association.goa EBI: ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/HUMAN/README ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/HUMAN/gene_association.goa [mh for EBI] ! ** October 19 2001: MetaCyc links to GO. ===================================== Database cross-links between GO biological_process terms and the MetaCyc database of Peter Karp and colleagues have now been made. We thank Peter Karp for his help in doing this. abbreviation: MetaCyc definition: The Metabolic Encyclopedia of metabolic and other pathways. example: MetaCyc:GLUTDEG-PWY generic_url: http://ecocyc.org/ecocyc/metacyc.html? url_syntax: http://malibu.ai.sri.com:1555//META/NEW-IMAGE?type=PATHWAY&object=? url_syntax_example: http://malibu.ai.sri.com:1555//META/NEW-IMAGE?type=PATHWAY&object=GLUTDEG-PWY [ma] ! ** September 20 2001: Additions to GO ontologies. ============================================== The children of the process term "protein modification" have been extensively revised and extended so as to accomodate the various classes of protein modification listed in the RESID database: http://home.earthlink.net/~jsgaravelli/RESIDInfo.HTML In the definitions file references of the form RESID:AA are to the records of this database. [ma] ! ** September 18 2001: Additions to GO ontologies. ============================================== 1. With the great help of Dr. Paul Kellam (London) and Dr. Bill Gillis (Delaware) the three GO ontologies have been updated so as to be of use for the annotation of viral gene products. 2. The biological_process and molecular_function ontologies have been updated to include information from The University of Minnesota Biocatalysis/Biodegradation Database [http://umbbd.ahc.umn.edu/index.html]. Functions (enzymes) present in both GO and in the UM-BBD have been given a database cross-reference of the form UM-BBD_enzymeID:eX (where X is an integer); pathways (in biological_process) have been given a database cross-reference of the form UM-BBD_pathwayID:X, where X is a string. Many thanks to Dr. Lynda Ellis for her help. The url syntax for hyperlinking to UM-BDD for enzymes is: http://umbbd.ahc.umn.edu/servlets/pageservlet?ptype=ep&enzymeID=eX The url syntax for hyperlinking to UM-BDD for pathways is: http://umbbd.ahc.umn.edu/phe/X_map.html replacing 'X' by the integer or string as appropriate. [ma] ! ** April 27 2001: New email address for private correspondence with GO. ==================================================================== For commercial companies using GO. ---------------------------------- We know that several companies are using GO. At the moment the only way for these to suggest changes to GO is to email a public list, or to privately contact a GO worker whom they happen to know. We realise that this may be difficult for some companies as, by so doing, they may disclose their current work to the world. We have, therefore, established a contact for companies that wish to communicate suggestions to GO in confidence. This is by email to this address: go-in@ebi.ac.uk Mail to this address will only be read by Midori Harris, the new full-time GO Biology Editor now working at the EBI on the NIH grant, Michael Ashburner, Mike Cherry and Judy Blake. We will keep the identity of the company requesting a particular change confidential. If we need to consult with colleagues we will do so in such a way that is anonymous with respect to the companies requesting help or changes to GO. We hope that this will encourage all users of GO, not only those in academia, to provide the GO Consortium with useful feedback. Michael Ashburner & Midori Harris for the GO Consortium ! ** March 16 2001: Conversion to GO-EDIT output. ============================================ Until now all changes to the three GO ontology files have been made with generic editing tools (typically emacs) with, in recent months, some episodic checking for syntactical errors by scripts. The Berkeley group, especially John Richter, has been working hard for some months to build an interactive GO editor, as a java app (GO-EDIT). This has now been tested by the curators and is ready for prime time. See the following url for details of the Editor, which is publically available: http://www.godatabase.org/dev/editor.html There are two consequences of this change, one short term and one long term: Short term: ----------- The ordering of nodes in the ontologies will change radically, because GO-EDIT sorts by an alphabetical algorithm. This means that the first instantiation of these files on the GO CVS will generate a massive diff report. The relationships between nodes will not, of course, change. The definitions file (doc/GO.defs) will also be handled by GO-EDIT from now. Long term: ---------- For the moment the GO-EDIT will write out the flat file files of GO.defs, molecular_function.ontology, biological_process.ontology and cellular_component.ontology, which will be committed to the GO CVS site in the normal way. However, since these files will now be much more rigorous with respect to their syntax, regular loading into the XML and mySQL versions will now occur. In the longer term users of the GO-EDIT will write directly back to the database, which will then become the authorative version of GO. When that occurs we will notify users. These changes will have effect from Friday, March 16, 2001, around 10:00 AM PST. To avoid any possible confusion between versions the first version of each of the three ontologies will be numbered 2.1. ! ** March 7 2001: Revision of %enzyme. ================================== There has been continual dissatisfaction within the GO group and others in the way that enzyme functions have been handled in $molecular_function. The children of the node %enzyme have been wholly revised, so as to follow much more closely the Enzyme Commission system, the EC hierarchy now acts as a scaffold for the children of this node. Enzymes without an EC number, and functions that are "paraphyletic" - that is cut across different EC numbers - are at the end of this section. [ma] ! ** February 17 2001: G-protein coupled receptors. ============================================== The high level organisation of the G-protein coupled receptors (GO:0004930) has been revised, with the addition of some missing functions, in response to user comments. [ma] ! ** November 1 2000: Major revision of stress response and responses to stimuli. ============================================================================ We have made a major revision to the children of %stress response in $biological_process. This was stimulated by the need of the Arabidopsis database (TAIR). Responses to external stimuli (whether stress or not) are now children of %response to external stimulus ; GO:0009605. At the same time we have replaced %sensory perception ; GO:0007600 by a new parent term %perception of external stimulus ; GO:0009581 and made very considerable revision of their children. No terms have been removed, although some have been slightly re-worded to make clearer. [ma;lr;mh] ! ** October 26 2000: Plant cell components added. ============================================= A number of terms have been added to $cellular_component for plants. To accomodate plant biology, some aspects of the ontology structure have been changed: cell wall ; GO:0005618 is no longer a child of extracellular ; GO:0005576, and intercellular junction ; GO:0005911 is no longer a child of plasma membrane ; GO:0005886. [mah] ! ** October 2 2000: Transport completely revised. ============================================= The children of %transport in $biological_process have now been completely revised to bring them into conformity with the recent revision of %transporter in $molecular_function. Only two pre-existing children have been made obsolete. [ma] ! ** October 2 2000: Transport completely revised. ============================================= We have today committed a complete revision of all of the children of %transport. In versions 1.57 and previous of $molecular_function there were some 300 children of this term; there are now over 800. All except three of the previous children remain, although the precise terms and their relationships may have been revised. The new structure is, we hope, much more logical. Moreover, we have - as far as is possible - used terms and relationships of Maier et al's Transport Classification (see Microbio. Molec. Biol. Revs 64(2): 354-411 and http://www-biology.ucsd.edu/~msaier/transport/). Database cross-references to this have been added, with the prefic TC:. The problems are that (a) the basis of the TC is protein structure, and not function, and (b) there are differences in terms between the published and web versions of the database. We have also used many of the definitions from this valuable resource. There may be a one to many relationship between a GO term and the TC. We doubt that this revision is error free. In general, we have only added terms for Eukarya and E. coli. Please report any errors or problems to GO. The revision of the children of %transport ; GO:0006810 in $biological_process is now underway. [ma] ! ** September 26 2000: Terms for prokaryotes added to GO. ===================================================== Thanks to the enormous help of Monica Riley and Gretta Serres of GenProtEC and of Michelle Gwinn of TIGR we have added terms to all three ontologies that are relevant for the annotation of bacterial genomes with respect to GO. In practice, we are sure that genome annotators will require new terms, especially for genomes other than of enteric bacteria. If so, please contact GO. Please also inform GO of any errors. [ma] ! ** September 21 2000: E. coli enzymes added. ========================================= $molecular_function has been updated with about 360 new enzymes to allow it to be used for E. coli. I thank Dr. Gretta Serres of the MBL, Woods Hole for providing us with a list of the E. coli enzymes from GenProtEC (http://genprotec.mbl.edu/start). We hope to have updated $cellular_component with the E. coli multimeric proteins within a few days. [ma] ! ** August 2 2000: Revised evidence codes. ====================================== The descriptions and codes for two evidence categories have been changed to improve clarity. 1. old: ASS, "author said so" new: TAS, traceable author statement 2. old: NA, not available new: NAS, non-traceable author statement A full description of each evidence category, and how it is used, can be found in the GO.evidence document. [mah] ! ** July 24 2000: New evidence line. ================================ A new "evidence" criterion for gene product annotation has been added: inferred from electronic annotation [to ] [ma] ! ** July 24 2000: Changes to the syntax of the gene_association files. =================================================================== There are three changes to the syntax of the gene_association files. 1. The cardinality of the "ref" and "evidence" fields is now 1, not 1, >1 as before. Thus there will now be a separate row for each unique reference-evidence pair. 2. Many of the "evidence" statements allow links to other data base objects. For example a gene product may be annotated by FlyBase as being: "dimethyl gluctase, inferred from sequence similarity with SWP:P99999" 3. Database annotatators can now negate an association between a gene product and a GOid. This is done by prefixing the GOid with the string NOT. The syntax of the gene_association files now includes these database cross references as a new field. The new syntax is (white space are tabs): DB Gene_id Gene_symbol GOid ref evidence with aspect name(|name) synonym(|synonym) These changes will be implemented for gene_association files: *.fb > version 1.19 *.mgi > version 1.5 *.pombase > version 1.2 *.sgd > version 1.139 [ma] ! ** July 24 2000: Schizosaccharomyce pombe gene associations available. =================================================================== Valerie Wood of the Sanger Centre has annotated the S. pombe sequence with respect to GO biological_process, see: http://www.sanger.ac.uk/Projects/S_pombe/FUNCAT/funcat.shtml A S. pombe gene association file is available from: go/POMBASE_GO_files/gene_association.pombase [ma] ! ** July 22 2000: In-line references to MEDLINE deleted from ontologies. ==================================================================== In versions prior to: cellular_component Revision: 1.69 molecular_function Revision: 1.123 biological_process Revision: 1.105 terms may have had an in-line bibliographic reference, e.g. DNA-nonhomologous end-joining ; GO:0006303 ; MEDLINE:99027923 These have now all been removed and references for terms are only now found in the file: go/doc/GO.defs. Note that this does not change the line syntax, which remains: < | % term [; db cross ref]* [; synonym:text]* [ < | % term]* [The reason that there is no change is that 'reference' was regarded as a type of 'db cross ref'; 'db cross ref', e.g. to ENZYME, remains valid in the ontologies.] [ma] !