Gene Ontology Newsletter

Issue No. 2

August 2006

Paper Highlight: Using GO to Evaluate Interactions and Networks

Interaction networks derived from high-throughput (HTP) methods, such as proteome-wide purification of protein complexes using affinity-tagged proteins, have become important resources for predicting gene function and studying complex biological networks, but the specificity of these networks has been difficult to assess. Data published in a recent paper from Mike Tyers' lab (University of Toronto) and incorporated within their BioGRID resource compares the results of HTP datasets versus that of individual experiments described in the literature. The paper (Reguly T, Breitkreutz A, Boucher L, Breitkreutz BJ, Hon GC, Myers CL, Parsons A, Friesen H, Oughtred R, Tong A, Stark C, Ho Y, Botstein D, Andrews B, Boone C, Troyanskya OG, Ideker T, Dolinski K, Batada NN, Tyers M. [PMID:16762047 | doi:10.1186/jbiol36]) describes a comprehensive dataset of genetic and physical interactions for the budding yeast Saccharomyces cerevisiae. Using GO Biological Process annotations curated from the primary literature by Saccharomyces Genome Database, the authors analyzed the extent of shared GO annotations between interaction pairs defined in the literature-curated (LC) dataset versus those defined in HTP datasets. The results indicate that inclusion of the LC dataset within their analysis significantly improves gene function prediction. The LC dataset assembled by Reguly et al. thus provides not only a valuable new resource for annotating a gene's role in biology, but an important benchmark by which the accuracy of HTP datasets can be measured. The complete LC dataset is available from BioGRID and mirrored at SGD.

Expansion of Central Nervous System Development Representation

Current genetic and molecular studies in many model organisms are aimed at understanding formation and development of the nervous system. Up until this point, the GO has had a very shallow representation of processes pertaining to the nervous system. In June 2006, curators from Mouse Genome Informatics and Zebrafish Information Network met with researchers studying central nervous system development to improve the representation of these processes in GO. In particular, emphasis was placed on three areas that are being addressed actively in current research: forebrain development, hindbrain development and neural tube development. This collaboration resulted in the addition of over 500 terms that reflect the development of the forebrain, the hindbrain, and the neural tube from the perspective of biological process and anatomical structure.

Representing Interactions between Organisms

The collaboration between PAMGO and the GO consortium continued with a jamboree to discuss new terms to capture processes such as the modification of host structures and microbe responses to host defenses. Terms representing the interactions of symbionts and hosts can be found in the interaction between organisms ; GO:0051704 node of the Biological Process ontology.

Annotation Camp 2006

The 3rd GO Annotation Camp was held at Stanford University on July 10 – 14, 2006. During the first two days members of the GO consortium met to discuss annotation standards, focusing on the consistent use of evidence codes. Discussion included the clarification of usage guidelines for cases in which multiple evidence codes could be used to support a GO annotation (e.g., a protein binding experiment may be annotated with IDA [Inferred by Direct Assay] or IPI [Inferred by Physical Interaction] evidence codes). In addition, suggestions were made to refine the use of certain codes or extend the use of other codes in light of new experimental approaches. All of the recommendations will be incorporated into revised evidence code documentation that will clarify the use of codes and should improve annotation consistency for both new and current groups.

The second part of the meeting was devoted to training scientists and curators from more than 20 organism databases and scientific communities to manually curate GO annotations from the primary literature. The training included presentations by GO consortium members about the ontologies, the evidence codes, and the process of annotation as well as the opportunity to gain practical experience by annotating a representative set of ten publications in small groups guided by an experienced GO curator. This set of reference papers, annotated under the new guidelines, represents a valuable training resource that will be available on the GO web site.

Thanks to Incyte Genomics and the Department of Genetics at the School of Medicine, Stanford University, for supporting this meeting. If you are interested in attending next year's annotation camp, please contact GO helpdesk.

Photo of GO group at the 2006 GO Annotation Camp

OBO-Edit Released

OBO-Edit, formerly known as DAG-Edit, has been rebuilt from the ground up, with improved speed, stability, and a more intuitive interface. New features include support for OBO 1.0 and 1.2 specifications, basic reasoning capabilities, cross-product editing, full user's guide, and bug fixes. OBO-Edit can be downloaded from the OBO-Edit website.

Removing 'unknown' Terms

The 'unknown' terms in GO — biological process unknown ; GO:0000004, molecular function unknown ; GO:0005554 and cellular component unknown ; GO:0008372 — will be removed from GO on October 16, 2006. Current annotations to 'unknown' terms, which are made when curators have looked at all of the published literature about a given gene product and failed to establish its function, biological process or cellular component, will migrate to the top level terms — biological process ; GO:0008150, molecular function ; GO:0003674, and cellular component ; GO:0005575. Annotation to these terms will retain the evidence code ND [No Data].

New File in OBO 1.2 Format

Beginning September 15, 2006, the GO consortium will make available an additional ontology file in OBO 1.2 format. The main difference between the files will be that the replacement terms for obsolete terms will be specified by tags in gene_ontology_edit.obo, rather than in the comments field as currently in gene_ontology.obo. For more information see the OBO 1.2 format specification. The current gene_ontology.obo file in OBO 1.0 format will continue to be produced.

Upcoming Meetings

Contact GO

To receive this newsletter and other announcements from the GO Consortium, please subscribe to the GO Friends mailing list.

Please contact the Gene Ontology Consortium with any comments or suggestions. Frequently asked questions will appear as tutorials or tips in upcoming newsletters.