GO Cardiovascular: Introduction to GO Annotation for the Cardiovascular Initiative

A Rough Guide to GO Annotation

In simple terms, a GO annotation is the manual or electronic association of a GO term representing a biological process, cellular component, or molecular function term with a gene product. The evidence for the association is captured by recording the reference, and the nature of the evidence is classified using a code from the GO evidence code set. Every GO annotation created by the GO Consortium follows strict annotation guidelines; see the GO annotation guide for more information.

The process of annotation (the association of GO terms to gene products) can be carried out either manually or automatically. The large-scale assignment of gene products to GO terms using automatic methods is a fast and efficient way of creating a large set of annotations. However, in order for these associations to be correct many of these electronic GO annotations use high level (i.e. rather general) GO terms and often only provide the most minimal overview of the associated gene products functions. Details on the electronic methods applied can be found in Camon et al., 2004 and the GO References page. Manual, comprehensive GO annotations created by assessing experimental evidence from the latest published literature produces far more reliable and detailed GO annotation sets. Manual annotation is a slow, expensive activity, however, and annotation groups need to carefully prioritize their annotation target sets.

The cardiovascular GO annotation initiative will focus on manual annotation of human genes known, or thought, to be involved in cardiovascular processes. Since November 2007, two annotators skilled in GO curation have begun to manually and comprehensively annotate the experimental literature associated with the list of cardiovascular relevant genes. During the course of this project, 1500 well-known genes relevant to the cardiovascular system will be comprehensively annotated, with the close involvement of many cardiovascular groups.

An example of the improved information that can be supplied by carrying out GO annotation can be seen in the example below:

Manual annotation of the human SMAD3 gene before the start of the cardiovascular GO annotation initiative
GO term Evidence code Reference

Biological Process (2 terms)

GO:0006366 : transcription from RNA polymerase II promoter TAS PMID:10823886
GO:0007179 : transforming growth factor beta receptor signaling pathway TAS PMID:8774881

Molecular Function (2 terms)

GO:0003700 : transcription factor activity TAS PMID:10823886
GO:0005515 : protein binding IPI PMID:14612439

Cellular Component (1 term)

GO:0005622 : intracellular IC PMID:14612439
Manual annotation of the human SMAD3 gene after GO annotation by cardiovascular GO annotation initiative curators (multiple GO term entries only represented once)
GO term Evidence code Reference

Biological Process (14 terms)

GO:0006366 : transcription from RNA polymerase II promoter TAS PMID:10823886
GO:0000122 : negative regulation of transcription from RNA polymerase II promoter IDA PMID:8774881
GO:0001666 : response to hypoxia IMP PMID:12411310
GO:0006917 : induction of apoptosis IMP PMID:15334054
GO:0006919 : caspase activation IMP PMID:15107418
GO:0007050 : cell cycle arrest IMP PMID:14555988
GO:0007183 : SMAD protein complex assembly IDA PMID:10823886
GO:0017015 : regulation of transforming growth factor beta receptor signaling pathway IMP PMID:8774881
GO:0019049 : evasion of host defenses by virus IDA PMID:15334054
GO:0030308 : negative regulation of cell growth IDA PMID:8774881
GO:0032909 : regulation of transforming growth factor-beta2 production IMP PMID:12411310
GO:0042993 : positive regulation of transcription factor import into nucleus IDA PMID:15799969
GO:0045930 : negative regulation of mitotic cell cycle IMP PMID:14555988
GO:0045944 : positive regulation of transcription from RNA polymerase II promoter IDA PMID:8774881

Molecular Function (6 terms)

GO:0003700 : transcription factor activity TAS PMID:10823886
GO:0005515 : protein binding IPI PMID:14612439
GO:0015460 : transport accessory protein activity IDA PMID:15799969
GO:0042803 : protein homodimerization activity IPI PMID:8774881
GO:0043565 : sequence-specific DNA binding IDA PMID:10823886
GO:0046332 : SMAD binding IPI PMID:8774881

Cellular Component (4 terms)

GO:0005622 : intracellular IC PMID:14612439
GO:0005634 : nucleus IDA PMID:12446380
GO:0005737 : cytoplasm IDA PMID:12446380
GO:0043235 : receptor complex IMP PMID:8774881

Back to top

Recommended reading

The following article provides a general introduction to GO for biologists:

Lomax J. Get ready to GO! A biologist's guide to the Gene Ontology. Brief. Bioinformatics. Sep 2005;6(3):298-304. [PMID:16212777]

The following article provides a general introduction to the UniProtKB-GOA database at EBI and explains electronic and manual annotation techniques:

Camon E, Magrane M, Barrell D, Lee V, Dimmer E, Maslen J, Binns D, Harte N, Lopez R, Apweiler R. The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. Nucleic Acids Res.. Jan 2004;32(Database issue):D262-6. [PMID:14681408 | doi:10.1093/nar/gkh021]

The following article provides a summary of the aims of the Cardiovascular GO Annotation Initiative:

Lovering RC, Dimmer E, Khodiyar VK, Barrell DG, Scambler P, Hubank M, Apweiler R, Talmud PJ. Cardiovascular GO annotation initiative year 1 report: why cardiovascular GO? Proteomics. May 2008;8(10):1950-3. [PMID:18491309 | doi:10.1002/pmic.200800078]

The following article provides an overview of GO, how to use GO and some of the tools available for the analysis of high-throughput data:

Dimmer EC, Huntley RP, Barrell DG, Binns D, Draghici S, Camon EB, Hubank M, Talmud PJ, Apweiler R, Lovering RC. The Gene Ontology - Providing a Functional Role in Proteomic Studies. Proteomics. Jul 2008. [PMID:18634107 | doi:10.1002/pmic.200800002]

Back to top