Guidelines for picking GO terms for Annotations

Annotation is the process of assigning GO terms to gene products. The annotation data in the GO database is contributed by members of the GO Consortium, and the Consortium is actively encouraging new groups to start contributing annotation. Annotations can be made from published literature where a curator reads and interprets the experiments and results presented in a paper or can be inferred automatically using sequence information or by key word mapping. Details on how to make automatic inferences can be found on the Electronic Annotation page. The GO annotation guide details more about the annotation process; other pages of interest may be the GO annotation conventions, the standard operating procedures used by some consortium members, and the GO annotation file format guide.

GO Term Usage Guide

Response to' guidelines

  • High level response to terms should not directly be used for annotation. A list of highlevel terms deemed unuseful are available here.
  • Update guidelines: Expression experiments should not be annotated to response to terms

Downstream Effects

  • Add stuff from http://gocwiki.geneontology.org/index.php/Guidelines_from_Annotation_Camp

Use of Regulation terms

  • Add stuff from http://gocwiki.geneontology.org/index.php/Guidelines_from_Annotation_Camp

Annotating gene products that interact with other organisms

The majority of gene products act within the organism that encoded them. However, sometimes gene products encoded by one organism can act on or in other organisms. For example, in obligate parasitic species (including viruses), almost all their gene products will be interacting with their host organism. Interactions may also be between organisms of the same species: for example, the proteins used by bacteria to adhere to one another to form a biofilm.

For annotating gene products involved in these multi-organism interactions, there are special terms in the biological process ontology, under multi-organism process, and in the cellular component ontology, under host. More specific information can be found in the biological process documentation on multi-organism processes and in the cellular component guidelines on host cell.

The species in the interaction are recorded in an annotation by using terms from this node and entering two taxon IDs in the Taxon column. The first taxon ID should be that of the species encoding the gene product, and the second should be the taxon of the other species in the interaction. Where the interaction is between organisms of the same species, both taxon IDs should be the same. The taxon column of the annotation file is described in more detail in the annotation file format guide.

An additional taxon ID should not be added in cases where the annotation is based on sequence or structural similarity.

Nomenclature Conventions

The terms 'symbiont' and 'host' may carry connotations of the nature of the interaction between two organisms, but in the Gene Ontology, they are used solely to differentiate between organisms on the basis of their size. The word symbiont is used to refer to the smaller organism in a symbiotic interaction; the larger organism is called the host. If the two organisms are the same size, the term will be contain other organism. Note that parasites and pathogens are also referred to as 'symbionts', as symbiosis encompasses parasitism, commensalism and mutualism.

Requesting new terms in the multi-organism process node

Like the rest of GO, the multi-organism process node is not complete, and you will probably have to request some new terms when annotating your gene products. These should be submitted via the GO curator requests tracker in the usual way. Here are a few points to bear in mind when requesting new terms, and annotating using this node:

  • A term name should make the direction of the interaction clear. An example of this is given below; induction of nodule morphogenesis in host would be used to annotate the symbiont gene product, while induction of nodule morphogenesis by symbiont is used to annotate the host genes. Both processes would be children of a common term nodulation.
  • If your gene product affects a 'normal' host process, you should always request a new term in the MOP node, rather than just annotating directly to the term in the 'normal' ontology. So for example, if your bacterial gene product regulates the ethylene-mediated signaling pathway in plants, rather than using dual taxon to annotate to regulation of ethylene mediated signaling pathway ; GO:0010104, you should instead request a new term regulation of ethylene mediated signaling pathway in host.
  • Where an organism subverts a 'normal' biological process, e.g. the transcription of viral DNA by host transcription machinery, host proteins should not be annotated to a 'symbiont' term like transcription of symbiont DNA. This is because this would be considered considered a pathological process, i.e. not 'normal' for the host.

Example: Performing a process with another organism

Nod factor export proteins transfer nod factors out of the purple bacterium Sinorhizobium meliloti into the surrounding soil. Here they are detected by LysM nod factor receptor kinases in Medicago truncatula roots and initiate the process of nodulation.

Annotation of Nod factor export ATP-binding protein I from S. meliloti

suggest a new term induction of nodule morphogenesis in host

nodulation ; GO:0009877
[p] induction of nodule morphogenesis in host ; GO:00new01

Sinorhizobium meliloti taxonomy ID: 382
Medicago truncatula taxonomy ID: 3880

protein name: Nod factor export ATP-binding protein I
GO term: induction of nodule morphogenesis in host ; GO:00new01
taxon column: taxon:382|taxon:3880

Annotation of LysM receptor kinase LYK3 precursor from M. truncatula

suggest a new term induction of nodule morphogenesis by symbiont

nodulation ; GO:0009877
[p] induction of nodule morphogenesis by symbiont ; GO:00new02

Medicago truncatula taxonomy ID: 3880
Sinorhizobium meliloti taxonomy ID: 382

protein name: LysM receptor kinase LYK3 precursor
GO term: induction of nodule morphogenesis by symbiont ; GO:00new02
taxon column: taxon:3880|taxon:382

Example: Performing a process in more than one species

The protein cardiotoxin from the southern Indonesian spitting cobra Naja sputatrix kills mammalian cells by cytolysis when it enters the host cell cytoplasm.

Annotation of cardiotoxin precursor, from N. sputatrix

use the GO terms cytolysis of cells of another organism ; GO:0051715 and host cell cytoplasm ; GO:0030430

Naja sputatrix taxonomy ID: 33626
Mammalia taxonomy ID: 40674

protein name: cardiotoxin precursor
GO term: cytolysis of cells of another organism ; GO:0051715
taxon column: taxon:33626|taxon:40674

protein name: cardiotoxin precursor
GO term: host cell cytoplasm ; GO:0030430
taxon column: taxon:33626|taxon:40674

Example: Regulating a process in another organism

Mosquito saliva contains D7 proteins, which bind biogenic amines in order to suppress hemostasis in humans.

Annotation of D7 protein long form, from A. gambiae

suggest a new term negative regulation of hemostasis in host

evasion of host defense response ; GO:0030682
[i] negative regulation of hemostasis in host ; GO:00new03

Anopheles gambiae taxonomy ID: 7165
Homo sapiens taxonomy ID: 9606

protein name: D7 protein long form
GO term: negative regulation of hemostasis in host ; GO:00new03
taxon column: taxon:7165|taxon:9606

Back to top

Annotation mailing list

All Consortium annotators should subscribe to the GO discussion mailing list, which provides a forum for the discussion of annotations and specific use questions. Subscription details and archived posts are available on the annotation mailing list information page.

Specific annotation queries can be submitted to the GO annotation tracker at SourceForge. For general queries about annotation not answered by this page, please email the GO helpdesk.

Back to top

GO Annotation Resources

For more information on annotation, please see the following resources: