How to Submit GO Annotations?

This page provides details on the requirements for supplying GO annotations to the GO Consortium (GOC). For more general information on GO annotation, please see the GO annotation guide.

Who can submit GO annotations?

  • Groups that have GO annotations for a new genome/species
  • Users who find holes or inaccuracies in the current set of GO annotations can contact the GOC to submit GO annotations.

How can I contribute missing GO annotations or updates to GO annotations?

When users find errors or omissions in the GO annotation set, they are strongly encouraged to contact the GO Consortium with details of the data that requires improvement. In all cases, it is extremely useful to include in your feedback:
  • An identifier for the gene product/gene of interest
  • A citation to a publicly available reference supporting a need for annotation improvement (such as a PubMed identifier)
  • Some details on the specific annotations that should be reviewed or added.

How can I contribute a large set of GO annotations?

  • Single file (one-time) contribution
  • Some research communities do not have an established annotation group with the funding and time to commit to long-term maintenance of their GO annotation datasets, however such groups can contribute annotations to the central GO Consortium repository on a single-submission basis. Under such circumstances, the external group would be sending data on the understanding that although they would be fully acknowledged as the creators of such annotations, the submitters would be unable to commit to any future updates of the annotation set. Therefore, if and when the GO Consortium annotation group has accepted the annotations, it will be the GO Consortium that would become immediately responsible for the future maintenance of the annotation set. In these cases it is vital that the submitters have been in contact with the GO Consortium before submitting their annotation file, so the Consortium can work with the submitting annotation group to ensure submission of the highest-quality data possible. The GO Consortium would become responsible for future annotation updates in response to user feedback or in response to changes occurring in the GO (i.e. GO terms being made secondary or obsolete), changes to annotated sequence identifiers or annotation format changes. Changes made by the GO Consortium curators to submitted annotations will be attributed to the GO Consortium. Interested annotation efforts should contact the GO Consortium.

  • On-going GO Annotation Contributions/Collaborations
  • Annotation groups may alternatively choose to regularly supply the GO Consortium with an annotation file. In this instance the annotation group would continue to be responsible for the maintenance and improvement of the annotation set, and would be responsible for responding to any requests for annotation changes.

What are the Requirements for Submitting Annotations?

  • It is important that the external curation group contacts the GO Consortium before annotation work is carried out, to ensure that mentors/trainers can be allocated from the GO Consortium so that it can be established that the data produced would satisfy all GO Consortium annotation and format requirements.
  • Please read the document on the minimum requirements for submitting GO annotations
  • All annotation sets should be supplied in standard GO Consortium annotation format, such as GAF 2.0. Guidance on how to create an annotation file is available in this FAQ.
  • GO annotations should ideally be made to UniProtKB accessions (e.g. P12345) or NCBI accessions. Where alternative identifier types are used these need to be stable and a gp2protein file (see ID specification page) must be generated and submitted along side the annotation file.
  • As is described in the GAF2.0 format, the contributing annotation group must supply a name that will be used to acknowledge their annotation set. This name will be visible in the 'assigned_by' field (column 15) of all annotation lines contributed by the group, and will be included in the list of annotation providers.
  • Where an 'Ongoing Annotation Contribution/Collaboration' is entered into, a primary contact person/email list needs to be supplied, so that any annotation requests can be fed back to the group and acted upon in a timely manner. This information should be submitted in a gene_association.conf file (example config file).

How to create a Gene Associations File (GAF)?

GO annotations are disseminated in a 17 column tab delimited GAF 2.0 format. However, the following information is sufficient to make a GAF file. You can choose to construct a GAF file using the documentation or simply contact the GOC at go-helpdesk@mailman.stanford.edu for step-by-step assistance.
  • Stable IDs for the gene products or objects that are being annotated. If the gene product IDs are not in UniProtKB or NCBI, please submit the IDs to one of these databases
  • GO ID that each gene product can be associated with
  • Evidence code that allows you to make the association
  • Reference (published paper or reference describing the methodology used to make the geneproduct to GO term association)
  • Taxon identifier for the gene products for which the associations are made
  • Finally, put these data together in an Excel or tab delimited file and send it to the GOC.

Where can I find the current set of GO annotations available at the GOC?

  • GO annotations can be downloaded from the GOC's FTP site
  • GO annotations can be interactively viewed on the web via AmiGO, a web application supported by the GOC.