Annotations provided by The Institute for Genomic Research (TIGR) (see note at bottom of README on current status of TIGR) 9712 Medical Center Dr., Rockville, MD Methods of GO Annotation Most of our GO annotations are derived from sequence similarity evidence. There are several resources we use for this: HMMs, BER pairwise matches (BER is a tool that employs both BLAST and Smith-Waterman algorithms), TMHMM transmembrane predictions, SignalP signal peptide predictions, PROSITE matches, and COGs. The HMM dataset we use consists of Pfams and TIGRFAMS. We have manually assigned GO terms to both TIGRFAMS (available as a mapping file on the GO site) and Pfams. HMMs from these two sources exist at many levels of functional specificity. For example: some represent domains, some superfamilies, and some exact molecular functions. GO terms are assigned to these HMMs with the appropriate granularity. The BER pairwise results are available to us as a file of alignments generated by the open source program BLAST_extend_repraze (BER, ber.sourceforge.net) that utilizes both BLAST and Smith Waterman algorithms. We also use our Genome Properties system to predict the presence or absence of many biological systems (pathways, protein complexes, etc.) for each genome as well as storing important traits and characteristics of the genome or organism in question. In our annotation process, an annotator looks at all available evidence and then decides what they think the protein is doing in the cell. At this point they look for GO terms to annotate to the protein making sure to only assign GO terms at the specificity that the evidence supports. In general annotators look first for the strongest types of evidence: HMMs specific for one function (TIGR equivalog HMMs) and high quality matches to experimentally characterized proteins. The only pairwise alignment evidence that can be used for annotation is when the match protein is itself experimentally characterized. If such strong pieces of evidence for specific function are found then specific annotations can be made. If not, then more general forms of evidence may be used and will result in more general GO annotations. For example HMM matches to subfamily, superfamily, or domain level HMMs give support to annotations of general functions or family memberships which would lead to more general GO term choices. Manatee, our open source manual annotation tool (manatee.sourceforge.net), provides suggestions for GO terms from several different sources: from the HMMs that match the protein, from other proteins that are very similar to this one, from EC numbers, and from the Genome Properties system. Often the GO terms which the annotator needs are available from the pool of suggested terms and the annotator does not then need to search the ontologies, but if not, then the annotator will search the ontologies to find the terms they need. Manatee has a built in GO browser. We also use the GO AmiGO tool. When one of our proteins has itself been experimentally characterized, we read the relevant literature and assign GO terms accordingly. However, the vast majority of the proteins in our genomes have not been experimentally characterized, thus our reliance on sequence similarity methods. Current status of TIGR: In the Fall of 2006 TIGR merged with other institutes under the umbrella of the Venter Foundation and became a division of the J. Craig Venter Institute (JCVI). In the Winter of 2007 the TIGR division was dissolved. There no longer is any place called TIGR. However, the GO annotations from the TIGR genomes live on and are being maintained (in a limited degree) by the former TIGR annotation team that is now a part of the JCVI.