GO MEETING - The Jackson Labs. Oct 7-8 1999. PEOPLE MGD Judy Blake David Hill Joel Richardson Martin Ringwold Janan Eppig Charlie Ray Ben King - Mouse sequencing Jeff Davies Richard Balderelli Allan Davies SGD Andrew Kasarskis Mike Cherry Midori Harris FB Heather Butler Michael Ashburner Suzanna Lewis Astra Zeneca Michael Rebhan AGENDA 1. Current CVS/annotation of GO 2. Putting sets together for common query interface 3. Publications 4. WWW pages 5. Other collaborations 6. Funding & resources 7. People MINUTES 1. Progress FB/Berkeley. Nothing new on software; New versions imported into query tool. FBV/Cambridge Report on progress of attribution in FB. About 1700 done. Celera annotation plans were reported. It is hoped that they will use GO for functional inference. FB to get its reference CDS set of genes GO'd by November 7. (Ashburner/Heather). SGD Midori annotating yeast genes with GO, done about 300 plus tRNAs. Also doing gene summaries of each gene in SGD. Have about 3000 to do. GO query tool for internal use on www for curators; better diff files. MGD Alan and David Hill have been doing assignments. Detailed hand annotation with MLC and GXD - have to write detailed reports on genes and then add GO terms. At the same time do first pass "GO-FISH" - have mapped 3,000 genes with GO terms. Also mapped via EC numbers. Had not been using CVS but keeping a file of changes. Mapping SWP Keywords to GO terms - done to letter 'E'. 650 SWP Keywords that seem to be relevant to GO. 40-50% map directly to GO. David will [or could !] finish within a week ! dph@informatics.jax.org - David Hill MGD now beginning to use CVS (Allen) For CVS problems: mark@genome.stanford.edu Use "update" rather than "checkout". Agreed number series for new terms: SGD 0000001-0001500. MGD 0001501-0003000. FB 0008001-0009500. 2. Putting sets together: What we are using now: FB tagged value format SGD tabbed list MGD Excel file Evidence statements - MGD argue for "stated by author". Following agreed as valid values IMP inferred from mutant phenotype IGI inferred from genetic interaction {with } IPI inferred from physical interaction {with } **note we changed this from protein interaction ISS inferred from sequence similarity {with } IDA inferred from direct assay ASS author said so NA not avaliable Evidence must not be null, even if the record is " not available " We now want to agree on a tab delimited format - which SL can parse into XML. MEOW Core database. [mandatory] cardinality 1 ; controlled: MGI, FB, SGD gene symbol. [mandatory] cardinality 1 gene symbol synonym .cardinality 0, 1, >1 [white space allowed] gene name. cardinality 0,1 [white space allowed] gene identifier. [mandatory] cardinality 1 chromosome. cardinality 0, 1 map position. cardinality 0, 1 short gene description. cardinality 1 db xref, NA, protein. cardinality 0, 1, >1 GO add-on GO id. [mandatory] cardinality 1, >1 reference id. [mandatory] cardinality 1, >1 ; must be within domain of database identified in MEOW core evidence. [mandatory] cardinality 1, >1 ; controlled, see above aspect. cardinality 1 ; controlled F|P|C DB,Gene_id,Gene_symbol,GOid,ref(|refs),evidence(|evidence),aspect,name,synonym(|synonym) tab delimiter between fields (NOT commas) within field delimiter is | hard return at end of record ascii SGD_GO_files/gene_associations MGD_GO_files/gene_associations FB_GO_files/gene_associations SGD & FB do a remove of old versions before committing new. At this stage other data will not be dumped by contributing databases to GO. 2. Query/Editor tools/databases. Private editorial tools Local editorial interface to modify GO (ie to replace CVS) - but changes to go to editor for committment. Stanford work on editor tool. How do we compare for internal purposes between collab. d/abses ? Public tools At local sites [responsibility of collab d/bases] Cross-genome Data base Servlet ? or other performance enhancement Improved query database GO query tool must have comment to GO email button (at first to all of GO list, so that we can all see what is going on). Each database should implement its own query tool for GO. - all 3. WWW Mike has registered: www.geneontology.org & www.genename.org We agree to use geneontology.org as prime address and to close down the existing ebi and fruitfly sites (these then point to geneontology.org). Need a top page - Cherry Suzanna to check that the Query applet can run from this new web site. - Suzi Suzanna will activate URL hyperlinks from query report. - Suzi Needs url syntax for MGD (see MGD Tools for Developers on home page - or contact Joel) and for SGD (contact Mike Cherry). Tree will show number of gene_associations per node. The CVS can automatically update the text files and automatically write a new version and date at top of file - Cherry ftp - three ontologies in both hierarchical and xml (rename "compartment" as "cellular component" in CVS repository). - Cherry will xml files be automatically updated by a script when ontologies are updated ? - yes, but need to look into mechanism - Suzi. - GO.bib - GO.doc .. MA to re-write as an html document. Add GXD as collaborator indep of MGD - GO.defs - ISMB paper - geneassociations.fly - geneassociations.mouse - geneassociations.yeast GO query tool from Suzanna email button for contacts; go to entire list - Cherry Must change proofs of the SGD/FB/MGD NAR January issue papers, for new url. MA to write general introduction for web page Ashburner MA to update GO.doc Ashburner Suzi to give collaborators urls for definitions. (OUP acknowledgement) - Suzi 4. Publications Where - TIGS .. probably the best for this first paper. Alternatives: Genome Research NAR Nature Genetics Bioinformatics Talk to Roberts about paper for NAR Special Issue for 2001. Ashburner, but next year. Botstein & Cherry to do a draft then to Alan Davies at MGD - Botstein/Cherry/Alan 4. Other collaborators. C. elegans - Sternberg's NIH application for WormBase has been submitted - for summer 2000 funding. Arabidopsis: TAIR (The Arabidopsis Information Resource) - Carnegie-Stanford (science)/NCGR (computing). Started Sept 1, all of old AtDB curators moved over to Carnegie. Chris Town of TIGR is on TAIR grant. MA worried that could be more than one push - TIGR (NSF annotation grant); Mike Bevan at John Innes. Ashburner to follow up. Monica Riley/Gretta Serres - functional assigments for E. coli. Need to talk to TIGR about prokaryotes. Ashburner to follow up Look at TRANSFAC classification. Incyte collaboration, further discussions with Frank Russo. Ashburner/Suzi Swiss-Prot. Ashburner 5. Grants. Janan will lead on an NIH-NHGRI RO1 grant - Liza Brookes - for Feb 1 2000. - Janan What should we ask for: curator for MGD curator for SGD curator for WormBase ? as supplement [curator for FB already on MRC grant] Core: GO manager/editor software support travel/kit Funding cycle: FB to 2003 NIH 2002 MRC SGD 2001 NIH MGD 2001 NIH GXD 2000-2005 (NIH Institute of Child Health) GO 8/00-8/03 ? Astra-Zeneca: Would Ken be willing to write two cheques, one to EBI and one to UCB since we are the only two who now need to draw on funds ? Contracts between EBI and Jaxs and EBI and Stanford are academic at the moment. Should we set up a non-profit GO Inc ? Ashburner for action 6. Content MA to finish Style Manual, work on with Andrew - Ashburner/Andrew Need to look again at %enzyme - split by EC - what would we loose ? - use classification of substrates imposed on EC ? - Ashburner 7. Next meeting Feb 24-26 2000 - Boston / Harvard. Talk to Bill. Ashburner Talk to FCK re: a meeting in Les Treilles. Ashburner Friends of GO - activate and update - add Mike Rebhan. - Ashburner bionet.announce when new pages up and data into query tool. FINAL REMARKS Substantial progress has been made by all three database groups in implementing GO over the summer. This is very encouraging. Although there have been some areas of GO content that have needed changing (and several that have needed adding, as expected), in general the three ontologies seem to be working rather well. A major message of this meeting is that we must get something substantial in the public view as soon as possible. To this end we have rationalised the web sites for GO and agreed an output format for gene associations to be sent to Suzanna to drive the Query Tool. We have also agreed on a paper about GO for TIGS to be done this year. We hope that the new web pages with a Query Tool with content can be up in a matter of weeks, tho we know that until mid-November Suzanna and Ashburner are very busy with the fly annotation.