<?xml version="1.0" encoding="iso-8859-1"?>
<ruleset>
<!-- <rule>
	<id>GO_AR:00000XX</id>
	<title>brief summary of rule</title>
	<contact>email address</contact>
	<description format="[html|text]">description of the check to be performed</description>
	<implementation_list>
		<implementation status="active">
			<script language="scriptLanguage" source="anyURI" />
			<input format="gaf1.0">
				anyURI
			</input>
			<input format="gaf2.0" />
			<output>[warn|warn_and_remove]</output>
			<when_performed>[pre_submit|on_submit|daily|etc.]</when_performed>
		</implementation>
		<implementation>
			<script language="SQL">
			paste SQL query here
			</script>
			<input schema="[GOLD|LEAD]">GO database</input>
		</implementation>
	</implementation_list>
	<status date="YYYY-MM-DD">[Implemented|Approved|Proposed|Deprecated]</status>
	<history>
		<status date="YYYY-MM-DD">[Proposed|Approved|Implemented|Deprecated]</status>
		<comment date="YYYY-MM-DD">XXX annotations retrieved by this query.</comment>
	</history>
	<comment date="YYYY-MM-DD" format="[html|text]">any old comment</comment>
</rule> -->
	<rule>
		<id>
			GO_AR:0000001
		</id>
		<title>Basic GAF checks</title>
		<contact>
			cherry@genome.stanford.edu
		</contact>
		<description format="html">
<![CDATA[
		<p>The following basic checks ensure that submitted gene association files conform to the GAF spec, and come from the original GAF check script.</p>
		<ul><li>Each line of the GAF file is checked for the correct number of columns, the cardinality of the columns, leading or trailing whitespace</li>
		<li>Col 1 and all DB abbreviations must be in <a href="http://www.geneontology.org/cgi-bin/xrefs.cgi">GO.xrf_abbs</a> (case may be incorrect)</li>
		<li>All GO IDs must be extant in current ontology</li>
		<li>Qualifier, evidence, aspect and DB object columns must be within the list of allowed values</li>
		<li>DB:Reference, Taxon and GO ID columns are checked for minimal form</li>
		<li>Date must be in YYYYMMDD format</li>
		<li>All IEAs over a year old are removed</li>
		<li>Taxa with a 'representative' group (e.g. MGI for Mus musculus, FlyBase for Drosophila) must be submitted by that group only</li>
		</ul>]]>
		</description>
		<implementation_list>
			<implementation status="active">
				<script language="perl" source="http://www.geneontology.org/software/utilities/filter-gene-association.pl" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
				<when_performed>on_submit</when_performed>
				<output>warn_and_remove</output>
			</implementation>
		<implementation status="active">
			<script language="java" source="org.geneontology.gold.rules.BasicChecksRule" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
		</implementation>
		</implementation_list>
		<status date="2005-10-19">
			Implemented
		</status>
	</rule>
	<rule>
		<id>
			GO_AR:0000002
		</id>
		<title>No 'NOT' annotations to 'protein binding ; GO:0005515'</title>
		<contact>
			edimmer@ebi.ac.uk
		</contact>
		<contact>
			rama@genome.stanford.edu
		</contact>
		<description format="html"><![CDATA[<p>Even if an identifier is available in the 'with' column, a qualifier only informs on the GO term, it cannot instruct users to restrict the annotation to just the protein identified in the 'with', therefore an annotation applying <span class="term">protein binding ; GO:0005515</span> with the <span class="not">not</span> qualifier implies that the annotated protein cannot bind anything.</p>
<p>This is such a wide-reaching statement that few curators would want to make.</p>
<p>This rule <em>only</em> applies to GO:0005515; children of this term can be qualified with <span class="not">not</span>, as further information on the type of binding is then supplied in the GO term; e.g. <span class="not">not</span> + <span class="term">NFAT4 protein binding ; GO:0051529</span> would be fine, as the negative binding statement only applies to the NFAT4 protein.</p>
<p>For more information, see the <a href="http://wiki.geneontology.org/index.php/Binding_Guidelines">binding guidelines</a> on the GO wiki.</p>]]>
		</description>
		<implementation_list>
			<implementation status="active">
				<script language="perl" source="http://www.geneontology.org/software/utilities/filter-gene-association.pl" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
				<when_performed>on_submit</when_performed>
				<output>warn_and_remove</output>
			</implementation>
			<implementation>
				<script language="SQL">
SELECT gene_product.symbol,
       CONCAT(gpx.xref_dbname, ':', gpx.xref_key) AS gpxref,
       IF(association.is_not=1,"NOT","") AS 'not',
       term.acc,
       term.name,
       evidence.code,
       db.name AS assigned_by
FROM   association
       INNER JOIN gene_product
         ON association.gene_product_id = gene_product.id
       INNER JOIN dbxref AS gpx
         ON gene_product.dbxref_id = gpx.id
       INNER JOIN term
         ON association.term_id = term.id
       INNER JOIN evidence
         ON association.id = evidence.association_id
       INNER JOIN db
         ON association.source_db_id=db.id
WHERE  association.is_not='1'
       AND term.acc = 'GO:0005515'
				</script>
				<input schema="LEAD">
					GO database
				</input>
			</implementation>
			<implementation>
				<script language="regex">/^(.*?\t){3}not\tGO:0005515\t/i
				</script>
				<input format="gaf1.0" />
				<input format="gaf2.0" />
			</implementation>
		</implementation_list>
		<status date="2011-01-01">Implemented</status>
		<history>
			<status date="2010-04-01">Approved</status>
		</history>
	</rule>
	<rule>
		<id>
			GO_AR:0000003
		</id>
		<title>Annotations to 'binding ; GO:0005488' and 'protein binding ; GO:0005515' should be made with IPI and an interactor in the 'with' field</title>
		<contact>
			edimmer@ebi.ac.uk
		</contact>
		<contact>
			rama@genome.stanford.edu
		</contact>
		<description format="html">
<![CDATA[
<p>Annotations to <span class="term">binding : GO:0005488</span> or <span class="term">protein binding ; GO:0005515</span> with the <acronym title="Traceable Author Statement" class="evCode">TAS</acronym>, <acronym title="Non-traceable Author Statement" class="evCode">NAS</acronym>, <acronym title="Inferred by Curator" class="evCode">IC</acronym>, <acronym title="Inferred from Mutant Phenotype" class="evCode">IMP</acronym>, <acronym title="Inferred from Genetic Interaction" class="evCode">IGI</acronym> and <acronym title="Inferred by Direct Assay" class="evCode">IDA</acronym> evidence codes are not informative as they do not allow the interacting  partner to be specified. If the nature of the binding partner is known (protein or DNA for example), an appropriate child term of <span class="term">binding ; GO:0005488</span> should be chosen for the annotation. In the case of chemicals, ChEBI IDs can go in the 'with' column. Children of <span class="term">protein binding ; GO:0005515</span> where the type of protein is identified in the GO term name do not need further specification.
</p>
<p>For more information, see the <a href="http://wiki.geneontology.org/index.php/Binding_Guidelines">binding guidelines</a> on the GO wiki.</p>]]>
		</description>
		<implementation_list>
			<implementation status="active">
				<script language="perl" source="http://www.geneontology.org/software/utilities/filter-gene-association.pl" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
				<when_performed>on_submit</when_performed>
				<output>warn_and_remove</output>
			</implementation>
			<implementation>
				<script language="SQL">
SELECT gene_product.symbol,
       CONCAT(gpx.xref_dbname, ':', gpx.xref_key) AS gpxref,
       IF(association.is_not=1,"NOT","") AS 'not',
       term.acc,
       term.name,
       evidence.code,
       db.name AS assigned_by
FROM   association
       INNER JOIN gene_product
         ON association.gene_product_id = gene_product.id
       INNER JOIN dbxref AS gpx
         ON gene_product.dbxref_id = gpx.id
       INNER JOIN term
         ON association.term_id = term.id
       INNER JOIN evidence
         ON association.id = evidence.association_id
       INNER JOIN db
         ON association.source_db_id=db.id
WHERE  evidence.code IN ('NAS','TAS','IDA','IMP','IGC','IEP','ND','IC','RCA','EXP', 'IGI')
       AND (term.acc = 'GO:0005515' OR term.acc = 'GO:0005488')
				</script>
				<input schema="LEAD">
					GO database
				</input>
			</implementation>
			<!--
			<implementation>
				<script language="regex">/^(.*?\t){4}GO:0005(488|515)\t/i</script>
				<input format="gaf1.0" />
				<input format="gaf2.0" />
			</implementation>
			 -->
			<!-- 
			  Replaced with a procedural check, the regex only identifies all annotations 
			  with GO:0005515 or GO:0005488. It does not check for the evidence code or 
			  content of the with field!
			 -->
			<implementation status="active">
				<script language="java" source="org.geneontology.rules.GO_AR_0000003" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
			</implementation>
		</implementation_list>
		<status date="2011-01-01">
			Implemented
		</status>
		<history>
			<status date="2010-04-01">
				Approved
			</status>
		</history>
	</rule>
	<rule>
		<id>
			GO_AR:0000004
		</id>
		<title>Reciprocal annotations for 'protein binding ; GO:0005515'</title>
		<contact>
			rama@genome.stanford.edu
		</contact>
		<contact>
			edimmer@ebi.ac.uk
		</contact>
		<description format="html">
<![CDATA[
<p>When annotating to terms that are descendants of protein binding, and when the curator can supply the accession of the interacting protein accession, it is essential that reciprocal annotations are available - i.e. if you say protein A binds protein B, then you need to also have the second annotation that states that protein B binds protein A.</p>
<p>This will be a soft QC; a script will make these inferences and it is up to each MOD to evaluate and include the inferences in their GAF/DB.</p>
<p>For more information, see the <a href="http://wiki.geneontology.org/index.php/Binding_Guidelines">binding guidelines</a> on the GO wiki.</p>]]>
		</description>
		<status date="2010-04-01">
			Approved
		</status>
		<!-- 
		<implementation_list>
			<implementation status="active">
				<script language="java" source="org.geneontology.rules.GO_AR_0000004" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
			</implementation>
		</implementation_list>
		-->
	</rule>
	<rule>
		<id>
			GO_AR:0000005
		</id>
		<title>No ISS or ISS-related annotations to 'protein binding ; GO:0005515'</title>
		<contact>
			edimmer@ebi.ac.uk
		</contact>
		<contact>
			rama@genome.stanford.edu
		</contact>
		<description format="html">
<![CDATA[
<p>
	If we take an example annotation:
</p>
<div class="annot">
	<p>
		gene product: protein A<br>GO term: protein binding ; GO:0005515<br>evidence: IPI<br>reference: PMID:123456<br>with/from: <b>with</b> protein A
	</p>
</div>
<p>
	this annotation line can be interpreted as: protein A was found to carry out the 'protein binding' activity in PMID:12345, and that this function was <span class="evCode">Inferred from the results of a Physicial Interaction (IPI)</span> assay, which involved protein X
</p>
<p>
	However if we would like to transfer this annotation to protein A's ortholog 'protein B', the <acronym title="Inferred from Sequence Similarity" class="evCode">ISS</acronym> annotation that would be created would be:
</p>
<div class="annot">
	<p>
		gene product: protein B<br>GO term: protein binding ; GO:0005515<br>evidence: ISS<br>reference: GO_REF:curator_judgement<br>with/from: <b>with</b> protein A
	</p>
</div>
<p>
	This is interpreted as 'it is inferred that protein B carries out protein binding activity due to its sequence similarity (curator determined) with protein A, which was experimentally shown to carry out 'protein binding'.
</p>
<p>
	Therefore the <span class="evCode">ISS</span> annotation will not display the the interacting protein X accession. Such an annotation display can be confusing, as the value in the 'with' column just provides further information on why the <span class="evCode">ISS</span>/<span class="evCode">IPI</span> or <acronym title="Inferred from Genetic Interaction" class="evCode">IGI</acronym> annotation was created. This means that an <span class="evCode">ISS</span> projection from <span class="term">protein binding</span> is not particularly useful as you are only really telling the user that you think an homologous protein binds a protein, based on overall sequence similarity.
</p>
<p>
	This rule only applies to GO:0005515, as descendant terms such as <span class="term">mitogen-activated protein kinase p38 binding ; GO:0048273</span> used as <span class="evCode">ISS</span> annotations are informative as the GO term name contains far more specific information as to the identity of the interactor.
</p>
<p>For more information, see the <a href="http://wiki.geneontology.org/index.php/Binding_Guidelines">binding guidelines</a> on the GO wiki.</p>]]>
		</description>
		<implementation_list>
			<implementation>
				<script language="SQL">
SELECT gene_product.symbol,
       CONCAT(gpx.xref_dbname, ':', gpx.xref_key) AS gpxref,
       IF(association.is_not=1,"NOT","") AS 'not',
       term.acc,
       term.name,
       evidence.code,
       db.name AS assigned_by
FROM   association
       INNER JOIN gene_product
         ON association.gene_product_id = gene_product.id
       INNER JOIN dbxref AS gpx
         ON gene_product.dbxref_id = gpx.id
       INNER JOIN term
         ON association.term_id = term.id
       INNER JOIN evidence
         ON association.id = evidence.association_id
       INNER JOIN db
         ON association.source_db_id=db.id
WHERE  evidence.code IN ('ISS','ISO','ISA','ISM')
       AND term.acc = 'GO:0005515'
				</script>
				<input schema="LEAD">
					GO database
				</input>
			</implementation>
			<implementation status="active">
				<script language="java" source="org.geneontology.rules.GO_AR_0000005" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
			</implementation>
		</implementation_list>
		<status date="2010-04-01">
			Approved
		</status>
	</rule>
	<rule>
		<id>
			GO_AR:0000006
		</id>
		<title>IEP usage is restricted to terms from the Biological Process ontology</title>
		<contact>
			edimmer@ebi.ac.uk
		</contact>
		<contact>
			rama@genome.stanford.edu
		</contact>
		<description format="html">
<![CDATA[
<p>The <span class="evCode">IEP evidence code</span> is used where process involvement is inferred from the timing or location of expression of a gene, particularly when comparing a gene that is not yet characterized with the timing or location of expression of genes known to be involved in a particular process. This type of annotation is only suitable with terms from the Biological Process ontology.</p>
<p>For more information, see the <a href="http://wiki.geneontology.org/index.php/Binding_Guidelines">binding guidelines</a> on the GO wiki.</p>]]>
		</description>
		<implementation_list>
			<implementation status="active">
				<script language="perl" source="http://www.geneontology.org/software/utilities/filter-gene-association.pl" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
				<when_performed>on_submit</when_performed>
				<output>warn_and_remove</output>
			</implementation>
			<implementation>
				<script language="SQL">
SELECT gene_product.symbol,
       CONCAT(gpx.xref_dbname, ':', gpx.xref_key) AS gpxref,
       IF(association.is_not=1,"NOT","") AS 'not',
       term.acc,
       term.name,
       term.term_type,
       evidence.code,
       db.name AS assigned_by
FROM   association
       INNER JOIN gene_product
         ON association.gene_product_id = gene_product.id
       INNER JOIN dbxref AS gpx
         ON gene_product.dbxref_id = gpx.id
       INNER JOIN term
         ON association.term_id = term.id
       INNER JOIN evidence
         ON association.id = evidence.association_id
       INNER JOIN db
         ON association.source_db_id=db.id
WHERE  evidence.code = 'IEP'
       AND term.term_type != 'biological_process'
				</script>
				<input schema="LEAD">
					GO database
				</input>
			</implementation>
			<implementation status="active">
				<script language="java" source="org.geneontology.rules.GO_AR_0000006" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
			</implementation>
		</implementation_list>
		<status date="2011-01-01">
			Implemented
		</status>
		<history>
			<status date="2010-11-18">
				Approved
			</status>
			<comment date="2010-07-23">
				427 annotations retrieved by this query.
			</comment>
		</history>
	</rule>
	<rule>
		<id>
			GO_AR:0000007
		</id>
		<title>IPI should not be used with catalytic activity molecular function terms</title>
		<contact>
			edimmer@ebi.ac.uk
		</contact>
		<contact>
			rama@genome.stanford.edu
		</contact>
		<description format="html">
<![CDATA[
<p>The <a href="http://www.geneontology.org/GO.evidence.shtml#ipi" class="evCode">IPI (Inferred from Physical Interaction) evidence code</a> is used where an annotation can be supported from interaction evidence between the gene product of interest and another molecule (see the <a href="http://www.geneontology.org/GO.evidence.shtml#ipi">evidence code documentation</a>). While the <span class="evCode">IPI evidence code</span> is frequently used to support annotations to terms that are children of <span class="term">binding ; GO:0005488</span>, it is thought unlikely by the Binding working group that enough information can be obtained from a binding interaction to support an annotation to a term that is a chid of <span class="term">catalytic activity ; GO:0003824</span>. Such <span class="evCode">IPI</span> annotations to child terms of <span class="term">catalytic activity ; GO:0003824</span> may need to be revisited and corrected.</p>
<p>For more information, see the <a href="http://wiki.geneontology.org/index.php/Annotations_to_Catalytic_activity_with_IPI">catalytic activity annotation guide</a> on the GO wiki.</p>]]>
		</description>
		<implementation_list>
			<implementation>
				<script language="SQL">
SELECT gene_product.symbol,
       CONCAT(gpx.xref_dbname, ':', gpx.xref_key) AS gpxref,
       IF(association.is_not=1,"NOT","") AS 'not',
       term.acc,
       term.name,
       evidence.code,
       db.name AS assigned_by
FROM   term
       INNER JOIN graph_path
         ON term.id = graph_path.term2_id
       INNER JOIN term AS term2
         ON graph_path.term1_id = term2.id
       INNER JOIN association
         ON graph_path.term2_id = association.term_id
       INNER JOIN evidence
         ON association.id = evidence.association_id
       INNER JOIN gene_product
         ON association.gene_product_id = gene_product.id
       INNER JOIN dbxref AS gpx
         ON gene_product.dbxref_id = gpx.id
       INNER JOIN db
         ON association.source_db_id=db.id
WHERE  term2.acc = 'GO:0003824'
       AND evidence.code = 'IPI'
				</script>
				<input schema="LEAD">
					GO database
				</input>
			</implementation>
			<implementation status="active">
				<script language="java" source="org.geneontology.rules.GO_AR_0000007" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
			</implementation>
		</implementation_list>
		<status date="2010-04-01">
			Proposed
		</status>
		<history>
			<comment date="2010-05-28">
				144 annotations retrieved by this query.
			</comment>
		</history>
	</rule>
	<rule>
		<id>
			GO_AR:0000008
		</id>
		<title>No annotations should be made to uninformative high level terms</title>
		<contact>
			edimmer@ebi.ac.uk
		</contact>
		<contact>
			rama@genome.stanford.edu
		</contact>
		<description format="html">
<![CDATA[
	<p>Some terms are too high-level to provide useful information when used for annotation, regardless of the evidence code used.</p>
	<p>We provide and maintain the list of too high-level terms as two subsets in the ontology:</p>
	<ul>
		<li>gocheck_do_not_annotate "Term not to be used for direct annotation"</li>
		<li>gocheck_do_not_manually_annotate "Term not to be used for direct manual annotation"</li>
	</ul>
	<p>Both subsets denote high level terms, not to be used for any manual annotation.</p>
	<p>For inferred electronic annotations (IEAs), we allow the use of terms from the 
	gocheck_do_not_manually_annotate subset. These terms may still offer some general information, 
	but a human curator should always be able to find a more specific annotation.</p>
]]>
		</description>
		<implementation_list>
			<implementation>
				<script language="SQL">
SELECT gene_product.symbol,
       CONCAT(gpx.xref_dbname, ':', gpx.xref_key) AS gpxref,
       IF(association.is_not=1,"NOT","") AS 'not',
       term.acc,
       term.name,
       db.name AS assigned_by
FROM   association
       INNER JOIN gene_product
         ON association.gene_product_id = gene_product.id
       INNER JOIN term
         ON association.term_id = term.id
       INNER JOIN db
         ON association.source_db_id=db.id
       INNER JOIN dbxref AS gpx
         ON gene_product.dbxref_id = gpx.id
WHERE  term.acc IN ( 'GO:0050896', 'GO:0007610', 'GO:0051716', 'GO:0009628',
       'GO:0009607', 'GO:0042221', 'GO:0009719', 'GO:0009605', 'GO:0006950',
       'GO:0048585', 'GO:0048584', 'GO:0048583', 'GO:0001071', 'GO:0000988')
				</script>
				<input schema="LEAD">
					GO database
				</input>
			</implementation>
			<!-- 
			<implementation>
				<script language="regex">
/^(.*?\t){4}GO:00(00998|01071|06950|07610|0960[57]|09628|09719|42221|4858[345]|50896|51716)/
				</script>
				<input format="gaf1.0" />
				<input format="gaf2.0" />
			</implementation>
			 -->
			<implementation status="active">
				<script language="java" source="org.geneontology.rules.GO_AR_0000008" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
			</implementation>
		</implementation_list>
		<status date="2010-04-01">
			Proposed
		</status>
	</rule>
	<rule>
		<id>GO_AR:0000009</id>
		<title>Annotation Intersection Alerts</title>
		<contact>val@sanger.ac.uk</contact>
		<description format="text">To be added</description>
		<implementation_list />
		<status date="2010-04-01">Proposed</status>
	</rule>
	<rule>
		<id>
			GO_AR:0000010
		</id>
		<title>PubMed reference formatting must be correct</title>
		<contact>
			edimmer@ebi.ac.uk
		</contact>
		<contact>
			rama@genome.stanford.edu
		</contact>
		<description format="html">
<![CDATA[
			<p>References in the GAF (Column 6) should be of the format <span class="fmt">db_name:db_key|PMID:12345678</span>, e.g. <span class="fmt">SGD_REF:S000047763|PMID:2676709</span>. No other format is acceptable for PubMed references; the following examples are invalid:
			</p>
			<ul><li>PMID:PMID:14561399</li>
			<li>PMID:unpublished</li>
			<li>PMID:.</li>
			<li>PMID:0</li>
			</ul>
			<p>This is proposed as a HARD QC check: incorrectly formatted references will be removed.</p>
]]>
		</description>
		<implementation_list>
			<implementation>
				<script language="SQL">
SELECT gene_product.symbol,
       CONCAT(gpx.xref_dbname, ':', gpx.xref_key) AS gpxref,
       IF(association.is_not=1,"NOT","") AS 'not',
       term.acc,
       term.name,
       evidence.code,
       CONCAT(dbxref.xref_dbname, ':', dbxref.xref_key) AS evxref,
       db.name AS assigned_by
FROM   association
       INNER JOIN evidence
         ON association.id = evidence.association_id
       INNER JOIN gene_product
         ON association.gene_product_id = gene_product.id
       INNER JOIN term
         ON association.term_id = term.id
       INNER JOIN dbxref
         ON evidence.dbxref_id = dbxref.id
       INNER JOIN dbxref AS gpx
         ON gene_product.dbxref_id = gpx.id
       INNER JOIN db
         ON association.source_db_id=db.id
WHERE  dbxref.xref_dbname = 'PMID'
       AND dbxref.xref_key REGEXP '^[^0-9]'
					</script>
				<input schema="LEAD">
					GO database
				</input>
			</implementation>
			<implementation>
				<script language="regex">
/^(.*?\t){5}([^\t]\|)*PMID:(?!\d+)/
				</script>
				<input format="gaf1.0" />
				<input format="gaf2.0" />
			</implementation>
		</implementation_list>
		<status date="2010-04-01">
			Proposed
		</status>
	</rule>
	<rule>
		<id>
			GO_AR:0000011
		</id>
		<title>ND annotations to root nodes only</title>
		<contact>
			edimmer@ebi.ac.uk
		</contact>
		<contact>
			rama@genome.stanford.edu
		</contact>
		<description format="html">
<![CDATA[
	<p>The <a class="evCode" href="http://www.geneontology.org/GO.evidence.shtml#nd">No Data (ND) evidence code</a> should be used for annotations to the root nodes only and should be accompanied with <a href="http://www.geneontology.org/cgi-bin/references.cgi#GO_REF:0000015">GO_REF:0000015</a> or an internal reference. PMIDs <strong>cannot</strong> be used for annotations made with <span class="evCode">ND</span>.
	</p>
	<ul>
	<li>
		if you are using an internal reference, that reference ID should be listed as an external accession for <a href="http://www.geneontology.org/cgi-bin/references.cgi#GO_REF:0000015">GO_REF:0000015</a>. Please add (or email) your internal reference ID for GO_REF:0000015.
	</li>
	<li>
		All <span class="evCode">ND</span> annotations made with a reference other than GO_REF:0000015 (or an equivalent internal reference that is listed as external accession for GO_REF:0000015) should be filtered out of the GAF.
	</li>
	</ul>
	<p>
		The SQL code identifies all <span class="evCode">ND</span> annotations that do not use GO_REF:0000015 or one of the alternative internal references listed for it in the <a href="http://www.geneontology.org/cgi-bin/references.cgi">GO references file</a>.
	</p>
	]]>
		</description>
		<implementation_list>
			<implementation>
				<script language="SQL">
SELECT gene_product.symbol,
       CONCAT(gpx.xref_dbname, ':', gpx.xref_key),
       IF(association.is_not = 1, "NOT", "") AS 'not',
       term.acc,
       term.name,
       evidence.code,
       CONCAT(dbxref.xref_dbname, ':', dbxref.xref_key) AS evxref,
       db.name AS assigned_by
FROM   association
       INNER JOIN evidence
         ON association.id = evidence.association_id
       INNER JOIN gene_product
         ON association.gene_product_id = gene_product.id
       INNER JOIN term
         ON association.term_id = term.id
       INNER JOIN dbxref
         ON evidence.dbxref_id = dbxref.id
       INNER JOIN dbxref AS gpx
         ON gpx.id = gene_product.dbxref_id
       INNER JOIN db
         ON association.source_db_id = db.id
WHERE  ( evidence.code = 'ND'
         AND term.acc NOT IN ( 'GO:0005575', 'GO:0003674', 'GO:0008150' ) )
        OR ( NOT(evidence.code = 'ND')
             AND term.acc IN ( 'GO:0005575', 'GO:0003674', 'GO:0008150' ) )
        OR ( evidence.code = 'ND'
             AND ( CONCAT(dbxref.xref_dbname, ':', dbxref.xref_key) NOT IN (
                         'GO_REF:0000015', 'FB:FBrf0159398',
                         'ZFIN:ZDB-PUB-031118-1',
                         'dictyBase_REF:9851',
                         'MGI:MGI:2156816',
                         'SGD_REF:S000069584', 'CGD_REF:CAL0125086',
                         'RGD:1598407',
                         'TAIR:Communication:1345790',
                         'AspGD_REF:ASPL0000111607' ) ) )
					</script>
				<input schema="LEAD">
					GO database
				</input>
			</implementation>
			<implementation status="active">
				<script language="java" source="org.geneontology.rules.GO_AR_0000011" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
			</implementation>
		</implementation_list>
		<status date="2010-04-01">
			Proposed
		</status>
	</rule>
<rule>
	<id>GO_AR:0000013</id>
	<title>Taxon-appropriate annotation check</title>
		<contact>
			edimmer@ebi.ac.uk
		</contact>
		<contact>
			rama@genome.stanford.edu
		</contact>
	<description format="html">
<![CDATA[
	<p>GO taxon constraints ensure that annotations are not made to inappropriate species or sets of species. See <a rel="external" href="http://www.biomedcentral.com/1471-2105/11/530">http://www.biomedcentral.com/1471-2105/11/530</a> for more details.
	</p>
]]>
	</description>
	<implementation_list>
		<implementation status="active">
			<script language="java" source="org.geneontology.gold.rules.AnnotationTaxonRule" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
		</implementation>
	</implementation_list>
	<status date="2011-04-12">Proposed</status>
</rule>

<rule>
	<id>GO_AR:0000014</id>
	<title>Valid GO term ID</title>
		<contact>
			edimmer@ebi.ac.uk
		</contact>
		<contact>
			rama@genome.stanford.edu
		</contact>
	<description format="txt">This check ensures that the GO IDs used for annotations are valid IDs and are not obsolete.</description>
	<implementation_list>
		<implementation status="active">
			<script language="java" source="org.geneontology.gold.rules.GoClassReferenceAnnotationRule" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
		</implementation>
	</implementation_list>
	<status date="2011-04-12">Proposed</status>
</rule>

<rule>
	<id>GO_AR:0000015</id>
	<title>Dual species taxon check</title>
	<contact>go-discuss@lists.stanford.edu</contact>
	<description format="html">Dual species annotations are used to capture information about multi-organism interactions. The first taxon ID should be that of the species encoding the gene product, and the second should be the taxon of the other species in the interaction. Where the interaction is between organisms of the same species, both taxon IDs should be the same. These annotations should be used only in conjunction with terms that have the biological process term 'GO:0051704 : multi-organism process' or the cellular component term 'GO:0044215 : other organism' as an ancestor.</description>
	<implementation_list>
		<!-- 
		<implementation status="active">
			<script language="java" source="org.geneontology.rules.GO_AR_0000015" />
			<input format="gaf1.0" />
			<input format="gaf2.0" />
		</implementation>
		 -->
	</implementation_list>
<!--
## all multi-spp interactions:
SELECT DISTINCT
       term.acc, term.name,
       s1.ncbi_taxa_id, s1.genus, s1.species, s1.common_name,
       s2.ncbi_taxa_id, s2.genus, s2.species, s2.common_name
FROM   association_species_qualifier AS asq
       INNER JOIN association AS a
         ON (a.id=asq.association_id)
       INNER JOIN gene_product AS g
         ON (a.gene_product_id=g.id)
       INNER JOIN species AS s1
         ON (g.species_id=s1.id)
       INNER JOIN species AS s2
         ON (asq.species_id=s2.id)
       INNER JOIN term
         ON a.term_id=term.id
ORDER BY s1.ncbi_taxa_id, s2.ncbi_taxa_id, term.acc

## those with parentage under the multispp terms:
SELECT DISTINCT
       term.acc, term.name,
       s1.ncbi_taxa_id, s1.genus, s1.species, s1.common_name,
       s2.ncbi_taxa_id, s2.genus, s2.species, s2.common_name
FROM   association_species_qualifier AS asq
       INNER JOIN association AS a
         ON (a.id=asq.association_id)
       INNER JOIN gene_product AS g
         ON (a.gene_product_id=g.id)
       INNER JOIN species AS s1
         ON (g.species_id=s1.id)
       INNER JOIN species AS s2
         ON (asq.species_id=s2.id)
       INNER JOIN term
         ON a.term_id=term.id
       INNER JOIN graph_path
         ON term.id=term2_id
       INNER JOIN term as term2
         ON term2.id=term1_id
WHERE  term2.acc='GO:0044217' OR term2.acc='GO:0051704'
ORDER BY s1.ncbi_taxa_id, s2.ncbi_taxa_id, term.acc

-->
	<status date="2011-08-25">Proposed</status>
</rule>

<rule>
	<id>GO_AR:0000016</id>
	<title>IC annotations require a With/From GO ID</title>
		<contact>
			edimmer@ebi.ac.uk
		</contact>
		<contact>
			rama@genome.stanford.edu
		</contact>
	<description format="html">
<![CDATA[
<p>All IC annotations should include a GO ID in the "With/From" column; for more information, see the <a href="http://www.geneontology.org/GO.evidence.shtml#ic">IC evidence code guidelines</a>.</p>]]>
</description>
	<implementation_list>
			<implementation status="active">
				<script language="perl" source="http://www.geneontology.org/software/utilities/filter-gene-association.pl" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
				<when_performed>on_submit</when_performed>
				<output>warn_and_remove</output>
			</implementation>
			<implementation status="active">
				<script language="java" source="org.geneontology.rules.GO_AR_0000016" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
			</implementation>
	</implementation_list>
	<status date="2012-02-01">Approved</status>
</rule>

<rule>
	<id>GO_AR:0000017</id>
	<title>IDA annotations must not have a With/From entry</title>
		<contact>
			edimmer@ebi.ac.uk
		</contact>
		<contact>
			rama@genome.stanford.edu
		</contact>
	<description format="text">Use IDA only when no identifier can be placed in the "With/From" column. When there is an appropriate ID for the "With/From" column, use IPI.</description>
	<implementation_list>
			<implementation status="active">
				<script language="perl" source="http://www.geneontology.org/software/utilities/filter-gene-association.pl" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
				<when_performed>on_submit</when_performed>
				<output>warn_and_remove</output>
			</implementation>
			<implementation status="active">
				<script language="java" source="org.geneontology.rules.GO_AR_0000017" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
			</implementation>
	</implementation_list>
	<status date="2012-02-01">Approved</status>
</rule>

<rule>
	<id>GO_AR:0000018</id>
	<title>IPI annotations require a With/From entry</title>
		<contact>
			edimmer@ebi.ac.uk
		</contact>
		<contact>
			rama@genome.stanford.edu
		</contact>
	<description format="html">
<![CDATA[
<p>All IPI annotations should include a nucleotide/protein/chemical identifier in the "With/From" column (column 8). From the <a href="http://www.geneontology.org/GO.evidence.shtml#ipi">description of IPI in the GO evidence code guide</a>: "We strongly recommend making an entry in the with/from column when using this evidence code to include an identifier for the other protein or other macromolecule or other chemical involved in the interaction. When multiple entries are placed in the with/from field, they are separated by pipes. Consider using IDA when no identifier can be entered in the with/from column." All annotations made after January 1 2012 that break this rule will be removed.</p>
]]>
</description>
	<implementation_list>
			<implementation status="active">
				<script language="perl" source="http://www.geneontology.org/software/utilities/filter-gene-association.pl" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
				<when_performed>on_submit</when_performed>
				<output>warn_and_remove</output>
			</implementation>
			<implementation status="active">
				<script language="java" source="org.geneontology.rules.GO_AR_0000018" />
				<input format="gaf1.0" />
				<input format="gaf2.0" />
			</implementation>
	</implementation_list>
	<status date="2012-02-01">Approved</status>
</rule>

	<rule>
		<id>
			GO_AR:0000019
		</id>
		<title>Generic Reasoner Validation Check</title>
		<contact>
			hdietze@lbl.gov
		</contact>
		<description format="html">
<![CDATA[
		<p>The entire GAF is converted to OWL, combined with
                  the main GO ontology and auxhiliary constraint
                  ontologies. The resulting ontology is checked for
                  consistency and unsatisfiable classes over using a
                  complete DL reasoner such as HermiT. </p> ]]>
		</description>
		<implementation_list>
		  <implementation status="active">
		    <script language="java" source="org.geneontology.gold.rules.GenericReasonerValidationCheck" />
		  </implementation>
		</implementation_list>
		<status date="2011-04-09">
			Implemented
		</status>
	</rule>

	<rule>
		<id>
			GO_AR:0000020
		</id>
		<title>Automatic repair of annotations to merged or obsoleted terms</title>
		<contact>
			ramab@stanford.edu
		</contact>
		<description format="html">
<![CDATA[
		<p>
                  Ontology operations such as term merges and
                  obsoletions may be out of sync with annotation
                  releases. Each GO entry T in the GAF is checked to
                  see if it corresponds to a valid (non-obsolete) term
                  in the ontology. If not, metadata for other terms is
                  checked. If the term has been merged into a term S
                  (i.e. S has alt_id of T) then T is replaced by S in
                  the GAF line. 
      </p> ]]>
		</description>
		<implementation_list>
		</implementation_list>
		<status date="2013-02-06">
			Approved
		</status>
	</rule>

</ruleset>
