The OBO Flat File Format Specification, version 1.3 [DEPRECATED]

Chris Mungall, John Day-Richter
version 1.3 alpha, July 30, 2010

THIS IS A DEPRECATED DOCUMENT. Please refer to the 1.4 guide and specification.

Changes between 1.3 and 1.4

Note that this section is only informative if you were an early adopter of any obo format 1.3 features. If you are not in this category, you can disregard this document and refer to either the 1.2 format guide or the draft 1.4 guide.

Most of the new syntax introduced in 1.3 will also be available in 1.4. This includes:

ID expressions
header macros
equivalent_to, functional, inverse_functional, holds_over_chain

See the 1.4 doc for details.

Deprecation of 1.3 semantics

However, there are semantic changes between 1.3 and 1.4 that impact OBO to OWL translation. The semantics of OBO Format 1.3 were specified directly in First Order Logic / ISO Common Logic (see the now obsolete Obolog specification). This was to allow for a natural specification of relations using the strategy outlined in the Relation Ontology paper. OBO Format 1.3 formally allowed type level relations as distinct from instance level relations, and allowed these to be specified in terms of temporally indexed instance-level relations.

This strategy resulted in complications with the translation to OWL. OBO Format 1.4 abandons the semantics specified in the obolog spec and the type level relations as specified in the RO paper. The semantics of obof.1.4 will be instead be specified directly in terms of OWL, and relations such as part_of will be binary instance level relations, explcitly or implicitly quantified.

Changes between 1.3 and 1.2

Any changes between OBO-Format 1.2 and 1.3 are either in highlighted text, or with a coloured bar at the left side of the page.

The remainder of this document is deprecated - please refer to the 1.4 guide

Abstract

The OBO flat file format is an ontology representation language. The concepts it models represent a subset of the concepts in the OWL description logic language, with several extensions for meta-data modelling and the modelling of concepts that are not supported in DL languages.

The format itself attempts to achieve the following goals:

Human readability
Ease of parsing
Extensibility
Minimal redundancy

OBO Format Syntactic Structure

The format is similar to the tag-value format of the GO definitions file, with a few modifications. One important difference is that unrecognized tags in any context do not necessarily generate fatal errors (although some parsers may decide to do so; see Parser Requirements below). This allows parsers to read files that contain information not used by a particular tool.

OBO Document Structure

An OBO document is structured as follows:

<header> <stanza> <stanza> ...

Blank lines are ignored.

The header is an unlabeled section at the beginning of the document containing tag-value pairs. The header ends when the first stanza is encountered.

A stanza is a labeled section of the document, indicating that an object of a particular type is being described. Stanzas consist of a stanza name in square brackets, and a series of tag-value pairs, structured as follows:

[<Stanza name>] <tag-value pair> <tag-value pair> <tag-value pair>

Comments

An OBO file may contain any number of lines beginning with !, at any point in the file. These lines are ignored by parsers.

Further, any line may end with a ! comment. Parsers that encounter an unescaped ! will ignore the ! and all data until the end of the line. \<newline> sequences are not allowed in ! comments (see escape characters).

Tag-Value Pairs

Tag-value pairs consist of a tag name, an unescaped colon, the tag value, and a newline:

<tag>: <value> {<trailing modifiers>} ! <comment>

The tag name is always a string. The value is always a string, but the value string may require special parsing depending on the tag with which it is associated.

In general, tag-value pairs occur on a single line. Multi-line values are possible using escape characters (see escape characters).

In general, each stanza type expects a particular set of pre-defined tags. However, a stanza may contain any tag. If a parser does not recognize a tag name for a particular stanza, no error will be generated. This allows new experimental tags to be added without breaking existing parsers. See handling unrecognized tags for specifics.

Trailing Modifiers

Any tag-value pair may be followed by a trailing modifier. Trailing modifiers have been introduced into the OBO 1.2 Specification to allow the graceful addition of new features to existing tags.

A trailing modifier has the following structure:

{<name>=<value>, <name=value>, <name=value>}

That is, trailing modifiers are lists of name-value pairs.

Parser implementations may choose to decode and/or round-trip these trailing modifiers. However, this is not required. A parser may choose to ignore or strip away trailing modifiers.

For this reason, trailing modifiers should only include information that is optional or experimental.

Trailing modifiers may also occur within dbxref definitions (see dbxref formatting).

Escape characters

Tag names and values may contain the following escape characters:

\n: newline
\W: single space
\t: tab
\:: colon
\,: comma
\": double quote
\\: backslash
\(: open parenthesis
\): close parenthesis
\[: open bracket
\]: close bracket
\{: open brace
\}: close brace
@: at (language tag)
\<newline>: <no value>

Escaped characters should only be used when a literal character is needed (that is, a character that the parser should not interpret as having a special meaning when parsing). Some tag values may contain unescaped colons, brackets, quotes, etc., that have meaning in decoding the tag value. Unescaped spaces between the separator colon and the start of the value tag are discarded.

OBO parser implementations may support only these escape characters, or they may assume that any character following a backslash is an escaped character. Parsers that choose the latter approach will translate \a and \? to "a" and "?" respectively.

Identifier Syntax

Identifiers (IDs) in OBO should be strings consisting of an IDSpace concatenated to a LocalID via a : (colon) character. The ID should not contain any whitespace. The IDSpace should not itself contain any colon characters, and should ideally be registered on the GO xrefs page or with OBO.

Identifier Expressions

Starting with OBO 1.3, IDs may also be ID Expressions. An ID Expression acts just like an ID. However, the ID need not be declared in a Term stanza; the ID itself contains information defining itself, using OBO intersection_of semantics.

An ID Expression is any ID that conforms to the recursive pattern ID '^' Relation '(' ID ')' [ '^' Relation '(' ID ')' ... ]. It is treated as a genus-differentia description, with the first ID being the genus, and successive Relation(ID) parts contituting the differentia.

For example: GO:0005737^part_of(CL:0000023) can be used wherever one wants to say "cytoplasm of oocyte". This is treated as if it has the following definition:

[Term] id: GO:0005737^part_of(CL:0000023) intersection_of: GO:0005737 ! cytoplasm intersection_of: part_of CL:0000023 ! oocyte

This is known as post-composition. We can refer to an unnamed entity (i.e. one with no ID in any ontology) by describing it via a logical expression. See The Obolog document for the formal semantics of these expressions.

Built-in OBO Semantics

Document Header Tags

Required tags

format-version: Gives the OBO specification version that this file uses. This is useful if tag semantics change from one OBO specification version to the next.

Optional tags

data-version

Gives the version of the current ontology.

version

Deprecated. Use data-version instead.

date

The current date in dd:MM:yyyy HH:mm format.

saved-by

The username of the person to last save this file. The meaning of "username" is entirely up to the application that generated the file.

auto-generated-by

The program that generated the file.

subsetdef

A description of a term subset. The value for this tag should contain a subset name, a space, and a quote enclosed subset description, as follows:

subsetdef: GO_SLIM "GO Slim"

import

A url pointing to another OBO document. The contents of the target document will be appended to this document at parse time. If the target document also contains import statements, they will be resolved. This tag replaces the typeref tag from earlier versions of the OBO spec.

synonymtypedef

A description of a user-defined synonym type. The value for this tag should contain a synonym type name, a space, a quote enclosed description, and an optional scope specifier, as follows:

synonymtypedef: UK_SPELLING "British spelling" EXACT

The scope specifier indicates the default scope for any synonym that has this type. See the synonym section of tags in a term stanza for more information on the scope specifier.

idspace

A mapping between a "local" ID space and a "global" ID space. The value for this tag should be a local idspace, a space, a URI, optionally followed by a quote-enclosed description, like this:

idspace: GO urn:lsid:bioontology.org:GO: "gene ontology terms"

default-relationship-id-prefix

Any relationship lacking an ID space will be prefixed with the value of this tag. For example:

default-relationship-id-prefix: OBO_REL

The above will make sure that all relations referred to in the current file come from the OBO relations ontology, unless otherwise specified.

The scope of this tag is within the current file only. See also id-mapping, below

id-mapping

Maps a Term or Typedef ID to another Term or Typedef ID. The main reason for this tag is to increase interoperability between different OBO ontologies.

id-mapping: part_of OBO_REL:part_of

This maps all cases of the unqualified relationship part_of to the ID OBO_REL:part_of defined in the OBO relations ontology

The scope of this tag is within the current file only. Note that the default-relationship-id-prefix tag takes precedence over this tag

remark

General comments for this file. This tag is differentiated from a ! comment in that the contents of a remark tag are guaranteed to be preserved by a parser.

The following tags are new in OBO 1.3:

treat-xrefs-as-equivalent

The value for this tag should contain an ID Space. Ideally one declared here

Macro. Treats all xrefs coming from a particular ID-Space as being statements of exact equivalence. Normally, xrefs have no special meaning beyond "This xref is of relevance to the current entity".

Example:

treat-xrefs-as-equivalent: CL . . . [Term] id: GO:0005623 name: cell xref: CL:0000000

This declares CL:0000000 and GO:0005623 to be equivalent in what they reference.

treat-xrefs-as-genus-differentia

The value for this tag should contain an ID Space followed by a relation and then a class filler.

Macro. Treats all xrefs coming from a particular ID-Space as being genus-differentia definitions (cross products, logical definitions, intersection definitions). Normally, xrefs have no special meaning beyond "This xref is of relevance to the current entity".

Example:

treat-xrefs-as-genus-differentia: CL part_of NCBITaxon:7955 . . [Term] id: ZFA:0000134 name: neuron xref: ZFIN:ZDB-ANAT-010921-563 xref: CL:0000540

is treated as if it states:

[Term] id: ZFA:0000134 name: neuron xref: ZFIN:ZDB-ANAT-010921-563 intersection_of: CL:0000540 intersection_of: part_of NCBITaxon:7955

treat-xrefs-as-relationship

The value for this tag should contain an ID Space followed by a relation ID.

Macro. Treats all xrefs coming from a particular ID-Space as being relationships. Normally, xrefs have no special meaning beyond "This xref is of relevance to the current entity".

Example:

treat-xrefs-as-relationship: MA homologous_to

This declares all xrefs to MA to be homology relationships.

treat-xrefs-as-is_a

The value for this tag should contain an ID Space

Macro. Treats all xrefs coming from a particular ID-Space as being is_a relationships. Normally, xrefs have no special meaning beyond "This xref is of relevance to the current entity".

Example:

treat-xrefs-as-is_a: CL

This declares all xrefs to CL to be is_a relations.

relax-unique-identifier-assumption-for-namespace

The value for this tag should be a namespace

By default, an obo namespace (note: not ID-space) partitions all the entities such that no two entities belonging to the same namespace may be equivalent. This header tag relaxes this assumption.

Note that this assumption does not hold by default *between* namespaces (it is OK for cellular_component and cell to use different identifiers to denote the type "cell").

It is unlikely that this tag will be used frequently. One scenario in which it may be useful is if a single ontology is created from multiple sources, with redundancy:

relax-unique-identifier-assumption-for-namespace: my_combined_ontology

relax-unique-label-assumption-for-namespace

The value for this tag should be a namespace

By default, an obo namespace (note: not ID-space) partitions all the entities such that no two entities belonging to the same namespace may have the same name tag (obsolete entities omitted). This header tag relaxes this assumption.

Note that this assumption does not hold by default *between* namespaces (it is OK for mouse_anatomy and fma to use the same names).

It is recommended that the unique label assumption should be maintained at all times. However, there may be times at an early stage of ontology development where this is relaxed.

relax-unique-label-assumption-for-namespace: my_combined_ontology

Stanzas

At present, all Term, Typedef and Instance stanzas always begin with an id tag. The value of the id tag announces the object to which the rest of the tags in the stanza refer. Normal, non-anonymous ids have global scope. An object has the same id in every file, and in every namespace. See ID Syntax.

The id tag may be optionally followed by an is_anonymous tag. If the value of is_anonymous is true, the object is anonymous. The id of an anonymous object is not fixed; if the ontology is parsed and then reserialized, the id may change. Anonymous ids have local scope; they are only valid in the file from which they were loaded. The same anonymous id in two different files refers to a different object in each file.

Any given stanza does not have to contain all the required tags. A file (or collection of files) may contain multiple stanzas that describe different aspects of an object. A required tag must be specified at least once for each object in a given set of files. This makes it possible for optional information to be stored in a separate file, and only loaded when necessary.

This means that parsers must wait until the end of the parse batch to check whether required information is missing. Multiple descriptions may produce parse errors if:

A stanza contains tags that contradict a previous stanza (ie one term description gives a different term name than another description)
A parser has processed all the files in a batch, but an object is still missing some required value (such as a term name).

There are currently five supported stanza types: [Term], [Typedef], [Instance], [Annotation] and [Formula]. Parsers/serializers will round-trip (successfully load and save) unrecognized stanzas.

Stanza Types

Term: Term stanzas constitute the nodes in an ontology graph. Each Term stanza should correspond to a particular types of entity, also known as classes or universals, which is known by a unique term or name.
Typedef: Typedef stanzas constitute the edge labels that may be used in an ontology graph. Also known as relations, relationship types, properties or predicates. The name "Typedef" is somewhat confusing but is retained for forwards compatibility.
Instance: Instance stanzas are used to represent the spatiotemporal particulars that instantiate types. Note that instances are typically not represented in ontologies. OBO allows them for completeness, to allow generalized data exchange and for compatibility with other languages.

Annotation: Annotation stanzas are used to represent statements, and to attach metadata to these statements (for users familiar with OWL terminology, these correspond to annotated axioms).
Formula -- DEPRECATED (1.4 provides syntax for specifying OWL macros): Stanza used to represent a logical or mathematical formula that stands independently of any one particular type or relation

Tags in a [Term] Stanza

Required tags

id: The unique id of the current term.

Optional tags

name

The term name. Any term may have only zero or one name defined. If multiple term names are defined, it is a parse error. In 1.2 name was required. This has been relaxed in 1.3. This helps with OWL interoperability, as labels are optional in OWL

is_anonymous

Whether or not the current object has an anonymous id

alt_id

Defines an alternate id for this term. A term may have any number of alternate ids.

def

The definition of the current term. There must be zero or one instances of this tag per term description. More than one definition for a term generates a parse error. The value of this tag should be the quote enclosed definition text, followed by a dbxref list containing dbxrefs that describe the origin of this definition (see dbxref formatting for information on how dbxref lists are encoded). An example of this tag would look like this:

definition: "The breakdown into simpler components of (+)-camphor, a bicyclic monoterpene ketone." [UM-BBD:pathway "", http://umbbd.ahc.umn.edu/cam/cam_map.html ""]

comment

A comment for this term. There must be zero or one instances of this tag per term description. More than one comment for a term generates a parse error.

subset

This tag indicates a term subset to which this term belongs. The value of this tag must be a subset name as defined in a subsetdef tag in the file header. If the value of this tag is not mentioned in a subsetdef tag, a parse error will be generated. A term may belong to any number of subsets.

synonym

This tag gives a synonym for this term, some xrefs to describe the origins of the synonym, and may indicate a synonym category or scope information.

The value consists of a quote enclosed synonym text, an optional scope identifier, an optional synonym type name, and an optional dbxref list, like this:

synonym: "The other white meat" EXACT MARKETING_SLOGAN [MEAT:00324, BACONBASE:03021]

The synonym scope may be one of four values: EXACT, BROAD, NARROW, RELATED. If the first form is used to specify a synonym, the scope is assumed to be RELATED.

The synonym type must be the id of a synonym type defined by a synonymtypedef line in the header. If the synonym type has a default scope, that scope is used regardless of any scope declaration given by a synonym tag.

The dbxref list is formatted as specified in dbxref formatting. A term may have any number of synonyms.

exact_synonym

Deprecated. An alias for the synonym tag with the scope modifier set to EXACT.

narrow_synonym

Deprecated. An alias for the synonym tag with the scope modifier set to NARROW.

broad_synonym

Deprecated. An alias for the synonym tag with the scope modifier set to BROAD.

xref

A dbxref that describes an analagous term in another vocabulary (see dbxref formatting for information about how the value of this tag must be formatted). A term may have any number of xrefs.

xref_analog

Deprecated. An alias for the xref tag.

xref_unk

Deprecated. An alias for the xref tag.

is_a

This tag describes a subclassing relationship between one term and another. The value is the id of the term of which this term is a subclass. A term may have any number of is_a relationships.

Parsers which support trailing modifiers may optionally parse the following trailing modifier tags for is_a:

namespace <any namespace id> derived true OR false

The namespace modifier allows the is_a relationship to be assigned its own namespace (independent of the namespace of the superclass or subclass of this is_a relationship).

The derived modifier indicates that the is_a relationship was not explicitly defined by a human ontology designer, but was created automatically by a reasoner, and could be re-derived using the non-derived relationships in the ontology.

This tag previously supported the completes trailing modifier. This modifier is now deprecated. Use the intersection_of tag instead.

intersection_of

This tag indicates that this term is equivalent to the intersection of several other terms. The value is either a term id, or a relationship type id, a space, and a term id. For example:

intersection_of: GO:0051319 ! G2 phase intersection_of: part_of GO:0000278 ! mitotic cell cycle

This means that the term is equivalent to any term that is both a subtype of 'G2 phase' and has a part_of relationship to 'mitotic cell cycle' (i.e. the G2 phase of the mitotic cell cycle). Note that whilst relationship tags specify necessary conditions, intersection_of tags specify necessary and sufficient conditions.

A collection of intersection_of tags appearing in a term is also known as a cross-product definition (this is the same as what OWL users know as a defined class, employing intersectionOf constructs).

It is strongly recommended that all intersection_of tags follow a genus-differentia pattern. In this pattern, one of the tags is directly to a term id (the genus) and the other tags are relation term pairs. For example:

[Term] id: GO:0045495 name: pole plasm intersection_of: GO:0005737 ! cytoplasm intersection_of: part_of CL:0000023 ! oocyte

These definitions can be read as sentences, such as a pole plasm is a cytoplasm that is part_of an oocyte

If any intersection_of tags are specified for a term, at least two intersection_of tags need to be present or it is a parse error. The full intersection for the term is the set of all ids specified by all intersection_of tags for that term.

As of OBO 1.3, this tag may be applied in Typedef stanzas

union_of

This tag indicates that this term represents the union of several other terms. The value is the id of one of the other terms of which this term is a union.

If any union_of tags are specified for a term, at least 2 union_of tags need to be present or it is a parse error. The full union for the term is the set of all ids specified by all union_of tags for that term.

This tag may not be applied to relationship types.

Parsers which support trailing modifiers may optionally parse the following trailing modifier tag for disjoint_from:

namespace <any namespace id>

disjoint_from

This tag indicates that a term is disjoint from another, meaning that the two terms have no instances or subclasses in common. The value is the id of the term from which the current term is disjoint. This tag may not be applied to relationship types.

Parsers which support trailing modifiers may optionally parse the following trailing modifier tag for disjoint_from:

namespace <any namespace id> derived true OR false

The namespace modifier allows the disjoint_from relationship to be assigned its own namespace.

The derived modifier indicates that the disjoint_from relationship was not explicitly defined by a human ontology designer, but was created automatically by a reasoner, and could be re-derived using the non-derived relationships in the ontology.

relationship

This tag describes a typed relationship between this term and another term or terms. The value of this tag should be the relationship type id, and then the id of the target term, plus, optionally, other target terms. The relationship type name must be a relationship type name as defined in a typedef tag stanza. The [Typedef] must either occur in a document in the current parse batch, or in a file imported via an import header tag. If the relationship type name is undefined, a parse error will be generated. If the id of the target term cannot be resolved by the end of parsing the current batch of files, this tag describes a "dangling reference"; see the parser requirements section for information about how a parser may handle dangling references. If a relationship is specified for a term with an is_obsolete value of true, a parse error will be generated.

Parsers which support trailing modifiers may optionally parse the following trailing modifier tags for relationships:

not_necessary true OR false -- DEPRECATED inverse_necessary true -- DEPRECATED OR false -- DEPRECATED namespace <any namespace id> derived true OR false cardinality any non-negative integer maxCardinality any non-negative integer minCardinality any non-negative integer

The necessary modifier allows a relationship to be marked as "not necessarily true". The inverse_necessary modifier allows the inverse of a relationship to be marked "necessarily true".

The namespace modifier allows the relationship to be assigned its own namespace (independant of the namespace of the parent, child, or type of the relationship).

The derived modifier indicates that the relationship was not explicitly defined by a human ontology designer, but was created automatically by a reasoner, and could be re-derived using the non-derived relationships in the ontology.

Cardinality qualifiers can be used to specify constraints on the number of relations of the specified type any given instance can have. For example, in the stanza declaring a id: SO:0000634 ! polycistronic mRNA, we can say: relationship: has_part SO:0000316 {minCardinality=2} ! CDS which means that every instance of a transcript of this type has two or more CDS features such that they stand in a has_part relationship from the transcript.

This tag previously supported the completes trailing modifier. This modifier is now deprecated. Use the intersection_of tag instead.

is_obsolete

Whether or not this term is obsolete. Allowable values are "true" and "false" (false is assumed if this tag is not present). Obsolete terms must have no relationships, and no defined is_a, inverse_of, disjoint_from, union_of, or intersection_of tags.

replaced_by

Gives a term which replaces an obsolete term. The value is the id of the replacement term. The value of this tag can safely be used to automatically reassign instances whose instance_of property points to an obsolete term.

The replaced_by tag may only be specified for obsolete terms. A single obsolete term may have more than one replaced_by tag. This tag can be used in conjunction with the consider tag.

consider

Gives a term which may be an appropriate substitute for an obsolete term, but needs to be looked at carefully by a human expert before the replacement is done.

This tag may only be specified for obsolete terms. A single obsolete term may have many consider tags. This tag can be used in conjunction with replaced_by.

use_term

Deprecated. Equivalent to consider.

builtin

Whether or not this term or relation is built in to the OBO format. Allowable values are "true" and "false" (false assumed as default). Rarely used. One example of where this is used is the OBO relations ontology, which provides a stanza for the is_a relation, even though this relation is axiomatic to the language.

Additional tags in 1.3:

created_by: Name of the creator of the term. May be a short username, initials or ID. Example: dph
creation_date: Date of creation of the term specified in ISO 8601 format. Example: 2009-04-13T01:32:36Z

Dbxref Formatting

Dbxref definitions take the following form:

<dbxref name> {optional-trailing-modifier}

<dbxref name> "<dbxref description>" {optional-trailing-modifier}

By convention, the dbxref name is a colon separated key-value pair, but this is not a requirement. If provided, the dbxref description is a string of zero or more characters describing the dbxref. An example of a dbxref would be:

GO:ma "Sprung whole from the head of Michael, like Athena"

Dbxref lists are used when a tag value must contain several dbxrefs. Dbxref lists take the following form:

[<dbxref definition>, <dbxref definition>, ...]

The brackets may contain zero or more comma separated dbxref definitions. An example of a dbxref list would be:

[GO:ma, GO:midori "Midori was drinking and came up with this", GO:john {namespace=johnsirrelevantdbxrefs}]

Note that the trailing modifiers (like all trailing modifiers) do not need to be decoded or round-tripped by parsers; trailing modifiers can always be optionally ignored. However, all parsers must be able to gracefully ignore trailing modifiers. It is important to recognize that lines which accept a dbxref list may have a trailing modifier for each dbxref in the list, and another trailing modifier for the line itself.

Tags in [Typedef] Stanza

[Typedef] stanzas support almost all the same tags as a [Term] stanza.

In OBO Format 1.2, the following tags were not allowed in a [Typedef] stanza. In 1.3 they are allowed.

union_of
intersection_of
disjoint_from

The following additional tags are only allowed in a [Typedef] stanza:

domain: The id of a term, or a special reserved identifier, which indicates the domain for this relationship type. If a property P has domain D, then any term T that has a relationship of type P to another term is a subclass of D. Note that this does not mean that the domain restricts which classes of terms can have a relationship of type P to another term. Rather, it means that any term that has a relationship of type P to another term is by definition a subclass of D.
range: The id of a term, or a special reserved identifier, which indicates acceptable range for this relationship type. If a property P has range R, then any term T that is the target of a relationship of type P is a subclass of R. Note that this does not mean that the range restricts which classes of terms can be the target of relationships of type P. Rather, it means that any term that is the target of a relationship of type P is by definition a subclass of R.
inverse_of: The id of another relationship type that is the inverse of this relationship type. If relation A is the inverse_of type B, and X has relationship A to Y, then it is implied that Y has relation B to X. In obof1.2 inverse_of applied at the instance level
inverse_of_at_instance_level -- DEPRECATED relations are instance level in 1.4: The id of another relationship type that is the inverse of this relationship type. If relation A is the inverse_of type B, and instance X has relationship A to instance Y, then it is implied that instance Y has relation B to instance X. Note that this applies at the instance level. If a particular relationship tag has a true trailing qualifier then the inverse applies at the term level.
transitive_over: The id of another relationship type that this relationship type is transitive over. If P is transitive over Q, and the ontology has X P Y and Y Q Z then it follows that X P Z (term/type level).
is_cyclic: Whether or not a cycle can be made from this relationship type. If a relationship type is non-cyclic, it is illegal for an ontology to contain a cycle made from user-defined or implied relationships of this type. Allowed values: true or false
is_reflexive: Whether this relationship is reflexive. All reflexive relationships are also cyclic. Allowed values: true or false. Term/type level.
is_symmetric: Whether this relationship is symmetric. All symmetric relationships are also cyclic. Allowed values: true or false. Term/type level.
is_anti_symmetric: Whether this relationship is anti-symmetric. Allowed values: true or false. Term/type level.
is_transitive: Whether this relationship is transitive. Allowed values: true or false. Term/type level.
is_metadata_tag: Whether this relationship is a metadata tag. Properties that are marked as metadata tags are used to record object metadata. Object metadata is additional information about an object that is useful to track, but does not impact the definition of the object or how it should be treated by a reasoner. Metadata tags might be used to record special term synonyms or structured notes about a term, for example.

Tags in an [Instance] Stanza

Required tags

id: The unique id of the current term.
name: The instance name. Any instance may have only one name defined.
instance_of: The term id that gives the class of which this is an instance.

Optional tags

property_value: This tag binds a property to a value in this instance. The value of this tag is a relationship type id, a space, and a value specifier. The value specifier may have one of two forms; in the first form, it is just the id of some other instance, relationship type or term. In the second form, the value is given by a quoted string, a space, and datatype identifier. See IDs for more information on legal datatype identifiers.

[Instance] id: john name: John Day-Richter instance_of: boy property_value: married_to heather property_value: shoe_size "8" xsd:positiveInteger

The following optional tags are also allowable for instances. They have exactly the same syntax and semantics as defined in tags in a term stanza:

is_anonymous
namespace
alt_id
comment
xref
synonym
created_by
creation_date
is_obsolete
replaced_by
consider

The replaced_by and consider tags are also allowable for obsolete instances, but they must refer to another instance, rather than another term, to use as a replacement.

Tags in an [Annotation] Stanza

Required tags

subject: The entity (e.g. gene, protein, etc.) being annotated
object: The term ID to which the subject is annotated

Optional tags

is_negated: For NOT annotations
evidence: a code representing the type of evidence
source: xref(s) to support the evidence
secondary_taxon: The taxon of the second organism in the interaction (where appropriate)
relation: Indicates a specific relationship between the subject and object nodes
assigned_by: Identifies who made the annotation

Built-In Objects

By default, every OBO ontology contains the following objects:

[Typedef] id: is_a name: is_a range: OBO:TERM_OR_TYPE domain: OBO:TERM_OR_TYPE definition: The basic subclassing relationship [OBO:defs] [Typedef] id: disjoint_from name: disjoint_from range: OBO:TERM domain: OBO:TERM definition: Indicates that two classes are disjoint [OBO:defs] [Typedef] id: instance_of name: instance_of range: OBO:TERM domain: OBO:INSTANCE definition: Indicates the type of an instance [OBO:defs] [Typedef] id: inverse_of name: inverse_of range: OBO:TYPE domain: OBO:TYPE definition: Indicates that one relationship type is the inverse of another [OBO:defs] [Typedef] id: union_of name: union_of range: OBO:TERM domain: OBO:TERM definition: Indicates that a term is the union of several others [OBO:defs] [Typedef] id: intersection_of name: intersection_of range: OBO:TERM domain: OBO:TERM definition: Indicates that a term is the intersection of several others [OBO:defs]

Parsers and Serializers

General Behavior

All parsers should be capable of failing gracefully and generating errors explaining the failure. Parsers may optionally be capable of generating warnings, if the file being read contains non-fatal errors.

Handling Unrecognized Tags

A parser may do one of several things when an unrecognized tag is found:

FAIL: report a fatal error and terminate parsing
WARN: report a warning, but continue parsing and ignore the unrecognized tag
WARN_AND_RECORD: report a warning, but record the unrecognized tag for later serialization
IGNORE: silently ignore the unrecognized tag
RECORD: record the unrecognized tag for later serialization (recommended)

Non-Roundtripping Header Tags

The following optional header tags need not survive round-tripping:

format-version
version
date
saved-by
auto-generated-by

They do not need to be round tripped, because the correct values will change when the file is saved.

Dangling References

There are several options when a dangling reference is encountered

FAIL: report a fatal error and terminate parsing
WARN_AND_IGNORE: report a fatal error and ignore the dangling reference
WARN_AND_READ: report a warning and read in the dangling reference, storing it in a form suitable for round-tripping
READ: silently read and store the dangling relationship (recommended)

Serializer Conventions

Any parser should be able to read correctly formatted files in any layout. However, it is suggested that serializers obey the following conventions to ensure consistency, and to facilitate file comparison (for example in CVS).

General Conventions

Within a single file, all tags relating to a single entity should appear in the same stanza (thereby minimizing the total number of stanzas and keeping all tags regarding a single entity in the same place)
Any time an identifier is referenced (i.e. anywhere other than an id: tag), it should be accompanied by the corresponding name value in the comments. See this guide for examples.
In any case where the correct ordering of tags is ambiguous (for example, if there are two tags with the same name, or the ordering is not given in this document), tags should be ordered alphabetically, first on the tag name, then on the tag value.

Stanza Conventions

All new stanza declarations should be preceded by a blank line. [Typedef] stanzas should appear before [Term] stanzas, and [Instance] stanzas should appear after [Term] stanzas. All other stanza types should appear after [Instance] stanzas, in alphabetical order on the stanza name.

Header Tags

Header tags should appear in the following order:

format-version
data-version
date
saved-by
auto-generated-by
import
subsetdef
synonymtypedef
default-namespace
remark

Ordering Term and Typedef stanzas

[Term], [Typdef], and [Instance] stanzas should be serialized in alphabetical order on the value of their id tag.

Ordering Term and Typedef tags

Term tags should appear in the following order:

id
is_anonymous
name
namespace
alt_id
def
comment
subset
synonym
xref
is_a
intersection_of
union_of
disjoint_from
relationship
created_by
creation_date
is_obsolete
replaced_by
consider
is_metadata_tag

Typedef tags should appear in the following order:

id
is_anonymous
name
namespace
alt_id
def
comment
subset
synonym
xref
domain
range
is_anti_symmetric
is_cyclic
is_reflexive
is_symmetric
is_transitive
is_a
inverse_of
transitive_over
relationship
created_by
creation_date
is_obsolete
replaced_by
consider

Instance tags should appear in the following order:

id
is_anonymous
name
namespace
alt_id
comment
synonym
xref
instance_of
property_value
created_by
creation_date
is_obsolete
replaced_by
consider

Annotation tags should appear in the following order:

id
is_anonymous
name
namespace
alt_id
def
comment
subset
synonym
xref
is_a
created_by
creation_date
is_obsolete
replaced_by
consider
subject
relation
is_negated
object
source
assigned_by
evidence

If the same tag appears multiple times in a stanza, the tags should be ordered alphabetically on the tag value.

Dbxref lists

Values in dbxref lists should be ordered alphabetically on the dbxref name.

Changes in 1.3

Changes that break forwards compatibility

relationship: tag can take additional arguments. Note that this is only expected to be used in a small subset of ontologies at some point in the future. These relations are called "n-ary relations" -- DEPRECATED
All string values are treated as instances of rdf:text. String values that lack '@' a followed by a valid language tag are treated as if they have a trailing '@'. Strings that should genuinely contain the '@' character must escape it. This is technically a forwards-incompatible change in that it changes the semantics of strings. However, it is syntactically forwards compatible.
The semantics of inverse_of were not clear in 1.2. In 1.3 R inverse_of S holds iff a R b implies b R a, whether or not a and b are classes or instances. -- DEPRECATED semantics are identical to OWL in obof.1.4
Use of the backslash character "\" to split a tag over multiple lines was never used in practice and has been deprecated.
Unicode support. TODO.

Forwards compatible changes

New Stanza Types

[Annotation]: This makes links between entities (e.g. a gene and a molecular function) first-class citizens, and allows us to attach metadata (evidence, provenance, audit info etc) to the link.
[Formula] -- DEPRECATED: [Formula] stanza and formula: tag added. This is an advanced feature that can be used to help formally define relations.

New Header Tags

Header Macros

treat-xrefs-as-equivalent
treat-xrefs-as-genus-differentia
treat-xrefs-as-relationship
treat-xrefs-as-is_a
relax-unique-identifier-assumption-for-namespace
relax-unique-label-assumption-for-namespace

Definitional Expressions

ID Definitional Expressions added. E.g. GO:0005737^part_of(CL:0000023) can be used wherever one wants to say "cytoplasm of oocyte". This is treated as if it has the following definition:

[Term] id: GO:0005737^part_of(CL:0000023) intersection_of: GO:0005737 ! cytoplasm intersection_of: part_of CL:0000023 ! oocyte

This is known as post-composition. We can refer to an unnamed entity (i.e. one with no ID in any ontology) by describing it via a logical expression. The The Obolog document for the formal semantics of these expressions.

Relation Tags

Many of these are advanced features that can safely be ignored by parsers.

holds_over_chain

See Relation Composition. This is an extension of the transitive_over tag, introduced in 1.2

equivalent_to_chain

See Relation Composition. This is an extension of the transitive_over tag, introduced in 1.2

disjoint_over

For example: spatially_disconnected_from is disjoint_over part_of, in that two disconnected entities have no parts in common

relationship

Relations can now be related, but only by formal built-in predicates. This is an advanced feature to help formally define type-level relations in terms of their instance level counterparts

all_some_all_times -- DEPRECATED
all_only -- DEPRECATED
all_some -- DEPRECATED
all_some_tr -- DEPRECATED
all_some_reference_context -- DEPRECATED
homeomorphic_for

intersection_of

Previously, this tag could only be used in [Term] stanzas, to define types/classes/universals/patterns. Now it can be used to define relations. For example, we can define a temporal relation coincides_with as being true if both start end end boundaries are shared.

[Typedef] id: coincides_with intersection_of: has_same_start_as intersection_of: has_same_end_as

union_of

Previously, this tag could only be used in [Term] stanzas, to define types/classes/universals/patterns. Now it can be used to define relations

functional

Relation acts like a function. E.g. any entity only relates to one other entity by this relation

inverse_functional

Like functional, but the opposite "direction"

Tags for either relations or types/classes

equivalent_to: Used to specify exact equivalence between two instances, types or relations