Java Event Based Parser for OBO
Getting Started
The Java event-based parser is part of the OBO-Edit source code, in the org.geneontology.oboedit.dataadapter package.
OBO-Edit can load files in both OBO 1.0 and OBO 1.2 syntax.
We apologize in advance for the very sparse documentation in these source files.
The classes of interest are:
-
GOBOParser
-
DefaultGOBOParser
-
GOBOParseEngine
The basic idea is that GOBOParseEngine
reads and parses a collection of OBO files to generate events (like readID
and readDefinition
). Each GOBOParseEngine
is associated with an implementation of GOBOParser
. Each time GOBOParseEngine
generates an event, the corresponding GOBOParser
method is called. Thus, if GOBOParseEngine
sees the line "name: kinase" in an OBO file, it will call GOBOParseEngine.readName("kinase", null)
.
DefaultGOBOParser
is an implementation of GOBOParser
that populates the OBO-Edit data models from an OBO file. If you want to use OBO-Edit's data models, you can use DefaultGOBOParser
like so:
public static OBOSession getSession(String path) {
DefaultGOBOParser parser = new DefaultGOBOParser();
GOBOParseEngine engine = new GOBOParseEngine(parser);
// GOBOParseEngine can parse several files at once
// and create one munged-together ontology,
// so we need to provide a Collection to the setPaths() method
Collection paths = new LinkedList();
paths.add(path);
engine.setPaths(paths);
engine.parse();
OBOSession session = parser.getSession();
return session;
}
If you're populating a database, or doing something else where it would just be a waste of memory to use the OBO-Edit data models, you can create your own implementation of GOBOParser
, and skip the data model generation step altogether.
If you feel inspired to add some developer documentation to the OBO-Edit code (particularly the data model and data adapter interfaces), please do! We're always happy to integrate code contributions!
If you have any additional questions, please contact gohelp@geneontology.org (the GO helpdesk).