Ah, it does on and on. After my last attempt at literate OWL programming, called omnsplit, I decided that there was a problem; this version splits the OWL file into individual statements, and puts them into files with the same name as the OWL class (property, or whatever).

The problem is that, for an ontology like OBI, you get 1400 individual files; this is just inconvienient as many applications don’t like this many files in a directory. Also, there is a naming constraint; you can only use characters legal in the file system; this doesn’t include “:” if you want to be Windows (NTFS) compliant.

So, for my new system, I decided to generate an index file, which just points at locations in the ontology file. Initially, I was just going to index the main ontology file; in the end, I decided a partial copy was the way forward; generating both the index and indexed file ensure that they will stay in-sync.

It required a bit of nasty latex hacking; the basic problem was avoiding the limitation of being only able to use legal LaTeX macro characters (that is letters). The system now works like this:

%% This is generated by python which also generates the
%% function_ont.spt file which is a copy of the ontology (with a
%% few new lines gone.

%% This just defines a new macro in what appears to be an
%% unnecessarily complex way.
\expandafter\def\csname OmnEntityHeaderheader\endcsname%

%% But the use of \expandafter and \csname means that you can
%% use any character you like, including underscores and numbers
%% in the macro name.
\expandafter\def\csname OmnEntityObjectPropertyhas_role\endcsname%

%% We can now define two commands in the style file. Again
%% we use \csname so that we are not bound to characters legal
%% in latex macros.
\newcommand{\omnclass}[2]{\csname OmnEntityClass#1#2\endcsname}
\newcommand{\omnobjprop}[2]{\csname OmnEntityObjectProperty#1#2\endcsname}

%% now in our source, we can do things like this.

Using an index in this way also has another advantage. I’ve had to make a decision whether to go with rdfs:label or the entity name. I can now back out of this; I can just use both in the index file, without too much extra space, so that either would be referencable within the latex.

To me, this feels like the right solution. It’s relatively simple (with a bit of nasty latex, which is nicely hidden), it doesn’t depend on the file system. It needs a bit more work to bring it to completion, but not that much.

Sadly bio-ontologies looms, so next week will be getting ready for that; perhaps I can finish this off on the way back. “Sadly” is perhaps a poor choice of words; I’m greatly looking forward to it, but I’ve kind of had the bit between my teeth with python and latex hacking for the last few weeks.

Leave a Reply