I have been struggling for a while with OWL development environments. While Protege provides a nice GUI based system, this has the limitations of many such systems; it allows you to do what the authors intended, but not all of the things that you might wish.

It is partly for this reason that I have been developing my own OWL Manchester syntax mode for Emacs (http://www.russet.org.uk/blog/2161); I lose a lot from Protege, but then I also gain the ability to manipulate large numbers of classes at once, as well as easy access to versioning. These things are useful.

Still, the environment is lacking in many ways; recently, while building an ontology for karyotypes (http://www.russet.org.uk/blog/2202), I wanted a more programmatic environment. A trivial example, for instance, comes from the human chromosomes; there are 22 autosomes in all. These can easily be expressed in OWL with 22 classes (plus X and Y). The problem is that all of these classes are likely to be very similar, which produces a code duplication problem. Of course, this is not a new problem; OPPL — the ontology pre-processor language was created at least in part for this purpose (10.1038/npre.2009.4006.1).

The main problem with OPPL, however, is that is a Domain Specific Language; while this makes it well adapted to its task, it also means that it lacks many basic features of a “real” programming language. Another possibility is to use the OWL API (10.1007/978-3-540-39718-2_42) (I am actually on this paper, but I publicly acknowledge that this was a rather generous attribution from Sean Bechhofer; I did do some work on the API, but not much, and I suspect none of my work remains). However, a brief look at the OWL API tutorial shows a problem. This code creates two classes and makes one a subclass of another.

OWLOntologyManager m = create();
OWLOntology o = m.createOntology(pizza_iri);
// class A and class B
OWLClass clsA = df.getOWLClass(IRI.create(pizza_iri + "#A"));
OWLClass clsB = df.getOWLClass(IRI.create(pizza_iri + "#B"));
// Now create the axiom
OWLAxiom axiom = df.getOWLSubClassOfAxiom(clsA, clsB);
// add the axiom to the ontology.
AddAxiom addAxiom = new AddAxiom(o, axiom);
// We now use the manager to apply the change
m.applyChange(addAxiom);
// remove the axiom from the ontology
RemoveAxiom removeAxiom = new RemoveAxiom(o, axiom);
m.applyChange(removeAxiom);

Aside from the intrinsic problems of Java — the compile, run, test cycle is rather clunky for this sort of work, this amount of code to achieve something straightforward makes this a little untenable.

Class: piz:A
    SubClassOf:
        piz:B

However, while Java and the OWL API do not seem a good choice for manipulating OWL directly, rewriting everything from first principles would also be a bad idea.

One solution to this problem came to my attention recently, in the shape of Clojure; essentially, this is a lisp implemented on the JVM. I will not describe the virtues or otherwise of Lisp in great detail; for some reason it is one of those languages that tends to generate fanaticism, and there are lots of descriptions of lisp elsewhere. For my purposes, there were three advantages. The first was personal, which is that I know Lisp reasonably well being an Emacs hacker. The other two are more general: Clojure has good integration with Java, and can manipulate Java objects, meaning I can make direct use of the OWL API; and, second, Lisp has a good degree of syntactic plasticity, which is important as, after all, I am looking for a convenient representation.

Initially, I have aimed at producing a representation which is fairly similar to Manchester syntax (http://www.w3.org/TR/owl2-manchester-syntax/). My initial attempts used the various features of Clojure directly. Consider, for instance, the following two statements:

(owl/owlclass
 "Arm" {:subclass "Limb"})

(owl/owlclass
 "HumanArm" {:subclass ["Limb" "HumanBodyPart"]})

Lisp, in general, uses a prefix notation. There is no obvious and easy way around this; in this case, it actually fits rather well with Manchester syntax which looks similar. The use of frame keywords such as :SubClassOf in Manchester syntax is also fortuitous as lisp uses a similar syntax. However, this syntax is rather too difficult. Even in this simple example we have a statement terminator which looks like ]}) (representing end of a vector, hash and sequence respectively). Lisp’s are often criticised for having too many parentheses; Clojure is unusual in using lots of different styles of parens. In Emacs-Lisp, I just keep hitting ) till I finished. In Clojure, you have to the brackets in the right order. All rather painful.

Fixing this turned out to be quite difficult, with a particularly nasty function I have called groupify. It is heavily recursive, which is apparently, a poor idea in Clojure, as it lacks some recursion optimisations present in many lisps; however, without mutable local variables, I could see no other option. The syntax now looks much simpler.

(owl/owlclass
 "Arm" :subclass "Limb")

(owl/owlclass
 "HumanArm" :subclass "Limb" "HumanBodyPart")

(owl/owlclass
 "Hand" :subclass (owl/some "isPartOf" "Arm"))

Both the :subclass and :equivalent frames support any number of class expressions; so far I have only implemented some or only, but the rest are not hard. Currently, it is only possible to save the ontology in Manchester syntax, but fixing this is trivial; the OWL API is doing all of the work.

Of course, this would not be much help if all I had managed to achieve was Manchester syntax with more parens. However, the big advantage of this becomes clearer with the next example:

(dorun
 (map
  (fn [x]
    (owl/owlclass
     (str "HumanChromosome" x)
     :subclass "HumanChromosome"))
  (concat '("X" "Y") (range 1 23))))

This creates a class for each human chromosome. In this case, I have hard coded the list of classes in, but I could be parsing a CSV or accessing a database. Or accessing an existing ontology; this could be very useful in avoiding maintenance of duplicate hierarchies.

Still, as it stands is just a (under-functional) version of OPPL. To make this worthwhile, I need to build off the language features that Clojure brings. I want to be able to interact with a reasoner, performing tasks in batch. In particular, the next step is to hook into Clojure’s test framework; something I have sorely missed when ontology building as opposed to programming. My experiences so far with combining Clojure and the OWL API suggest this should not be too hard.

These would not be minor advances; in the same way that test-driven programming has had a significant impact on the way we code, having a good test frame work for OWL would mean that we could define our use cases up-front, formally, programmatically and then fiddle with the logical representation till they work. As with test-driven programming, the test cases would themselves start to form part of the documentation for the code. When combined with a literate framework (http://www.russet.org.uk/blog/1213), to link between the ontology, the test cases and the experimental data that we are attempting to represent and model, this would provide a strong environment indeed. It would be a good step from moving from the craft-based approach we are taking at the moment, toward the pragmatic environment that I and others (http://robertdavidstevens.wordpress.com/2011/05/26/unicorns-in-my-ontology/) feel we need.

My code is available on Google code at http://code.google.com/p/clojure-owl/, and will be developed further there.

Bibliography

9 Comments

  1. Alan Ruttenberg says:

    Hi Phil,

    I’ve also developed some lisp-based software for working with OWL and SemWeb. They are based on ABCL (armed-bear common lisp) which is also built on Java. I don’t have much documentation other than projects and tools built with it. Let me know if you are interested in comparing notes.

  2. Phillip Lord says:

    Would be good to see what you have done.

    My intentions for clojure-owl are clear in my mind; I want this as tool for ontology development, so that I can develop higher levels of abstraction as I go, as well as integrating with documentation. I’m guessing your stuff is more for ontology integration and tooling?

  3. An Exercise in Irrelevance » Blog Archive » OWL Concepts as Lisp Atoms says:

    […] my initial work on developing a Clojure environment for OWL (http://www.russet.org.uk/blog/2214), I was focused on producing something similar to Manchester syntax […]

  4. Ignazio says:

    There is a newer (kinda) version of OPPL around, although it’s not seeing much use these days (http://oppl2.sourceforge.net).

    Curiously enough, I’m trying to provide the OWL API with a simpler interface – the clunkyness of many interfaces is really getting under my skin. I don’t think the Java syntax for it will be as terse as your example, but hey, it might be fun.

  5. Phillip Lord says:

    I never really intended this work to be an OPPL killer; at heart I wanted to provide an ontology editing environment, but one which is good for programmers. At the moment, I don’t think that there is anything in this niche.

    I really do want an REPL environment though; I have already found this useful — I can already build an ontology, add classes, take them away again and so on. It’s a good way to program but an unbeatable way to build ontologies.

    I should have another post in a few days time about some new features which do not just ape Manchester syntax; using clojure, anything that annoys me about Manchester syntax I can just program away.

  6. An Exercise in Irrelevance » Blog Archive » Disjoints in Clojure-owl says:

    […] ontologies, where I could work with a full programming language at to define the classes I wanted (http://www.russet.org.uk/blog/2214). After some initial work with functions taking strings, I have moved to an approach where classes […]

  7. An Exercise in Irrelevance » Blog Archive » Clojure OWL 0.2 says:

    […] a library written in Clojure, that I can use for building OWL ontologies programmatically (http://www.russet.org.uk/blog/2214). The basic idea behind this library is to give me something that looks like Manchester syntax […]

  8. An Exercise in Irrelevance » Blog Archive » Remembering the World as it used to be says:

    […] have been working on a Clojure library for developing OWL ontologies (http://www.russet.org.uk/blog/2214). There have been two significant advances with this library recently. First, I have changed its […]

  9. An Exercise in Irrelevance » Blog Archive » Support for Axiom Annotations says:

    […] the early development of Tawny-OWL and easy to use syntax has been a specific objective (http://www.russet.org.uk/blog/2214), as well as hiding some of the complexity of the OWL API. The intension has always been for […]

Leave a Reply