Archive for the ‘Tech’ Category

Less than one month after the release of Tawny-OWL 1.2.0 (, I am pleased to announce the 1.3.0 release. This is a much smaller release than 1.2.0, but provides two useful changes.

First, I have now added support for axiom annotations as more extensively documented in my past post ( While I expect these are still a minority requirement, they are used heavily by some people, and so they need supporting.

Second, I have reworked the support for patterns. This has been through the addition of three functions. p allows writing classes with optionality. So, for instance, consider this code:

(p o/owl-class
   o partition-name
   :comment comment
   :super super)

comment and super are optional here; they can be nil. In this case, the p function removes the nil value from the owl-class call. More over if, as in this case, the entire frame is has only nil values, it will be removed altogether. p returns the results of this call as a clojure record contain the entity itself and the name that was used to create that entity. In a nice piece of serendipity, the support that I have added for annotations (, also allows direct use of this record latter in the pattern. So, for example, this form uses partition with is the return value from above.

 #(p o/owl-class o
     :comment comment
     :super partition)

The reason for this record, though, is that it makes it relatively easy to build patterns in both function and macro form. Macros are never trivial, but all this one does is turn a set of symbols into their string equivalents.

(defmacro defpartition
  "As value-partition but accepts symbols instead of string and
takes the ontology as a frame rather than first argument."
  [partition-name partition-values & options]
   (list* (name partition-name)
          `(tawny.util/quote-word ~@partition-values)

The practical upshot of this is that I can define a value partition using a macro like this:

(defpartition Hydrophobicity
  [Hydrophobic Hydrophillic]
  :comment "Part of the Hydrophobicity value partition"
  :super PhysicoChemicalProperty
  :domain AminoAcid)

As well as generating the OWL API objects, this also binds the relevant vars, both those visible here (Hydrophobicity) and those generated (hasHydrophobicity).

Take together these are both important additions to Tawny-OWL. We can provide better provenance with the axiom annotations, and with patterns we can lift the level of abstraction at which we build our ontology which is one of the original motivations for Tawny-OWL in the first place.

Tawny-OWL 1.3.0 is now available on Clojars.


Since the early development of Tawny-OWL and easy to use syntax has been a specific objective (, as well as hiding some of the complexity of the OWL API. The intension has always been for Tawny-OWL to be an ontology developer tool first and a programmatic library second and keeping this in mind has been part of the reason that I believe does fill these objectives.

Unfortunately, the other part of the reason is that Tawny-OWL does hides functionality that is available in the OWL API. Or, more strictly, does not uncover it. Tawny-OWL is implemented in Clojure and what is possible in Java is also possible in Java.

One of the key decisions was to hide the support that the OWL API provides for certain forms of annotation, in particular annotatons on axioms. Of course, Tawny-OWL allows you to add annotations to entities. This is used to enable labels and comments on any entity. But axiom annotations allow the description of the relationships between entities. So, for example, as well as attaching comments on two classes, it is also possible to attach a comment on the sub/superclass relatinship between the two.

The main reason that Tawny-OWL did not support these natively is that it takes an entity-centric view of OWL. So, if we consider this statement:

(defclass A
   :super B
   :label "A")

We are describing the entity A primarily. In fact, this statement translates into two axioms, which we can see in the OWL/XML representation which looks like this:

    <Class IRI="#A"/>
    <Class IRI="#B"/>
    <AnnotationProperty abbreviatedIRI="rdfs:label"/>
    <Literal xml:lang="en"

The defclass statement above returns the entity (actually, the var as it is a def form, but the var contains the entity), rather than the axioms. If I wished to return the axioms, as there are several, I would need a list, or more probably, a data structure so that I could extract the axiom I wanted. This would, however, complicate life considerably. For instance, B would now refer to this data structure, which would need unpicking for its use here. Worse, the OWL API works by mutation, so the axioms in B might now reflect only some of the axioms refering to B.

Of course, there is a way around this, which is to dip down into the OWL API, fetching the axioms this way. As far as I can tell, annotations need to be added at the time the axiom is created (it is probably possible to do it later as well). This example comes from my recasting of the OWL Primer ontology.

 (.getOWLSubClassOfAxiom (owl-data-factory)
  Man Person #{(owl-comment "States that every man is a person")}))

This works well, but the syntax is not nice, we need to do a direct call to the OWL API. We are not even using the add-subclass function. This did not bother my overly, as it was not something that I thought would be needed often.

Unfortunately, it is something that the Gene Ontology people do often, including, for instance, annotating labels with the source of knowledge for these labels. If I am to support them, I need an attractive syntax that fits with current Tawny-OWL syntax. After a couple of attempts, I decided on this:

(defclass A
  :super (annotate B
           (owl-comment "A is a kind of B")))

The axioms in Tawny-OWL are syntactically implicit, describing the :super relationship between A, so I cannot directly address these. But attaching an annotation to B in this way is unambiguous. Compare these two statements that it might otherwise be mistaken for; in one case, we annotate A with a comment (which is most common thing to do) or, we inline an annotation of B (which would probably be better not inline!).

(defclass A
  :super B
  :annotation (owl-comment "A is an interesting entity"))

(defclass A
  :super (owl-class B
               (owl-comment "B is an interesting entity")))

This also extends naturally to other axioms, including annotation labels.

(defclass A
  :annotation (annotate (label "A")
                 (owl-comment "According to me")))

The implementation of this took me several attempts, including some fairly painful and ultimately unsuccesful macros. In the end, I found a much simpler solution. annotate returns a clojure record which contains both the entity — (label "A") or B in these examples — and the annotation always an owl-comment here, but potentially anything. This record is passed through the Tawny-OWL function call stacks in place of the raw entity, until the appropriate axiom is created. I then unpick this object with two calls to protocol methods — as-entity and as-annotations like so.

    (as-iri named-entity)
    ^OWLAnnotation (as-entity annotation)
    (as-annotations annotation)

The protocol implementations are trivial.

(defrecord Annotated [entity annotations]
  (as-entity [this] entity)
  (as-annotations [this] annotations))

These allow me to avoid checking for an Annotated object when creating my axiom. In most cases, I will not have one of these, but a normal OWLObject. Or a Long, String or even a Keyword for property characteristics. So I extend the protocols to cover these cases also, with even more trivial implementations.

  (as-entity [entity] entity))

  (as-annotations [entity]

Finally, the annotate function broadcasts as do many other functions in Tawny-OWL, so it is possible to annotate several axioms at once. So, for example, here explicitly using a list.

(defclass A
   :super (annotate [B C D]
             (owl-comment "All of these are supers")))

Or, implicitly with a function existential restriction that itself uses broadcasting.

(defclass A
   :super (annotate (owl-some r B C D)
             (owl-comment "All of these are existentials")))

While the syntax is slightly more complex than most of Tawny-OWL, it is a considerable improvement on dropping down to the OWL API layer beneath; and, ultimately, this form of annotation is a more complex usage of OWL.


In this post, I describe the different technologies that I’ve tried for writing a new book, from markdown to LaTeX.

One of the questions that I have been asked about Tawny-OWL ( was whether there was a manual or not. Actually, the answer is, yes, there is and has been since well before the 1.0 release.

However, it’s not rich enough. It makes requires both a reasonably good background in Clojure and in Ontology development. Given that the overlaps between these two areas of knowledge is probably limited to myself and Jennifer Warrendar this is rather less than ideal. So, I’ve started a new manual in the form of a book called Take-Wing. Getting a nice environment has been an interesting journey.


The orginal version of the documentation was written in Markdown. There was not really a good reason for this, other than I wanted to try it and it is well supported. It’s not a bad format and is easy to type, although it lacks in extensibility (which is, of course, why it has been extended in so many different ways!).

However, one significant problem was that I have no good way of checking that the code samples in it work. I had been hit by this when I changed function names, for example, from owlclass to owl-class, or more perniciously when I moved one entity from a variable to a function. My general solution to this linked-buffer which translates documentation into code and vice-versa. The lack of an explicit delimiter for the start and end of code blocks makes this harder to implement for markdown.


My next thought was to move to asciidoc. I like asciidoc and have used it for years. I helped to add slidy support to it, so that I can use it for slides. Most of my teaching material uses asciidoc now, because I can get slides and lecture notes from the same source, with easy to integrate code snippets that I can run, albeit maintained in independent source files which can make things a little painful.

The first version of Take-Wing used asciidoc. Because I wanted to use linked-buffer to generate Clojure source that I could test, I needed to use multiple files with a master file — the Clojure cookbook uses much the same system, albeit for a slightly different reason.

However, writing a talk or even a lecture series is a very different thing from writing a book. The main problem was not asciidoc itself, but the support from Emacs. There is an asciidoc mode but, aside from needing patching, it is heavily focused on syntax highlighting rather than document structure. So, inserting cross-references and the link were painful.


My next idea was to try org-mode. So, I used pandoc to do the translation; this wasn’t a glowing success, but it achieved the basics anyway.

And Emacs specific solution doesn’t seem ideal, but then for Take-Wing I am likely to be the main author. And I have been using org for a while, particularly for building plans both for Tawny-OWL and also, ironically, for the outline for Take-Wing.

It has clear code demarcation blocks, so adding support for linked-buffers was straight-forward enough. One significant problem is that the org-mode export framework has recently been replaced, so I have to install my own version; actually, installing a new version is not problematic as this can be done with package.el, but because org-mode is in core emacs, deleting the old version is doesn’t happen automatically. I found several version conflicts resulting from loading old files. Still, I managed to get org working, producing both a PDF and nice web page.

Org-mode was pleasant enough to use. The folding features were nice and the syntax is reasonable for typing. However, I found the same problems as with asciidoc. While, org has advanced features for linking to the rest of the world, linking within documents is not so good, especially when using a master document. Basically, I couldn’t get over the envy of my PhD students both of who have written their thesis with AucTex


LaTeX itself is an old piece of software, but it does the job of document preparation like nothing else. And, the support in Emacs is fantastic. However, it’s achilles heel has been that is produces PDFs. I don’t like PDFs for a variety of reasons, but the main one is that, essentially, it’s a screenshot of a piece of paper. I’ve long gone natively digital: I write my papers, blog posts, emails and everything else on computer, and I am used to reading in this way as well. And I want the web for reading documentation.

In my time, I’ve tried lots of different technology for webifying LaTeX. The best I had found was Plastex; my post on LaTeXtowordpress ( is still one of my best read, and my colleague even used it to post her PhD thesis ( It’s a good tool, but it’s not using TeX underneath and so it doesn’t always do the right thing as a result.

For some reason, I managed to have missed tex4ht, though, even though it has been around for quite a while. My initial impressions are, though, that it is probably the best tool that I have seen even if it has been blessed with some of the worst documentation and a very strange command line. I’m hopefully that make4ht may help this. It’s author has been fantastic on StackExchange in answering my questions. So, I decided to give it a go, so I exported by org to latex, threw away my org and started again.

The first problem that I didn’t like it’s handling of the listings package. I have fixed this by dropping code straight through to HTML. Initially, I tried highlight to do syntax highlighting, but then dropped this for prism, as the former cannot do inline highlighting also. The default footnote handling is also very strange, but this is also easy to fix with a standard package. So, now, after a lot of fiddling, it all seems to be working.


It’s been quite a lot of effort and rather more fiddling than I would have hoped, but I now have an editing environment that I like, and this is important. Writing a document is hard, and the environment needs to support this and not get in the way.

I am hopeful that tex4ht will fulfil it’s initial promise. I have been using LaTeX less and less over the years, simply because of its lack of HTML support, despite the fact that it’s the best environment for writing in: so when I have been most happy as an author, I have been least happy as a reader.

All I have to do now is write the rest of the book; so far, I think it’s going well, and this is something that I shall cover in later blog posts.


About six months after the release of Tawny-OWL 1.1.0 (, I am pleased to announce the 1.2 release. There have been a number of changes in this release, some planned, some not.

The planned changes have been an extension of the rendering capabilities of Tawny-OWL; it is now possible to render an OWLEntity entirely into Clojure data structures. The main reason for this was to allow use of core.logic for building an effective syntactic query language over OWL. This is now complete, although more work needs to be done on tawny.query if this is to be exploited to it’s fullest.

The main unplanned release was caused by a change in the OWL API which had dramatic (and negative) consequences for Tawny-OWL performance. The details of this change have been documented elsewhere ( Fixing this actually resulted in small increases in rendering performance. This also lead me to some aggresive micro-optimisations of the Tawny-OWL code base, mostly meaning I can avoid using variadic function calls where necessary (these box arguments into a list, which can be quite slow). On small ontologies, Tawny-OWL will load several times faster now.

The pace of development on Tawny-OWL has slowed somewhat in recent times; partly, this is because it is now reaching a point of maturity. However, I have also been diverted by two other projects. I have put a reasonable amount of effort into Linked buffer which provides a rich literate environment, originally intended for Tawny-OWL, although it is not specific to it. And, secondly, I have started a manual for Tawny-OWL called Take Wing. This is developed using Linked Buffer. Not the first example of a literate program as a manual, I know, but probably the first example a literate ontology as a manual.

Despite the slowed pace, I have a roadmap for Tawny 1.3. I wish to support what I am calling transactional restrictions; these are a set of axioms which are either added all together or not at all. The use case for this comes from the Gene Ontology who use annotations on their annotations to describe evidence for particular statements. The meta annotations should not be added to the ontology, until the annotations are.

And secondly, I am reworking tawny.pattern — the existing value partition support has already been changed, and there are some new functions which make it easy to turn a functional pattern into a macro one; the latter produce the more attractive syntax.

No doubt there will be a few unplanned additions also!


Yesterday, I came across a very strange bug in my Tawny-OWL library (1303.0213). The bug was my fault, but fixing it was exacerbated by what has to go into my list of Clojure gotchas (

Consider this very simple piece of Clojure code. Is this a recursive function, i.e. one that calls itself? Intuitively the answer is yes, but I would argue that actually, you cannot tell from the information given. It could be, but is not necessarily so.

(defn countdown [n]
  (if (pos? n)
    (cons n
          (countdown (- n 1)))

Let’s run this code:

test-project.core> (countdown 10)
(10 9 8 7 6 5 4 3 2 1 LiftOff)

All is working as it should and the function is recursing, so there appears to be no problem. However, countdown is a big long to type, so lets create an alias called cdown and test that this works as expected.

test-project.core> (def cdown countdown)
test-project.core> (cdown 10)
(10 9 8 7 6 5 4 3 2 1 LiftOff)

All is good, cdown is doing just what we expect also, and it’s the same as countdown. However, as we as being lazy, I am easily bored, so now we make the countdown go faster by redefining the countdown function.

(defn countdown [n]
  (if (pos? n)
    (cons n
          (countdown (- n 2)))

test-project.core> (countdown 10)
(10 8 6 4 2 LiftOff)

Again, everything is working. But what happens when we call cdown

(cdown 10)

If you are new to clojure, you might assume that as we have changed the definition of countdown the definition of cdown would change as well, so we should get

test-project.core> (cdown 10)
(10 8 6 4 2 LiftOff)

If you are more experienced with Clojure (as I would claim myself to now be, although others may disagree), then you would not that the alias was to the function and not the var, so cdown now contains a pointer to the old definition, so it will behave as it did before.

test-project.core> (cdown 10)
(10 9 8 7 6 5 4 3 2 1 LiftOff)

As it happens, this is not true either. What we actually get it this:

test-project.core> (cdown 10)
(10 9 7 5 3 1 LiftOff)

A wierd combination of the two functions. What is happening is this: the initial function call to cdown calls our first function definition; 10 is postive, so we subtract 1 to get 9. And, now, here is the strange bit. The function is no longer recursive, because it calls countdown through the indirection of the clojure var; we could be more explicit and replace the “recursive” call with this instead which is what is actually happening in the compiled Java class.

((var-get (var countdown)) 10)

So, in short, in Clojure a recursive function is not necessarily recursive, even if it is when defined. If the var changes, then the function no longer calls itself.

There are solutions to this. I should have defined my cdown alias like so:

(def cdown #'countdown)

Now, when we defined countdown the original function gets lost and GCd rather than hanging around, a pale shadow of its former recursive self. This works but is rather counterintuitive; with a Lisp-1 you get used to refering to functions directly rather than with a quoted symbol, but here we have to use both a quote and a #.

A second solution is to use recur, which does not use var indirection. So consider this example. The first countdown2 function is recursive and will always be since it directly refers to itself, and this remains after redefinition.

(defn countdown2 [i]
  (when (pos? i)
    (println i)
    (recur (dec i))))

(def cdown2 countdown2)

(defn countdown2 [i]
  (when (pos? i)
    (println i)
    (recur (- i 2))))

;; prints 10, 9, 8...
(cdown2 10)

Of course, we cannot use this in the first example, because it is not tail recursive. And, if we want to be really picky, we could argue that countdown2 is not recursive either because it is tail-call optimised, so it doesn’t call itself. And this is not an implementation detail either because it’s the whole point of recur.

Another solution would be for Clojure to provide a recur* form (or something equivalent) which would work away from the tail position and which directly recursed, rather than using var indirection. It’s unlikely that Clojure will do this, however, because this form of recursion consumes stack and so is not a good thing to do.

It would also be possible to change the semantics of Clojure symbols so they automatically refered to vars rather than values as appropriate. So in all these cases, we would have vars. As far as I can tell at the moment, the same syntax y is refering to two different things — the value of the var y in the first two forms and the var itself in the third.

(def x y)
(def z {:x y})

(defn y[]

To replicate the current behaviour, we would just use var-get.

(def x (var-get y))

Without thinking about this too deeply, I cannot actually see a downside to this, except that it would involve more var indirection, so would probably be slower.

Finally, clojure could just remove var indirection. Actually, this is likely to happen in the form of build profiles, because it also has the advantage of making Clojure faster. As my example shows, however, it also introduces some subtle changes to the semantics of the language.

I still like Clojure. The more you use a language, I guess, the more oddities you find.


I first discovered the Visitors pattern from reading the original design patterns book, sometime in the last century when, having jumped into computer science, I was busy hoovering up everything I could about programming. After reading the chapter on visitors, it made no sense to me then and I had no real cause to go near them again until I started working on Tawny-OWL ( This library includes a renderer which takes a set of OWL API OWLObjects and turns them into Clojure data structures. Originally, I added this simply to generate documentation, but since then it’s grown to have rather a life of its own. In Java, this would be be implemented with a visitor. In Clojure, I implemented this with multimethods which is often used as a replacement for the visitor pattern but I am now far from convinced that this was a good decision, after the performance of my code collapsed massively. In this article, I describe the problems with multimethods, and some alternatives. For each of these, I use a small example, and bench mark them; you can follow my code on The results are, perhaps, a little surprising. This should be of relevance to anyone addressing the same problem in Clojure.

Visitors and the OWL API

The purpose of a Visitor is to crawl over a set of objects, identified by their type, typically representing an abstract syntax tree, although it can be used for other purposes. It enables the implementation of many different types of behaviour over the tree without having to extend each of the types. This works by having a Visitor which implements one method for each object type. It works well, unless you add a new type, in which case there are a lot of Visitors to re-write. This is one reason that they are found mainly around abstract syntax trees. They tend not to change very much which is a prerequesite for visitors.

The OWL API (1303.0213) is a library which represents the OWL language ( — a mechanism for representing an area of knowledge using logical statements, or an ontology ( It’s a fairly complex API and implements several visitor patterns; it’s main visitor — the OWLObjectVisitor has so many methods I ended up using Clojure and reflection to count them all — 98 is the answer!


Now, Tawny-OWL ( wraps the OWL API to provide an evaluative, R-like environment for building ontologies, built using Clojure, which provides the ability to abstract away from the actual axioms; you can build functions which build axioms, examples of which can be seen in the karyotype ontology ( (1305.3758).

While abstractions are generally nice, it also hides the underlying axioms; so I built tawny.render which takes OWL API objects and turns them into Clojure forms, originally as a mechanism for documentation, although since I wrote it, it has taken on a life of its own.

In Java, the renderer would just have used the Visitor pattern. In Tawny-OWL, I initially implemented this using multimethods. These have quite commonly been put forward as a replacement for the visitor pattern (, so it seemed a reasonable suggestion and it worked well for me, until a new version of the OWL API was released. Suddenly the performance of my test suite crashed catastrophically. Consider, these results: from 40 seconds to 4 minutes.

git clone
git checkout 5870a9

time lein with-profile 3.4.10 test
real    0m45.186s

time lein with-profile 3.5.0 test
real    4m26.070s

The OWL API test suites speed up slightly between the two versions, so it seemed unlikely to be an implementation mistake. I also noticed that most of the time was in the rendering tests — which include one substantial rendering task.

Thanks to git bisect, I tracked the problem down to a single commit 2bb6a0. If you look through the entire log (it’s long, and I did!) the surprising thing is this; there are precisely no implementation changes. All that this commit does is add a (large number) of interfaces to the existing model. Why on earth would this cause a massive drop in performance?

Visitors Revisited

There is one subtlety about the visitor pattern that is not obvious at first sight. If we have a visitor interface like so:

class Visitor{
    public void visit(A a);
    public void visit(B b);
    public void visit(C c);
    public void visit(D d);

At some point, we have to decide which method to actually call. With the visitor pattern it is the entity that accepts the visitor that makes this decision, because it is this entity that calls visit. Consider this code:

class Dimpl implements D{
    public accept( Visitor v ){

In Java, which method to dispatch to is made on the basis of the compile time, static type of the object. Now, the Dimpl class knows it’s own type at compile time, hence we dispatch to the correct Visitor method, even though the class which makes calls the accept method does not know the underlying type of the Dimpl object. Effectively, we pass off part of the computation to the compile time type system.

Clojure cannot do this. Multimethod calls have to check the runtime type of the object to be visited at runtime against the range of possibilities. By introducing a large number of interfaces, it appears the OWL API had increased the search space over which the multimethod isa? call had to crawl. I’ll try and show here that between the two versions of the OWL API, benchmarking isa? call has got significantly slower. This is not enough to explain in performance drop of my tests, although refactoring the multimethod to reduce the number of isa? does actually address the issue, so it really does seem to be the problem.

In the rest of this article, I will consider several different alternatives, benchmarking them and comparing them for performance. Microbenchmarking of this form is hard to do. In this article, I use a combination of the excellent criterium, clojure.core/time supplemented with some macro-benchmarking — running the Tawny-OWL tests. I consider that none of these tests are definitive, and will discuss their advantages and disadvantages as I go. All results are reported to 2 significant figures. Ultimately, my aim with this article is to improve the Tawny-OWL performance, but I hope that the results are clean enough that they are relevant to other environments also.

Set up

The rest of the code for this post is available at The code in this article is a point in time, and will not be updated if I change the visitor benchmark. This project can be cloned and tested using lein with-profile production run. I will not describe all of the plumbing that makes this work here.

First, however, we define a benchmark command. Using Criterium (http:/// directly is a little painful, so I define a small macro which allows me to switch between bench, quick-bench and time. My version of bench also prints out the form so I can see what is happening.

(def bench-form

(defmacro bench [form]
     (println "Benching:" (quote ~form))
     (~bench-form ~form)))

Establishing a base line

First we establish a base line; how long does it take to invoke a function that does nothing at all. While it varies from machine to machine, on my desktop it takes around 10ns.

(defn f [])

(visitor.benchmark.bench/bench (f)

Some objects to work with

I have choosen to use the OWL API and Tawny-OWL to generate my objects, although my belief is that this analysis should be good for any API. I already had a lot of the code for this analysis, the OWL API uses visitors and it is big. Finally, the change between 3.4.10 and 3.5.0 that I describe earlier is a useful test. We generate three objects to use.

(defontology ont)

(defclass AClass)
(defoproperty AProperty)
(def ASome (first  (owl-some AProperty AClass)))


To test multimethods first, I used the following code, which creates a two method multimethod operating on class. After this, I also tried adding many more methods to the multimethod and calling it again.

(defmulti multi class)

(defmethod multi org.semanticweb.owlapi.model.OWLClass [e])
(defmethod multi org.semanticweb.owlapi.model.OWLObjectProperty [e])

(defn multi-on-class []
  (bench (multi AClass)))

(defn multi-on-property []
  (bench (multi AProperty)))

Now using criterium.core/bench gives the following results, both before and after adding extra methods. The basic conclusion here is that, it makes no difference at all: the multimethod is unaffected by the change in number of method and, critically, there is no difference at all between OWL-API 3.4.10 and 3.5.0. Poor evidence for the theory that the multimethod is responsible for the performance drop.

Big Class
Big Property

Poking through the Clojure java code finds this possible issue — the method lookup appears to be cached. dispatchVal here is the value returned by the dispatch function.

private IFn findAndCacheBestMethod(Object dispatchVal)

Testing with clojure.core/time rather confirms this. Consider the results in this table. The first thing to notice is that change in unit — we have moved from nanoseconds to milliseconds. Just for comparision our base test of an empty function reports “Elapsed time: 0.026253 msecs” or around 26μs for a simple invocation — three orders slower than criterium suggests and 5 orders faster than our multimethod invocation. Now, none of these differences are entirely believable, but what is interesting is difference of between 5 and 20 fold between the two versions of the OWL API. Of course, clojure.core/time is potentially risky because of start up time and so forth but, because this should affect the test against 3.4.10 and 3.5.0 equally; it also will also remove the effect of caching in findAndCacheBestMethod single it only runs once.

Big Class
Big Property

Trying to pull apart this appart a little further, let’s test the isa? method more directly, which we test with following benchmarks one of which succeeds and one of which fails.

(bench (isa? (class AClass)
(bench (isa? (class AClass)

First, comparison with criterium.core/bench which shows a massive difference between success and failure (note the change in unit); the reason for this should be obvious — if isa? succeeds it short-circuits; if it fails, it must test all superclasses and the various hierarchies that Clojure adds.


Using the simpler, clojure.core/time shows the same thing as well as a difference between 3.4.10 and 3.5.0.


This distinction is important here. If it fails, then the isa? method is really very, very slow. 3.5.0 is slower, although not that much slower, at least according to clojure.core/time.

I am still not entirely satisified with these results. They do appear to show that multimethods can be pretty slow, and they do show a difference between 3.4.10 and 3.5.0. However, the various caching that multimethods perform should help — even though the OWL API has add many new interfaces, we are still only talking about tens of interfaces.

Implementing a Visitor with a Visitor

The most obvious way to implement a visitor pattern is to, er, implement the visitor. To achieve this in a vaguely friendly way in Clojure, we need to use reify — we cannot use proxy as it cannot overload on type which is the whole point of the visitor pattern.

We can achieve this simply with the following code:

(defn ontology-visitor []
  (reify org.semanticweb.owlapi.model.OWLObjectVisitor
    (^void visit [this
                  ^org.semanticweb.owlapi.model.OWLOntology o]
      (println "v ontology" o))))

(defontology ont)
(.visit (ontology-visitor) ont)

This works fine, but there is a problem. By default, the visitor pattern returns nothing — it works by side effect. In general, and specifically in the OWL API the visit method returns void. This is not at all useful as a replacement for multimethods in a recursive descent renderer. To work around this problem, I add some extra glue in terms of an atom which we change by side effect during visiting and then dereference to provide a return value.

(defn visit-entity [entity]
  (let [r (atom nil)]
     (reify org.semanticweb.owlapi.model.OWLObjectVisitor
       (^void visit [this
                     ^org.semanticweb.owlapi.model.OWLOntology o]
         (reset! r
                 (str "visit o" o)))

       (^void visit [this
                     ^org.semanticweb.owlapi.model.OWLClass c]
         (reset! r
                 (str "visit class" c))))

(visit-entity ont)

(defclass A)
(visit-entity A)

And how does it perform. Well, badly, is the answer: around 25μs for each method invocation (using bench) which is 1000x slower than our base line.

Of course, one obvious complaint with this implementation is that it involves creation of an atom, resetting that atom and so forth; actually, I have tested that as well, but for the purpose of this article, I test this simpler version.

(defn new-object-visitor []
  (reify org.semanticweb.owlapi.model.OWLObjectVisitor
    (^void visit [this
                  ^org.semanticweb.owlapi.model.OWLObjectProperty o])
    (^void visit [this
                  ^org.semanticweb.owlapi.model.OWLClass c])))

This gives a base line score of about 7ns. In short, the creation of a new object is not the problem. In the examples above, I use type hinting, but actually this makes no difference to performance. The fundamental problem here is that our reified object is dispatching on type using runtime checks and reflection.

A big Cond

Another possible option would be to replace the multimethod with a big cond form. I will not show the full code for this as it’s rather long, but an example of part of it looks like this:

(defn- as-form-int
  "Render one of the main OWLEntities as one of the main functions
in tawny."
  [c options]
      (fn [entity-type ^OWLEntity entity]
        (= entity-type (.getEntityType entity)))
    (as-form-int-OWLClass c options)
    (as-form-int-OWLObjectProperty c options)
    (as-form-int-OWLDataProperty c options)
    (as-form-int-OWLAnnotationProperty c options)
    (as-form-int-OWLNamedIndividual c options)
    (as-form-int-OWLDatatype c options)))

Here, we have use a cond to dispatch on the EntityType. The individual methods for the multimethod have been renamed using a rather clunky naming scheme. I do not have individual benchmarks for form of test, but looked at overall it does solve the problem. lein test runs in around 10s and there is no difference between the OWL API 3.4.10 and 3.5.0. In short, it solves the problem.

Unfortunately, the solution comes at a significant cost. Firstly and perhaps most trivially the code looks terrible. The solution also is very specific to the OWL API — without the getEntityType method this fails. In fact, getEntityType does not make all the distinctions we need; I also needed another couple of methods (getClassExpressionType and getDataRangeType) and still needed to fallback to instance? at points.

A different solution would be to just use instance? calls.

(defn small [e]
   (instance? org.semanticweb.owlapi.model.OWLClass e)
   (instance? org.semanticweb.owlapi.model.OWLObjectProperty e)

This behaves as expected under benchmarking. The code shown above with only two instance of calls bench marks at 5ns; however, when expanded to perform all the instance of calls I need, performance drops to 230ns.

I have also tried a few other tricks for optimising type based look up, including replacing the cond with a map lookup. Details are available in my test code. Suffice to say, I did not achieve a solution which was performant.


A final solution would be to use protocols. The idea here is that OWL Objects would know how to render themselves. I have not yet tested this within the context of the OWL API, and have only benchmarks. Consider this code:

(extend-type org.semanticweb.owlapi.model.OWLClass

A bench check of this shows highly performant invocation at around 10ns. The use of time is less happy, however, with a 0.2ms invocation (compared to 0.005ms baseline). I suspect that there is some caching happening at the protocol layer also. I would like to find the time to check protocols further, with a full implementation in the context of tawny.render. Unfortunately, this is quite a bit of (very boring!) code to rewrite.

Adapting the multimethod

The actual solution that I used was different to all of these, simply because it was the first acceptable solution that I found which solved the problem of very long test run. My approach to solving this problem was to address the isa? calls through memoization, using the fact that I know which classes the multimethod dispatches on.

The code for this just affects the dispatch method; we call isa? once and only once for each concrete class of object that we have. The full diff is available here.

(defn lookup
  "Returns the class of the OWLEntity of which this is a subclass."
    (fn fn-lookup [parent]
      (isa? c parent))
    (list org.semanticweb.owlapi.model.OWLClass

(def m-lookup (memoize lookup))

(defmulti multi (fn [c]
                   (class c))))

My results here are a little surprising. The speed of the dispatch seems to slow down using my benchmarking (data not shown). Fortunately, I tried running my test cases BEFORE I did this bench marking, because the results overall are excellent.

git  checkout 2fab56

time lein with-profile 3.4.10 test
real    0m7.875s
user    0m9.547s
sys     0m0.661s

time lein with-profile 3.5.0 test
real    0m7.950s
user    0m9.641s
sys     0m0.696s

As you can see, this memoized version achieves two things. Firstly, it is faster than the original by five fold. But critically, the difference between 3.4.10 and 3.5.0 has gone. For the 3.5.0 version of the OWL API this memoization is 30 fold faster.

Of course, this solution is not ideal. The multimethod is not really doing anything now. We have closed it rather than left it open. It is doing a completely pointless isa? check when all we need is an = check, and it is still doing linear time dispatch when it should be constant. Moreover, I wonder what why this form of memoization is working when the multimethod implementation itself is cached. None the less it works well enough that I am happy with it for the moment.


In this article, I have investigated the visitor pattern and its implementation in Clojure. Multimethods do work, and do produce a nice code base, but they do not perform that well. As my example shows, they are potentially also sensitive to changes in the class structure of dependencies. In short, the runtime decisions of multimethods are not really a good replacement for the compile decisions of the Visitor.

Benchmarking is very difficult under the best of circumstances. Criteriums use of multiple runs brings some statistics to bear on the task, but also produces very poor results for memoized or cached functions if the real-world task defeats this caching. In this article, I have chosen a combination of simple test cases, bench and time but also my test cases from tawny.owl, against two different versions of the OWL API. Clearly the latter is not simple and there are many other variables which could be producing these results. However, what these tests lack in cleanliness, they gain in that they are a real world problem. Although I have tested them in this environment as well as I am able, this lack of cleanliness leaves the possibility that I have just misunderstood the cause.

I would like to test protocols in this environment also. They may be a good solution and appear to run quickly, but I have only tested against benchmark; as we have seen benchmarks are not necessarily clean.

Ultimately, I feel that we need proper “dispatch on type” support in Clojure. Multimethods are much more generic even if they are generally used in this way. I would like to be able to dispatch on more than one argument, have type hints taken care of for me, and most importantly, I would like them to be highly performant. I have managed to achieve some of this by modifying the dispatch method, but I think that this is a more generic problem and should be solved properly. As it stands, I would argue that multimethods are not a good replacement for the visitor pattern.


Thanks to Keith Flanagan for a proof read. Mistakes are my own!