I first discovered the Visitors pattern from reading the original design patterns book, sometime in the last century when, having jumped into computer science, I was busy hoovering up everything I could about programming. After reading the chapter on visitors, it made no sense to me then and I had no real cause to go near them again until I started working on Tawny-OWL (n.d.a) This library includes a renderer which takes a set of OWL API OWLObjects and turns them into Clojure data structures. Originally, I added this simply to generate documentation, but since then it’s grown to have rather a life of its own. In Java, this would be be implemented with a visitor. In Clojure, I implemented this with multimethods which is often used as a replacement for the visitor pattern but I am now far from convinced that this was a good decision, after the performance of my code collapsed massively. In this article, I describe the problems with multimethods, and some alternatives. For each of these, I use a small example, and bench mark them; you can follow my code on https://github.com/phillord/visitor-benchmark. The results are, perhaps, a little surprising. This should be of relevance to anyone addressing the same problem in Clojure.

Visitors and the OWL API

The purpose of a Visitor is to crawl over a set of objects, identified by their type, typically representing an abstract syntax tree, although it can be used for other purposes. It enables the implementation of many different types of behaviour over the tree without having to extend each of the types. This works by having a Visitor which implements one method for each object type. It works well, unless you add a new type, in which case there are a lot of Visitors to re-write. This is one reason that they are found mainly around abstract syntax trees. They tend not to change very much which is a prerequesite for visitors.

The OWL API (n.d.b) is a library which represents the OWL language (n.d.c/) mechanism for representing an area of knowledge using logical statements, or an ontology (n.d.d) It’s a fairly complex API and implements several visitor patterns; it’s main visitor — the OWLObjectVisitor has so many methods I ended up using Clojure and reflection to count them all — 98 is the answer!

Tawny-OWL

Now, Tawny-OWL (n.d.a) wraps the OWL API to provide an evaluative, R-like environment for building ontologies, built using Clojure, which provides the ability to abstract away from the actual axioms; you can build functions which build axioms, examples of which can be seen in the karyotype ontology (https://github.com/jaydchan/tawny-karyotype) (n.d.e)

While abstractions are generally nice, it also hides the underlying axioms; so I built tawny.render which takes OWL API objects and turns them into Clojure forms, originally as a mechanism for documentation, although since I wrote it, it has taken on a life of its own.

In Java, the renderer would just have used the Visitor pattern. In Tawny-OWL, I initially implemented this using multimethods. These have quite commonly been put forward as a replacement for the visitor pattern (n.d.f) so it seemed a reasonable suggestion and it worked well for me, until a new version of the OWL API was released. Suddenly the performance of my test suite crashed catastrophically. Consider, these results: from 40 seconds to 4 minutes.

git clone git@github.com:phillord/tawny-owl.git
git checkout 5870a9

time lein with-profile 3.4.10 test
real    0m45.186s

time lein with-profile 3.5.0 test
real    4m26.070s

The OWL API test suites speed up slightly between the two versions, so it seemed unlikely to be an implementation mistake. I also noticed that most of the time was in the rendering tests — which include one substantial rendering task.

Thanks to git bisect, I tracked the problem down to a single commit 2bb6a0. If you look through the entire log (it’s long, and I did!) the surprising thing is this; there are precisely no implementation changes. All that this commit does is add a (large number) of interfaces to the existing model. Why on earth would this cause a massive drop in performance?

Visitors Revisited

There is one subtlety about the visitor pattern that is not obvious at first sight. If we have a visitor interface like so:

class Visitor{
    public void visit(A a);
    public void visit(B b);
    public void visit(C c);
    public void visit(D d);
}

At some point, we have to decide which method to actually call. With the visitor pattern it is the entity that accepts the visitor that makes this decision, because it is this entity that calls visit. Consider this code:

class Dimpl implements D{
    public accept( Visitor v ){
        v.visit(this);
    }
}

In Java, which method to dispatch to is made on the basis of the compile time, static type of the object. Now, the Dimpl class knows it’s own type at compile time, hence we dispatch to the correct Visitor method, even though the class which makes calls the accept method does not know the underlying type of the Dimpl object. Effectively, we pass off part of the computation to the compile time type system.

Clojure cannot do this. Multimethod calls have to check the runtime type of the object to be visited at runtime against the range of possibilities. By introducing a large number of interfaces, it appears the OWL API had increased the search space over which the multimethod isa? call had to crawl. I’ll try and show here that between the two versions of the OWL API, benchmarking isa? call has got significantly slower. This is not enough to explain in performance drop of my tests, although refactoring the multimethod to reduce the number of isa? does actually address the issue, so it really does seem to be the problem.

In the rest of this article, I will consider several different alternatives, benchmarking them and comparing them for performance. Microbenchmarking of this form is hard to do. In this article, I use a combination of the excellent criterium, clojure.core/time supplemented with some macro-benchmarking — running the Tawny-OWL tests. I consider that none of these tests are definitive, and will discuss their advantages and disadvantages as I go. All results are reported to 2 significant figures. Ultimately, my aim with this article is to improve the Tawny-OWL performance, but I hope that the results are clean enough that they are relevant to other environments also.

Set up

The rest of the code for this post is available at https://github.com/phillord/visitor-benchmark. The code in this article is a point in time, and will not be updated if I change the visitor benchmark. This project can be cloned and tested using lein with-profile production run. I will not describe all of the plumbing that makes this work here.

First, however, we define a benchmark command. Using Criterium (n.d.g) directly is a little painful, so I define a small macro which allows me to switch between bench, quick-bench and time. My version of bench also prints out the form so I can see what is happening.

(def bench-form
  ;;'clojure.core/time
  ;;'criterium.core/quick-bench
  'criterium.core/bench
  )

(defmacro bench [form]
  `(do
     (println "Benching:" (quote ~form))
     (~bench-form ~form)))

Establishing a base line

First we establish a base line; how long does it take to invoke a function that does nothing at all. While it varies from machine to machine, on my desktop it takes around 10ns.

(defn f [])

(visitor.benchmark.bench/bench (f)

Some objects to work with

I have choosen to use the OWL API and Tawny-OWL to generate my objects, although my belief is that this analysis should be good for any API. I already had a lot of the code for this analysis, the OWL API uses visitors and it is big. Finally, the change between 3.4.10 and 3.5.0 that I describe earlier is a useful test. We generate three objects to use.

(defontology ont)

(defclass AClass)
(defoproperty AProperty)
(def ASome (first  (owl-some AProperty AClass)))

Multimethods

To test multimethods first, I used the following code, which creates a two method multimethod operating on class. After this, I also tried adding many more methods to the multimethod and calling it again.

(defmulti multi class)

(defmethod multi org.semanticweb.owlapi.model.OWLClass [e])
(defmethod multi org.semanticweb.owlapi.model.OWLObjectProperty [e])

(defn multi-on-class []
  (bench (multi AClass)))

(defn multi-on-property []
  (bench (multi AProperty)))

Now using criterium.core/bench gives the following results, both before and after adding extra methods. The basic conclusion here is that, it makes no difference at all: the multimethod is unaffected by the change in number of method and, critically, there is no difference at all between OWL-API 3.4.10 and 3.5.0. Poor evidence for the theory that the multimethod is responsible for the performance drop.

OWL API Class Property Big Class Big Property
3.4.10 61ns 61ns 66ns 65ns
3.5.0 62ns 57ns 60ns 63ns

Poking through the Clojure java code finds this possible issue — the method lookup appears to be cached. dispatchVal here is the value returned by the dispatch function.

private IFn findAndCacheBestMethod(Object dispatchVal)

Testing with clojure.core/time rather confirms this. Consider the results in this table. The first thing to notice is that change in unit — we have moved from nanoseconds to milliseconds. Just for comparision our base test of an empty function reports “Elapsed time: 0.026253 msecs” or around 26μs for a simple invocation — three orders slower than criterium suggests and 5 orders faster than our multimethod invocation. Now, none of these differences are entirely believable, but what is interesting is difference of between 5 and 20 fold between the two versions of the OWL API. Of course, clojure.core/time is potentially risky because of start up time and so forth but, because this should affect the test against 3.4.10 and 3.5.0 equally; it also will also remove the effect of caching in findAndCacheBestMethod single it only runs once.

OWL API Class Property Big Class Big Property
3.4.10 220ms 140ms 80ms 88ms
3.5.0 860ms 1400ms 450ms 1400ms

Trying to pull apart this appart a little further, let’s test the isa? method more directly, which we test with following benchmarks one of which succeeds and one of which fails.

(bench (isa? (class AClass)
              org.semanticweb.owlapi.model.OWLClass))
(bench (isa? (class AClass)
              org.semanticweb.owlapi.model.OWLObjectProperty)))

First, comparison with criterium.core/bench which shows a massive difference between success and failure (note the change in unit); the reason for this should be obvious — if isa? succeeds it short-circuits; if it fails, it must test all superclasses and the various hierarchies that Clojure adds.

OWL API Succeeds Fails
3.4.10 76ns 40μs
3.5.0 76ns 33μs

Using the simpler, clojure.core/time shows the same thing as well as a difference between 3.4.10 and 3.5.0.

OWL API Succeeds Fails
3.4.10 30μs 1.4ms
3.5.0 40μs 3.8ms

This distinction is important here. If it fails, then the isa? method is really very, very slow. 3.5.0 is slower, although not that much slower, at least according to clojure.core/time.

I am still not entirely satisified with these results. They do appear to show that multimethods can be pretty slow, and they do show a difference between 3.4.10 and 3.5.0. However, the various caching that multimethods perform should help — even though the OWL API has add many new interfaces, we are still only talking about tens of interfaces.

Implementing a Visitor with a Visitor

The most obvious way to implement a visitor pattern is to, er, implement the visitor. To achieve this in a vaguely friendly way in Clojure, we need to use reify — we cannot use proxy as it cannot overload on type which is the whole point of the visitor pattern.

We can achieve this simply with the following code:

(defn ontology-visitor []
  (reify org.semanticweb.owlapi.model.OWLObjectVisitor
    (^void visit [this
                  ^org.semanticweb.owlapi.model.OWLOntology o]
      (println "v ontology" o))))

(defontology ont)
(.visit (ontology-visitor) ont)

This works fine, but there is a problem. By default, the visitor pattern returns nothing — it works by side effect. In general, and specifically in the OWL API the visit method returns void. This is not at all useful as a replacement for multimethods in a recursive descent renderer. To work around this problem, I add some extra glue in terms of an atom which we change by side effect during visiting and then dereference to provide a return value.

(defn visit-entity [entity]
  (let [r (atom nil)]
    (.visit
     (reify org.semanticweb.owlapi.model.OWLObjectVisitor
       (^void visit [this
                     ^org.semanticweb.owlapi.model.OWLOntology o]
         (reset! r
                 (str "visit o" o)))

       (^void visit [this
                     ^org.semanticweb.owlapi.model.OWLClass c]
         (reset! r
                 (str "visit class" c))))
     entity)
    @r))

(visit-entity ont)

(defclass A)
(visit-entity A)

And how does it perform. Well, badly, is the answer: around 25μs for each method invocation (using bench) which is 1000x slower than our base line.

Of course, one obvious complaint with this implementation is that it involves creation of an atom, resetting that atom and so forth; actually, I have tested that as well, but for the purpose of this article, I test this simpler version.

(defn new-object-visitor []
  (reify org.semanticweb.owlapi.model.OWLObjectVisitor
    (^void visit [this
                  ^org.semanticweb.owlapi.model.OWLObjectProperty o])
    (^void visit [this
                  ^org.semanticweb.owlapi.model.OWLClass c])))

This gives a base line score of about 7ns. In short, the creation of a new object is not the problem. In the examples above, I use type hinting, but actually this makes no difference to performance. The fundamental problem here is that our reified object is dispatching on type using runtime checks and reflection.

A big Cond

Another possible option would be to replace the multimethod with a big cond form. I will not show the full code for this as it’s rather long, but an example of part of it looks like this:

(defn- as-form-int
  "Render one of the main OWLEntities as one of the main functions
in tawny."
  [c options]
  (condp
      (fn [entity-type ^OWLEntity entity]
        (= entity-type (.getEntityType entity)))
      c
    EntityType/CLASS
    (as-form-int-OWLClass c options)
    EntityType/OBJECT_PROPERTY
    (as-form-int-OWLObjectProperty c options)
    EntityType/DATA_PROPERTY
    (as-form-int-OWLDataProperty c options)
    EntityType/ANNOTATION_PROPERTY
    (as-form-int-OWLAnnotationProperty c options)
    EntityType/NAMED_INDIVIDUAL
    (as-form-int-OWLNamedIndividual c options)
    EntityType/DATATYPE
    (as-form-int-OWLDatatype c options)))

Here, we have use a cond to dispatch on the EntityType. The individual methods for the multimethod have been renamed using a rather clunky naming scheme. I do not have individual benchmarks for form of test, but looked at overall it does solve the problem. lein test runs in around 10s and there is no difference between the OWL API 3.4.10 and 3.5.0. In short, it solves the problem.

Unfortunately, the solution comes at a significant cost. Firstly and perhaps most trivially the code looks terrible. The solution also is very specific to the OWL API — without the getEntityType method this fails. In fact, getEntityType does not make all the distinctions we need; I also needed another couple of methods (getClassExpressionType and getDataRangeType) and still needed to fallback to instance? at points.

A different solution would be to just use instance? calls.

(defn small [e]
  (cond
   (instance? org.semanticweb.owlapi.model.OWLClass e)
   true
   (instance? org.semanticweb.owlapi.model.OWLObjectProperty e)
   true))

This behaves as expected under benchmarking. The code shown above with only two instance of calls bench marks at 5ns; however, when expanded to perform all the instance of calls I need, performance drops to 230ns.

I have also tried a few other tricks for optimising type based look up, including replacing the cond with a map lookup. Details are available in my test code. Suffice to say, I did not achieve a solution which was performant.

Protocols

A final solution would be to use protocols. The idea here is that OWL Objects would know how to render themselves. I have not yet tested this within the context of the OWL API, and have only benchmarks. Consider this code:

(extend-type org.semanticweb.owlapi.model.OWLClass
  Render
  (render[this]))

A bench check of this shows highly performant invocation at around 10ns. The use of time is less happy, however, with a 0.2ms invocation (compared to 0.005ms baseline). I suspect that there is some caching happening at the protocol layer also. I would like to find the time to check protocols further, with a full implementation in the context of tawny.render. Unfortunately, this is quite a bit of (very boring!) code to rewrite.

Adapting the multimethod

The actual solution that I used was different to all of these, simply because it was the first acceptable solution that I found which solved the problem of very long test run. My approach to solving this problem was to address the isa? calls through memoization, using the fact that I know which classes the multimethod dispatches on.

The code for this just affects the dispatch method; we call isa? once and only once for each concrete class of object that we have. The full diff is available here.

(defn lookup
  "Returns the class of the OWLEntity of which this is a subclass."
  [c]
  (first
   (filter
    (fn fn-lookup [parent]
      (isa? c parent))
    (list org.semanticweb.owlapi.model.OWLClass
          org.semanticweb.owlapi.model.OWLObjectProperty
          org.semanticweb.owlapi.model.OWLNamedIndividual
          org.semanticweb.owlapi.model.OWLDataProperty
          org.semanticweb.owlapi.model.OWLAnnotationProperty
          org.semanticweb.owlapi.model.OWLDatatype))))

(def m-lookup (memoize lookup))

(defmulti multi (fn [c]
                  (m-lookup
                   (class c))))

My results here are a little surprising. The speed of the dispatch seems to slow down using my benchmarking (data not shown). Fortunately, I tried running my test cases BEFORE I did this bench marking, because the results overall are excellent.

git  checkout 2fab56

time lein with-profile 3.4.10 test
real    0m7.875s
user    0m9.547s
sys 0m0.661s

time lein with-profile 3.5.0 test
real    0m7.950s
user    0m9.641s
sys 0m0.696s

As you can see, this memoized version achieves two things. Firstly, it is faster than the original by five fold. But critically, the difference between 3.4.10 and 3.5.0 has gone. For the 3.5.0 version of the OWL API this memoization is 30 fold faster.

Of course, this solution is not ideal. The multimethod is not really doing anything now. We have closed it rather than left it open. It is doing a completely pointless isa? check when all we need is an = check, and it is still doing linear time dispatch when it should be constant. Moreover, I wonder what why this form of memoization is working when the multimethod implementation itself is cached. None the less it works well enough that I am happy with it for the moment.

Conclusions

In this article, I have investigated the visitor pattern and its implementation in Clojure. Multimethods do work, and do produce a nice code base, but they do not perform that well. As my example shows, they are potentially also sensitive to changes in the class structure of dependencies. In short, the runtime decisions of multimethods are not really a good replacement for the compile decisions of the Visitor.

Benchmarking is very difficult under the best of circumstances. Criteriums use of multiple runs brings some statistics to bear on the task, but also produces very poor results for memoized or cached functions if the real-world task defeats this caching. In this article, I have chosen a combination of simple test cases, bench and time but also my test cases from tawny.owl, against two different versions of the OWL API. Clearly the latter is not simple and there are many other variables which could be producing these results. However, what these tests lack in cleanliness, they gain in that they are a real world problem. Although I have tested them in this environment as well as I am able, this lack of cleanliness leaves the possibility that I have just misunderstood the cause.

I would like to test protocols in this environment also. They may be a good solution and appear to run quickly, but I have only tested against benchmark; as we have seen benchmarks are not necessarily clean.

Ultimately, I feel that we need proper “dispatch on type” support in Clojure. Multimethods are much more generic even if they are generally used in this way. I would like to be able to dispatch on more than one argument, have type hints taken care of for me, and most importantly, I would like them to be highly performant. I have managed to achieve some of this by modifying the dispatch method, but I think that this is a more generic problem and should be solved properly. As it stands, I would argue that multimethods are not a good replacement for the visitor pattern.

Acknowledgements

Thanks to Keith Flanagan for a proof read. Mistakes are my own!