Less than one month after the release of Tawny-OWL 1.2.0 (http://www.russet.org.uk/blog/3018), I am pleased to announce the 1.3.0 release. This is a much smaller release than 1.2.0, but provides two useful changes.

First, I have now added support for axiom annotations as more extensively documented in my past post (http://www.russet.org.uk/blog/3028). While I expect these are still a minority requirement, they are used heavily by some people, and so they need supporting.

Second, I have reworked the support for patterns. This has been through the addition of three functions. p allows writing classes with optionality. So, for instance, consider this code:

(p o/owl-class
   o partition-name
   :comment comment
   :super super)

comment and super are optional here; they can be nil. In this case, the p function removes the nil value from the owl-class call. More over if, as in this case, the entire frame is has only nil values, it will be removed altogether. p returns the results of this call as a clojure record contain the entity itself and the name that was used to create that entity. In a nice piece of serendipity, the support that I have added for annotations (http://www.russet.org.uk/blog/3028), also allows direct use of this record latter in the pattern. So, for example, this form uses partition with is the return value from above.

(map
 #(p o/owl-class o
     %
     :comment comment
     :super partition)
   values)

The reason for this record, though, is that it makes it relatively easy to build patterns in both function and macro form. Macros are never trivial, but all this one does is turn a set of symbols into their string equivalents.

(defmacro defpartition
  "As value-partition but accepts symbols instead of string and
takes the ontology as a frame rather than first argument."
  [partition-name partition-values & options]
  (tawny.pattern/pattern-generator
   'tawny.pattern/value-partition
   (list* (name partition-name)
          `(tawny.util/quote-word ~@partition-values)
          options)))

The practical upshot of this is that I can define a value partition using a macro like this:

(defpartition Hydrophobicity
  [Hydrophobic Hydrophillic]
  :comment "Part of the Hydrophobicity value partition"
  :super PhysicoChemicalProperty
  :domain AminoAcid)

As well as generating the OWL API objects, this also binds the relevant vars, both those visible here (Hydrophobicity) and those generated (hasHydrophobicity).

Take together these are both important additions to Tawny-OWL. We can provide better provenance with the axiom annotations, and with patterns we can lift the level of abstraction at which we build our ontology which is one of the original motivations for Tawny-OWL in the first place.

Tawny-OWL 1.3.0 is now available on Clojars.

Bibliography

Since the early development of Tawny-OWL and easy to use syntax has been a specific objective (http://www.russet.org.uk/blog/2214), as well as hiding some of the complexity of the OWL API. The intension has always been for Tawny-OWL to be an ontology developer tool first and a programmatic library second and keeping this in mind has been part of the reason that I believe does fill these objectives.

Unfortunately, the other part of the reason is that Tawny-OWL does hides functionality that is available in the OWL API. Or, more strictly, does not uncover it. Tawny-OWL is implemented in Clojure and what is possible in Java is also possible in Java.

One of the key decisions was to hide the support that the OWL API provides for certain forms of annotation, in particular annotatons on axioms. Of course, Tawny-OWL allows you to add annotations to entities. This is used to enable labels and comments on any entity. But axiom annotations allow the description of the relationships between entities. So, for example, as well as attaching comments on two classes, it is also possible to attach a comment on the sub/superclass relatinship between the two.

The main reason that Tawny-OWL did not support these natively is that it takes an entity-centric view of OWL. So, if we consider this statement:

(defclass A
   :super B
   :label "A")

We are describing the entity A primarily. In fact, this statement translates into two axioms, which we can see in the OWL/XML representation which looks like this:

<SubClassOf>
    <Class IRI="#A"/>
    <Class IRI="#B"/>
</SubClassOf>
<AnnotationAssertion>
    <AnnotationProperty abbreviatedIRI="rdfs:label"/>
    <IRI>#A</IRI>
    <Literal xml:lang="en"
       datatypeIRI="http://www.w3.org/1999/02/22-rdf-syntax-ns#PlainLiteral">A</Literal>
</AnnotationAssertion>

The defclass statement above returns the entity (actually, the var as it is a def form, but the var contains the entity), rather than the axioms. If I wished to return the axioms, as there are several, I would need a list, or more probably, a data structure so that I could extract the axiom I wanted. This would, however, complicate life considerably. For instance, B would now refer to this data structure, which would need unpicking for its use here. Worse, the OWL API works by mutation, so the axioms in B might now reflect only some of the axioms refering to B.

Of course, there is a way around this, which is to dip down into the OWL API, fetching the axioms this way. As far as I can tell, annotations need to be added at the time the axiom is created (it is probably possible to do it later as well). This example comes from my recasting of the OWL Primer ontology.

(add-axiom
 (.getOWLSubClassOfAxiom (owl-data-factory)
  Man Person #{(owl-comment "States that every man is a person")}))

This works well, but the syntax is not nice, we need to do a direct call to the OWL API. We are not even using the add-subclass function. This did not bother my overly, as it was not something that I thought would be needed often.

Unfortunately, it is something that the Gene Ontology people do often, including, for instance, annotating labels with the source of knowledge for these labels. If I am to support them, I need an attractive syntax that fits with current Tawny-OWL syntax. After a couple of attempts, I decided on this:

(defclass A
  :super (annotate B
           (owl-comment "A is a kind of B")))

The axioms in Tawny-OWL are syntactically implicit, describing the :super relationship between A, so I cannot directly address these. But attaching an annotation to B in this way is unambiguous. Compare these two statements that it might otherwise be mistaken for; in one case, we annotate A with a comment (which is most common thing to do) or, we inline an annotation of B (which would probably be better not inline!).

(defclass A
  :super B
  :annotation (owl-comment "A is an interesting entity"))

(defclass A
  :super (owl-class B
            :annotation
               (owl-comment "B is an interesting entity")))

This also extends naturally to other axioms, including annotation labels.

(defclass A
  :annotation (annotate (label "A")
                 (owl-comment "According to me")))

The implementation of this took me several attempts, including some fairly painful and ultimately unsuccesful macros. In the end, I found a much simpler solution. annotate returns a clojure record which contains both the entity — (label "A") or B in these examples — and the annotation always an owl-comment here, but potentially anything. This record is passed through the Tawny-OWL function call stacks in place of the raw entity, until the appropriate axiom is created. I then unpick this object with two calls to protocol methods — as-entity and as-annotations like so.

(.getOWLAnnotationAssertionAxiom
    (owl-data-factory)
    (as-iri named-entity)
    ^OWLAnnotation (as-entity annotation)
    (as-annotations annotation)

The protocol implementations are trivial.

(defrecord Annotated [entity annotations]
  Entityable
  (as-entity [this] entity)
  Annotatable
  (as-annotations [this] annotations))

These allow me to avoid checking for an Annotated object when creating my axiom. In most cases, I will not have one of these, but a normal OWLObject. Or a Long, String or even a Keyword for property characteristics. So I extend the protocols to cover these cases also, with even more trivial implementations.

(extend-type
    Object
  Entityable
  (as-entity [entity] entity))

(extend-type
    Object
  Annotatable
  (as-annotations [entity]
    #{}))

Finally, the annotate function broadcasts as do many other functions in Tawny-OWL, so it is possible to annotate several axioms at once. So, for example, here explicitly using a list.

(defclass A
   :super (annotate [B C D]
             (owl-comment "All of these are supers")))

Or, implicitly with a function existential restriction that itself uses broadcasting.

(defclass A
   :super (annotate (owl-some r B C D)
             (owl-comment "All of these are existentials")))

While the syntax is slightly more complex than most of Tawny-OWL, it is a considerable improvement on dropping down to the OWL API layer beneath; and, ultimately, this form of annotation is a more complex usage of OWL.

Bibliography

In this post, I describe the different technologies that I’ve tried for writing a new book, from markdown to LaTeX.

One of the questions that I have been asked about Tawny-OWL (http://www.russet.org.uk/blog/2366) was whether there was a manual or not. Actually, the answer is, yes, there is and has been since well before the 1.0 release.

However, it’s not rich enough. It makes requires both a reasonably good background in Clojure and in Ontology development. Given that the overlaps between these two areas of knowledge is probably limited to myself and Jennifer Warrendar this is rather less than ideal. So, I’ve started a new manual in the form of a book called Take-Wing. Getting a nice environment has been an interesting journey.


Markdown

The orginal version of the documentation was written in Markdown. There was not really a good reason for this, other than I wanted to try it and it is well supported. It’s not a bad format and is easy to type, although it lacks in extensibility (which is, of course, why it has been extended in so many different ways!).

However, one significant problem was that I have no good way of checking that the code samples in it work. I had been hit by this when I changed function names, for example, from owlclass to owl-class, or more perniciously when I moved one entity from a variable to a function. My general solution to this linked-buffer which translates documentation into code and vice-versa. The lack of an explicit delimiter for the start and end of code blocks makes this harder to implement for markdown.


Asciidoc

My next thought was to move to asciidoc. I like asciidoc and have used it for years. I helped to add slidy support to it, so that I can use it for slides. Most of my teaching material uses asciidoc now, because I can get slides and lecture notes from the same source, with easy to integrate code snippets that I can run, albeit maintained in independent source files which can make things a little painful.

The first version of Take-Wing used asciidoc. Because I wanted to use linked-buffer to generate Clojure source that I could test, I needed to use multiple files with a master file — the Clojure cookbook uses much the same system, albeit for a slightly different reason.

However, writing a talk or even a lecture series is a very different thing from writing a book. The main problem was not asciidoc itself, but the support from Emacs. There is an asciidoc mode but, aside from needing patching, it is heavily focused on syntax highlighting rather than document structure. So, inserting cross-references and the link were painful.


Org-Mode

My next idea was to try org-mode. So, I used pandoc to do the translation; this wasn’t a glowing success, but it achieved the basics anyway.

And Emacs specific solution doesn’t seem ideal, but then for Take-Wing I am likely to be the main author. And I have been using org for a while, particularly for building plans both for Tawny-OWL and also, ironically, for the outline for Take-Wing.

It has clear code demarcation blocks, so adding support for linked-buffers was straight-forward enough. One significant problem is that the org-mode export framework has recently been replaced, so I have to install my own version; actually, installing a new version is not problematic as this can be done with package.el, but because org-mode is in core emacs, deleting the old version is doesn’t happen automatically. I found several version conflicts resulting from loading old files. Still, I managed to get org working, producing both a PDF and nice web page.

Org-mode was pleasant enough to use. The folding features were nice and the syntax is reasonable for typing. However, I found the same problems as with asciidoc. While, org has advanced features for linking to the rest of the world, linking within documents is not so good, especially when using a master document. Basically, I couldn’t get over the envy of my PhD students both of who have written their thesis with AucTex


LaTeX

LaTeX itself is an old piece of software, but it does the job of document preparation like nothing else. And, the support in Emacs is fantastic. However, it’s achilles heel has been that is produces PDFs. I don’t like PDFs for a variety of reasons, but the main one is that, essentially, it’s a screenshot of a piece of paper. I’ve long gone natively digital: I write my papers, blog posts, emails and everything else on computer, and I am used to reading in this way as well. And I want the web for reading documentation.

In my time, I’ve tried lots of different technology for webifying LaTeX. The best I had found was Plastex; my post on LaTeXtowordpress (http://www.russet.org.uk/blog/1740) is still one of my best read, and my colleague even used it to post her PhD thesis (http://themindwobbles.wordpress.com/2012/06/14/converting-a-latex-thesis-to-multiple-wordpress-posts/). It’s a good tool, but it’s not using TeX underneath and so it doesn’t always do the right thing as a result.

For some reason, I managed to have missed tex4ht, though, even though it has been around for quite a while. My initial impressions are, though, that it is probably the best tool that I have seen even if it has been blessed with some of the worst documentation and a very strange command line. I’m hopefully that make4ht may help this. It’s author has been fantastic on StackExchange in answering my questions. So, I decided to give it a go, so I exported by org to latex, threw away my org and started again.

The first problem that I didn’t like it’s handling of the listings package. I have fixed this by dropping code straight through to HTML. Initially, I tried highlight to do syntax highlighting, but then dropped this for prism, as the former cannot do inline highlighting also. The default footnote handling is also very strange, but this is also easy to fix with a standard package. So, now, after a lot of fiddling, it all seems to be working.


Conclusions

It’s been quite a lot of effort and rather more fiddling than I would have hoped, but I now have an editing environment that I like, and this is important. Writing a document is hard, and the environment needs to support this and not get in the way.

I am hopeful that tex4ht will fulfil it’s initial promise. I have been using LaTeX less and less over the years, simply because of its lack of HTML support, despite the fact that it’s the best environment for writing in: so when I have been most happy as an author, I have been least happy as a reader.

All I have to do now is write the rest of the book; so far, I think it’s going well, and this is something that I shall cover in later blog posts.

Bibliography

About six months after the release of Tawny-OWL 1.1.0 (http://www.russet.org.uk/blog/2989), I am pleased to announce the 1.2 release. There have been a number of changes in this release, some planned, some not.

The planned changes have been an extension of the rendering capabilities of Tawny-OWL; it is now possible to render an OWLEntity entirely into Clojure data structures. The main reason for this was to allow use of core.logic for building an effective syntactic query language over OWL. This is now complete, although more work needs to be done on tawny.query if this is to be exploited to it’s fullest.

The main unplanned release was caused by a change in the OWL API which had dramatic (and negative) consequences for Tawny-OWL performance. The details of this change have been documented elsewhere (http://www.russet.org.uk/blog/3007). Fixing this actually resulted in small increases in rendering performance. This also lead me to some aggresive micro-optimisations of the Tawny-OWL code base, mostly meaning I can avoid using variadic function calls where necessary (these box arguments into a list, which can be quite slow). On small ontologies, Tawny-OWL will load several times faster now.

The pace of development on Tawny-OWL has slowed somewhat in recent times; partly, this is because it is now reaching a point of maturity. However, I have also been diverted by two other projects. I have put a reasonable amount of effort into Linked buffer which provides a rich literate environment, originally intended for Tawny-OWL, although it is not specific to it. And, secondly, I have started a manual for Tawny-OWL called Take Wing. This is developed using Linked Buffer. Not the first example of a literate program as a manual, I know, but probably the first example a literate ontology as a manual.

Despite the slowed pace, I have a roadmap for Tawny 1.3. I wish to support what I am calling transactional restrictions; these are a set of axioms which are either added all together or not at all. The use case for this comes from the Gene Ontology who use annotations on their annotations to describe evidence for particular statements. The meta annotations should not be added to the ontology, until the annotations are.

And secondly, I am reworking tawny.pattern — the existing value partition support has already been changed, and there are some new functions which make it easy to turn a functional pattern into a macro one; the latter produce the more attractive syntax.

No doubt there will be a few unplanned additions also!

Bibliography

Yesterday, I came across a very strange bug in my Tawny-OWL library (1303.0213). The bug was my fault, but fixing it was exacerbated by what has to go into my list of Clojure gotchas (http://www.russet.org.uk/blog/2991).

Consider this very simple piece of Clojure code. Is this a recursive function, i.e. one that calls itself? Intuitively the answer is yes, but I would argue that actually, you cannot tell from the information given. It could be, but is not necessarily so.

(defn countdown [n]
  (if (pos? n)
    (cons n
          (countdown (- n 1)))
    '(LiftOff)))

Let’s run this code:

test-project.core> (countdown 10)
(10 9 8 7 6 5 4 3 2 1 LiftOff)

All is working as it should and the function is recursing, so there appears to be no problem. However, countdown is a big long to type, so lets create an alias called cdown and test that this works as expected.

test-project.core> (def cdown countdown)
#'test-project.core/cdown
test-project.core> (cdown 10)
(10 9 8 7 6 5 4 3 2 1 LiftOff)

All is good, cdown is doing just what we expect also, and it’s the same as countdown. However, as we as being lazy, I am easily bored, so now we make the countdown go faster by redefining the countdown function.

(defn countdown [n]
  (if (pos? n)
    (cons n
          (countdown (- n 2)))
    '(LiftOff)))

test-project.core> (countdown 10)
(10 8 6 4 2 LiftOff)

Again, everything is working. But what happens when we call cdown

(cdown 10)

If you are new to clojure, you might assume that as we have changed the definition of countdown the definition of cdown would change as well, so we should get

test-project.core> (cdown 10)
(10 8 6 4 2 LiftOff)

If you are more experienced with Clojure (as I would claim myself to now be, although others may disagree), then you would not that the alias was to the function and not the var, so cdown now contains a pointer to the old definition, so it will behave as it did before.

test-project.core> (cdown 10)
(10 9 8 7 6 5 4 3 2 1 LiftOff)

As it happens, this is not true either. What we actually get it this:

test-project.core> (cdown 10)
(10 9 7 5 3 1 LiftOff)

A wierd combination of the two functions. What is happening is this: the initial function call to cdown calls our first function definition; 10 is postive, so we subtract 1 to get 9. And, now, here is the strange bit. The function is no longer recursive, because it calls countdown through the indirection of the clojure var; we could be more explicit and replace the “recursive” call with this instead which is what is actually happening in the compiled Java class.

((var-get (var countdown)) 10)

So, in short, in Clojure a recursive function is not necessarily recursive, even if it is when defined. If the var changes, then the function no longer calls itself.

There are solutions to this. I should have defined my cdown alias like so:

(def cdown #'countdown)

Now, when we defined countdown the original function gets lost and GCd rather than hanging around, a pale shadow of its former recursive self. This works but is rather counterintuitive; with a Lisp-1 you get used to refering to functions directly rather than with a quoted symbol, but here we have to use both a quote and a #.

A second solution is to use recur, which does not use var indirection. So consider this example. The first countdown2 function is recursive and will always be since it directly refers to itself, and this remains after redefinition.

(defn countdown2 [i]
  (when (pos? i)
    (println i)
    (recur (dec i))))

(def cdown2 countdown2)

(defn countdown2 [i]
  (when (pos? i)
    (println i)
    (recur (- i 2))))

;; prints 10, 9, 8...
(cdown2 10)

Of course, we cannot use this in the first example, because it is not tail recursive. And, if we want to be really picky, we could argue that countdown2 is not recursive either because it is tail-call optimised, so it doesn’t call itself. And this is not an implementation detail either because it’s the whole point of recur.

Another solution would be for Clojure to provide a recur* form (or something equivalent) which would work away from the tail position and which directly recursed, rather than using var indirection. It’s unlikely that Clojure will do this, however, because this form of recursion consumes stack and so is not a good thing to do.

It would also be possible to change the semantics of Clojure symbols so they automatically refered to vars rather than values as appropriate. So in all these cases, we would have vars. As far as I can tell at the moment, the same syntax y is refering to two different things — the value of the var y in the first two forms and the var itself in the third.

(def x y)
(def z {:x y})

(defn y[]
  (y))

To replicate the current behaviour, we would just use var-get.

(def x (var-get y))

Without thinking about this too deeply, I cannot actually see a downside to this, except that it would involve more var indirection, so would probably be slower.

Finally, clojure could just remove var indirection. Actually, this is likely to happen in the form of build profiles, because it also has the advantage of making Clojure faster. As my example shows, however, it also introduces some subtle changes to the semantics of the language.

I still like Clojure. The more you use a language, I guess, the more oddities you find.

Bibliography

Godwin’s law says that sooner or later every argument on the internet ends up with someone being called a Nazi. Interestingly in the Open Access debate, this has turned out not to be true; in this case, it’s been communists instead.

Perhaps the best known example of this was the Jeffrey Beale rant, which uses the term “anti-corporatist” and “collectivizing”. A more direct communist angle comes from Moshe Vardi at ACM. He defends the idea that publishing is expensive because of all the fixed costs, something seen from the ACM before, and which I have commented on (http://www.russet.org.uk/blog/1924). He also makes the tired confusion that “free software” is the same thing as “anti-copyright”.

Now, personally, I do not particularly mind being called a communist — I do have left-leaning political views, but in the US “communist” is essentially used as a short-hand for being personally responsible for Stalinist purges.

For me OA is just a small part of the issue. I do want OA because I get irritated that I have to read papers at work because at home I can’t login, as well as the slightly more philanthropic ideal that scientific knowledge should be free.

But I also want the entire publication process to be easier (http://www.russet.org.uk/blog/2248). I want a single submission format, free from the stupidity of multiple formats. I want peer review to be open and acknowledged. I want the process to take less time, less effort and less money. I want colour figures, I want animation (http://www.russet.org.uk/blog/2189). OA does not necessarily provide this: both OUP (10.1093/bioinformatics/bts372) and PLOS (10.1371/journal.pone.0075541) were painful. However, arXiv for example, provides cheap, rapid and simple publication at $7 a paper. WordPress manages billions of articles as a loss leader.

Open-Access is only part of the issue. Scientific authoring is a hard, arduous and difficult process. Scientific publishing should not be. This is the issue.

Bibliography