## Archive for the ‘Ontology’ Category

### The Epistemology of Pizza

Over the years, a great deal has been written about the ontology of pizza . It’s a good example, is easy to understand and works surprisingly well in a tutorial context. It is also comes up surprisingly commonly in the public sphere as it did last year on BBC News . The key point of which is this: the pizza maker argues that you can’t have a marinara (tomato and garlic) with added mozzarella because a marinara is pizza rossa which can’t have mozzarella; a margherita (tomato and mozzarella) with garlic is fine though. Ha, those crazy Italians. I paraphrase, of course.

Of course, the right to comment first on this article rests with Robert Stevens, my colleague and world acknowledged authority on the ontology of pizza, and indeed he has done so .

I would like to take a slightly different approach to the question though. How do we know what our ontology should be? I’ll start by ontological reading of the article, look at some issues with it, and then consider how we might gather the knowledge to resolve them. As with many parts of ontology construction, there is not a perfect answer and it turns out to be more complex than it might appear at first sight.

## A Literate Approach

I start by building an ontology. As (both) regular readers of this journal might expect, I am going to use Tawny-OWL to do this, so the syntax is a little different than for Robert’s post.

I’m starting from a slightly different place from Robert. One of his concerns was to fit within the context of the existing pizza ontology. Now, I use to be a firm believer in the idea that ontologies were all about a shared conceptualisation of the world, but over time, I have become less sure that this should always be a consideration. In this case, the purpose of building the ontology is to allow me to formally and explicitly describe something with the context of this blogpost, to clarify my own understanding. Sharing is irrelevant for this use case and I am going to build my ontology rapidly from scratch.

I am a big fan of literate ontologies and I want do that with blogs as well. So the source code of this post (accessible from http://archive.org/download/phil-lord-journal/), is lenticular text and the whole article can be parsed as a valid ontology. Perhaps no change for the reader, but a comfort for me as the author to know that I’ve evaluated every statement in this post.

 (ns the-epistemology-of-pizza (:refer-clojure :only []) (:use [tawny owl english reasoner])) (reasoner-factory :hermit)

As with all Tawny-OWL ontologies, we start with a namespace declaration. If you are reading this and use Clojure a lot, note the :refer-clojure and use of tawny.english: none of or, some or not are the clojure.core functions! We select a default reasoner also.

 (defontology epistemology) (defclass PizzaTopping) (defoproperty hasTopping) (as-subclasses PizzaTopping :disjoint (defclass Mozzarella) (defclass Tomato) (defclass Garlic))

We need a set of primitive terms on which to base the ontology, which we define here. Again, this is not an ontology for sharing: I have not added labels, formal textual definitions nor do I need to care about what the IRIs actual are. We only need three toppings and all we need to care about them is that they are different.

 (defclass Pizza :super (some hasTopping (or Mozzarella Tomato)))

It turns out, I do not need to care about pizza bases either, so I am not going to talk about them explicitly. Rather unusually (and in a difference from the pizza ontology) I am going to insist that a Pizza have either Mozzarella or Tomato. I’ll come back to this later.

 (defclass PizzaRossa :super Pizza (only hasTopping (not Mozzarella))) (defclass PizzaBianca :super Pizza (only hasTopping (not Tomato)))

The definitions of PizzaRossa and PizzaBianca are a little surprising; they are defined negatively, but I quite like these definitions. They are sort of like the backward definitions of the fly geneticist — the “white” gene is responsible for the enzyme that causes red eyes. The gene is named after the mutant.

 (defclass Marinara :super PizzaRossa (some hasTopping Garlic Tomato))

The definition of Marinara is now straight-forward enough, simply stating the ingredients, and that Marinara is a PizzaRossa.

 (clojure.core/assert (with-probe-entities [WierdMarinara (owl-class "WierdMarinara" :super Marinara (some hasTopping Mozzarella))] (clojure.core/not (coherent?))))

And, so we get to the humourous crux of the story which is, indeed, you cannot have a Marinara with Mozzarella.

## The Epistemological Questions

But this leaves us with a number of problems. One of which is that in explaining the joke, we have rather killed it; one sad reality that all ontologists have to face is that it pushes us toward pedantry, making us humourless, crushing bores. A much deeper problem though is that what we have produced is a computational data structure which we have queried and got an answer about that computational data structure. When what we really want to know about is pizza.

How do we know whether what we have modelled is correct. Is this ontology a good ontology, an accurate reflection of reality? And what does this mean anyway? In short, how do we know what we know?

Let’s consider the issue from a set of different perspectives.

### The Software Engineer

As a software engineer, I’m rather fond of the ontology that I have produced, and in that sense it is a good ontology. I’ve already said that I quite like the “backward definitions”, and find them quite elegant. The ontology is also symmetrical: PizzaBianca and PizzaRossa are both defined in the same, quite equivalent way.

Now, of course, we might argue that this considerations do not tell us that we have a good ontology. But an elegant and symmetrical axiomatisation is useful; it’s easier to remember, there are no “special cases”, and it is easy to spot outliers. All of these things support maintainability of software in general and specifically in ontologies.

There are some issues with my ontology; we have not, for instance, following Alan Rector’s normalisation pattern ; the Margherita is neither white nor red in this schema.

Margherita presents a bigger problem, though, than not being normalised, which is simply that my knowledge of margherita tells me that it is normally considered a pizza rossa, while my ontology say it is not. My ontology is nice, but it is wrong.

### The philosophical approach

We could also consider this ontology from a more philosophical point-of-view. Of course, I am perhaps not the ideal person to do this, but I would note that our ontology has a clear, single inheritance hierarchy. Most of our classes have clear differentiatia (I’ve just stopped at the toppings, but you can’t define everything, because that turtles ). Even our definition of pizza could fit into a larger hierarchy: pizza without either mozzarella or tomato is either a focaccia or some other type of bread.

After that, I am rather stuck. It is hard to draw many more conclusions about pizza from first principles. From a realist point-of-view, we should model reality and universals, which sounds nice. But how do I determine that what that reality is?

Time to phone a friend.

### The Expert Analysis Technique

One standard technique in ontology building is to consult with an expert. Indeed, that is often the main evidence and justification that is used to support an ontology, which is why many ontology papers have more authors than the human genome paper. So, let’s try it in this case. The BBC article quotes from quite a few experts.

“La marinara is a pizza rossa,” she states frostily. “A pizza rossa is made with tomato and without mozzarella. So you can’t have a marinara with mozzarella because there’s no such thing.”

Emanuela
— BBC

My ontology supports this because Marinara is a PizzaRossa so, indeed, cannot have Mozzarella.

“No, it’s not,” pipes up a customer who until now has been quietly consuming his pizza and beer on a stool behind me. “She’s right. A pizza rossa can’t have mozzarella.”

Customer
— BBC

Also!

The pizzaiola is right. A marinara is not a marinara if you add mozzarella. But she was wrong to say she would make you a margherita with garlic because margherita with garlic doesn’t exist.”

Friend
— BBC

Currently, my ontology says nothing about this one way or the other. But, we can add a closed definition for Margherita easily, and now, indeed, a margherita with garlic cannot exist.

 (defclass Margherita :super Pizza (some-only hasTopping Mozzarella Tomato)) (clojure.core/assert (with-probe-entities [WierdMargherita (owl-class "WeirdMargherita" :super Margherita (some hasTopping Garlic))] (clojure.core/not (coherent?))))

I also tried extending my analysis further, with a novel technique; I tried it by asking my friends on facebook. One of them, replied as follows.

my take on this is that we must distinguish between pizzas made by bakeries (pizza bianca and rossa) which tipically (but theres no firm rules) do not have mozzarella, and pizza from a pizzeria (restaurant) which almost always has mozzarella (bianca and rossa). a pizza bianca without mozzarella is a focaccia ( with rosemary, onions, potatoes, etc)

— Ulisse Pizzi

It’s a disaster! Almost, none of this has been modelled in my ontology, although it supports my assertion that a pizza must have tomato or mozzarella. But worse, “there are no firm rules”; it’s enough to make a grown ontologist cry. But there are some deeper frustrations here. None of the experts have told me all the issues that I want to know. None have really answered what is a pizza.

We could ask some more experts? But what happens when they contradict? Do we take averages? And, more important, how do I know when to stop; I could carry on asking friends all day. And how do we avoid cherry-(tomato)-picking?

### From the Definitive sauce (erm, source)

Having tried ontology by facebook, let’s try a research method with a much older and richer pedigree; we will look the answer up on wikipedia instead. First port of call is the main Pizza, which says many things, but none of them that useful in this case. Hunting around got me to Pizza al taglio (pizza by the slice) which says:

The simplest varieties include pizza Margherita (tomato sauce and cheese), pizza bianca (olive oil, rosemary and garlic),[4] and pizza rossa (tomato sauce only).

— Pizza al taglio

This is interesting becuase it brings in garlic and rosemary (which has not been mentioned before). In this version of the world, a margherita is not a pizza bianca. And there is a lack of symmetry between rossa (which is closed) and bianca (which is not).

Perhaps a better source of material would be to look on the Italian wikipedia for its definition of Pizza. This says:

Pizza marchigiana […] Le varianti tradizionali sono quattro: bianca[12] con il rosmarino, bianca[12] alla cipolla, rossa semplice[13] e rossa[13] con la mozzarella.

Pizza marchigiana
— Italian Wikipedia

So, more types of pizza, including with rosemary, with onion and simple. The footnotes are more informative still.

[12] Per “bianca” si intende una pizza senza pomodoro. [13] Per “rossa” si intende una pizza con il pomodoro.

Pizza
— Italian Wikipedia

So, pizza bianca must NOT have tomato, while rossa MUST have it. The definition of pizza rossa here is inconsistent with mine, but makes more sense to be honest — margherita finally becomes a pizza rossa. Strictly, this quote only really applies to “pizza marchigiana” — once we move out of Marche, all our definitions might be different!

Another useful thing that does come out of reading wikipedia, however, is the information that Pizza Neopolitana has regionally protected status in the EU. And that there is an official body for defining what is a pizza, namely the AVPN. Their document describing the pizza is called the disciplinare (also in English).

And, indeed it has definitions, although unfortunately only of two pizza, the Marinara and the Margherita. So, the answers we get from here are limited; but they look like this:

Marinara (tomato, oil, oregano, and garlic) Margherita (tomato, oil, mozzarella or fior di latte, grated cheese and basil)

Disciplinare
— Associazione Verace Pizza Napoletana

So, we have our answer? Well, it’s one I find surprising (grated hard cheese on margherita!). But, actually, the disciplinare is more specific. For instance, where they say tomato not just any tomato will do, they have to be one from a very specific type:

The following variations of fresh tomatoes can be used: “S.Marzano dell’Agro Sarnese-nocerino D.O.P”., “Pomodorini di Corbara (Corbarino)”, “Pomodorino del piennolo del Vesuvio” D.O.P.” (see attached appendices for suppliers and technical details)

Disciplinare
— Associazione Verace Pizza Napoletana

How can we encode this? It really just too complicated, and is unlikely to be useful ontologically at any point; worse, we only have answers for two pizza types and there many others. Looking further, there are also instructions for the flour, the proving, the stretching and much more besides. So, it turns out that the definitive answer is not that useful either: it is incomplete and overly complicated.

## Conclusions

The BBC articles take on the whole process is this:

Pizza has taught me that logic can be subjective and that subjective logic can be cultural

— BBC

Of course, it’s not true; logic is not subjective at all, although there are many different forms of logic. But definitions can be. I have tried multiple different mechanisms of reaching a definition, and all of them have flaws:

• software engineering approach — maintainable, nice but correct?
• philosophical approach — limited in its applicability
• find a friend (the facebook approach) — prone to cherry picking
• literature review (the wikipedia approach) — lacks interactivity, so may not answre the question
• A definitive source (the authoritarian approach) — over specified, under covering

Definitions are difficult, and there will be no universal answer. Having a clearly defined use case, and a mechanism to test your ontology against that use case remains key. But also having a clear awareness of the techniques that you are using for build your ontology and the flaws that exist with them.

It is also a good excuse to eat pizza, if you need one.

Minor spelling corrections!

## Bibliography

### Bio-Ontologies and ICBO

I’m winding by way back from a busy month with both Bio-Ontologies and ICBO, but in general I think the experience has been really positive, even if interspersing holiday and work travel has rather exhausted me. But both were in Europe and Bio-Ontologies was right next door, so I did not want to waste the opportunity.

I have a long history with Bio-Ontologies, having been a chair for many years and a informal helper before that. We steered it from an informal meeting, to having a proper programme committee, proceedings and much of the structure that it has now. I bumped into Steven Leard at the meeting, and was rather shocked to realise that the first meeting I helped out at was 14 years ago.

Strangely, though, since my last time as a chair, five or six years ago, I’ve never been once. For a few years, of course, this was quite deliberate; I was so fed up with travelling at that time of year, that I really enjoyed the rest. But since then, it has been happenstance, rather than a deliberate decision. So, it felt like a bit of a home-coming, and even if I have seen many of the people at different conferences on different occasions. Mark Musen gave a interesting keynote: I was, at the time, rather unconvinced by this hypothesis that we don’t spend enough time arguing (I mean, ontologists, really?). A more nuanced reading of what he said though, is that we should assess and re-assess our practices against the evidence of our experience. I cannot help but agree with this, and it has made me think again. More on that later, perhaps.

It was nice to go to Dublin, also, as it was my first time. Nice city, deeply integrated with it’s river. We had some nice feed, in some good resturants and cafes, and a blissful absence of Irish theme pubs. The conference venue was good also, even if it does look like a vacuum cleaner from outside.

ICBO was a different kettle of fish, though. At four days (many of the delegates go for the whole thing) it’s long, and I felt rather stretched by the end (I’m on the plane home now, after a very early start, which might be colouring my vision). This does give plenty of time for slightly longer and more detailed presentations; the workshops were small, intense and full of discussion. Likewise, the poster and demo sessions. I rather blitzed the conference with Tawny-OWL , and Lentic . In total, I gave 1 tutorial; 1 paper; 1 demo; 1 flash update on the demo and 1 feedback session on the tutorial. People seemed genuinely sympathetic and a little sad when my cute Tawny-OWL logo went 404 during the flash update. For those who missed it, the logo is online, as is the logo for lentic which lacks in cuteness, but is rather more dramatic.

I got some good feedback, was surprised to win the best demo session (I mean, it was entirely text running in Emacs, and very laggy, running on my 5 year-old netbook). The second place was James Overton’s Robot. I am told between the two of us we got a very large percentage of the vote. I think this is an interesting result, because it strongly suggests to me that, for ICBO attendees there is disatisfaction over current tooling. Ontologies are being more programmatically developed and I cannot help but feel that this is the future.

I thought I had never been to Lisbon before, but on getting there I realised I had been, about 20 years ago; the story of is long, and not that interesting so I will skip describing it here. This time I had a better look and I will not forget again. Lisbon is very nice city indeed; while it’s architectural elegance may not be quite up there with Rome (or even Milan), it’s certainly not far behind, but as a city built into and with its geography it is stunning.

In summary, an interesting month from an ontology perspective and one that I enjoyed very much. While I might have wanted for something a little less hectic (especially, as I interspersed my holidays ), it has left me with the sense that ontologies are both a productive part of the bioinformatics environment and a sense that there is more to come.

## Abstract

Bio-medical ontologies can contain a large number of concepts. Often many of these concepts are very similar to each other, and similar or identical to concepts found in other bio-medical databases. This presents both a challenge and opportunity: maintaining many similar concepts is tedious and fastidious work, which could be substantially reduced if the data could be derived from pre-existing knowledge sources. In this paper, we describe how we have achieved this for an ontology of the mitochondria using our novel ontology development environment, the Tawny-OWL library.

• Jennifer D. Warrender
• Phillip Lord

## Plain English Summary

Ontologies allow complex descriptions of the world in a way that is both precise and computationally amenable — that is, computers can be used to check and query these descriptions. The mitochondria is a critical part of the cells of most organisms, being responsible for energy usage. We wished to build an ontology describing the current research on the mitochondria.

The more traditional approach to this, would have been to build the ontology from scratch; but many parts of the mitochondria, including the genes and proteins have already been described in other databases. Building from scratch on the basis of the data in these databases would be time-consuming, but also sensitive to change — if the database changes, our ontology would need updating too.

Instead we have used our new ontology development methodology to automatically extract this knowledge, and build the ontology for us providing what we describe as the scaffold for an ontology. In future, we will add more knowledge to this ontology, slowing building up the rich description of the mitochondrion that we are aiming for.

## Abstract

Ontology development relates to software development in that they both involve the production of formal computational knowledge. It is possible, therefore, that some of the techniques used in software engineering could also be used for ontologies; for example, in software engineering testing is a well-established process, and part of many different methodologies. The application of testing to ontologies, therefore, seems attractive. The Karyotype Ontology is developed using the novel Tawny-OWL library. This provides a fully programmatic environment for ontology development, which includes a complete test harness. In this paper, we describe how we have used this harness to build an extensive series of tests as well as used a commodity continuous integration system to link testing deeply into our development process; this environment, is applicable to any OWL ontology whether written using Tawny-OWL or not. Moreover, we present a novel analysis of our tests, introducing a new classification of what our different tests are. For each class of test, we describe why we use these tests, also by comparison to software tests. We believe that this systematic comparison between ontology and software development will help us move to a more agile form of ontology development.

• Jennifer D. Warrender
• Phillip Lord

## Plain English Summary

Ontologies are a mechanism for representing parts of the world computationally. They allow you to describe the world in a complex way, and then query over it repeatable and consistently. However, ontologies are complex and are themselves hard to build consistently and repeatably. If the ontology is built incorrectly, then queries will give the wrong answers also.

Software is also complex and over the years, software engineers have developed many techniques for building software so that it, too, is correct. While these do not always succeed, they have allowed us to produce software that is vastly more complex than in years past. One important technique is automated testing. Here software can be run to ensure that it is behaving correctly automatically and often. To do this, we use one piece of software to test another.

We have borrowed the same technology for use with ontologies; while this has been done before, our use of commodity testing software has allowed us to scale up the tests significantly, and we describe this approach in this paper. However, while they have many similarities, ontologies are not software. The sort of tests that we need for ontologies may be different from those that we need for software. In this paper, we also describe the kinds of tests that we have used for the karyotype ontology , and which are probably relevant to other ontology development efforts too.

Overall, this should increase our understanding of how to build ontology tests and ontologies.

## Bibliography

### On Function, Recursion and Evolution

I was entertained to see the recent publication of a new paper on the definition of function . I met one of the authors at a meeting a few years back in Durham, and had a very nice discussion about my own contribution to this definition which I published previously .

I do not want to discuss the paper in full, which is a nice paper and worth a read. I do however want to comment more specifically about the parts that explicitly and implicitly address my own paper.

At the start of the paper, the authors discuss the criteria for their definition which includes this:

Avoidance of epiphenomenalism: Functions should be determined by current performance of its bearer, not mainly by causally inert historical facts like its (evolutionary or cultural) history or a mere ascription by its producers, users, or observers

I found this a fairly strange criteria; it’s not clear to me why historical facts are inert; especially in biology the evolutionary history of an organism is surely one of the most important features. Originally, this criteria comes from another paper by Artiga who says:

We want to find out what is the lung’s function, we would probably look at what lungs actually do in our body. We would see that they enable respiration, so we would conclude that this is their function. Why they came to be here seems completely irrelevant for function attribution.

Obviously, this means “most peoples” bodies rather than just one, given that lungs do (somewhat) different things in different people. But, I do not think that why they came to be here is irrelevant, at least not if we wish to distinguish with a role. My fingers are currently engaged in typing, but few people would describe this as a function (although most would say that precise and controlled manipulation of the world is). Or to make a more extreme position, after Robert Hoehndorf, the heart actually does produce loud thumping noises. Surely not a function?

I am also slightly disappointed that what I think is one of the key points of my own function paper has been missed from their list of criteria. In it, I say:

I consider whether these definitions are applicable; for a given set of entities how do we decide whether we have a function (of either subclass) or a role.

Given a definition, I should be able to produce at least one practical test that I can use to determine whether that definition holds; I think that this notion of applicability needs to be more widely considered.

Now, my actual definition of biological function was:

A biological function is a realizable entity that inheres in a continuant which is realized in an activity, and where the homologous structure(s) of individuals of closely related and the same species bear this same biological function.

The language has been chosen to mirror BFO since it was in this context that the paper was addressed; I think it could be simplified and made more readable, but I was constrained by the language of BFO. Now, the first criticism on my definition is on technical grounds namely:

Lord claims that his definition is recursive rather than circular, despite the occurrence of the word “function” in the definiens.

My use of this form of definition was, of course, deliberate and partly provocative; perhaps, it is something that I should not have done, since it has muddied the water somewhat as this comment shows. In fact, it is very easy to work around this criticism by simply removing the recursion:

A biological function is a …. same species bear this same realizable entity.

The technical criticism has now gone. But I do not like the definition as much because “the same realizable entity” would in fact be a biological function. I think we avoid recursive definitions because they can be circular, but this is like avoiding recursive function calls because they may not terminate. And that is a shame, because, as with recursive function calls, I think this form of definition can be quite succinct. Consider:

A spouse is a person who is married to their spouse.

or:

A brother is a man with the same parents as their brother.

If we unwind the recursion, then we get

A brother is a man with the same parents as another man.

Again, we are hiding that reality that both men in this definition are brothers.

Of course, some recursive definitions might actually be circular, and that is less good. But if the applicability of a function is also considered then this issue goes away. I can determine if some one is a spouse or a brother given these definitions, so I see no problem.

A second criticism comes from my statment that:

Hence he concludes that among the instances of realizables that are realizables for the same type of process can be both roles and functions depending on the species the realizable’s bearer belongs to. This presents a problem for the distinction between functions and roles.

I do not think that this is a problem at all, because I say quite clearly that we can distinguish between roles and functions, but that we do this for the individual role or function not at a class level:

My definition distinguishes between the two based on the nature of the relationship to the independent continuant in which they inhere. I suggest that it is very hard to make the distinction at the class level[…]. For an individual continuant bearing a realizable entity, this distinction appears to be much more straightforward.

In otherwords, “for walking on” is either a role or a function. But in human hands it is a role, while for chimps it is a function. I see no reason why the distinction at the level of the individual should be considered to be less relavant than at the class, nor why this should be problematic. Actually, it reduces the need for duplication between the role and function hierarchies; while tools like Tawny-OWL may ease the maintainence of duplication, avoiding altogether still seems sensible.

The final criticism is, I think, the least worrisome. The authors say:

Had evolution stopped after the first species, according to Lord’s definition, there would not have been any biological function at all.

The slightly flippant but none the less entirely valid argument to this is, “but it didn’t”. We could equally argue against a definition of human as having two hands on the basis that they might have evolved a third.

More importantly, though, in most definitions of life the ability to adapt or evolve is part of the definition. Without this, we have a chemical process. So, without evolution, we have no life. Given this, we can rewrite the last statement as:

Had life stopped after the first species, there would not have been any biological function at all.

Which is an entirely true statement; that it drops so nicely out of my definition for biological function is a strength of my definition and not a weakness.

I feel that my definition is still a good one. Rereading my function paper now the argument still seems coherent, and the examples clear. Although I put an entire section on applicability into the function, I do rather regret that I did not introduce it as a general criteria for all ontology definitions explicitly; that this criteria has been missed is surely my fault and not the readers. Perhaps I should have spent more time on that, than on my recursive definition which was not critical to the paper.

At the same time, the fact that discussions on definitions are still going on, for a term that biologists have been using for many years again leads me back to the conclusion that the definitions of such generic terms are not nearly as important as some make out. So long as they are useful, biologists will carry on describing things as functions if it fits their ad-hoc, informal definitions that have been developed over time within a community. I cannot help but think that this is a good thing.

## Bibliography

### Manchester Syntax is a bit backward

Before commit eb2f0e04, I used to have this function in tawny.owl.

 (defbdontfn add-subclass {:doc "Adds one or more subclass to name in ontology." :arglists '([name & subclass] [ontology name & subclass])} [o name subclass] (add-axiom o (.getOWLSubClassOfAxiom (owl-data-factory) (ensure-class o name) (ensure-class o subclass))))

The idea is, as the name suggests to add a subclass relationship to the ontology; on the face of it, everything looks fine. However, a closer look at the OWL API raises a question:

 getOWLSubClassOfAxiom(OWLClassExpression subClass, OWLClassExpression superClass)

The subclass parameter in Clojure maps to the superClass parameter in Java. The subclass in Clojure is actually the superclass.

If we compare the property equivalent in Tawny, things seem more regular:

 (defbdontfn add-superproperty "Adds all items in superpropertylist to property as a superproperty." [o property superproperty] (add-axiom o (.getOWLSubObjectPropertyOfAxiom (owl-data-factory) (ensure-object-property o property) (ensure-object-property o superproperty))))

and the equivalent Java:

 getOWLSubObjectPropertyOfAxiom(OWLObjectPropertyExpression subProperty, OWLObjectPropertyExpression superProperty)

The names of the parameters are now the same way around in Clojure and Java. So, have I made a mistake in Tawny with subclass handling? Actually, no, because we get strangeness at a different point with properties; consider the object-property-handlers which map between frames and the functions which implement them:

 (def ^{:private true} object-property-handlers { :domain add-domain :range add-range :inverse add-inverse :subproperty add-superproperty :characteristic add-characteristics :subpropertychain add-subpropertychain :disjoint add-disjoint-property :equivalent add-equivalent-property :annotation add-annotation :label add-label :comment add-comment})

So, the :subproperty: frame is implemented with the add-superproperty function. As might be expected, :subclass is implemented with add-subclass

Even without this oddness, the problem can be seen when considering just the add-* functions. Consider, add-label:

 (defbmontfn add-label "Add labels to the named entities." [o named-entity label] (add-annotation o named-entity [(tawny.owl/label label)]))

The semantics of this are that the third argument, label, is added to the second, named-entity as a label. It is slightly more complex than this; the b in defbmontfn means broadcast — add-label is actually variadic and flattens meaning that any number of labels can be added.

With add-subclass the semantics are reversed; the second argument becomes a subclass of the third (or, again, because of broadcasting, the third or subsequent arguments). And add-subclass is inconsistent here — all of the other add-* have the same semantics as add-label.

So, clearly, both add-subclass and the :subproperty frame have problems, and are not consistent with the rest of the API. Two important parts of Tawny-OWL have been implemented backward. How did this happen?

## Investigating Manchester Syntax

We can investigate this further, by considering another inconsistency with Tawny. Considering the object-property-handlers above, we can see that while :subproperty is implemented with add-superproperty, :subpropertychain is implemented with add-subpropertychain.

The slot names in Tawny come (nearly) directly from Manchester syntax; so, let us compare Manchester syntax with the functional syntax for sub-properties and sub-property chains, using the OWL Primer. In Manchester syntax:

 ObjectProperty: hasFather SubPropertyOf: hasParent

In functional syntax:

 SubObjectPropertyOf( :hasFather :hasParent )

Compare this to the equivalent declaration for subproperty chain.

 ObjectProperty: hasGrandparent SubPropertyChain: hasParent o hasParent

Or in functional syntax:

 SubObjectPropertyOf( ObjectPropertyChain( :hasParent :hasParent ) :hasGrandparent )

The filler for SubPropertyChain: comes first, while for SubProperty: is comes second.

This suggests that the SubPropertyOf: and SubPropertyChain: frames are back-to-front from each other (this is the values of the slots appear in different orders in the two syntaxes). So, with the former, SubPropertyOf: I am stating that the entity (hasFather) is related to the filler (hasParent) and that the filler (hasParent) is the super property. With the latter, SubPropertyChain: I am stating that the entity (hasGrandparent) is related to the filler (hasParent o hasParent) and that the filler (hasParent o hasParent) is the sub property.

So, the two appear to be inconsistent with each other. So, let’s consider a further analysis of the other slots. Consider, for example:

 A Annotations: rdfs:label B

which means B is an annotation of A.

 A EquivalentTo: B

means B is equivalent to A (or, in this case, that A is equivalent to B as equivalance is symmetrical).

 A Domain: B

means B is a domain of A

 A Type: B

means B is a type of A.

All of these are consistent with each other: the filler (B) has a relationship to the entity (A) which is defined by the slot (type), with the caveat that the EquivalentTo relationship is symmetric.

Now

 A SubClassOf: B
 A SubPropertyOf: B

are backward: the entity (A) has a relationship to the filler (B) defined by the slot (SubClassOf:, SubPropertyOf:) – it’s why the Of preposition has been added. It is not possible to add the same preposition to the other slots; although it is possible to add has to the beginning. So, for example, the natural language semantics of these statements preserves their OMN meaning:

 A HasAnnotation: B A HasType: B A HasKey: B

Of these, only the latter is actually OMN. The only other slots with prepositions are EquivalentTo and SameAs — you could change these to has as well.

 A HasEquivalent: B A HasSame: B

This probably reduces the readability over all, but it does at least maintain the semantics. It is for this reason that I say SubClassOf: is backward; to be consistent, it should be Super:

So

 A Super: B

means B is a superclass of A. Now, we could add the has preposition to the start, while preserving the natural language semantics.

 A HasSuper: B

Everything that I have said here is also true of SubPropertyOf: which behaves in the same way as SubClassOf: (i.e. backwards wrt to most slots).

Going back to the very early question, SubPropertyChain: (note, not SubPropertyChainOf:) is the same way around as most slots and the opposite way around from SubPropertyOf:

 A SubPropertyChain: B o B

could be replaced with

 A HasSubPropertyChain: B o B

In summary, for Manchester syntax SubClassOf: and SubPropertyOf: frames are backward with respect to all the other frames.

## The Implications for Tawny

Unfortunately, the situation in Tawny-OWL was slightly worse than for Manchester syntax. While writing an early version of the karyotype ontology by hand, I found typing too hard so removed the prepositions (:subclass and not :subclassof). Combined with the lack of CamelCase, this seemed a cleaner syntax. But it has exacerbated the issues described here.

Although, I have become aware of this problem before the release of the first full version of Tawny, I decided that consistency with Manchester syntax was worth the hassle. My recent experiments with literate ontologies , however have made me realise that I could not leave the situation as it is. One key feature of Tawny is that it (normally) forces declaration of entities before use which avoids simple spelling mistakes common when writing Manchester syntax by hand. However, only having access to a :subclass slot means that ontologies must be declared from the top of the inheritance hierarchy downward. For a literate ontology, this restriction seems unnecessary, and places an unfortunate emphasis on the upper ontology. I would like also to be able to build from the bottom up.

Neither having the semantics of add-subclass backward, nor the :subproperty add-superclass solution work well as it stands, and extending this to a :superclass slot would make the situation worse. In short, the only sensible fix was to diverge from OWL Manchester syntax, and deprecate the use of :subclass and :subproperty. At the same time, I decided to remove some extra typing. Therefore, :subclass has become :super (shortening and reversing the natural language semantics, retaining the logical semantics), and the new slot :sub has been added. Likewise, :subproperty has become :super and a new slot :sub introduced for properties also. As well as avoiding extra typing, removing the suffix has meant that I can leave :subclass and :subproperty in place but deprecated; the alternative of just reversing their semantics seemed unfortunate. Only the semantics of add-subclass has been broken, being reversed.

The inconsistency with Manchester syntax is currently a little painful, especially as the :subclass slot has been around since the early days of Tawny . The advantage, however, is that I have a simple rule to remember: A :s B means “A has :s B” or equivalently, “B is :s of A“. For this reason, and because it paves the way for richer literate ontologies, I feel that this is a good change.