<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>An Exercise in Irrelevance &#187; Ontology</title>
	<atom:link href="http://www.russet.org.uk/blog/category/all/professional/ontology/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.russet.org.uk/blog</link>
	<description>Ramblings from Phil Lord&#039;s life</description>
	<lastBuildDate>Thu, 02 Feb 2012 14:11:11 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>More on Pici</title>
		<link>http://www.russet.org.uk/blog/2012/01/more-on-pici-2/</link>
		<comments>http://www.russet.org.uk/blog/2012/01/more-on-pici-2/#comments</comments>
		<pubDate>Fri, 20 Jan 2012 22:47:53 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Ontology]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1960</guid>
		<description><![CDATA[I started to write this post a long time ago in October; unfortunately before I finished I got hit with the start of teaching. I considered just ditching the post, as it is now so out-of-date and I am not usually a zombie poster. However, in this case, I shall post as a) it helps [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1960">
<p><a name="preamble"></a> 
<p>I started to write this post a long time ago in October; unfortunately before I finished I got hit with the start of teaching. I considered just ditching the post, as it is now so out-of-date and I am not usually a zombie poster. However, in this case, I shall post as a) it helps my mind to move back toward research after so long away and b) it will be my first of 2012, so I can check my makefiles work!</p>
<p>A couple of follow ups from my <a href="http://www.russet.org.uk/blog/2011/10/the-pici-principle-what-you-should-not-say/">previous post</a>.</p>
<p>Nicolas Le Novere commented via twitter on even the highest level assertion of that radioactivity is a dependent continuant.</p>
<blockquote><p>@phillord fluorescence and radioactivity are occurrent not continuant. Freeze time to check.</p>
<p>@phillord hence the unit of radioactivity: per second (Becquerel)</p>
<p align="right"> &#8212; Nicolas Le Novere </p>
</blockquote>
<p>In my original post, I suggested we needed <tt>Radiation</tt>, <tt>Radioactive</tt> or <tt>Radioactivity</tt>; in hind-sight, perhaps I should have used <tt>Radioactive</tt> rather than <tt>Radioactivity</tt>, which may have circumvented this issue. However, I think it is worth considering this a little further.</p>
<p>I would nearly agree with Nicolas that radioactivity is a process; actually, I would say that radioactive decay is a process, while radioactivity is a property of this process. However, in my last post, I was looking at a model which was &#8220;BFO-like&#8221; as OBI is based on BFO. For BFO, that radioactivity is a rate, is measured per second does not mean that it is an occurrent; any more than velocity which is also measure per second is an occurrent. Actually, in BFO land, <tt>radioactivity</tt> would be a quality of the atoms which are decaying and not a measurement of the process. This is because, as Pierre Grenon says, properties of processes <a href="http://groups.google.com/group/bfo-discuss/msg/a605f86a934b80da">do not exist</a>.</p>
<p>In fact, if we look more at this more closely still, BFO would also claim that radioactive decay is not, as it might appear, a <tt>Process</tt>, because processes are continuous. This is not true for radioactive decay, even for a bulk of radioactive material. An atom decays, then there is a pause, then another decays. This makes radioactive decay a <tt>processual entity</tt>, which can contain discontinuities.</p>
<p>I am not arguing that BFOs treatment of processes is correct&#8201;&#8212;&#8201;in fact, I think it is nonsensical. However, it is this line of arguing that I was using in my previous post.</p>
<p>David Sutherland rather takes me to task about whether realism does what I suggest.</p>
<blockquote><p>I agree completely, but what realist principle says you need to give something the most detailed classification you can come up with?</p>
<p align="right"> &#8212; David Sutherland </p>
</blockquote>
<p>It&#8217;s a good question, but I would turn it around. I don&#8217;t think that realism requires you do this, although this quote from Barry Smith does rather distinguish between simplifications (i.e. not the most detailed classification you can come up with) and reality.</p>
<blockquote><p>I am beginning to suspect that for you everything is a simplification (model) — for me, functions are part of reality; they are not simplifications; I am not interested in simplifications.</p>
<p align="right"> <em>http://groups.google.com/group/bfo-discuss/msg/865e601864fbc2dc</em><br /> &#8212; Barry Smith </p>
</blockquote>
<p>The problem, though, is that realism elevates &#8220;reality&#8221; above all else. I think that this is wrong. Of course, in any scientific discipline, we should by aiming to model the experimental data that we have. But this is not all we need to do. As any statistician will tell you, models are compromises. It is very easy to build a model that perfectly represents the data that you have; you just build a model with as many variables as data points. The model will fit perfectly to the data, but ultimately the model is useless, since it lacks explanatory power. We need use cases, we need simplifications and sometimes we will need multiple representations of the same thing; there are examples galore in my <a href="http://www.russet.org.uk/blog/2010/07/realism-and-science/">paper</a> <span class="kcite" kcite-id="ITEM-1">(doi:10.1371/journal.pone.0012258)</span>
. In fact, Chris Mungall gives a good example when he talks about dispositions and their status as being real:</p>
<blockquote><p>In fact, I have a particular problem with dispositions being &#8220;real&#8221; &#8211; BFO asks me to believe there are an infinite number of real but unrealized and perhaps wildly improbable dispositions floating around me every second</p>
<p align="right"> &#8212; Chris Mungall </p>
</blockquote>
<p>And later he gives the solution.</p>
<blockquote><p>taking a hard-headed pragmatic approach &#8211; e.g. avoid weirdo classes that don&#8217;t correspond to a term a normal scientist would use; introduce distinctions that give you the desired results to queries and inferences)</p>
<p align="right"> &#8212; Chris Mungall </p>
</blockquote>
<p>In otherwords, reality is important. But we also need use cases, we need community norms, and we need applications. If ontologies do not fit with these, then can be as &#8220;real&#8221; as you like, but they are still wrong.</p>


<p>Bibliography
      <div class="kcite-bibliography"></div>
</p>


<script type="text/javascript">
      var kcite_citation_data;
      if( kcite_citation_data == undefined ){
          kcite_citation_data = [];
      }
      kcite_citation_data[ 1960 ] = {"ITEM-1":{"source":"doi","identifier":"10.1371/journal.pone.0012258","resolved":true,"id":"ITEM-1","title":"Adding a Little Reality to Building Ontologies for Biology","author":[{"family":"Lord","given":"Phillip"},{"family":"Stevens","given":"Robert"}],"container-title":"PLoS ONE","issued":{"date-parts":[[2010,9,3]]},"page":"e12258-","volume":"5","issue":"9","DOI":"10.1371/journal.pone.0012258","type":"article-journal"}};
</script>


</div> <!-- kcite-section 1960 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2012/01/more-on-pici-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Pici Principle: What you should not say</title>
		<link>http://www.russet.org.uk/blog/2011/10/the-pici-principle-what-you-should-not-say/</link>
		<comments>http://www.russet.org.uk/blog/2011/10/the-pici-principle-what-you-should-not-say/#comments</comments>
		<pubDate>Tue, 18 Oct 2011 14:12:32 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Ontology]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1945</guid>
		<description><![CDATA[I once had cause to refer, somewhat mischievously, to &#8220;a kind of pasta from Tuscany, which is almost identical to spaghetti, but slightly different&#8221;; this was on a mailing list that was used by many Italians. It provoked the expected response; an offended Tuscan responded &#8220;I don&#8217;t know what you are talking about; but if [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1945">
<p><a name="preamble"></a> 
<p><a name="pici"></a></p>
<p>I once had cause to refer, somewhat mischievously, to &#8220;a kind of pasta from Tuscany, which is almost identical to spaghetti, but slightly different&#8221;; this was on a mailing list that was used by many Italians. It provoked the expected response; an offended Tuscan responded &#8220;I don&#8217;t know what you are talking about; but if you mean pici&#8221;, which I did, &#8220;it&#8217;s nothing like spaghetti&#8221;.</p>
<p>Recently, on the OBI mailing list, there has been much discussion about labels, markers or tracers. What ever you wish to call it, the basic idea is the same; a molecule which is easily detectable, is used to trace something else. This can involve adding a small amount of a radioactive isotope (P<sup>32</sup>). This makes it possible to follow the molecule (which is otherwise hard) by tracing the radiation (which is generally easy).</p>
<p>So, how do we model this? As with many parts of ontology building, it turns out to be not straight-forward; during this discussion, an <a href="http://sourceforge.net/mailarchive/message.php?msg_id=28115081">email</a> from <a href="http://www.oerc.ox.ac.uk/people/philippe-rocca-serra">Philipee Rocca-Serra</a> which left me asking the question, are we being too specific? I will work through an example to show what I mean. Feel free to skip to the <a href="#punchline">punchline</a> if you choose.</p>
<p>Consider, for example, the following models; these are not directly taken from OBI, as I want to reduce the complexity for this article; rather they are in the general spirit of the models which raised these questions.</p>
<p>A label, or something that has been labelled is clearly part of an experimental design. It is not intrinsic to this entity, rather it appears to be a role that the entity is playing in the experiment. So:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.4 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Class</font></b>: Label
       <font color="#990000">SubClassOf:</font>
          Role</tt></pre>
</td>
</tr>
</table>
<p>There are, of course, labels of many sorts. The main types that I can think of are radioactive, fluorescent and what I call adherent. So, we might add the following, with a few subclasses of adherent as explanation.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.4 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Class</font></b>: RadioactiveLabel
       <font color="#990000">SubClassOf:</font>
          Label

<b><font color="#0000FF">Class</font></b>: FluorescentLabel
       <font color="#990000">SubClassOf:</font>
          Label

<b><font color="#0000FF">Class</font></b>: AdherentLabel
       <font color="#990000">SubClassOf:</font>
          Label

<b><font color="#0000FF">Class</font></b>: BiotinilaytedLabel
       <font color="#990000">SubClassOf:</font>
           AdherentLabel

<b><font color="#0000FF">Class</font></b>: AntigenicLabel
       <font color="#990000">SubClassOf:</font>
           AdherentLabel</tt></pre>
</td>
</tr>
</table>
<p>So far so good. However, for a label to be useful, it needs to be manufactured (often in a bespoke fashion, depending on the experiment being performed) and it needs to be detectable. So, we might add classes like so:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.4 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Class</font></b>: LabellingProcess
       <font color="#990000">SubClassOf:</font>
           Process
           has_output some Label

<b><font color="#0000FF">Class</font></b>: LabellingDetectionProcess
       <font color="#990000">SubClassOf:</font>
           Process
           has_input some
                  Sample contains some Label</tt></pre>
</td>
</tr>
</table>
<p>Now we have three classes for every label type. We can deal with this by generating a cross-product, either at development time, or at the time of use if we are using OWL. However, we need something to tie together these classes. We need a concept to know that we need a <tt>RadioLabellingProcess</tt> to produce a <tt>RadioLabel</tt> which we detect in a <tt>RadioLabellingDetectionProcess</tt>. In short, we need a concept of <tt>Radiation</tt>, <tt>Radioactive</tt> or <tt>Radioactivity</tt>.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.4 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Class</font></b>: RadioactiveEntity
    <font color="#990000">SubClassOf:</font>
        IndependentContinuant,
        bears some Radioactivity

<b><font color="#0000FF">Class</font></b>: RadioactiveLabel
    <font color="#990000">SubClassOf:</font>
        Role,
        RadioactiveEntity

<b><font color="#0000FF">Class</font></b>: RadiationDetector
    <font color="#990000">SubClassOf:</font>
       detects some Radioactivity

<b><font color="#0000FF">Class</font></b>: RadioactiveLabelProductionProcess
    <font color="#990000">SubClassOf:</font>
       has_input some RadioactiveEntity</tt></pre>
</td>
</tr>
</table>
<p>This is where the situation gets difficult. What kind of thing is <tt>Radioactivity</tt>? Taking the realist approach, we need to consider this carefully, determining what this universal is. So, starting from the top, it is fairly obvious that we have a <tt>Continuant</tt>. Next question, do we have a <tt>Dependent</tt> or <tt>IndependentContinuant</tt>. Again, this is fairly clear: radioactivity cannot exist without something to be radioactive, hence <tt>Radioactivity</tt> is a <tt>DependentContinuant</tt>.</p>
<p>We have a set of <tt>DependentContinuant</tt>&#8216;s that <tt>Radioactivity</tt> could be. The concept <tt>Role</tt> does not fill well; this is usually ascribed by socially or, in this case, experimentally determined behaviour. Perhaps, <tt>Disposition</tt> would be better. However, this does not really fit either, as a <tt>Disposition</tt> is realised &#8220;under specific circumstances&#8221;. Now this is not true of radioactivity. Either something is radioactive or it is not, and if it is, then it is, to the best of our knowledge, radioactive under all circumstances. It appears, then, that <tt>Radioactivity</tt> is a <tt>Quality</tt>, because &#8220;it is exhibited if it inheres in an entity at all&#8221;.</p>
<p>If we follow the same logic with our other label types, initially, we come to the same conclusions. However, <tt>Fluorescence</tt> is not exhibited under all circumstances. It only happens when the label is illuminated with the right kind of light. So, <tt>Fluorescence</tt> appears to be a <tt>Disposition</tt>. Following a similar logic, this is also true of <tt>Adherent</tt>. So the best we can say about the property of the substance that makes it usable in labelling is that it is a <tt>RealizableEntity</tt>.</p>
<p>Having <tt>Radioactivity</tt> stand out in this way is a little unsatisfying. Let&#8217;s consider the logic again. One classic experimental form is the pulse-decay experiment. I can, for example, feed a rat with, say, radioactive phosphorus briefly. After this, you can trace the course of phosphorus. Now during the course of this experiment, the rat becomes radioactive and then ceases to be radioactive again. But, it is notably, the same rat. So, perhaps, the statement that things are either radioactive or not is wrong. Perhaps, it is not a <tt>Quality</tt> at all. The flaw in the logic is the assumption that because an atom is either radioactive or is not, therefore anything made up from atoms must be so. But an entity can have its atoms totally replaced and still be the same entity. In this case, what is true of a rat, is also true of its DNA. We can replace the atoms in a sample of DNA with other ones and still, have the same DNA. So, maybe, <tt>Radioactivity</tt> is a <tt>Quality</tt> at an atomic level of granularity, but is, after all, a <tt>Disposition</tt> at others.</p>
<p>Thinking further, however, maybe it is not a <tt>Quality</tt> at all. A mass of P<sup>32</sup> is always radioactive, but a single atom? Perhaps not, since it only displays this when it decays. So, perhaps, it is a <tt>Disposition</tt> after all. However, this makes no sense, because dispositions are displayed under &#8220;specific circumstances&#8221;. Now, to the best of our knowledge, radioactive decay is stochastic&#8201;&#8212;&#8201;it is so random, that radioactivity is often used to generate randomness. We cannot specify the circumstances under which it happens, it just does. More over, after it displays the radioactivity, what has happened to the atom? Using the same argument as before, we could say that, like the rat, the atom still exists, it&#8217;s just that (some of) the elementary particles that make it up have changed. But this way, surely, madness lies, as &#8220;being phosophorus&#8221; would become some sort of dependent continuant, which the atom displays during its decay, while it happens to have the right number of protons. So, probably it makes more sense to say that, the decay process represents the end of the existence of the phosophorus atom and the beginning of a new atom (and a radioactive particle). In which case, even our original decision that <tt>Radioactivity</tt> is <tt>DependentContinuant</tt> is wrong. It&#8217;s not a <tt>DependentContinuant</tt> at all, it&#8217;s only a process which over as soon as it begins.</p>
<p><a name="punchline"></a></p>
<p>So, what have we achieved? Well, I would argue, not a great deal, except for a lot of discussion. More over, we have ended discussing very detailed issues about the physical properties of matter, when we started discussing an ontology of biomedical investigations. This might be entertaining, or it might be very dull, depending on your point-of-view. But, what we have failed to produce is a specific conclusion.</p>
<p>The problem here is <strong>realism</strong>. A realist ontology represents portions of reality, that is classes of things that really have instances. We have to ask these questions to try and determine whether <tt>Radioactivity</tt> exists and what kind of thing that it is. We can set realism against <strong>pragmatism</strong>. Previously, Robert Stevens has described the problems that this causes by preventing the ontologist from modelling &#8220;<a href="http://robertdavidstevens.wordpress.com/2011/05/26/unicorns-in-my-ontology/">unicorns</a>&#8220;, such as Newtonian mechanics, or canonical anatomies. The unicorn principle says, if it is useful to model a concept in an ontology, then often we should. Here, I introduce what I call the &#8220;<a href="#pici">Pici</a> principle&#8221;&#8201;&#8212;&#8201;if it is not useful to model a concept then we should not. As a British native, pasta is pasta; it all tastes much the same to me. Generally, I do not need the ability to be able to distinguish pici and spaghetti, unless I want to provoke a response from an over-excitable Tuscan. The sensible course is not to get involved in the discussion in the first place.</p>
<p>The same applies in this instance. There is a clear use case for the concept of <tt>Radioactivity</tt>; without it, we cannot say that a radio-label is radioactive, or that a fluorescence detector is not going to work detecting it. But to achieve this use case, we do not need to understand very deeply what <tt>Radioactivity</tt> is. Describing it as a <tt>DependentContinuant</tt> is enough, and it will fulfil the use cases. It will not enable us to ask questions about which kind of labels detect qualities and which detect dispositions. But in the absence of a use case, this is not an issue.</p>
<p>A chemist may care, and may want to classify radioactivity further. This is fine; as with pasta, we can safely leave these issues to someone else, in the knowledge that they are probably better qualified to give an answer anyway. So long as they decide that <tt>Radioactivity</tt> is a <tt>DependentContinuant</tt>, it does not matter to us what kind of <tt>DependentContinuant</tt>; we have said nothing incorrect. So, our ontology will integrate with theirs, without change to either. By being as vague as our use cases allow us, we have actually increased the ability of our ontology to integrate with others.</p>
<p>In short, the pici principle encapsulates the idea that deciding what we <strong>should not</strong> model in an ontology is as important as what we <strong>should</strong> model. And this decision comes from use cases, not reality.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1945 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2011/10/the-pici-principle-what-you-should-not-say/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>The Status Quo farewell tour on realism</title>
		<link>http://www.russet.org.uk/blog/2010/09/the-status-quo-farewell-tour-on-realism/</link>
		<comments>http://www.russet.org.uk/blog/2010/09/the-status-quo-farewell-tour-on-realism/#comments</comments>
		<pubDate>Thu, 30 Sep 2010 09:17:12 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Ontology]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1836</guid>
		<description><![CDATA[I originally wrote this as a brief comment in reply to David Osumi-Sutherlands excellent post. But, the formatting got mixed up and is unfixable there, so I posted I am posting it here. Not only do I believe in mind-independent reality, I believe that science makes claims about mind independent reality that it is reasonable [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1836">
<p>I originally wrote this as a brief comment in reply to David Osumi-Sutherlands excellent <a href="http://ontogeek.wordpress.com/2010/09/27/yes-really/">post</a>. But, the formatting got mixed up and is unfixable there, so I posted I am posting it here.</p>
<blockquote><p>Not only do I believe in mind-independent reality, I believe that science makes claims about mind independent reality that it is reasonable to believe are true. In my experience, most scientists (certainly most biologists) believe this too.</p>
<p align="right"> &#8212; David Sutherland </p>
</blockquote>
<p>I agree. However, this has little or no bearing or relevance to whether you are a realist or not. The assumption that it does it based on an etimological fallacy&#8201;&#8212;&#8201;&#8221;realism&#8221; chose a good name, this is all. Conceptualists, or people like myself who just don&#8217;t care about the philosophy, but who simply find that realism is resulting in bad ontologies, do not automatically believe that the world is a fluffly place, dreamed up in someone&#8217;s head. Believing in reality does not make you a realist.</p>
<blockquote><p>It strikes me that what Phil calls realism seems to be much more specific than this – he at least sometimes gives the impression that a realist position involves accepting the BFO + whatever Barry Smith proposes. My knowledge of philosophy is quite limited, but I’ve read enough to know that this is unwarranted – it seems to stem largely from ongoing arguments about nature of the OBO-Foundry .</p>
<p align="right"> &#8212; David Sutherland </p>
</blockquote>
<p>This is a criticism that others have made&#8201;&#8212;&#8201;Matthias Samwald pointed it out on my blog. Yes, you (both) are entirely right. In my <a href="http://www.russet.org.uk/blog/2010/07/realism-and-science/">paper</a>, I am explicit about this, saying &#8220;In short, for this paper, when we say “realism”, we largely mean <em>realism as practiced by BFO</em>. We do not claim, in this paper, to address all the philosophical perspectives that through time carried the name <em>realism</em>.&#8221; Even this is hard&#8201;&#8212;&#8201;as Gary Merrill&#8217;s excellent paper describes, what &#8220;realism as defined by Barry&#8221; means changes from paper to paper.</p>
<p>I&#8217;m not trying to address the philosophy&#8201;&#8212;&#8201;as I say, I don&#8217;t care. I am trying to address the problems being caused in ontology development now.</p>
<blockquote><p>The obvious response is: even if Kepler had got the maths right, is it really irrelevant to our acceptance or rejection of his theory whether he believed that force was exerted on the planets by creatures with wings sprouting from their backs? Even in the most mathematically abstracted areas of science, we can’t completely purge ontological claims.</p>
<p align="right"> &#8212; David Sutherland </p>
</blockquote>

<p>Gravity and Newton are not ontological claims. They are phenomenological. We do not stick to the earth because of gravity; rather, gravity is the name we give to phenomenon. The cleverness and the understanding is that one simple piece of phenomenonology (\(1/r^2\)) can explain a lot of others.</p>
<p>There&#8217;s a nice Feynman quote on this as well, as there appears to be on anything. &#8220;Feynman, inertia&#8221; and google should provide.</p>
<p>The creatures with wings theory starts to break down when you realise that electrons assert a gravitation force on each other. You would have to ask, &#8220;could the angels really be bothered&#8221;, and &#8220;what are the angels made of&#8221;? But at the level of planets, I think it mostly works.</p>
<p>Of course, &#8220;gravity&#8221; sounds much more sciency than &#8220;wings of angels&#8221;, which means it&#8217;s better.</p>
<blockquote><p>Secondly, and more importantly, I care about whether it is reasonable to believe, on the basis of the scientific evidence we have, that instances of the classes I define exist.</p>
<p align="right"> &#8212; David Sutherland </p>
</blockquote>
<p>Fine. But unless you have a clear definition of what you mean, it&#8217;s useless. Apparently, according to realism instances of &#8220;Dog&#8221; exist but no instances of &#8220;Dog or Cat&#8221; exist. According to realism, zero, by definition, doesn&#8217;t exist. I don&#8217;t know what to make of this.</p>
<blockquote><p>I see no reason to expect logical consistency if classes lacking instances are allowed. – I hold the assumption that the real world that our scientific theories make statements about does not contradict itself</p>
<p align="right"> &#8212; David Sutherland </p>
</blockquote>
<p>Maybe. We won&#8217;t know till we have finished. In the meantime, contradictions occur. There is not point building a computational framework for representing our data that can&#8217;t cope with this. I think that this is no that significant an issue for reference ontologies which are, by definition, likely to be behind the times!</p>
<blockquote><p>Can we find ways to mark classes to make this disinction clear? Something along the lines of:</p>
<p>Class we have good reason to believe has no instances [REFS?] Class believed on theoretical grounds alone to have instances [REFS] Class for which there is experimental evidence for the existance of instances – evidence summary</p>
<p align="right"> &#8212; David Sutherland </p>
</blockquote>
<p>Again, very little to do with realism. Ironically, one of my main issues with realism is the introduction of lots of abstract and frankly incomprehensible concepts into ontologies. What is the evidence for the existance of instances of Generically Dependent Continuant? Or as Chris Mungall says on my blog:</p>
<blockquote><p>Instead we have to say ‘book content’ has_concretization of some (inheres_in some book). This gives me a headache and seems to just be making busy work for no practical reason. Also I feel lost with respect to what the intermediate unnamed entity is here.</p>
<p align="right"> &#8212; Chris Mungall </p>
</blockquote>
<p>But, yes, standards for reporting of evidence are a good thing. I think, we should also be investigating metrics for looking at the usage of ontology terms&#8201;&#8212;&#8201;those which are never used are also problematic. These are simple, pragmatic steps that we could take.</p>
<p>I think that we need to consider readability metrics for our English definitions, we need to consider metrics for complexity (&#8220;as simple as possible, but no simpler!&#8221;) for both our logical definitions and for the ontologies overall. All of these things are important steps, in determining whether the distinctions we choose to make are worthwhile. And I think that we need better tools to tie all of this together.</p>
<p>All of these things will come in time. But not while we waste time arguing about philosophy. Not if we push forward the idea that correctness comes from thinking hard about things, rather than testing them. And not if we give the impression that ontology building is about understanding large numbers of unsupportable, untestable and probably meaningless statements about the nature of reality. This is why I wrote my paper.</p>
<p>Oh dear, this was meant to be brief. I apologise to all three of my subscribers. I will stop. Honest.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1836 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/09/the-status-quo-farewell-tour-on-realism/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Why Not?</title>
		<link>http://www.russet.org.uk/blog/2010/09/why-not/</link>
		<comments>http://www.russet.org.uk/blog/2010/09/why-not/#comments</comments>
		<pubDate>Tue, 07 Sep 2010 17:18:08 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Ontology]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1823</guid>
		<description><![CDATA[My last post was an attempt to drag myself out of the realism debate; unfortunately, Chris Mungall replied, and he deserves an answer. Fortunately, his comment addressed an issue that I have been meaning to post for a while, which is the use of &#8220;not&#8221;, or &#8220;absent&#8221; in an ontology. I&#8217;ll make a brief aside [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1823">
<p>My last <a href="http://www.russet.org.uk/blog/2010/09/why-realism-is-wrong/">post</a> was an attempt to drag myself out of the realism debate; unfortunately, Chris Mungall <a href="http://www.russet.org.uk/blog/2010/09/why-realism-is-wrong/#comment-10021">replied</a>, and he deserves an answer. Fortunately, his comment addressed an issue that I have been meaning to post for a while, which is the use of &#8220;not&#8221;, or &#8220;absent&#8221; in an ontology. I&#8217;ll make a brief aside into realism, then describe the <a href="#pragmatic">pragmatic</a> design decisions that lie at the heart of the issue. Feel free to skip the realism bit.</p>
<hr /> 
<h2><a name="_the_realist_objection"></a>The realist objection</h2>
<blockquote><p>&#8220;I wasn’t aware of the realist objection to the not / complementOf construct.&#8221;</p>
<p align="right"> &#8212; Chris Mungal </p>
</blockquote>
<p>Obviously, the standard problem with realism is that it is ill-defined, so it is, therefore hard to determine exactly what is does mean. My reading that realism objects to &#8220;not&#8221; comes from the &#8220;Beyond Concepts&#8221; paper ( <a href="http://www.russet.org.uk/blog/2010/07/realism-and-science/#smith2004beyond">*</a>).</p>
<p>This paper describes this form of class as &#8220;purely contingent&#8221;, that is not universals. Again, this is based on Barry&#8217;s misunderstanding of OWL set-theoretic semantics, and that classes are defined by their extensions, rather than all possible extensions.</p>
<hr /> 
<h2><a name="pragmatic"></a>The pragmatic issue</h2>
<p>First, considering the logic.</p>
<blockquote><p>&#8220;Which seams reasonable since I don’t know how to define ‘absent wing’ in OWL in a way that would give me the correct inferences &#8220;</p>
<p align="right"> &#8212; Chris Mungall </p>
</blockquote>
<p>As is always the case, the are advantages and disadvantages. The problem, as you allude to, is not <tt>absent wing</tt> per se, but with the inference <tt>absent wing is-a wing</tt>, which is logically incorrect.</p>
<p>Now, of course, as a user I might validly say, I don&#8217;t care about the logical incorrectness. However, this incorrectness makes queries harder&#8201;&#8212;&#8201;to answer the question &#8220;how many wings do these three flies have&#8221;, you have to actually ask &#8220;how many wings, except for absent wings&#8230;&#8221;. So, therefore, &#8220;absent wing&#8221; is wrong.</p>
<p>Sadly, the world is not so simple, as using &#8220;absent wings&#8221; also makes things easier, because I can ask &#8220;how many of the flies have abnormal wings&#8221; straight-forwardly. In general, users would consider an <tt>absent wing</tt> to be abnormal. So, now the user has to remember to ask something like &#8220;how many of the flies have abnormal wings, or less than two wings&#8221;.</p>
<p>So, from a logical perspective, there are gain and losses either way. The bottom line is that, &#8220;absent wings&#8221; have a nasty sting in the tail (to mix my metaphor), and it is an issue that you have to be aware of when building ontologies; there isn&#8217;t a universal answer.</p>
<p>We can also see the same issue, from another perspective, which I have stolen from <a href="http://iospress.metapress.com/content/j3324564p5l33863/?p=c9a7d6a6826845258807084d2693dbad&amp;pi=0">Gary Merrill</a>. A wing is not an either/or property, it&#8217;s a continuum. At some level of granularity, I suspect that there is no fly mutant which has absolutely no wings at all (well, obviously this doesn&#8217;t include those which just prevent adult development). So, &#8220;small wing&#8221;, &#8220;really small wing&#8221;, &#8220;five cells more than having no wing&#8221; are all fine, cause no logical problems. But once those five cells disappear, then suddenly all the logic breaks down?</p>
<p>I am reminded of a mistaken argument I had once, when some one asserted that the drink-driving limit should be zero. This is, of course, daft as every human in existance has a measurable amount of alcohol in their blood; the mistake was trying to explain this when this measurable amount was quite high for both the listener and myself.</p>
<p>Unless we are dealing with maths, zero never means zero. But this does not make us all drink-drivers.</p>
<hr /> 
<h2><a name="_conclusion"></a>Conclusion</h2>
<blockquote><p>As it turns out we can dispense with realism entirely for this discussion. If we focus on modeling in a way that gives us useful answers then the realist objection is consistent with but superfluous with a pragmatic modeling approach.</p>
<p align="right"> &#8212; Chris Mungall </p>
</blockquote>
<p>Indeed. As always, it comes down to a question of having a set of clearly defined use-cases, to an understanding of the logic, and the consequences of our modelling decisions. My own feeling is that, in general, constructs such as &#8220;absent wing&#8221; are dangerous as they are likely to lead to bad ontologies; but this does not mean that they are wrong. I think that the pragmatic modelling approach would be to say, do not do this unless you understand why it is dangerous.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1823 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/09/why-not/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Realism and Science</title>
		<link>http://www.russet.org.uk/blog/2010/07/realism-and-science/</link>
		<comments>http://www.russet.org.uk/blog/2010/07/realism-and-science/#comments</comments>
		<pubDate>Wed, 28 Jul 2010 15:28:42 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Ontology]]></category>
		<category><![CDATA[Papers]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1713</guid>
		<description><![CDATA[This post carries the text of a paper accepted for PLoS One (now published). I publish it here as a pre-print because of the recent discussion on OBO discuss about realism. I have converted this from the original latex, which isn&#8217;t perfect. Apologies for errors. The [PDF] is available here. Adding a little reality to [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1713">
<p>This post carries the text of a paper accepted for PLoS One (now <a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0012258">published</a>). I publish it here
as a pre-print because of the recent discussion on OBO discuss about realism.
I have converted this from the original latex, which isn&#8217;t perfect. Apologies
for errors.</p>
<p>The <a href="http://homepages.cs.ncl.ac.uk/phillip.lord/download/publications/realism_and_science.pdf">[PDF]</a> 
is available here. </p>
<div>
<p><big class="xlarge"><b class="bfseries">Adding a little reality to
  building ontologies for biology</b><br /></big> Phillip Lord and Robert
  Stevens<br /> School of Computing Science<br /> Claremont Road<br />
  Newcastle University<br /> Newcastle-upon-Tyne, UK<br />
  <a href="phillip.lord@newcastle.ac.uk">phillip.lord@newcastle.ac.uk</a><br />
  School of Computer Science<br /> The University of Manchester<br /> Oxford
  Road<br /> Manchester, UK<br />
  <a href="robert.stevens@manchester.ac.uk">robert.stevens@manchester.ac.uk</a></p>
</div>
<h1 id="a0000000002">Abstract</h1>
<p><b class="bfseries">Background:</b> Many areas of biology are open to
mathematical and computational modelling. The application of discrete, logical
formalisms defines the field of biomedical ontologies. Ontologies have been
put to many uses in bioinformatics. The most widespread is for description of
entities about which data have been collected, allowing integration and
analysis across multiple resources. There are now over 60 ontologies in active
use, increasingly developed as large, international collaborations.</p>
<p>There are, however, many opinions on how ontologies should be authored;
that is, what is appropriate for representation. Recently, a common opinion
has been the &ldquo;realist&rdquo; approach that places restrictions upon the
style of modelling considered to be appropriate.</p>
<p><b class="bfseries">Methodology/Principle Findings:</b> Here, we use a
number of case studies for describing the results of biological experiments.
We investigate the ways in which these could be represented using both realist
and non-realist approaches; we consider the limitations and advantages of each
of these models.</p>
<p><b class="bfseries">Conclusions/Significance:</b> From our analysis, we
conclude that while realist principles may enable straight-forward modelling
for some topics, there are crucial aspects of science and the phenomena it
studies that do not fit into this approach; realism appears to be
over-simplistic which, perversely, results in overly complex ontological
models. We suggest that it is impossible to avoid compromise in modelling
ontology; a clearer understanding of these compromises will better enable
appropriate modelling, fulfilling the many needs for discrete mathematical
models within computational biology.</p>
<h1 id="a0000000003">Introduction</h1>
<p>Ontologies are now widely used for describing and enhancing biological
resources and biological data, largely following on from the success of the
Gene
Ontology&nbsp;<span class="cite">[<a href="#Ashburner2000">1</a>]</span>.
Ontologies have been used for many purposes, from schema integration to value
reconcilliation to query
interfaces&nbsp;<span class="cite">[<a href="#handbook2">2</a>]</span>.
Ontologies have also become a cornerstone of computational biology and
bioinformatics. As computationally amenable artifacts they are, themselves, a
direct part of computational biology; many computational biologists are
involved in their production and maintenance. Many more use ontologies to
summarise their data, often by looking for
over-representation&nbsp;<span class="cite">[<a href="#Zeeberg2003">3</a>]</span>,
as the basis for drawing computational inferences about
data&nbsp;<span class="cite">[<a href="#Wolstencroft2006">4</a>]</span>,
or as the basis for determining semantic
similarity&nbsp;<span class="cite">[<a href="#Lord2003">5</a>]</span>.
Even those not making direct computational use of ontologies are likely to
come into contact with them, for example, when preparing annotation as part of
their data
release&nbsp;<span class="cite">[<a href="#Whetzel2006a">6</a>]</span>.</p>
<p>It is, therefore, of vital interest to computational biologists that
ontologies for use within biomedicine are fit for purpose. One effort that
aims to increase the quality of the ontologies available within biomedicine is
the &ldquo;OBO
Foundry&rdquo;&nbsp;<span class="cite">[<a href="#Smith2007">7</a>]</span>.
The main tool that it uses for this is &ldquo;an evolving set of shared
principles governing ontology development&rdquo;. The initial eleven
principles of the OBO
Foundry&nbsp;<span class="cite">[<a href="#OBOFoundry2006">8</a>]</span>
were largely concerned with what might be termed &lsquo;good engineering
practice&rsquo; (ontologies must, for example, be openly available, with a
common syntax, well documented, and used). These principles have later been
joined by a further
eleven&nbsp;<span class="cite">[<a href="#OBOFoundry2008">9</a>]</span>;
these include principles such as &ldquo;textual definitions will use the
genus-species form&rdquo;, &ldquo;Use of Basic Formal Ontology&rdquo; and, the
somewhat quixotic, &ldquo;terms [&hellip;] should correspond to instances in
reality&rdquo;. These stem not from engineering practice, but from a
perspective called <i class="itshape">realism</i>.</p>
<p>The many different uses for ontologies that we have described are reflected
in different understandings and methodologies about how and what to represent
in an ontology. Over the last few years, for many uses the paradigm has moved
from &ldquo;a conceptualization of the application domain&rdquo; toward
&ldquo;a description of the key entities in reality&rdquo;; it is this latter
approach that defines
realism&nbsp;<span class="cite">[<a href="#Johansson2006">10</a>]</span>.
This approach to ontology is typified by the Basic Formal Ontology (BFO); a
small upper-ontology for use within science in general and biomedical ontology
building in
particular&nbsp;<span class="cite">[<a href="#Grenon2004">11</a>]</span>.</p>
<p>There has been significant discussion regarding the possibility of
representing <em>only</em> &ldquo;real entities&rdquo; in computational
ontologies&nbsp;<span class="cite">[<a href="#smith2004beyond">12</a>]</span>.
Likewise, there has been significant discussion about the philosophy
surrounding realism and the role of ontology in its
representation&nbsp;<span class="cite">[<a href="#Johansson2006">10</a>]</span>.
While it is argued by some that it is possible to represent <em>only</em>
reality when making a domain description, there has, however, been little
discussion on whether it is necessarily desirable to do so.</p>
<p>In this paper, we consider the implications that realism has for the
choices that are open to the ontologist while they are modelling their domain
of interest. In particular, we consider the implications that this has for the
computational capabilities of any resultant ontology, in terms of its ability
to represent scientific knowledge in a computationally amenable form, as well
as the ability to perform automated inference or statistics over this
knowledge. We suggest that the application of realism results in ontologies
that are over-complex, awkward or limited; as such, realism falls far short of
its aim of increasing the fitness-for-purpose of ontologies. This approach,
therefore, is unlikely to fulfil the needs of computational biologists whom
form a substantial part of both the user and developer community for
bio-ontologies.</p>
<h1 id="a0000000004">Methods</h1>
<p>In this paper, we take the approach of a number of worked exemplars; this
is a complementary approach to an in-depth consideration of the modelling
decisions for a particular area or particular ontology, which we have used
previously&nbsp;<span class="cite">[<a href="#Lord2009">13</a>]</span>,
as it allows broader conclusions about the general principles of ontology
development. For each section, as well as the main exemplars, a number of
related examples are briefly discussed, to reinforce that the issues raised
are, indeed, general.</p>
<p>The exemplars have been selected by several criteria. First, all the main
exemplars are all taken from within biomedicine; this is also true for the
majority of the related examples. Second, we have chosen exemplars that
provide as wide a coverage of biology as possible. For practical reasons,
third, we have chosen exemplars where the underlying science is relatively
basic to much of biology and is likely to be immediately clear to the reader
without significant explanation.</p>
<p>We have chosen exemplars requiring as little knowledge of specific
ontologies as possible. We refer to only three. The first is BFO (see
&ldquo;sec:what-realism-2&rdquo;) which is a canonical example of a realist
ontology. BFO is described as a cross-domain, upper-ontology; as a result,
most terms fail the criteria given above; they are of poor biomedical
relevance, and are not basic science or immediately clear. We have, therefore,
also used PATO
(see <a href="http://obofoundry.org/wiki/index.php/PATO:Main_Page">http://obofoundry.org/wiki/index.php/PATO:Main_Page</a>);
this defines &ldquo;qualities&rdquo; that we might consider attributes of
other entities; so, the authors of this paper have a height, weight and shape,
all of which are considered to be qualities of the authors. Finally, we use
the relationship
ontology&nbsp;<span class="cite">[<a href="#Smith2005">14</a>]</span>;
this describes the relations between entities. So, for example, the height of
the author <em>inheres_in</em> the author.</p>
<p>As discussed in this and other
works&nbsp;<span class="cite">[<a href="#Russell1946">15</a>, <a href="#Merrill2010">16</a>]</span>,
&ldquo;realism&rdquo; is itself poorly defined. Where this lack of definition
makes the consequences of realism hard to determine, we have taken the
practical course, of showing the consequences as they play out in practice; to
an extent, therefore, these three ontologies are not only exemplars for
realism, but define it, as it is currently practiced. In short, for this
paper, when we say &ldquo;realism&rdquo;, we largely mean &ldquo;realism as
practiced by BFO&rdquo;. We do not claim, in this paper, to address all the
philosophical perspectives that through time carried the name
&ldquo;realism&rdquo;.</p>
<h1 id="a0000000005">Results</h1>
<h2 id="a0000000006">What is Realism?</h2>
<p>Building ontologies based on reality is obviously appealing to most
scientists; after all the study of <em>reality</em> to determine its behaviour
and laws is the goal of scientists. A brief consideration, however, shows that
this notion cannot define a methodology for the building of ontologies.</p>
<p>Within the context of science &ldquo;reality&rdquo; would normally be taken
to mean our experimental or observational data; but the statement that science
(ontologies) should be based on experimental or observational data is a truism
and, as such, has no explanatory power. The &ldquo;real&rdquo; in realism
refers, in fact, to the belief that the categories that we can use to divide
entities are, themselves, real.</p>
<p>This distinction stems from an old argument from philosophy; realism
against conceptualism. Again, both sides of the argument agree that the world
we can percieve, and as scientists, experiment on, is mind-independent. The
conceptualist, however, argues that the categories that they
term <em>concepts</em> are a product of social agreement. Conversely, the
realist argues that these categories that they term <em>universals</em> are
themselves real, that is mind independent in their own right, like the
entities they describe.</p>
<p>This distinction may seem fairly confusing; as
Russell&nbsp;<span class="cite">[<a href="#Russell1946">15</a>]</span>
says &ldquo;if I have failed to make Aristotle&rsquo;s theory of universals
clear, that is (I maintain) because it is not clear&rdquo;. In fact, there is
a third possibility that is a more empirical view&mdash;that is, if categories
(or other models) help in describing and predicting experimental data, then
they are useful regardless of whether they are real or
otherwise&nbsp;<span class="cite">[<a href="#Dumontier2010">17</a>]</span>.
As an example, the Mendelian notion of segregating units of inheritance was
defined and useful many years before a complete mechanistic description of
their cause was available. In this context, we note that there is no commonly
used term to express this form of category; most commonly,
&ldquo;concept&rdquo; is used.</p>
<p>For a field with a core activity of providing definitions, there is
surprisingly little agreement on the meaning of the word
&ldquo;ontology&rdquo;; as there have been many papers on the topic, we
consider just a few that reflect the distinction between these approaches.
Probably the most commonly cited
definition&nbsp;<span class="cite">[<a href="#Gruber1992">18</a>]</span>
describes an ontology as &ldquo;a specification of a conceptualization&rdquo;.
This definition emphasises the formality (i.e. logical and, therefore,
computationally amenable) aspect to ontology development.</p>
<p>This is countered with a realist definition; while the requirements from
Gruber&rsquo;s definition&mdash;a formal specification&mdash;are necessary,
realist ontologies add the requirement that &ldquo;the nodes and edges
correspond not to concepts but, rather, to entities in
reality&rdquo;&nbsp;<span class="cite">[<a href="#Ceusters2006">19</a>]</span>.</p>
<p>What does&ldquo;reality&rdquo; in this context actually mean? Definitions
such as &ldquo;that which exists&rdquo; are strangely circular leaving the
question of what &ldquo;exists&rdquo; means.
Smith&nbsp;<span class="cite">[<a href="#smith2004beyond">12</a>]</span>
adds the priviso that reality is &ldquo;captured in scientific laws&rdquo;.
Being a scientific law is not strictly enough, as some are later shown to be
wrong, but a scientific law is the current best attempt at reality; this
possibility does not make an ontology non-realist. For a realist ontology, the
nodes are &ldquo;universals&rdquo;&mdash;entities in reality&mdash;rather than
concepts; at least one particular must exist for every universal.</p>
<p>This still leaves the difficulty of applying the realist definition in
practice. So most scientists will happily accept, for example, that a cell is
real as it is an entity that can be observed, interacted with and manipulated.
However, concepts such as
&ldquo;function&rdquo;&nbsp;<span class="cite">[<a href="#Lord2009">13</a>]</span>
have raised more
discussion&nbsp;<span class="cite">[<a href="#Shrager2003">20</a>]</span>;
is this &ldquo;real&rdquo; or just a word biologists use as a point of
reference? While the definition involving &ldquo;entities in reality&rdquo;
maybe of philosophical interest, they are hard to turn into a specific assay;
how to test whether a particular concept is, also, a universal. Instead of a
clear assay for existence, realism offers direction about what concepts are
NOT reality, rather than those that are reality. For example, and perhaps
ironically given the negative practical definition of reality, a statement
such as:</p>
<pre>
  Dog is_a not Cat
</pre>
<p>is not held to be a statement about reality as it is a logically
constructed example of subsumption (an <tt>is_a</tt> relationship); there is
no real universal containing particular <tt>not Cat</tt>s in existence.
Likewise,</p>
<pre>
  Dog is_a (Dog or Cat)
</pre>
<p>as the existence of particular <tt>Dog</tt>s and <tt>Cat</tt>s does not
mean that there are any particular <tt>Dog or Cat</tt>s (examples modified
from&nbsp;<span class="cite">[<a href="#smith2004beyond">12</a>]</span>).</p>
<p>This is not meant to provide a complete introduction to
&ldquo;realism&rdquo;, but to provide a grounding for the discussion that
follows; we will consider the issues raised by realism, throughout the paper.
A more philosophical treatment of realism is given by
Merrill&nbsp;<span class="cite">[<a href="#Merrill2010">16</a>]</span>.
It is useful to note that
Gruber&rsquo;s&nbsp;<span class="cite">[<a href="#Gruber1992">18</a>]</span>
statement that &ldquo;And it [a computational ontology] is certainly a
different sense of the word than its use in philosophy.&rdquo;. In this paper,
we are concerned with the ontologies as computational artefacts.</p>
<p>To summarise, a realist approach to ontology says that the categories or
universals in to which objects or particulars fall have an existence in their
own right. It is these universals and <em>only</em> these universals that a
realist approach says should be the nodes within an ontology. In this paper we
examine whether this approach is an adequate means to provide an account for
the data produced by biomedicine.</p>
<h2 id="a0000000007">Models that represent reality</h2>
<p>In this section, we suggest that many universals have a range of
representations. In some cases, the choice of representation may be obvious,
such as length which has a natural scientific representation in SI units. In
many cases, however, there is no clear set of criteria for choosing between
representations. We consider the way that one quality, <em>colour</em>, could
be represented ontologically.</p>
<p>Colour is a complex phenomenon. The colour of an object or other phenomena
arises, in part, from that object and, in part, from the eye that perceives
it.</p>
<p>A representation of the physical reality would be an account of the
reflection, transmission and perception of light by an organism. Such an
account of the reality of light and its perception might cover the following
facts: Chlorophyll is green in reflection and red in transmission; a flower
petal appears white to a human, but has UV stripes to a bee; the plant leaf
and the algae appear green to humans, but have different reflection spectra
because their chlorophyll co-ordinate to their Mg<sup>2+</sup> ion in
different ways.</p>
<p>There have been a number of different attempts to represent the
complexities of colour numerically, for a number of different purposes. These
are models that allow us to describe colour, without having to deal with the
underlying physics or reality of colour. Probably the best known of these are
RGB (Red, Green, Blue) or HSV (Hue, Saturation, Value), both of which are
additive colour models appropriate for describing colour on a display screen.
CYMK (Cyan, Yellow, Magenta and Black) is a subtractive colour model and
commonly used for printing.</p>
<p>Collectively these representation schemes are known
as <i class="itshape">colour models</i>. That none of these schemes has become
predominant reflects both their different uses and the preferences of
different user groups.</p>
<p>For the ontology builder, this leaves us with a difficult choice:</p>
<ol class="enumerate">
<li>
<p>We bless one of the colour models, substituting the model for the
    underlying physics and do not describe the others.</p>
</li>
<li>
<p>We describe all of the colour models, but do not describe that they are
    part of a colour model.</p>
</li>
<li>
<p>We explicitly describe the reality of the physics, biology and the
    relationship to the different colour models, reflecting the practise of
    describing colour in much of science.</p>
</li>
</ol>
<p>Currently, considering the PATO ontology, which is documented as being
built according to realist principles, the first approach has been taken,
using the HSV scheme. So, PATO has a term <b class="bfseries">Color Hue</b>
(PATO:15) that is defined as :</p>
<blockquote class="quote"><p>
  <i class="itshape">&ldquo;A chromatic scalar-circular quality inhering in an
  object that manifests in an observer by virtue of the dominant wavelength of
  the visible light; may be subject to fiat divisions, typically into 7 or 8
  spectra.&rdquo;</i>
</p></blockquote>
<p>Using this model, PATO describes <b class="bfseries">red</b> (PATO:322) as
:</p>
<blockquote class="quote"><p>
  <i class="itshape">&ldquo;A color hue with high wavelength of the long-wave
  end of the visible spectrum, evoked in the human observer by radiant energy
  with wavelengths of approximately 630 to 750 nanometers.&rdquo;</i>
</p></blockquote>
<p>This modelling approach has a number of limitations.</p>
<ul class="itemize">
<li>
<p>The decision to choose one colour model or the other is arbitrary.
    While there are reasonable justifications for the use of HSV as opposed
    to, for example, RGB, there is no <i class="itshape">a priori</i>
    justification for use of an additive colour model as opposed to a
    subtractive model. Both are valid, for different usage; in general,
    reflective colour is more common in biology (e.g. pigmentation) than
    emitted colour (e.g. fluorescence) which would suggest that subtractive
    models are more generally applicable, but a full treatment requires
    both.</p>
</li>
<li>
<p>There are no terms which can be used to express data described
    according to other colour models, necessitating a transformation between
    the different models into the officially &ldquo;blessed&rdquo; version
    during application of the ontology. These transformations may be lossy and
    not fully reversible.</p>
</li>
</ul>
<p>The second approach is also possible. This would allow expression of data
in multiple colour models, however:</p>
<ul class="itemize">
<li>
<p>The ontology would tend to get rather confusing as more colour models
    are added; colour would have children &ldquo;Hue&rdquo;, &ldquo;Red&rdquo;
    and &ldquo;Cyan&rdquo; and seven other sibling terms.</p>
</li>
<li>
<p>It is not clear which terms comprise a colour model: do values for
    &ldquo;Hue&rdquo;, &ldquo;Green&rdquo; and &ldquo;Magenta&rdquo; specify a
    colour?</p>
</li>
<li>
<p>It is not clear whether terms that occur in the other contexts are
    equivalent. Is &ldquo;Red as in RGB&rdquo; the same or different
    as <b class="bfseries">Red</b> (PATO:322)? Is &ldquo;Hue as in HSV&rdquo;
    the same or different from &ldquo;Hue as in HSL&rdquo; (HSL is another
    additive colour model).</p>
</li>
</ul>
<p>The third approach does not suffer from the limitations described. We
suggest from this analysis that it is necessary, if unfortunate, for some
qualities to be explicitly described with multiple representations. To avoid
confusion, the universal quality, colour, would need to be explicitly
described as having multiple valid models. Yet, realism argues that we should
not do this, as colour is real and not a model; more over, the focus on
realism means that the documentation does not describe the choices that have
been made, nor refer to the relationship between <b class="bfseries">Color
Hue</b> (PATO:15) and &ldquo;Hue as in HSV&rdquo;. In short, realism has
limited our ability to represent colour.</p>
<h3 id="a0000000008">Related Examples</h3>
<p>There are many different examples of this issue; having two or more models
  to describe the same part of reality is common. The distance between two
  markers on a chromosome can be measured using (one of a number of) genetic
  techniques. Some qualities have a bewildering array of different
  measurements associated with them; Wikipedia, for example, lists 13
  different measurements of concentration such as molarity or \(gm^{-3}\).</p>
<p>This issue has been previously recognised. In computing science, explicitly
modelling one model in another is a form of <em>metamodelling</em>. Other,
non-realist, upper-ontologies such as DOLCE use the concept
of <tt class="ttfamily">Quale</tt> to describe a cognitive abstraction (such
as Colour), including those over a physical quality (such as the spectral
properties of reflected
light)&nbsp;<span class="cite">[<a href="#Seyed2009">21</a>]</span>.</p>
<h2 id="a0000000009">Sequences and the Central Dogma</h2>
<p>The central dogma of molecular biology suggests that all genetic
information is encoded in the DNA of a cell, as the ordered nucleotides that
comprise the DNA. RNA is transcribed from this DNA. The RNA molecule also has
a defined order of nucleotides related to the DNA. Finally the RNA is
translated into protein.</p>
<p>Consider an ontology describing these entities. First, the DNA molecule has
a number of properties; as well as physical dimensions (discussed further in
&ldquo;sec:limits-consistency&rdquo;), including a length expressed in metres,
it consists of a number of monomeric units. So, for example, we might say a
DNA molecule with a series of nucleotide residues represented
as <tt class="ttfamily">&lsquo;GATC&rsquo;</tt> <tt class="ttfamily">has&shy;Monomeric&shy;Part</tt> <tt class="ttfamily">4</tt>.</p>
<p>This causes a slight worry from a realist perspective; the number 4 may not
  be a realist universal. There are no instances of 4. In this case, the
  number 4 is being used to describe a part of reality, so this is allowable
  in a realist ontology. Alternatively, we could describe the same reality
  using units (traditionally base-pairs or bp). Therefore,
  the <tt class="ttfamily">DNA
  molecule</tt> <tt class="ttfamily">has&shy;Polymer&shy;Length</tt> 4bp.</p>
<p>Accepting the use of natural numbers in this way, also means that we accept
  the use of sets and sequences to describe reality. One definition of 4 is a
  sequence. Stating that the DNA molecule represented with the
  sequence <tt class="ttfamily">&lsquo;GATC&rsquo;</tt> <tt class="ttfamily">has&shy;Polymer&shy;Length</tt> <tt class="ttfamily">4bp</tt>
  is equivalent, therefore, to stating that
  it <tt class="ttfamily">hasSequence</tt> <tt class="ttfamily">&lsquo;NNNN&rsquo;</tt>
  where <tt class="ttfamily">&lsquo;N&rsquo;</tt> is any nucleotide
  residue.</p>
<p>It should be noted, however, that the usefulness of these statements stems
from our <em>implicit</em> knowledge. The number 4 is a natural number,
so <tt class="ttfamily">has&shy;Monomeric&shy;Part</tt> <tt class="ttfamily">4.2</tt>
is not possible. If a new monomer is attached to our DNA molecule, it will
now <tt class="ttfamily">has&shy;Monomeric&shy;Part</tt> <tt class="ttfamily">5</tt>,
because the natural numbers are additive. We understand the operation of
natural numbers as part of our shared, background knowledge, and we can apply
this knowledge here.</p>
<p>Having described that the DNA molecule represented
as <tt class="ttfamily">&lsquo;GATC&rsquo;</tt> <tt class="ttfamily">has&shy;Polymer&shy;Length</tt> <tt class="ttfamily">4</tt>
(or <tt class="ttfamily">hasSequence</tt> <tt class="ttfamily">&lsquo;NNNN&rsquo;</tt>)
we might wish to be more specific about the order of nucleotide residues and
state <tt class="ttfamily">hasSequence</tt> <tt class="ttfamily">&lsquo;GATC&rsquo;</tt>.
The implicit background knowledge we used previously about the natural numbers
still applies here.</p>
<p>Next consider the process of transcription. The previous discussion about
DNA likewise applies to RNA. The RNA molecule will,
however, <tt class="ttfamily">hasSequence</tt> <tt class="ttfamily">&lsquo;GAUL&rsquo;</tt>,
as RNA uses a different set of bases to DNA. Mathematically, one sequence can
be determined from the other by applying a mapping; though the mapping is a
human activity, not a representation of biochemical reality. To describe this,
we have two options:</p>
<ul class="itemize">
<li>
<p>Taking the realist approach, we can continue to rely on
    the <em>implicit</em> knowledge of the biologist, as we have previously
    relied on an implicit understanding of the natural numbers.</p>
</li>
<li>
<p>We can be explicit about the properties of these sequences (additional
    to those properties shared with the naturals). We can talk about non-real
    world concepts such as alphabets, transformations and how these map to the
    real entities involved.</p>
</li>
</ul>
<p>It should be noted that the former severely limits the ability to describe
the central dogma. The transformation of DNA to RNA sequence is simple, but
the transformation of RNA to protein is more complex. Again, the choice is
between representing reality or representing how we practise science.</p>
<h3 id="a0000000010">Related examples</h3>
<p>The issues relating to sequences are fairly general. In computer science
terms, these are abstract data types. The DNA sequence is a kind of sequence
with special properties (a limited alphabet). Many of the physical quantities
in science have special properties in this way. Consider:</p>
<dl class="description">
<dt>Temperature:</dt>
<dd>
<p>While these look like positive real numbers, temperatures are only
    meaningfully subtracting from each other, which gives information about
    heat-flow between two bodies. Other operations (addition, multiplication)
    which are useful for real numbers have little meaning for temperature.</p>
</dd>
<dt>Recombination Distance:</dt>
<dd>
<p>These look like probabilities but are not, requiring a transformation
    to add.</p>
</dd>
</dl>
<p>There is a limitation on the ability to use abstract data types within a
given ontology language; in most cases, the expressivity of the language will
not allow arbitrary mathematical relations. Some languages, such as OWL, for
example, provide &ldquo;concrete domains&rdquo;; these provide extension
points within the ontology language where, for example, the special properties
of temperature could be represented; other languages do not. In either case,
there are limitations to these capabilities; for example, the constraint and
behaviour of a concrete domain needs to be interpreted with its own semantics
within a reasoner, rather than expressed explicitly within the ontology. It
may make more sense in many circumstances to describe the existence of a
mathematical model as discussed in &ldquo;sec:go-where-science&rdquo;.</p>
<h2 id="a0000000011">The limitations of computers</h2>
<p>Modelling continuous properties is a common problem in ontological
engineering. For example, according to statistics the western world is now
facing an obesity epidemic; in short many or most of us weigh too much.
Understanding, however, exactly what &ldquo;too much&rdquo; means is not
necessarily simple; a common technique to use is body mass index
(BMI)&mdash;body weight divided by square of the height, which is a continuous
value. The BMI range is split into 4 categories: Obese (&gt;30), Overweight
(&gt;25), Normal (&gt;18.5) and Underweight (&lt;18.5). These categories
represent ranges of the value of BMI.</p>
<p>This data simplification has many justifications. On an individual basis,
the BMI is not a particularly accurate measure, so the simplification does not
lose much accuracy. It is also easier to describe to patients, for whom a
&ldquo;BMI of 25&rdquo; will be less comprehensible than being
&ldquo;overweight&rdquo;.</p>
<p>Modelling some of this is straight-forward. Height and weight are modelled
as properties of the individual. The BMI would therefore appear to be a
property of the individual as it is a restatement of two existing properties.
It would appear, therefore, that the category into which an individual falls
should also be a property of the individual.</p>
<p>Consider the values of the property next. These categories are an
abstraction over the real-world properties. Although, height as an integer
value is expressed using a non-real-world entity, it is a description of a
part of reality. A range, however, in the BMI does not describe part of
reality in the same sense. There are no instances of BMI &ldquo;Obese&rdquo;.
In a realist ontology, therefore, it is unclear what the relationship is
between BMI Obese and the individual person.</p>
<p>For the statistician or computer scientist, there is an additional
advantage to the simplification; four discrete groups have better
computational properties than a continuous measure. Database queries become
easier to write, and quicker to run. This is also true for the ontology
builder; simplifying the real-world may fulfil the needs of an application for
which the ontology is built, while avoiding unnecessary complexity. This is a
widely used method for representing partitions of continuous values, the
appropriately named <em>value
partition</em>&nbsp;<span class="cite">[<a href="#rector2005">22</a>]</span>.</p>
<p>In the case of BMI there is a pre-existing social agreement toward a set of
categories; however, even in the absence of such an agreement, the ontology
builder might wish to represent a continuous range as a value partition to
decrease the complexity of their ontology. The value partition is useful, but
many of the concepts involved are not realist universals. The choice, then, is
modelling &ldquo;reality&rdquo; and modelling a simplification that is easier
to use and has better computational properties.</p>
<h3 id="a0000000012">Related Examples</h3>
<p>Splitting the two cases, there are many examples of pre-existing
simplifications. From medicine, there are so many that it seems to be the norm
rather than the exception: hypo- vs hyperthermic; hypo vs hypertensive; hypo-
vs hyperglycemic. In many cases, these ranges have standard interpretations
akin to the BMI.</p>
<p>There are likewise a number of constructions or design patterns that reduce
complexity, extend the effective capabilities of the language or simply
provide standard solutions to common
problems&nbsp;<span class="cite">[<a href="#egana2008">23</a>]</span>.</p>
<h2 id="a0000000013">To go where science has gone before</h2>
<p>Many experiments in biomedicine require the measurement of some physical
property of a biological system. Take, for example, the measurement of heart
rate; in standard practice, this is measured in beats per minute, and is
calculated simply by counting beats (\(b\)) over a time period (\(t\))
and dividing one by the other (\(b/t\)). However, what time period is
appropriate? We might choose 60s, but this raises the question, what is the
meaning of heart rate over shorter periods?</p>
<p>Fortunately, there is a standard solution to this problem, which is to
  define heart rate using differential calculus; so heart rate becomes \(db/dt\).</p>
<p>The derivative, \(db/dt\), presents some problems from a realist
perspective. As noted previously (see &ldquo;sec:sequ-centr-dogma&rdquo;), it
is possible to associate real numbers with entities; however, \(db/dt\) is
\(0/0\). It is not clear whether this quantity is a universal; it is
certainly the case that the expression \(db/dt\) is not a universal, yet
such values and calculus itself is apowerful tool within science and not using
it within ontological models is a severe restriction.</p>
<p>We can describe this ontologically in three ways:</p>
<ul class="itemize">
<li>
<p>We can model the real world entities involved &ndash; beats, time and
    describe nothing else.</p>
</li>
<li>
<p>We can describe rate in mathematical terms. In this case, we are
    defining the heart rate as a mathematical abstraction.</p>
</li>
<li>
<p>We can model the heart rate as a real world entity, \(db/dt\) as a
    mathematical entity and explicitly state that $latex db/dt is a model of
    heart rate.</p>
</li>
</ul>
<p>These different solutions present different advantages. The first is
  consistent with realism. The second is consistent with the most common
  definition used within science. The third is consistent with both but it is
  unclear when to use which term (for example, is \(\Delta {}b/\Delta{} t\) 
  an approximation of \(db/dt\), a quantification of the real world
  quality or both)?</p>
<p>In most cases for the description of science, the second option makes most
sense; conflating the mathematical model with the real entity enables us to
use the advantages of two different modelling techniques without introducing
the confusion of the third option.</p>
<h3 id="a0000000014">Related Examples</h3>
<p>There are many related examples from mechanics, electromagnetics or
chemistry; as with value partitions in medicine, so many that they appear to
be the norm. All of these subject areas have direct relevance to biology and,
perhaps even more so, to the equipment used in the practice of biology.</p>
<p>Mechanical examples would include velocity (\(dr/dt\)) and acceleration
(\(d^2r/dt^2\)). Electromagnetics would include current (\(dC/dt\))
and capacitance (\(dV/dt\)). Chemistry examples would include rate
constants and pH. In biology, population biology, systems biology and
neurosciences make wide use of mathematical models. The lack of a link in
realist ontologies to these mathematical models is not free from consequences
(described further in &ldquo;sec:discussion&rdquo;).</p>
<p>The more general issue comes not from relating to differential calculus,
but relating to pre-existing non-ontological techniques. For example, taxonomy
in the linnean sense. There have been many discussions about whether species
and high taxons are reflective of reality; it is certainly the case that a
number of higher taxons do not reflect
phylogeny&nbsp;<span class="cite">[<a href="#Schulz2008">24</a>]</span>.
Given that it is of uncertain status, should we represent taxonomy as a
quality of an organism, an independent conceptualisation of the biologists or
both?</p>
<h2 id="a0000000015">The limits of consistency</h2>
<p>Physical biological entities such as cells and organisms have an extent in
the real world. This paper&rsquo;s first author, for example, has a height of
around 1.8m; a similar value cannot be applied meaningfully to the electronic
version of this document, although it may apply to the paper that it may be
printed on.</p>
<p>There are a number of different, well-understood mechanisms for
representing physical space. We can use a dimensional or cartesian model, with
three perpendicular lines with a linear scale. We can use a polar model,
expressing extent using angles and a single distance. Modern physics has told
us, however, that all of these are limited models of reality; physics
generally uses a four dimensional Minkowskian spacetime model; here the axes
are not linear; motion of the observer down one will change values down the
others. Alternatively, at a quantum level, length is a probability
distribution.</p>
<p>For the ontology builder, this leaves a difficult choice and the same
choice discussed previously in &ldquo;sec:colo-colo-models&rdquo;: Represent
the reality physicists relate; bless one, ignore the rest; describe their
components but not their models; explicitly describe them.</p>
<p>If the ontology builder is to be consistent, then, they should make the
same choice in both cases; if we describe colour models, we should explicitly
describe Minkowskian spacetime, quantuum probability distributions, cartesian
and polar systems.</p>
<p>There are, however, two important differences to colour models. First,
there is a strong social bias toward cartesian systems. Secondly, within the
scope of biology and the life sciences, four dimensional spacetime or quantuum
models confuse rather than simplify; the relativistic corrections produce such
small differences that they are statistically meaningless; similarly,
describing a leg as a probability distribution adds little other than
complexity.</p>
<p>This leaves the ontology builder with two options:</p>
<ol class="enumerate">
<li>
<p>We can build an ontology with a consistent relationship to reality. So,
    having decided to explicitly represent colour models, this suggests that
    we should also explicitly model 3D space, 4D spacetime and the various
    co-ordinate systems that are used to describe these.</p>
</li>
<li>
<p>We build an ontology with an inconsistent relationship to reality. So,
    we might be explicit about colour models, but arbitrarily bless 3
    dimensional space, using cartesian co-ordinates.</p>
</li>
</ol>
<p>The compromise here is very straight-forward. The first solution retains
its consistency to reality, the second is consistent with usability and usage;
for biomedicine, a 3D cartesian co-ordinate system plus time is likely to be
enough for the foreseeable future and makes life easier in the meantime.</p>
<p>The Newtonian view of the world is the best model in this case: it is good
enough. When building an ontology for biomedicine, it makes most sense to use
this view as it will produce the results required. If, in the future,
biomedicine advances so that relativistic or quantuum representations are
necessary, then current ontologies will need refactoring; even then, this
future cost is likely to be offset by gains in the present.</p>
<h3 id="a0000000016">Related examples</h3>
<p>In the choice of units for measurement for scientific purposes, SI units
are to be preferred. It should be noted, here, that there is a domain
dependency; for an engineering ontology, the use of American imperial units
would be inevitable.</p>
<p>For most of biology it is unnecessary to distinguish between the length of
the calendar year and the astronomical year&mdash;the latter changing with
respect to variability in the motion of the earth. There are occasions when
this distinction may be important for data integration in bioinformatics as
leap years and leap seconds show.</p>
<p>For an ecologist counting the number of trees in a sampling square 100m by
100m, they will take the area as 10,000m<sup>2</sup>; The surface is, however,
neither smooth nor a Euclidean plane, so this area is wrong in reality. For
much of ecology, this distinction will not matter. Again, there is a domain
dependency here; whale or bird biologists interested in migration patterns may
well care about the curvature of the earth.</p>
<h1 id="a0000000017">Discussion</h1>
<p>Realism has been held up as a methodology for &ldquo;good&rdquo;
ontological modelling, and the production of more tightly defined and
consistent ontologies. In this paper, we have discussed five different cases,
with biological examples, that we might wish to model ontologically; for each,
we have presented different models, describing the same underlying science. In
each case, a realist solution is possible, but places either limitations or
awkwardness on the models produced.</p>
<p>Building an ontology with a consistent relationship to reality may help to
enable
interoperability&nbsp;<span class="cite">[<a href="#Smith2007">7</a>]</span>
under some circumstances. If, however, it disallows modifications for
computability (see &ldquo;sec:work-around-comp&rdquo;), or requires arbitrary
blessing for one form of specification over another (see
&ldquo;sec:colo-colo-models&rdquo;) it may have the opposite effect.</p>
<p>Nor are the issues discussed in this paper free from consequences. In
&ldquo;sec:go-where-science&rdquo;, we discussed interoperability with
existing scientific models. Mathematics and physics have produced complex,
refined and expressive notation systems, representing a deep understanding of
how numbers and the physical world work. These are, however, not being used in
current ontologies and this results in a lack of precision, errors and
omissions:</p>
<dl class="description">
<dt>Lack of Precision:</dt>
<dd>
<p>The PATO term <b class="bfseries">speed</b> (PATO:8) which is defined
    as:</p>
<blockquote class="quote"><p>
      <i class="itshape">&ldquo;A physical quality inhering in a bearer by
      virtue of the bearer&rsquo;s rate of change of position&rdquo;</i>
    </p></blockquote>
<p>with a synonym of <tt class="ttfamily">velocity</tt>; from this
    definition, we cannot distinguish the vector and scalar quantities of
    velocity and speed; indeed, it is not clear which of these
    two <b class="bfseries">speed</b> (PATO:8) is.
    Meanwhile <b class="bfseries">acceleration</b> (PATO:1028) is defined
    as:</p>
<blockquote class="quote"><p>
      <i class="itshape">&ldquo;&hellip; the rate of change of the
      bearer&rsquo;s velocity in either speed or direction&rdquo;</i>
    </p></blockquote>
<p>which is implicitly a vector quantity, and contradicts the statement
    that speed and velocity are synonyms. The mathematical definitions
    (velocity as \(dr/dt\), speed \(\left|{dr/dt}\right|\),
    acceleration \(d^2r/dt^2\)) are precise, concise and accurate.</p>
</dd>
<dt>Errors:</dt>
<dd>
<p>Similarly, <b class="bfseries">length</b> (PATO:122) is defined as a
    quality; qualities have to inhere in <tt class="ttfamily">Independent
    Continuant</tt>s; as a <tt class="ttfamily">Spatial Region</tt> is a child
    of <tt class="ttfamily">Continuant</tt> this means
    that <tt class="ttfamily">Spatial Region</tt>s cannot
    bear <tt class="ttfamily">length</tt>s. In short, in current versions of
    BFO, there is no intuitive way of modelling the length of a region in
    space.</p>
</dd>
<dt>Omissions:</dt>
<dd>
<p>BFO is mass-centric; it is currently unclear where many physical
    entities exist, examples including energy, waves (through a medium) or EM
    radiation. Likewise, it lacks a natural position for numbers (that have no
    particulars), patterns and distributions. Yet, these entities are key to a
    physical description of the world.</p>
</dd>
</dl>
<p>To our mind, these are indicative of some of the most serious flaws of
realism-based ontology building. It makes little sense to replicate the models
of physics using English instead of a more precise mathematical notation. If
BFO had been built using direct links to a grounded physical model of the
world, it seems likely that these problems would not have arisen.</p>
<p>We have discussed a number of concrete examples where building an ontology
by considering realist concerns has detrimental consequences for the model. We
believe that the real world entities and the relationships between them is
only one consideration among many: simplicity, usability, fitness for purpose
are equally important.</p>
<p>Taken to its most extreme form realism, it seems to these authors, would
produce models unsuitable for use within science. There is a choice between a
correct account of reality that does not allow the data of science to be
adequately described and a description of reality that takes in to account how
science is performed. Fortunately, most &ldquo;realist&rdquo; ontologies are
not really so: PATOs representation of HSV for modelling colour is not a bad
decision; it represents a straight-forward, pragmatic approach to ontology
building, where the representation has been chosen on the basis of a use case,
not the entities as they exist in reality. Similarly BFO uses a 3D plus time
model of reality; it suggests that length are properties of the entity alone,
without reference to the observer. This is not a true reflection of reality,
but one which is a good enough approximation for use within the biomedical
sciences; in short, usability and simplicity have been considered to be more
important in the modelling process than the relationship of the model to
reality. In accepting these compromises, BFO has placed itself squarely as a
computational rather than philosophical ontology.</p>
<p>Despite these concerns, realism has made a contribution to the field of
biomedical ontology engineering. By emphasising the importance of real-world
entities and by encouraging a more specific interpretation than the
generalisation of a &ldquo;conceptualisation&rdquo;, realism helps to avoid
the introduction of unnecessary layers of abstraction. A consideration of the
entities in reality may be a part of an ontology engineering process; ontology
builders should have careful and considered reasons for diverting from
modelling in this way and that ontologies should explicitly describe through
annotations the terms that do or may divert from this view. Ontology builders
should, however, be free to make this decision; the acceptance of compromise
with respect to reality will result in simpler and more effective knowledge
artefacts.</p>
<p>Johansson&nbsp;<span class="cite">[<a href="#Johansson2006">10</a>]</span>
when discussing realism asks the rhetorical question: &ldquo;would you like to
be treated for a physiological illness by a <em>(non-realist)</em> physician
who is not sure that there are human bodies?&rdquo; &ndash; (our emphasis). As
scientists, our reply would be if their survival and success statistics were
the best, we would not care whether they were a realist, a non-realist or a
robot which admitted of no philosophical position at all; also, using a doctor
who was strictly realist and thus cut off from much of the practise of science
(such as determining heart rate) would disturb many patients. As
bioinformaticians, we build ontologies to provide a descriptive and predictive
model of the wealth of experimental data that is now available. In biology,
the job of an ontologist is to describe data such that it can be analysed.
Naturally this entails a description of entities in reality; it also, however,
entails a description of science, and it entails compromise; we overlook this
to our peril. The last 200 years of science shows the success and strength of
this position; it is on this groundwork that we should build for the
future.</p>
<div>
<h1>Bibliography</h1>
<dl class="bibliography">
<dt>[<a name="Ashburner2000" id="Ashburner2000">1</a>]</dt>
<dd>
<p>Ashburner M, Ball C, Blake J, Botstein D, Butler H, et&nbsp;al.
      (2000) Gene Ontology: a tool for the unification of biology. The Gene
      Ontology Consortium. Nat Genet 25: 25&ndash;9.</p>
</dd>
<dt>[<a name="handbook2" id="handbook2">2</a>]</dt>
<dd>
<p>Stevens R, Lord P (2008) Application of ontologies in bioinformatics.
      In: Staab S, Studer R, editors, Handbook on Ontologies in Information
      Systems, Springer. Second edition.
      URL <a href="http://www.cs.man.ac.uk/~stevensr/papers/handbook2.pdf">http://www.cs.man.ac.uk/~stevensr/papers/handbook2.pdf</a>.</p>
</dd>
<dt>[<a name="Zeeberg2003" id="Zeeberg2003">3</a>]</dt>
<dd>
<p>Zeeberg B, Feng W, Wang G, Wang M, Fojo A, et&nbsp;al. (2003)
      GoMiner: a resource for biological interpretation of genomic and
      proteomic data. Genome Biol 4: R28.</p>
</dd>
<dt>[<a name="Wolstencroft2006" id="Wolstencroft2006">4</a>]</dt>
<dd>
<p>Wolstencroft K, Lord P, Tabernero L, Brass A, Stevens R (2006)
      Protein classification using ontology classification. Bioinformatics 22:
      e530-538.</p>
</dd>
<dt>[<a name="Lord2003" id="Lord2003">5</a>]</dt>
<dd>
<p>Lord PW, Stevens RD, Brass A, Goble CA (2003) Investigating semantic
      similarity measures across the gene ontology: the relationship between
      sequence and annotation. Bioinformatics 19: 1275&ndash;1283.</p>
</dd>
<dt>[<a name="Whetzel2006a" id="Whetzel2006a">6</a>]</dt>
<dd>
<p>Whetzel PL, Parkinson H, Causton HC, Fan L, Fostel J, et&nbsp;al.
      (2006) The MGED Ontology: a resource for semantics-based description of
      microarray experiments. Bioinformatics 22: 866&ndash;873.</p>
</dd>
<dt>[<a name="Smith2007" id="Smith2007">7</a>]</dt>
<dd>
<p>Smith B, Ashburner M, Rosse C, Bard J, Bug W, et&nbsp;al. (2007) The
      OBO Foundry: coordinated evolution of ontologies to support biomedical
      data integration. Nat Biotechnol 25: 1251&ndash;1255.</p>
</dd>
<dt>[<a name="OBOFoundry2006" id="OBOFoundry2006">8</a>]</dt>
<dd>
<p>OBO Foundry Consortium (2006). OBO Foundry
      Principles. <a href="http://obofoundry.org/wiki/index.php/OBO_Foundry_Principles">http://obofoundry.org/wiki/index.php/OBO_Foundry_Principles</a>.</p>
</dd>
<dt>[<a name="OBOFoundry2008" id="OBOFoundry2008">9</a>]</dt>
<dd>
<p>OBO Foundry Consortium (2008). OBO Foundry
      Principles. <a href="http://obofoundry.org/wiki/index.php/OBO_Foundry_Principles">http://obofoundry.org/wiki/index.php/OBO_Foundry_Principles</a>.</p>
</dd>
<dt>[<a name="Johansson2006" id="Johansson2006">10</a>]</dt>
<dd>
<p>Johansson I (2006) Bioinformatics and biological reality. J Biomed
      Inform 39: 274&ndash;287.</p>
</dd>
<dt>[<a name="Grenon2004" id="Grenon2004">11</a>]</dt>
<dd>
<p>Grenon P, Smith B, Goldberg L (2004) Biodynamic ontology: applying
      BFO in the biomedical domain. Stud Health Technol Inform 102:
      20&ndash;38.</p>
</dd>
<dt>[<a name="smith2004beyond" id="smith2004beyond">12</a>]</dt>
<dd>
<p>Smith B (2004) Beyond concepts: ontology as reality representation.
      In: Formal ontology in information systems: proceedings of the third
      conference (FOIS-2004). Ios Pr Inc, p.&nbsp;73.</p>
</dd>
<dt>[<a name="Lord2009" id="Lord2009">13</a>]</dt>
<dd>
<p>Lord P (2009) An Evolutionary Approach to Function. In:
      Bio-Ontologies 2009: Knowledge in Biology.
      URL <a href="http://hdl.handle.net/10101/npre.2009.3228.1">http://hdl.handle.net/10101/npre.2009.3228.1</a>.</p>
</dd>
<dt>[<a name="Smith2005" id="Smith2005">14</a>]</dt>
<dd>
<p>Smith B, Ceusters W, Klagges B, K&ouml;hler J, Kumar A, et&nbsp;al.
      (2005) Relations in biomedical ontologies. Genome Biol 6: R46.</p>
</dd>
<dt>[<a name="Russell1946" id="Russell1946">15</a>]</dt>
<dd>
<p>Russell B (1946) A History of Western Philosophy. Routledge.</p>
</dd>
<dt>[<a name="Merrill2010" id="Merrill2010">16</a>]</dt>
<dd>
<p>Merrill G (2010) Ontological realism: methodology or misdirection.
      Applied Ontology 5: 79-108.</p>
</dd>
<dt>[<a name="Dumontier2010" id="Dumontier2010">17</a>]</dt>
<dd>
<p>Dumontier M, Hoehndorf R (2010) Realism for scientific ontologies.
      In: 6th International Conference on Formal Ontology in Information
      Systems.</p>
</dd>
<dt>[<a name="Gruber1992" id="Gruber1992">18</a>]</dt>
<dd>
<p>Gruber T (1992). What is an ontology?
      URL <a href="http://www-ksl.stanford.edu/kst/what-is-an-ontology.html">http://www-ksl.stanford.edu/kst/what-is-an-ontology.html</a>.</p>
</dd>
<dt>[<a name="Ceusters2006" id="Ceusters2006">19</a>]</dt>
<dd>
<p>Ceusters W, Smith B (2006) A realism-based approach to the evolution
      of biomedical ontologies. AMIA Annu Symp Proc : 121&ndash;125.</p>
</dd>
<dt>[<a name="Shrager2003" id="Shrager2003">20</a>]</dt>
<dd>
<p>Shrager J (2003) The fiction of function. Bioinformatics 19:
      1934-1936.</p>
</dd>
<dt>[<a name="Seyed2009" id="Seyed2009">21</a>]</dt>
<dd>
<p>Seyed AP (2009) BFO/DOLCE Primitive Relation Comparison. In:
      BioOntologies 2009: Knowledge in Biology.</p>
</dd>
<dt>[<a name="rector2005" id="rector2005">22</a>]</dt>
<dd>
<p>Rector A (2005). Representing specified values in owl: &ldquo;value
      partitions&rdquo; and &ldquo;value sets&rdquo;. W3C Working Group Note.
      URL <a href="http://www.w3.org/TR/swbp-specified-values/">http://www.w3.org/TR/swbp-specified-values/</a>.</p>
</dd>
<dt>[<a name="egana2008" id="egana2008">23</a>]</dt>
<dd>
<p>Egana M, Rector A, Stevens R, Antezana E (2008) Applying Ontology
      Design Patterns in Bio-ontologies, Springer Berlin/Heidelberg. pp.
      7-16.</p>
</dd>
<dt>[<a name="Schulz2008" id="Schulz2008">24</a>]</dt>
<dd>
<p>Schulz S, Stenzhorn H, Boeker M (2008) The ontology of biological
      taxa. Bioinformatics 24: i313&ndash;i321.</p>
</dd>
</dl>
</div>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1713 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/07/realism-and-science/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Leaving BFO Discuss</title>
		<link>http://www.russet.org.uk/blog/2010/03/leaving-bfo-discuss/</link>
		<comments>http://www.russet.org.uk/blog/2010/03/leaving-bfo-discuss/#comments</comments>
		<pubDate>Tue, 23 Mar 2010 10:50:00 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Ontology]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1632</guid>
		<description><![CDATA[Introduction A few weeks ago I unsubscribed from the BFO discuss mailing list. I&#8217;ve been reading and posting there since March 2007; in that time I&#8217;ve managed to send 492 mail messages which surprises even me. As a mailing list, BFO discuss is a slightly bruising experience: it&#8217;s a bit like a bar fight; one [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1632">
<hr /> 
<h2><a name="_introduction"></a>Introduction</h2>
<p>A few weeks ago I unsubscribed from the BFO discuss <a href="http://groups.google.com/group/bfo-discuss">mailing list</a>. I&#8217;ve been reading and posting there since March 2007; in that time I&#8217;ve managed to send <a href="http://groups.google.com/groups/profile?show=more&amp;enc_user=QA5SShwAAACILHBRb8Eg-ALotJYU_6N2amAs_YdJkcjhgBJtEbF5Ig&amp;group=bfo-discuss">492</a> mail messages which surprises even me. As a mailing list, BFO discuss is a slightly bruising experience: it&#8217;s a bit like a bar fight; one person swings a punch and everyone just piles in. I joined the mailing list because BFO has become somewhat of a force within the bio-ontology community and I wanted to help make sure it was fit for purpose; however, I have to admit that I have been as guilty of reaching for nearest available pool cue as the next ontologist. Not the best side of me, but there you have it.</p>
<p>During my time on the mailing list, I have learnt a lot about BFO and the realist philosophy that, in theory, underpins it. Actually, BFO is not at all bad; for me, though, realism is largely without merit. One of the main difficulties with realism is that is carries with it the idea that, by thinking very hard, you can come up with a &#8220;representation of reality&#8221;. I think that this is mistaken. As scientists, we should be wary of thinking too much; our role, whenever possible, is to think just enough to get us to the start of the next experiment. This doesn&#8217;t seem to happen with BFO; in the time that I have been on the mailing list, BFO itself has changed very little; the constant feedback and iteration to accommodate new knowledge and experience is largely not happening. I have qualms with many parts of BFO (for example, I have discussed the issues with the <tt>Realizable Entity</tt> <a href="http://precedings.nature.com/documents/3228/version/1">hierarchy</a>). However, for me, the worse outcome of the philosophical approach have happened as a result of not considering the advanced models that physics has produced to explain the experimental data that we see. I give four examples.</p>
<hr /> 
<h2><a name="_length_in_space"></a>Length in Space</h2>
<p>BFO makes a very high-level split between <tt>Independent</tt> and <tt>Dependent Continuants</tt>. A continuant is something that persists over time, but which exists in full for this entire time: my computer or me, for instance, as opposed to a process, not all of which exists at any point in time. The distinction between an independent and dependent continuant depends on whether this entity exists on its own; for my height, a dependent continuant, to exist, I also have to exist. Once I cease to exist, so does my height. This seems okay, but in tying physical dimensions to an independent continuant, BFO has made a fundamental error: how do we express the length of a <tt>Spatial Region</tt>? Length is a dependent continuant and, so, there must be independent continuant in which is <tt>inheres</tt>. Unfortunately, <tt>Spatial Region</tt> is not an independent continuant itself.</p>
<p>There are solutions, of course; we can think of another relation, other than <tt>inheres</tt> to link <tt>Spatial Region</tt> and <tt>Length</tt>. But, we still need a Independent Continuant to exist that this length <tt>inheres</tt> in. Another possibility is to describe the length of a spatial region as the length of a Independent Continuant that could exists in it. But, it is easy to think of Spatial Regions in which no Independent Continuant can exist (for example, the Spatial Region 1m longer than the longest object in the universe). BFO would be modelling the world backward; physics uses a coordinate system and places objects within that; this approach would use objects to define the coordinate system.</p>
<p>Currently, this problem seems to have been <a href="http://groups.google.com/group/bfo-discuss/msg/6ef551a3d7767679">accepted</a> by some of the authors of BFO; however, there is no solution. If BFO had started from the mathematical models of physics, to me it seems likely that we would not be in this position.</p>
<hr /> 
<h2><a name="_change_in_process"></a>Change in Process</h2>
<p>BFO suggests that <tt>Occurrents</tt> (such as a process) can have properties in a similar way that independent continuants can have qualities. I have a length, a process may have a duration. However, BFO suggests that the properties of a an Occurrent cannot change; rather, there must be a new Occurrent.</p>
<p>Again, this makes little sense, and ignores very simple physical examples. Consider, for example, a car first travelling at 10ms<sup>-1</sup>, then 20ms<sup>-1</sup>. Consider the process of motion. BFO would have us model this as 3 processes; car moving at 10ms<sup>-1</sup>, car moving at 20ms<sup>-1</sup> and a single motion process of which the other two are part.</p>
<p>For a simple example, this style of modelling may work. However, consider the earth travelling around the sun. The problem is that the motion is continually changing; the earth&#8217;s velocity changes infinitesimally toward the sun, so it&#8217;s always accelerating. Worse, the acceleration also changes infinitesimally, as the earth&#8217;s relative location to sun changes. So, to model this in BFO, we need an infinite number of processes (for both the motion and acceleration). We could argue that while the velocity and acceleration change constantly, the angular velocity and speed of the earth is constant, so why not model the process in these terms? Unfortunately, even this is not true; the earth moves in an ellipse, not a circle, even if its very close to a circle. So, the angular velocity and speed change continually also.</p>
<p>The physics of this is, as I have said, straightforward. The earth&#8217;s motion has a velocity and acceleration expressed as (nearly) two sine waves along the two axes.</p>
<hr /> 
<h2><a name="_rate_of_change"></a>Rate of Change</h2>
<blockquote><p>In order to get to the subtleties in a clearer fashion, we remind you of a joke which you surely must have heard. At the point where a lady in a car is caught by a cop, the cop comes up to her and says, &#8220;Lady, you were going 60 miles an hour!&#8221; She says, &#8220;That&#8217;s impossible, sir, I was travelling only seven minutes. It is ridiculous &#8211; how can I go 60 miles an hour when I wasn&#8217;t going an hour?&#8221;</p>
<p align="right"> &#8212; Richard Feynman </p>
</blockquote>
<p>In a short, recent thread, it appears that there has been discussion on those qualities that need a <a href="http://groups.google.com/group/bfo-discuss/msg/cc3880bd24ad5380">period of time</a> to have meaning. The examples given include velocity and acceleration. But does this make any sense? It is certainly the case, as the Feynman quote shows, that the definition of velocity is not obvious. But it&#8217;s also a known issue. Feynman&#8217;s story shows that it can be very hard to describe exactly what you mean when <em>talking</em> about velocity; it&#8217;s for this reason that physics uses mathematical notation, where we can be precise. Velocity is \(dr/dt\), acceleration is \(d^{2}r/dt^{2}\). As I have <a href="http://groups.google.com/group/bfo-discuss/msg/4d1a576292c25a00">said</a>, these examples do not stand alone&#8201;&#8212;&#8201;the same applies to many other qualities, including those where change is not over time.</p>
<p>In short, it makes little sense to create distinctions in our physical model of the world that physics does not make. We are creating work for ourselves and confusion for everyone else.</p>
<hr /> 
<h2><a name="_absolute_space"></a>Absolute Space</h2>
<p>BFO distinguishes between <tt>Sites</tt> and <tt>SpatialRegions</tt>; the idea is to distinguish between bits of space in general, and holes&#8201;&#8212;&#8201;the lumen of the gut, for instance. This seems reasonable at first sight. However, this is being done by suggesting that a <tt>Site</tt> is relative to an <tt>IndependentContinuant</tt> while <tt>SpatialRegions</tt> are absolute.</p>
<p>In short, over 100 years after Michelson-Morley, BFO has reinvented absolute space. The justification for this is that, according to one of the authors, without absolute space, <a href="http://groups.google.com/group/bfo-discuss/msg/5cc343b4bc59d6f9">problems</a> arise. The problems haven&#8217;t been described in detail, but apparently, involve things moving through space or changing shape.</p>
<p>BFO is put forward as a &#8220;realist&#8221; ontology&#8201;&#8212;&#8201;that is it models the key entities as they exist in reality. And, the reality is this; there is no evidence that absolute space exists and, indeed, very strong evidence that it does not. It is also hard to see how this could cause problems; Einstein removed absolute space from the model that physics uses a century ago. Now, admittedly, this produces some really weird and counter-intuitive results, but only when two objects are moving rapidly with respect to each other. Relativity does not cause any problems that are not necessary to describe the world. In practice for &#8220;everyday&#8221; physics, the upshot is that you just define (or assume) a frame of reference; there is normally an obvious one, but any frame will do, and the results will come out the same.</p>
<p>My <a href="http://groups.google.com/group/bfo-discuss/msg/b9af427fb23689e0">post</a> on this produced some interesting replies. <a href="http://groups.google.com/group/bfo-discuss/msg/5ffd0249ff4f547a">Bjoern Peters</a> straightforwardly agreed. <a href="http://groups.google.com/group/bfo-discuss/msg/d744dd81bc9b926b">Alan Ruttenberg</a> suggested that I was arguing space doesn&#8217;t exist; while <a href="http://groups.google.com/group/bfo-discuss/msg/a3ec1c054a82fa62">Barry Smith</a> argued that having this (false!) distinction in BFO is necessary for practical reasons.</p>
<p>At which point, I unsubscribed.</p>
<hr /> 
<h2><a name="_conclusions"></a>Conclusions</h2>
<p>I am not arguing here that BFO is totally broken or has no purpose. To some extent, I am yet to be convinced that having any upper ontology helps with ontology building: arguing against, they are hard to understand and often result in a top-down design which ends in philosophical arguments and analysis paralysis; arguing for, they provide some basic structure or a design pattern, which can ease the task of starting to build an ontology, or to understand someone else&#8217;s. I am unsure yet whether they help with (computational) interoperability; by analogy to software, design patterns are good for the developer but do not provide any more guarantees. In general, though, I work on the basis that the use of a common framework seems a sensible idea; it is something we should try until we have enough data to make a more coherent decision. BFO provides one such basic framework; and, in general, it&#8217;s okay so long as we do not take it too seriously. We should be willing to ignore it when it fails.</p>
<p>However, realism has much less going for it. It is based on the conceit that we should look at reality; now, within a scientific context, this means experimental data. The statement that science should use experimental data, though, is obvious and is a truism; it cannot, therefore, itself define a methodology.</p>
<p>In practice, however, BFO has been built leaning on 2000 years of philosophy; and here lies the mistake. We should acknowledge our limitations as ontologists; we have nothing at all to add to a physical model of the universe as the physicists have already done it. All we need is to represent their model; we should not be looking at experimental data, because someone else has already done it for us. The problems described here are all avoided by the simple mathematical model that physics uses&#8201;&#8212;&#8201;4 dimensions, or real number lines, at 90 degrees to each other, and by the use of calculus to describe change.</p>
<p>In BFO, we see an attempt to consider the key entities as they exist in reality; and, the bottom line here, is that at least for these few classes, BFO has done a bad job of it. It has misunderstood lengths and space, developed a process model that is unmanageable and made distinctions that are known to be wrong. Biology is built on top of the other sciences, and it will not benefit the cause of bio-ontologies if we ignore them. Worse biologists attempting to use BFO will find it hard to apply models which are demonstrably wrong; what criteria can we apply to distinguish <tt>SpatialRegions</tt> and <tt>Sites</tt>, when physics tells us that these criteria do not and cannot exist? Finally, as ontologists, we should accept our limitations and the limitations of the technology; we should not attempt to re-represent knowledge which has already been modelled in more appropriate ways.</p>
<p>We should be experimenting and testing more than we are thinking; we should be embracing change when we are wrong. We should be leaning on 200 years of physics and biology, not 2000 years of philosophy.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1632 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/03/leaving-bfo-discuss/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>To Bio-Ontologies 2009</title>
		<link>http://www.russet.org.uk/blog/2009/06/to-bio-ontologies-2009/</link>
		<comments>http://www.russet.org.uk/blog/2009/06/to-bio-ontologies-2009/#comments</comments>
		<pubDate>Fri, 26 Jun 2009 21:51:15 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Ontology]]></category>
		<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1287</guid>
		<description><![CDATA[So, this year of Bio-Ontologies is upon me; I&#8217;m sitting in the airport waiting to fly in the wrong direction; although I&#8217;ve noticed that the airport signs no longer call this &#8220;waiting time&#8221; but &#8220;shopping time&#8221;. It&#8217;s 12 years on now; I can&#8217;t remember whether this makes it the oldest SIG at ISMB, but it [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1287">
<p>So, this year of Bio-Ontologies is upon me; I&#8217;m sitting in the airport waiting to fly in the wrong direction; although I&#8217;ve noticed that the airport signs no longer call this &#8220;waiting time&#8221; but &#8220;shopping time&#8221;.</p>
<p>It&#8217;s 12 years on now; I can&#8217;t remember whether this makes it the oldest SIG at ISMB, but it must be close. Perhaps it is surprising that a small meeting like this has lasted so long, but during it&#8217;s time the use of ontologies within biology has blossomed; to some extent, this is true of the outside world also. This year has carried on with the trend. Gone are the days that we used to get enough papers to fill the day, but no more; we&#8217;ve stretched the day out, we&#8217;ve added a poster session but still we get more. The number of attendees has gone up somewhat also. It&#8217;s good to see.</p>
<p>For me, bio-ontologies has also been the centre of my entry into the field; Edmonton was the first ontology paper that I ever presented &#8212; perhaps depressingly, still some of my best work. This year has special significance for me. I&#8217;m giving a paper myself for the first time since Edmonton; perhaps fitting to end off as I began, because this will also be my last year as conference chair. I&#8217;ve been involved now for 6 of the 12 years; while, I&#8217;ve enjoyed it and felt privileged to do the work, it&#8217;s enough. Organising is hard work, even now when I understand the process well. In the last few years, I&#8217;ve tried to push the workshop to be a bit broader than just ontologies, to take in all new forms and technologies for representing and distributing knowledge; I&#8217;ve met with some, but limited success. A workshop with a 12 year pedigree takes some time to move. I was heartened to see that it was the first SIG to get a subject on the official conference friendfeed. Web 2.0 is upon us. With luck, this will become a bigger part of the meeting. If so, this will be other peoples achievement, not mine. Did I mention that this is my last year?</p>
<p>I&#8217;m looking forward to giving my paper on functions and roles in ontologies. One of the more minor reasons for retiring, is that it&#8217;s easier to publish in a workshop which you are not organising. I&#8217;m surprisingly nervous about the talk; probably as much so as in Edmonton. I&#8217;ve been practicing the talk incessantly, to the point that my back is complaining from too much sitting. It&#8217;s my first ever single author paper. I&#8217;m hoping that people will like the paper; it&#8217;s message is simple and straight-forward. Of course, this doesn&#8217;t mean that it&#8217;s correct. Last years paper on a similar topic caused quite a fuss (which, let&#8217;s be honest, was partly my fault) and I know that some in the audience will be quite vehement in their opposition to mine. Even though I&#8217;ve been over it so many times, I have the back-of-my-mind fear that there is a big hole that I&#8217;ve missed.</p>
<p>I guess this is good; it means that I&#8217;m excited about my own paper in a way that I haven&#8217;t been for years. A bit of fuss will mean that other people are too, for good or for ill. In the end, I&#8217;ll probably be most disappointed if the paper goes with a whimper not a bang.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1287 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/06/to-bio-ontologies-2009/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introducing Omnencap</title>
		<link>http://www.russet.org.uk/blog/2009/06/introducing-omnencap/</link>
		<comments>http://www.russet.org.uk/blog/2009/06/introducing-omnencap/#comments</comments>
		<pubDate>Fri, 19 Jun 2009 20:27:26 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Ontology]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1269</guid>
		<description><![CDATA[Ah, it does on and on. After my last attempt at literate OWL programming, called omnsplit, I decided that there was a problem; this version splits the OWL file into individual statements, and puts them into files with the same name as the OWL class (property, or whatever). The problem is that, for an ontology [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1269">
<p>Ah, it does on and on. After my last attempt at literate OWL programming, called <a href="http://www.russet.org.uk/blog/2009/06/introducing-omnsplit/">omnsplit</a>, I decided that there was a problem; this version splits the OWL file into individual statements, and puts them into files with the same name as the OWL class (property, or whatever).</p>
<p>The problem is that, for an ontology like OBI, you get 1400 individual files; this is just inconvienient as many applications don&#8217;t like this many files in a directory. Also, there is a naming constraint; you can only use characters legal in the file system; this doesn&#8217;t include &#8220;:&#8221; if you want to be Windows (NTFS) compliant.</p>
<p>So, for my new system, I decided to generate an index file, which just points at locations in the ontology file. Initially, I was just going to index the main ontology file; in the end, I decided a partial copy was the way forward; generating both the index and indexed file ensure that they will stay in-sync.</p>
<p>It required a bit of nasty latex hacking; the basic problem was avoiding the limitation of being only able to use legal LaTeX macro characters (that is letters). The system now works like this:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt>
<i><font color="#9A1900">%% This is generated by python which also generates the</font></i>
<i><font color="#9A1900">%% function_ont.spt file which is a copy of the ontology (with a</font></i>
<i><font color="#9A1900">%% few new lines gone.</font></i>

<i><font color="#9A1900">%% This just defines a new macro in what appears to be an</font></i>
<i><font color="#9A1900">%% unnecessarily complex way.</font></i>
<b><font color="#0000FF">\expandafter\def\csname</font></b> OmnEntityHeaderheader<b><font color="#0000FF">\endcsname</font></b><i><font color="#9A1900">%</font></i>
<font color="#009900">{\lstinputlisting[language=omn,firstline=1,lastline=8]{function_ont.spt}}</font>

<i><font color="#9A1900">%% But the use of \expandafter and \csname means that you can</font></i>
<i><font color="#9A1900">%% use any character you like, including underscores and numbers</font></i>
<i><font color="#9A1900">%% in the macro name.</font></i>
<b><font color="#0000FF">\expandafter\def\csname</font></b> OmnEntityObjectPropertyhas_role<b><font color="#0000FF">\endcsname</font></b><i><font color="#9A1900">%</font></i>
<font color="#009900">{\lstinputlisting[language=omn,firstline=206,lastline=219]{function_ont.spt}}</font>

<i><font color="#9A1900">%% We can now define two commands in the style file. Again</font></i>
<i><font color="#9A1900">%% we use \csname so that we are not bound to characters legal</font></i>
<i><font color="#9A1900">%% in latex macros.</font></i>
<b><font color="#0000FF">\newcommand</font></b><font color="#009900">{\omnclass}</font><font color="#993399">[2]</font><font color="#009900">{\csname OmnEntityClass#1#2\endcsname}</font>
<b><font color="#0000FF">\newcommand</font></b><font color="#009900">{\omnobjprop}</font><font color="#993399">[2]</font><font color="#009900">{\csname OmnEntityObjectProperty#1#2\endcsname}</font>

<i><font color="#9A1900">%% now in our source, we can do things like this.</font></i>
<b><font color="#0000FF">\omnobjprop</font></b><font color="#009900">{}{has_role}</font>

</tt></pre>
</pre>
</td>
</tr>
</table>
<p>Using an index in this way also has another advantage. I&#8217;ve had to make a decision whether to go with rdfs:label or the entity name. I can now back out of this; I can just use both in the index file, without too much extra space, so that either would be referencable within the latex.</p>
<p>To me, this feels like the right solution. It&#8217;s relatively simple (with a bit of nasty latex, which is nicely hidden), it doesn&#8217;t depend on the file system. It needs a bit more work to bring it to completion, but not that much.</p>
<p>Sadly <a href="http://bio-ontologies.org.uk/">bio-ontologies</a> looms, so next week will be getting ready for that; perhaps I can finish this off on the way back. &#8220;Sadly&#8221; is perhaps a poor choice of words; I&#8217;m greatly looking forward to it, but I&#8217;ve kind of had the bit between my teeth with python and latex hacking for the last few weeks.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1269 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/06/introducing-omnencap/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introducing omnsplit</title>
		<link>http://www.russet.org.uk/blog/2009/06/introducing-omnsplit/</link>
		<comments>http://www.russet.org.uk/blog/2009/06/introducing-omnsplit/#comments</comments>
		<pubDate>Wed, 17 Jun 2009 12:02:27 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Ontology]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1258</guid>
		<description><![CDATA[After a bit of struggle, I now have another literate OWL tool working, along the lines discussed in a previous blog post. Rather than generating the OWL documentation, I now split a Manchester syntax file up, so that I can refer to bits of it. I have this working with OBI, using Protege to produce [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1258">
<p>After a bit of struggle, I now have another literate OWL tool working, along the lines discussed in a <a href="http://www.russet.org.uk/blog/2009/06/literate-omn/">previous</a> blog post. Rather than generating the OWL documentation, I now split a Manchester syntax file up, so that I can refer to bits of it. I have this working with OBI, using Protege to produce a single merged ontology file, in Manchester syntax.</p>
<p>The current implementation is rather simple; it produces one file-per-entity in the OWL file which I don&#8217;t think is entirely good. When run on OBI, it creates over 1400 files which is a lot. The other problem is that I&#8217;ve had to do some dubious hacking to get the file names work out. Firstly, I have to remove spaces and &#8220;\&#8221;&#8216;s, as wel as &#8220;:&#8221; which is illegal on NTFS.</p>
<p>There&#8217;s also a problem with some of the OWL. Unfortunately, the OBI to OWL conversion process has a reification step which I don&#8217;t quite understand the purpose of. This comes out as this sort of anonymous individual. I&#8217;m not sure at all how the definition has come out as the rdfs:label, but, for sure, you can&#8217;t use this as a filename!</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Individual</font></b>: relationship:genid7

    <font color="#990000">Annotations:</font>
        rdfs:label <font color="#FF0000">"C located_in C' if and only if: given any c that</font>
<font color="#FF0000">instantiates C at a time t, there is some c' such that: c' instantiates</font>
<font color="#FF0000">C' at time t and c *located_in* c'. (Here *located_in* is the</font>
<font color="#FF0000">instance-level location relation.)"</font>@en,
        oboInOwl:hasDbXref relationship:genid8

    <font color="#990000">Types:</font>
        oboInOwl:Definition
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>I think I might change the implementation a bit, though. Having 1400 files in one directory is not good. My idea is to serialize the entire file out as latex, with lots of macros, autogenerated.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><i><font color="#9A1900">%% this would appear in the generated file</font></i>
<b><font color="#0000FF">\newcommand</font></b><font color="#009900">{\OwlClassowlthing}</font>{
  <b><font color="#0000FF">\begin</font></b><font color="#009900">{omn}</font>
Class: owl:Thing
  <b><font color="#0000FF">\end</font></b><font color="#009900">{omn}</font>
}

<i><font color="#9A1900">%% then in your latex file you would do</font></i>
<b><font color="#0000FF">\owlclass</font></b><font color="#009900">{owl}{Thing}</font>

<i><font color="#9A1900">%% which would just resolve to the class above</font></i>
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>The only worry with this is that latex would then have to read a large file into latex, even if most of the macros are not used. This might be really, really slow. Well, we can but try.</p>
<p>As before, the current version is available at <tt>git://github.com/phillord/literate_omn.git</tt>.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1258 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/06/introducing-omnsplit/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Literate OMN</title>
		<link>http://www.russet.org.uk/blog/2009/06/literate-omn/</link>
		<comments>http://www.russet.org.uk/blog/2009/06/literate-omn/#comments</comments>
		<pubDate>Thu, 04 Jun 2009 16:00:18 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Ontology]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1213</guid>
		<description><![CDATA[Well, after a reasonable degree of struggle, I managed to get the first version of my literate OWL system working. As well as learning python, I&#8217;ve had a go with git; my repo is hosted on github at git://github.com/phillord/literate_omn.git. There are three components. omnextract.py this pulls out all the referenced omn files from the TeX [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1213">
<p>Well, after a reasonable degree of struggle, I managed to get the first version of my literate OWL system working. As well as learning python, I&#8217;ve had a go with git; my repo is hosted on github at <tt>git://github.com/phillord/literate_omn.git</tt>. There are three components.</p>
<table cellpadding="4"> 
<tr valign="top"> 
<td> <strong>omnextract.py</strong> </td>
<td> this pulls out all the referenced <tt>omn</tt> files from the TeX document and produces the complete omn file. </td>
</tr>
<tr valign="top"> 
<td> <strong>omn.sty</strong> </td>
<td> this is a driver for the listings package which does syntax highlighting in TeX. </td>
</tr>
<tr valign="top"> 
<td> <strong>omndoc.sty</strong> </td>
<td> this provides commands for including files into the TeX. It&#8217;s a thin wrapper around the listings package. </td>
</tr>
</table>
<p>I decided to make <tt>omn.sty</tt> seperate from <tt>omndoc.sty</tt> as it works standalone, if you just want to use the listings package on its own. At the moment, you can only include files; environments don&#8217;t work. You can see the the <a href="http://www.russet.org.uk/blog/wp-content/uploads/2009/06/all-test.pdf">pdf</a> it creates from this TeX</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">\documentclass</font></b><font color="#009900">{article}</font>

<b><font color="#0000FF">\usepackage</font></b><font color="#993399">[pdftex]</font><font color="#009900">{color}</font>
<b><font color="#0000FF">\usepackage</font></b><font color="#009900">{omndoc}</font>

<b><font color="#0000FF">\title</font></b><font color="#009900">{A Test Document for OMNDoc}</font>
<b><font color="#0000FF">\author</font></b><font color="#009900">{Phillip Lord}</font>
<i><font color="#9A1900">%% should be ignored by latex, put read by python</font></i>

<b><font color="#0000FF">\omndoc</font></b><font color="#009900">{all_test.omn}</font>

<b><font color="#0000FF">\begin</font></b><font color="#009900">{document}</font>
<b><font color="#0000FF">\maketitle</font></b>

Here is a piece of OWL that should be readable in the documentation and in the
OMN output.

<b><font color="#0000FF">\begin</font></b><font color="#009900">{omn}</font>
Class: FirstClass
<b><font color="#0000FF">\end</font></b><font color="#009900">{omn}</font>

<b><font color="#0000FF">\omn</font></b><font color="#009900">{first.pomn}</font>

Here is a piece of OWL that should be readable in the OMN output but is to
boring to be worth of consideration for the documentation.

<i><font color="#9A1900">% \ignore{</font></i>
<i><font color="#9A1900">%   \begin{omn}</font></i>
<i><font color="#9A1900">%     Class: BoringOWL</font></i>
<i><font color="#9A1900">%   \end{omn}</font></i>
<i><font color="#9A1900">% }</font></i>

<b><font color="#0000FF">\ignore</font></b><font color="#009900">{\omn{second.pomn}}</font>

Here is a piece of broken OWL that should be rendered in the documentation (as
broken!) but should be ignored in the OMN.

<i><font color="#9A1900">% \begin{notomn}</font></i>
<i><font color="#9A1900">% Clazz: BrokenOmn</font></i>
<i><font color="#9A1900">% \end{notomn}</font></i>

<b><font color="#0000FF">\notomn</font></b><font color="#009900">{third.pomn}</font>

<b><font color="#0000FF">\end</font></b><font color="#009900">{document}</font>
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>I&#8217;m starting to debate with myself, though, whether I have gone the right route here. The problem is that splitting the omn file up into bits is a pain. It only supports one way of working; if you want to use Protege, for example, to edit the file, you can&#8217;t; you can only view. We even miss the big advantage of literate programming; one source for both document and computation. But, then, you are stuck with a poor editing environment for either the documentation or computational representation.</p>
<p>I&#8217;ve been thinking instead of a system which would like this:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">\omndoc</font></b><font color="#009900">{function.omn}</font>

<b><font color="#0000FF">\omnClass</font></b><font color="#009900">{Function}</font>

<b><font color="#0000FF">\omnProperty</font></b><font color="#009900">{has_role}</font>

<b><font color="#0000FF">\omnSummary</font></b><font color="#009900">{}</font>
<b><font color="#0000FF">\omnMissing</font></b><font color="#009900">{}</font>
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>Now, the python component would split the <tt>function.omn</tt> file instead of combining it. Each class, individual or property would be but into it&#8217;s own file. The <tt>\omnClass</tt> macro would then just be a simple include (again using the listings package; it would show the class inline. <tt>\omnSummary</tt> would include some TeX (generated from python) saying how many classes and so forth were in the <tt>omn</tt> file; <tt>\omnMissing</tt> would produce a list of Classes that are not explicitly included. Given a big monitor, you could work on the two sources (documentation and ontology) side-by-side, with only a little bit of editing to support jump-to or equivalent. Finally, it would be more syntax-independent. The TeX would not need to be changed to support, for example, the XML syntax. Just some python to split the XML document up into snippets.</p>
<p>I shall start coding this over the next couple of days. I think I already have most of the python that I need so, hopefully, it should not take too long.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1213 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/06/literate-omn/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Literate OWL (well on blogs)</title>
		<link>http://www.russet.org.uk/blog/2009/05/literate-owl-well-on-blogs/</link>
		<comments>http://www.russet.org.uk/blog/2009/05/literate-owl-well-on-blogs/#comments</comments>
		<pubDate>Fri, 22 May 2009 15:23:42 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Ontology]]></category>
		<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1204</guid>
		<description><![CDATA[My next blog post was going to be about function, as I have just had a paper about it accepted. But, I got slightly side-tracked along the way, thinking about Literate Programming as it applies to OWL. While an ontology is (or, to my mind, should be) a computational artifact, it&#8217;s a bit different from [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1204">
<p>My next blog post was going to be about function, as I have just had a <a href="http://hdl.handle.net/10101/npre.2009.3228.1">paper</a> about it accepted. But, I got slightly side-tracked along the way, thinking about <a href="http://en.wikipedia.org/wiki/Literate_programming">Literate Programming</a> as it applies to OWL. While an ontology is (or, to my mind, should be) a computational artifact, it&#8217;s a bit different from a program; the main thing is that it doesn&#8217;t run; it doesn&#8217;t have that functional test that a program does. This is not to say that an ontology is not an application-dependent entity. It can be, but even then it needs to have a program built on it.</p>
<p>One of the upshots of this is that a narrative justification for an Ontology is fairly important; currently, we spend far too long on mailing lists, arguing about ontology terms and, to my mind, not enough of this is reflected in the final outcome. If, on the other hand, we moved to a situation that adding a new concept was equivalent to writing a paper, we might have less of this. Discussion would be a bit more focussed; besides which, most scientists are experienced with writing and reviewing papers, so we&#8217;d just be better at it.</p>
<p>For this to happen productively, though, the paper has to become, itself, a computational artifact. It&#8217;s not good having documentation that has to be kept in-sync with the ontology; we will just end up with multiple versions, and will never quite know what we are talking about; my discussions about BFO have shown me this; do we mean the OWL, the definitions in the OWL, the papers or what? We should be able to generate both readable documentation and computational OWL at the same time. In short, literate programming.</p>
<p>Now, I know that <a href="http://www.cs.man.ac.uk/~bparsia/">Bijan Parsia</a> has been <a href="http://www.webont.org/owled/2008/papers/owled2008eu_submission_4.pdf">investigating</a> this also, but I wanted to think a little bit about how it would fit into my environment.</p>
<p>One thought was to get the system working within <a href="http://www.methods.co.nz/asciidoc/">asciidoc</a> which I am using to generate these pages. This turned out to be simple enough; take, for instance, this definition for BiologicalFunction.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Class</font></b>: BiologicalFunction
    <font color="#990000">Annotations:</font>
    rdfs:comment
<font color="#FF0000">"Definition: A biological function is a realizable entity that inheres in continuant</font>
<font color="#FF0000">which is realized in an activity, and where the homologous structure(s) of</font>
<font color="#FF0000">individuals of closely related species (or identical species) fulfil this</font>
<font color="#FF0000">same biological function."</font>,

    <font color="#990000">SubClassOf:</font>
        Function
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>Asciidoc uses <a href="http://www.gnu.org/software/src-highlite/source-highlight.html">source-highlight</a> for it&#8217;s syntax highlighting. I had to add a bit of config (which, annoyingly, needs to be placed into main install directory for source-highlight, rather than in a user space dot-directory.</p>
<p>Unfortunately, this is not going to be as good as you might hope for printed documentation. The obvious solution here is to aim at LaTeX. I think that I am going to have a quick go at producing something like this, inspired by <a href="http://www.haskell.org/haskellwiki/Literate_programming">Literate Haskell</a>. Basically, I need three tags which look like this:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt>
<b><font color="#0000FF">\begin</font></b><font color="#009900">{owl}</font>
Class: Thing
<b><font color="#0000FF">\end</font></b><font color="#009900">{owl}</font>

<b><font color="#0000FF">\ignore</font></b>{
<b><font color="#0000FF">\begin</font></b><font color="#009900">{owl}</font>
Class: BoringOWL
<b><font color="#0000FF">\end</font></b><font color="#009900">{owl}</font>
}

<b><font color="#0000FF">\begin</font></b><font color="#009900">{notowl}</font>
Clazz: BrokenOwl
<b><font color="#0000FF">\end</font></b><font color="#009900">{notowl}</font>

</tt></pre>
</pre>
</td>
</tr>
</table>
<p>The first copes with OWL that should appear both in the documentation and code (that is most of it). The second covers OWL that should appear just in the code; the haskell example is for a &#8220;help&#8221; function; I suspect that this is rarely needed for OWL. The final example appears just in documentation; it would be useful for anti-examples (&#8220;Don&#8217;t do this!!!&#8221;). My plan would be to pre-process the latex just using regexps, nothing complex, to dump the OWL to a file, mostly because I don&#8217;t know how to get latex to do it. Meanwhile, these two macros would be just be defined in terms of the <a href="http://www.tug.org/texlive/Contents/live/texmf-dist/doc/latex/listings">Listings</a> package (which means writing yet another syntax highlighting set of regexps, oh dear).</p>
<p>Well, this is okay, but has two problems: first, it means writing OWL inside latex which means that editor support is going to be rubbish; second, what if I want to blog AND print a document. My solution to this is to move my ontologies to being multi-file based. As far as I can tell, Manchester OWL is order independent (except for the header). So the plan would be to write multiple files, each with a few Concepts in:</p>
<pre>
 function/header.omn
 function/function.omn
 function/biological_function.omn
 function/artifactual_function.omn</pre>
<p>Generating a complete Manchester syntax file from this would be easy (more or less, just run <tt>cat</tt>). This could be supported within latex by adding some include macros. Again, this is trivial to do with listings package.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">\owl</font></b><font color="#009900">{function.omn}</font>
<b><font color="#0000FF">\ignore</font></b><font color="#009900">{\owl{help.omn}}</font>
<b><font color="#0000FF">\noowl</font></b><font color="#009900">{broken.omn}</font>
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>Likewise, asciidoc supports it using include macros. I shall give this a go next week. I shall produce a document describing the axiomatisation for function in OWL that started all of this off.</p>
<p>PS Just finished this, and found out that blogpost stripped off all my nice syntax highlighting. Took a bit of effort but (hopefully) it should all be back in again now.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1204 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/05/literate-owl-well-on-blogs/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

