<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>An Exercise in Irrelevance</title>
	<atom:link href="http://www.russet.org.uk/blog/feed" rel="self" type="application/rss+xml" />
	<link>http://www.russet.org.uk/blog</link>
	<description>Knowledge, Biology and Ontologies</description>
	<lastBuildDate>Thu, 02 May 2013 11:41:38 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Supporting OBO style identifiers in Tawny</title>
		<link>http://www.russet.org.uk/blog/2929</link>
		<comments>http://www.russet.org.uk/blog/2929#comments</comments>
		<pubDate>Thu, 02 May 2013 11:41:37 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Tawny-OWL]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2929</guid>
		<description><![CDATA[Tawny-OWL is a library which enables the programmatic construction of OWL . One of the limitations with tawny as it stands is that it did not implement numeric, semantics free identifiers ; tawny builds identifiers from the clojure symbols used to describe the class. So, in my pizza ontology, for instance, PizzaTopping gets an iri [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2929">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Supporting+OBO+style+identifiers+in+Tawny&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2013-05-02&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2929&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>Tawny-OWL <span id="cite_ITEM-2929-0" name="citation"><a href="#ITEM-2929-0">[1]</a></span> is a library which enables the programmatic construction of OWL <span id="cite_ITEM-2929-1" name="citation"><a href="#ITEM-2929-1">[2]</a></span>. One of the limitations with tawny as it stands is that it did not implement numeric, semantics free identifiers <span id="cite_ITEM-2929-2" name="citation"><a href="#ITEM-2929-2">[3]</a></span>; tawny builds identifiers from the clojure symbols used to describe the class. So, in my <a href="https://github.com/phillord/tawny-pizza">pizza ontology</a>, for instance, <tt>PizzaTopping</tt> gets an iri ending in <tt>PizzaTopping</tt>. Semantics free identifiers have some significant advantages; the principle one is that the establish an identity for an object which can persist even if the properties (the labels for instance) change, as I have described previously <span id="cite_ITEM-2929-3" name="citation"><a href="#ITEM-2929-3">[4]</a></span>.</p>
<p>However, semantics-free identifiers do not come for free; they also have significant disadvantages, mainly that they make the life of developers harder and code less readable <span id="cite_ITEM-2929-2" name="citation"><a href="#ITEM-2929-2">[3]</a></span>. I&#8217;ve previously suggested solutions to this problem when it afflicts OWL Manchester syntax <span id="cite_ITEM-2929-4" name="citation"><a href="#ITEM-2929-4">[5]</a></span>.</p>
<p>With tawny, the IRIs that are used to identify concepts can easily be separated from the clojure symbols that are used to identify them; the initial link between them was simply one of convienience. So supporting numeric IRIs was possible with very little adjustment of the core <tt>owl.clj</tt> required one fixed function call to become a call to a first-class function.</p>
<p>One of purposes of tawny is to enable to a more agile development methodology than we have at present, so clearly I did not want the developer to have to manage this process by hand. Moreover, as <a href="http://sourceforge.net/mailarchive/forum.php?thread_name=CAFKQJ8mA92zh8mv27ME0RS%3DWLdw86sV9DVrmnsLiuxPw%3DkWK8g%40mail.gmail.com&amp;forum_name=obi-devel">recent discussions</a> on the OBI mailing list, the issue of co-ordination of identifiers can be a significant difficult. As James Malone has recently described, there the URIgen tool offers a solution to this problem <span id="cite_ITEM-2929-5" name="citation"><a href="#ITEM-2929-5">[6]</a></span>. <a href="http://www.ebi.ac.uk/~jupp/">Simon Jupp</a> who is the primary developer of URIgen kindly discussed the details with me, which has helped me form my ideas about a suitable workflow, and I have borrowed heavily from URIgen (and the protege plugin) for this. While I will probably implement a URIgen client for tawny in the future, my initial approach uses a slightly different idea. In general, with tawny, I have been advocating using standard software development tools, instead of specific ontology ones <span id="cite_ITEM-2929-1" name="citation"><a href="#ITEM-2929-1">[2]</a></span>; rather than co-ordinating developers through the use of a centralised server, it seems to me to make more sense to use whatever version control system. To that end, I have implemented a file based system for storing identifiers; given that most bio-ontologies remain under the 50,000 terms size, I think that this is plausible, especially as it is simply in tawny to modularise the source (if not the ontology which remains a hard research problem). In this case, I have used a properties files, since it is a simple and human-readable format.</p>
<p>This works as follows. First, we define a new ontology, with an iri-gen frame, which use the obo-iri-generate function. Of course, this is generic so it is possible to use arbitrary strategies for generating an IRI.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>defontology pizzaontology
  <b><font color="#000080">:iri</font></b> <font color="#FF0000">"http://www.ncl.ac.uk/pizza-obo"</font>
  <b><font color="#000080">:prefix</font></b> <font color="#FF0000">"piz:"</font>
  <b><font color="#000080">:comment</font></b> <font color="#FF0000">"An example pizza using OBO style ids"</font>
  <b><font color="#000080">:versioninfo</font></b> <font color="#FF0000">"Unreleased Version"</font>
  <b><font color="#000080">:annotation</font></b> <font color="#FF0000">(</font>seealso <font color="#FF0000">"Manchester Version"</font><font color="#FF0000">)</font>
  <b><font color="#000080">:iri-gen</font></b> tawny.obo/obo-iri-generate
  <font color="#FF0000">)</font></tt></pre>
</td>
</tr>
</table>
<p>Next, we need to restore the mapping between names and IRIs. We need to do this before we create any classes. In the first instance, this file will be empty, and will contain no mappings; this is not problematic.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>tawny.obo/obo-restore-iri <font color="#FF0000">"./src/tawny/obo/pizza/pizza_iri.props"</font><font color="#FF0000">)</font></tt></pre>
</td>
</tr>
</table>
<p>Now, we define concepts, properties and so forth as normal.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> CheeseTopping
  <b><font color="#000080">:label</font></b> <font color="#FF0000">"Cheese Topping"</font><font color="#FF0000">)</font>
<font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> MeatTopping
  <b><font color="#000080">:label</font></b> <font color="#FF0000">"Meat Topping"</font><font color="#FF0000">)</font></tt></pre>
</td>
</tr>
</table>
<p>The difference in how the IRI is created should be transparent to the developer at this point. Behind the scenes were are using this logic.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>defn obo-iri-generate-or-retrieve
  [name remembered current]
  <font color="#FF0000">(</font>or <font color="#FF0000">(</font>get remembered name<font color="#FF0000">)</font>
      <font color="#FF0000">(</font>get current name<font color="#FF0000">)</font>
      <font color="#FF0000">(</font>str obo-pre-iri <font color="#FF0000">"#"</font>
           <font color="#FF0000">(</font>java.util.UUID/randomUUID<font color="#FF0000">))))</font></tt></pre>
</td>
</tr>
</table>
<p>Or, in English: if the name (&#8220;CheeseTopping&#8221;) has been stored in our properties file, use this IRI; or if the name has already been used in the current session use this IRI, failing that, create a random UUID. I have used a UUID rather than autominting new identifiers because tawny is programmatic; it is very easy to create 1000 concepts where you meant to create 10 which would result in a lot of new identifiers. It makes more sense to mint permanent identifiers explicitly, as part of a release process.</p>
<p>This also works for programmatic use of tawny, regardless of whether concepts are added to the local namespace. This code creates many classes all at once, but does not add them to the namespace. Their IDs will still be stored.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>doseq [n <font color="#FF0000">(</font>map #<font color="#FF0000">(</font>str <font color="#FF0000">"n"</font> %<font color="#FF0000">)</font> <font color="#FF0000">(</font>range <font color="#993399">1</font> <font color="#993399">20</font><font color="#FF0000">))</font>]
  <font color="#FF0000">(</font>owlclass n<font color="#FF0000">)</font>
   <font color="#FF0000">)</font></tt></pre>
</td>
</tr>
</table>
<p>Finally, we need to store the IRIs we have created. Both full IDs and UUIDs are stored; so new classes will get a random UUID, but it will persist over time, providing some interoperability with external users who can use the short-term identifier in the knowledge that it may change.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>tawny.obo/obo-store-iri <font color="#FF0000">"./src/tawny/obo/pizza/pizza_iri.props"</font><font color="#FF0000">)</font></tt></pre>
</td>
</tr>
</table>
<p>At the same time, we report obsolete terms. These are those with permanent identifers, which are present in the properties file, but have not been created in the current file. Currently, these are just printed to screen, but I could generate classes and place them under an &#8220;obsolete&#8221; superclass.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>tawny.obo/obo-report-obsolete<font color="#FF0000">)</font></tt></pre>
</td>
</tr>
</table>
<p>Finally, at release point, a single function is called to generate the new IDs. This is done numerically, starting from the largest ID. If there are multiple developers, this step has to be co-ordinated, or it is going to break; but this is little different from a release point of any software project.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>tawny.obo/obo-generate-permanent-iri <font color="#FF0000">"./src/tawny/obo/pizza/pizza_iri.props"</font> <font color="#FF0000">"http://www.ncl.ac.uk/pizza-obo/PIZZA_"</font><font color="#FF0000">)</font></tt></pre>
</td>
</tr>
</table>
<p>I think this workflow makes sense, but only use in practice will show for sure. If the requirement for co-ordination over minting of real IDs is problematic, then URIgen would provide a nice solution. I can also see problems with my use of props files; I have sorted them numerically which makes them easier to read (and predicatably ordered), but this has the disadvantage that changes are likely to happen near the end, which is likely to result in conflicts. While these would be relatively simple conflicts, merging is necessarily painful. This could be avoiding by storing permanent IDs in one file, and UUIDs in per-developer files.</p>
<p>This is the last feature I am planning to add to the current iteration of tawny; I want to complete the documentation for all functions (this has already been done for owl.clj, but not the other namespaces), and the tutorial. For the 0.12 cycle, I plan to make tawny complete for OWL2 (basically, this means adding datatypes).</p>
<p>This articles describes a SNAPSHOT of tawny, available on github (<a href="https://github.com/phillord/tawny-owl">https://github.com/phillord/tawny-owl</a>). All the examples shown here, come from (yet another!) version of the pizza ontology, also available on github (<a href="https://github.com/phillord/tawny-obo-pizza">https://github.com/phillord/tawny-obo-pizza</a>).</p>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2929-0'></a>
P. Lord, "Programming OWL", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2214">http://www.russet.org.uk/blog/2214</a>


</li>
<li><a name='ITEM-2929-1'></a>
P. Lord, "The Semantic Web takes Wing: Programming Ontologies with Tawny-OWL", <i>An Exercise in Irrelevance</i>, 2013. <a href="http://www.russet.org.uk/blog/2366">http://www.russet.org.uk/blog/2366</a>


</li>
<li><a name='ITEM-2929-2'></a>
P. Lord, "Semantics-Free Ontologies", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2040">http://www.russet.org.uk/blog/2040</a>


</li>
<li><a name='ITEM-2929-3'></a>
P. Lord, "Permalink Semantics", <i>An Exercise in Irrelevance</i>, 2011. <a href="http://www.russet.org.uk/blog/1908">http://www.russet.org.uk/blog/1908</a>


</li>
<li><a name='ITEM-2929-4'></a>
P. Lord, "OBO Format and Manchester Syntax", <i>An Exercise in Irrelevance</i>, 2009. <a href="http://www.russet.org.uk/blog/1470">http://www.russet.org.uk/blog/1470</a>


</li>
<li><a name='ITEM-2929-5'></a>
J. Malone, "Keeping it Agile: the secret to a fitter ontology in 4* easy** steps!", <i>James Malone's EBI Blog</i>, 2013. <a href="http://jamesmaloneebi.blogspot.co.uk/2013/04/keeping-it-agile-secret-to-fitter.html">http://jamesmaloneebi.blogspot.co.uk/2013/04/keeping-it-agile-secret-to-fitter.html</a>


</li>
</ol>

</div> <!-- kcite-section 2929 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2929/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Twenty-Five Shades of Greycite: Semantics for referencing and
  preservation</title>
		<link>http://www.russet.org.uk/blog/2915</link>
		<comments>http://www.russet.org.uk/blog/2915#comments</comments>
		<pubDate>Mon, 29 Apr 2013 11:09:54 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Communication]]></category>
		<category><![CDATA[greycite]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2915</guid>
		<description><![CDATA[Abstract Plain English Summary Academic literature makes heavy of references; effectively links to other, previous work that supports, or contradicts the current work. This referencing is still largely textual, rather than using a hyperlink as is common on the web. As well as being time consuming for the author, it also difficult to extract the [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2915">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Twenty-Five+Shades+of+Greycite%3A+Semantics+for+referencing+and%0A++preservation&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2013-04-29&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2915&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span>
<hr /> 
<h2><a name="_abstract"></a>Abstract</h2>
<p> <p>  Semantic publishing can enable richer documents with clearer, computationally
interpretable properties. For this vision to become reality, however, authors
must benefit from this process, so that they are incentivised to add these
semantics. Moreover, the publication process that generates final content must
allow and enable this semantic content. Here we focus on author-led or "grey"
literature, which uses a convenient and simple publication pipeline. We
describe how we have used metadata in articles to enable richer referencing of
these articles and how we have customised the addition of these semantics to
articles. Finally, we describe how we use the same semantics to aid in digital
preservation and non-repudiability of research articles.
</p><ul><li> Phillip Lord</li><li> Lindsay Marshall</li></ul><ul><li><a href="http://arxiv.org/abs/1304.7151">http://arxiv.org/abs/1304.7151</a></li></ul> 
<hr /> 
<h2><a name="_plain_english_summary"></a>Plain English Summary</h2>
<p>Academic literature makes heavy of references; effectively links to other, previous work that supports, or contradicts the current work. This referencing is still largely textual, rather than using a hyperlink as is common on the web. As well as being time consuming for the author, it also difficult to extract the references computationally, as the references are formatted in many different ways.</p>
<p>Previously, we have described a system which works with identifiers such as ArXiv IDs (used to reference this article above!), PubMed IDs and DOIs. With this system, called kcite, the author supplies the ID, and kcite generates the reference list, leaving the ID underneath which is easy to extract computationally. The data used to generate the reference comes from specialised bibliographic servers.</p>
<p>In this paper, we describe two new systems. The first, called Greycite, provides similiar bibliographic data for any URL; it is extracted from the URL itself, using a wide variety of markup and some <em>ad-hoc</em> tricks, which the paper describes. As a result it works on many web pages (we predict about 1% of the total web, or a much higher percentage of &#8220;interesting&#8221; websites). Our second system, kblog-metadata, provides a flexible system for generating this data. Finally, we discuss ways in which the same metadata can be used for digitial preservation, by helping to track articles as and when they move across the web.</p>
<p>This paper was first written for the Sepublica 2013 workshop.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 2915 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2915/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The pain of foreign names</title>
		<link>http://www.russet.org.uk/blog/2910</link>
		<comments>http://www.russet.org.uk/blog/2910#comments</comments>
		<pubDate>Thu, 25 Apr 2013 12:14:12 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Communication]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2910</guid>
		<description><![CDATA[Our original intention with Greycite was to build a tool which can provide bibliographic metadata for any URL, to support my own kcite referencing tool . While it still fulfils this function, it also turns out to be a useful, general-purpose tool for investigating the metadata in various web pages. And this reveals some interesting [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2910">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=The+pain+of+foreign+names&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2013-04-25&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2910&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>Our original intention with Greycite <span id="cite_ITEM-2910-0" name="citation"><a href="#ITEM-2910-0">[1]</a></span> was to build a tool which can provide bibliographic metadata for any URL, to support my own kcite referencing tool <span id="cite_ITEM-2910-1" name="citation"><a href="#ITEM-2910-1">[2]</a></span>. While it still fulfils this function, it also turns out to be a useful, general-purpose tool for investigating the metadata in various web pages. And this reveals some interesting results. We discovered a nice example of this recently while adding <a href="http://en.wikipedia.org/wiki/RIS_%28file_format%29">RIS</a> support.</p>
<p>The <a href="http://www.nature.com/embor/journal/v14/n3/full/embor201311a.html">paper</a> in question comes from EMBO reports <span id="cite_ITEM-2910-2" name="citation"><a href="#ITEM-2910-2">[3]</a></span>. At first sight, the RIS for this page taken from Greycite looks reasonable.</p>
<pre style="padding:0.5em; color:gray;">TY - ELEC
UR - http://www.nature.com/embor/journal/v14/n3/full/embor201311a.html
Y2 - 2013-04-18 12:54:54
TI - The economics of creative research
JO - EMBO reports
PY - 2012
DA - 2012-02-08
DO - 10.1038/embor.2013.11
AU - Cou|[eacute]|e, Ivan
ER -</pre>
<p>However, something strange is going on with the author; poor old Ivan Couée&#8217;s name has been rather broken. So, why is this happening? Looking at the underlying HTML the first thing that hits you is a lot of space; there are over 50 empty lines at the beginning of the file; still, this is only a problem for people strange enough to be reading the HTML.</p>
<p>However, eventually we get to the metadata, first dublin core and what we describe as Google Scholar (since this is where we found it). And there we have it; greycite is reporting the metadata as it is. The author&#8217;s name is represented with <tt>|[eacute]|</tt> as a letter.</p>
<pre style="padding:0.5em; color:gray;">&lt;meta name="dc.language" content="en" /&gt;
&lt;meta name="dc.rights" content="&amp;#169; 2012 Nature Publishing Group" /&gt;
&lt;meta name="dc.title" content="The economics of creative research" /&gt;
&lt;meta name="dc.creator" content="Ivan Cou|[eacute]|e" /&gt;
&lt;meta name="dc.identifier" content="doi:10.1038/embor.2013.11" /&gt;
&lt;meta name="dc.date" content="2012-02-08" /&gt;

&lt;meta name="citation_publisher" content="Nature Publishing Group" /&gt;
&lt;meta name="citation_authors" content="Ivan Cou|[eacute]|e" /&gt;
&lt;meta name="citation_title" content="The economics of creative research" /&gt;
&lt;meta name="citation_date" content="2012-02-08" /&gt;
&lt;meta name="citation_volume" content="14" /&gt;
&lt;meta name="citation_issue" content="3" /&gt;
&lt;meta name="citation_firstpage" content="222" /&gt;
&lt;meta name="citation_doi" content="doi:10.1038/embor.2013.11" /&gt;
&lt;meta name="citation_journal_title" content="EMBO reports" /&gt;</pre>
<p>As far as we can tell this is an error; HTML attributes or extended character sets are entirely valid, but <tt>|[eacute]|</tt> does not appear to be a valid representation. Interestingly enough, there also appears to be some slightly buggy code in the PRISM metadata, which I am sure should not be this.</p>
<pre style="padding:0.5em; color:gray;">&lt;meta name="prism.issn" content="ERROR! NO ISSN" /&gt;
&lt;meta name="prism.eIssn" content="ERROR! NO EISSN" /&gt;</pre>
<p>My guess is that the problem is at the point of website generation rather than deeper in the bowels of the publishing system; grabbing the metadata for this article from CrossRef by content negotiation <span id="cite_ITEM-2910-3" name="citation"><a href="#ITEM-2910-3">[4]</a></span> shows the correct name.</p>
<pre style="padding:0.5em; color:gray;">{"volume":"14",
 "issue":"3",
 "DOI":"10.1038/embor.2013.11",
 "URL":"http://dx.doi.org/10.1038/embor.2013.11","title":"The economics of
        creative research",
 "container-title":"EMBO reports",
 "publisher":"Nature Publishing Group",
 "issued":{"date-parts":[[2012,2,8]]},
 "author":[{"family":"Couée","given":"Ivan"}],
 "editor":[],"page":"222-225",
 "type":"article-journal"}</pre>
<p>We emphathise with the publishers here. Getting character sets correct is the bane of everyones life; given the state of computing when multi-lingual character sets appeared, we guess it is not an example of premature optimisation, but an example of an optimisation you wish had never happened. The world would be an easier place if everything that used unicode from the start.</p>
<p>The current metadata for this paper can be seen on <a href="http://greycite.knowledgeblog.org/?uri=http://www.nature.com/embor/journal/v14/n3/full/embor201311a.html">greycite</a> or <a href="http://greycite.knowledgeblog.org/metadata/13923">in detail</a>. Hopefully, this will be updated in time!</p>
<p>This post was written by Phillip Lord and Lindsay Marshall</p>
<p><strong>Update</strong></p>
<p>Spelling mistake corrected, bibliography added. Thanks to Christian Perfect for bug report.</p>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2910-0'></a>
L. Marshall, and P. Lord, "GreyCite", <i>GreyCite</i><a href="http://greycite.knowledgeblog.org/">http://greycite.knowledgeblog.org/</a>


</li>
<li><a name='ITEM-2910-1'></a>
S. Cockell, and P. Lord, "KCite Plugin", <i>Knowledge Blog</i>, 2011. <a href="http://knowledgeblog.org/kcite-plugin">http://knowledgeblog.org/kcite-plugin</a>


</li>
<li><a name='ITEM-2910-2'></a>
I. Couée, "The economics of creative research", <i>EMBO reports</i>, vol. 14, pp. 222-225, 2012. <a href="http://dx.doi.org/10.1038/embor.2013.11">http://dx.doi.org/10.1038/embor.2013.11</a>


</li>
<li><a name='ITEM-2910-3'></a>
. gbilder, "Content Negotiation for CrossRef DOIs | CrossTech", <i>CrossTech</i>, 2011. <a href="http://www.crossref.org/CrossTech/2011/04/content_negotiation_for_crossr.html">http://www.crossref.org/CrossTech/2011/04/content_negotiation_for_crossr.html</a>


</li>
</ol>

</div> <!-- kcite-section 2910 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2910/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Overlays over arXiv</title>
		<link>http://www.russet.org.uk/blog/2367</link>
		<comments>http://www.russet.org.uk/blog/2367#comments</comments>
		<pubDate>Wed, 10 Apr 2013 16:34:13 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Communication]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2367</guid>
		<description><![CDATA[Much has been said about overlay journals . The idea is simple; the journal essentially becomes a selector, a channel, with the paper itself being hosted elsewhere, such as arXiv. This holds a certain amount of attraction for me; I already post my new papers on arXiv. I have been posting them here also . [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2367">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Overlays+over+arXiv&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2013-04-10&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2367&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>Much has been said about overlay journals <span id="cite_ITEM-2367-0" name="citation"><a href="#ITEM-2367-0">[1]</a></span>. The idea is simple; the journal essentially becomes a selector, a channel, with the paper itself being hosted elsewhere, such as arXiv.</p>
<p>This holds a certain amount of attraction for me; I already post my new papers on arXiv. I have been posting them here also <span id="cite_ITEM-2367-1" name="citation"><a href="#ITEM-2367-1">[2]</a></span>. This works well, but is hampered by technology. Mostly I write papers in LaTeX, and I have written tools to make these suitable for WordPress <span id="cite_ITEM-2367-2" name="citation"><a href="#ITEM-2367-2">[3]</a></span>; these work well enough to publish an entire thesis <span id="cite_ITEM-2367-3" name="citation"><a href="#ITEM-2367-3">[4]</a></span>. However, the process of doing this is not slick <span id="cite_ITEM-2367-4" name="citation"><a href="#ITEM-2367-4">[5]</a></span>. For instance, when trying to publish one of my own papers, I have had problems as I used a theorem environment <span id="cite_ITEM-2367-5" name="citation"><a href="#ITEM-2367-5">[6]</a></span>. While <a href="http://plastex.sourceforge.net/">PlasTeX</a> is a nice tool, the key problem is that it is fundamentally a different interpreter from TeX. Eventually, perhaps, LuaTeX will get an HTML backend, but until this happens the system will always fail in some cases.</p>
<p>So, I wanted to investigate whether it was possible to build Overlay functionality into a personal publication framework, such as the WordPress installation I host these articles on. Well, it turns out combined with the tools that I have written for manipulating metadata <span id="cite_ITEM-2367-6" name="citation"><a href="#ITEM-2367-6">[7]</a></span>, it is relatively simple to do so; my first attempt at this is now available for my OWLED 2013 paper <span id="cite_ITEM-2367-7" name="citation"><a href="#ITEM-2367-7">[8]</a></span>. The title, authors (just me in this case), date, abstract and PDF link all come directly from <a href="http://arxiv.org/">arXiv</a>. Full text is not available from arXiv&#8201;&#8212;&#8201;anyway it would suffer from all the issues described earlier; in the end, the PDF is probably the best representation of this paper. I have supplemented this with a plain English summary, something that I have wanted to do for years, but have not managed to start. If the reviewers will allow me to do so, I will also attach these when they become available.</p>
<p>The code for this is not quite ready to release yet: however, it will potentially work over any eprints repository, and I have connected it up to Greycite also <span id="cite_ITEM-2367-8" name="citation"><a href="#ITEM-2367-8">[9]</a></span>, so it can be used over any source that greycite can interpret.</p>
<p>All a little clunky, but I think that this is the future. The Journal is dead, Long Live the article.</p>
<p><strong>Update</strong></p>
<p>Fixed DOI.</p>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2367-0'></a>
. gowers, "Why I've also joined the good guys", <i>Gowers's Weblog</i>, 2013. <a href="http://gowers.wordpress.com/2013/01/16/why-ive-also-joined-the-good-guys/">http://gowers.wordpress.com/2013/01/16/why-ive-also-joined-the-good-guys/</a>


</li>
<li><a name='ITEM-2367-1'></a>
P. Lord, "Realism and Science", <i>An Exercise in Irrelevance</i>, 2010. <a href="http://www.russet.org.uk/blog/1713">http://www.russet.org.uk/blog/1713</a>


</li>
<li><a name='ITEM-2367-2'></a>
P. Lord, "Latex to WordPress", <i>An Exercise in Irrelevance</i>, 2010. <a href="http://www.russet.org.uk/blog/1740">http://www.russet.org.uk/blog/1740</a>


</li>
<li><a name='ITEM-2367-3'></a>
A. Lister, "PhD Thesis: Table of Contents", <i>the mind wobbles</i>, 2013. <a href="http://themindwobbles.wordpress.com/2013/01/02/phd-thesis-table-of-contents/">http://themindwobbles.wordpress.com/2013/01/02/phd-thesis-table-of-contents/</a>


</li>
<li><a name='ITEM-2367-4'></a>
. @wordpressdotcom, "Converting a Latex Thesis to Multiple Wordpress Posts", <i>the mind wobbles</i>, 2012. <a href="http://themindwobbles.wordpress.com/2012/06/14/converting-a-latex-thesis-to-multiple-wordpress-posts/">http://themindwobbles.wordpress.com/2012/06/14/converting-a-latex-thesis-to-multiple-wordpress-posts/</a>


</li>
<li><a name='ITEM-2367-5'></a>
P. Lord, "An evolutionary approach to Function", <i>Journal of Biomedical Semantics</i>, vol. 1, pp. S4, 2010. <a href="http://dx.doi.org/10.1186/2041-1480-1-S1-S4">http://dx.doi.org/10.1186/2041-1480-1-S1-S4</a>


</li>
<li><a name='ITEM-2367-6'></a>
P. Lord, "Kblog Metadata Plugin", <i>Knowledge Blog</i>, 2012. <a href="http://knowledgeblog.org/kblog-metadata">http://knowledgeblog.org/kblog-metadata</a>


</li>
<li><a name='ITEM-2367-7'></a>
P. Lord, "The Semantic Web takes Wing: Programming Ontologies with Tawny-OWL", <i>An Exercise in Irrelevance</i>, 2013. <a href="http://www.russet.org.uk/blog/2366">http://www.russet.org.uk/blog/2366</a>


</li>
<li><a name='ITEM-2367-8'></a>
L. Marshall, and P. Lord, "GreyCite", <i>GreyCite</i><a href="http://greycite.knowledgeblog.org/">http://greycite.knowledgeblog.org/</a>


</li>
</ol>

</div> <!-- kcite-section 2367 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2367/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Semantic Web takes Wing: Programming Ontologies with Tawny-OWL</title>
		<link>http://www.russet.org.uk/blog/2366</link>
		<comments>http://www.russet.org.uk/blog/2366#comments</comments>
		<pubDate>Tue, 09 Apr 2013 20:03:46 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Ontology]]></category>
		<category><![CDATA[Papers]]></category>
		<category><![CDATA[Tawny-OWL]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2366</guid>
		<description><![CDATA[Abstract Plain English Summary In this paper, I describe some new software, called Tawny-OWL, that addresses the issue of building ontologies. An ontology is a formal hierarchy, which can be used to describe different parts of the world, including biology which is my main interest. Building ontologies in any form is hard, but many ontologies [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2366">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=The+Semantic+Web+takes+Wing%3A+Programming+Ontologies+with+Tawny-OWL&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2013-04-09&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2366&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span>
<hr /> 
<h2><a name="_abstract"></a>Abstract</h2>
<p> <p>  The Tawny-OWL library provides a fully-programmatic environment for ontology
building; it enables the use of a rich set of tools for ontology development,
by recasting development as a form of programming. It is built in Clojure - a
modern Lisp dialect, and is backed by the OWL API. Used simply, it has a
similar syntax to OWL Manchester syntax, but it provides arbitrary
extensibility and abstraction. It builds on existing facilities for Clojure,
which provides a rich and modern programming tool chain, for versioning,
distributed development, build, testing and continuous integration. In this
paper, we describe the library, this environment and the its potential
implications for the ontology development process.
</p><ul><li> Phillip Lord</li></ul><ul><li><a href="http://arxiv.org/abs/1303.0213">http://arxiv.org/abs/1303.0213</a></li></ul> 
<hr /> 
<h2><a name="_plain_english_summary"></a>Plain English Summary</h2>
<p>In this paper, I describe some new software, called Tawny-OWL, that addresses the issue of building ontologies. An ontology is a formal hierarchy, which can be used to describe different parts of the world, including biology which is my main interest.</p>
<p>Building ontologies in any form is hard, but many ontologies are repetitive, having many similar terms. Current ontology building tools tend to require a significant amount of manual intervention. Rather than look to creating new tools, Tawny-OWL is a library written in full programming language, which helps to redefine the problem of ontology building to one of programming. Instead of building new ontology tools, the hope is that Tawny-OWL will enable ontology builders to just use existing tools that are designed for general purpose programming. As there are many more people involved in general programming, many tools already exist and are very advanced.</p>
<p>This is the first paper on the topic, although it has been discussed before <a href="http://www.russet.org.uk/blog/category/all/professional/tech/tawny-owl">here</a>.</p>
<p>This paper was written for the OWLED workshop in 2013.</p>
<hr /> 
<h2><a name="_reviews"></a>Reviews</h2>
<p>Reviews are posted here with the kind permission of the reviewers. Reviewers are identified or remain anonymous at their option. Copyright of the review remains with the reviewer and is not subject to the overall blog license.</p>
<h3><a name="_review_1"></a>Review 1</h3>
<p>The given paper is a solid presentation of a system for supporting the development of ontologies &#8211; and therefore not really a scientific/research paper.</p>
<p>It describes Tawny OWL in a sufficiently comprehensive and detailed fashion to understand both the rationale behind as well as the functioning of that system. The text itself is well written and also well structured. Further, the combination of the descriptive text in conjunction with the given (code) examples make the different functionality highlights of Tawny OWL very easy to grasp and appraise.</p>
<p>As another big plus of this paper, I see the availability of all source code which supports the fact that the system is indeed actually available &#8211; instead of being just another description of a &#8220;hidden&#8221; research system.</p>
<p>The possibility to integrate Tawny OWL in a common (programming) environment, the abstraction level support, the modularity and the testing &#8220;framework&#8221; along with its straightforward syntax make it indeed very appealing and sophisticated.</p>
<p>But the just said comes with a little warning: My above judgment (especially the last comment) are highly biased by the fact that I am also a software developer. And thus I do not know how much the above would apply to non-programmers as well.</p>
<p>And along with the above warning, I actually see a (more global) problem with the proposed approach to ontology development: The mentioned &#8220;waterfall methodologies&#8221; are still most often used for creating ontologies (at least in the field of biomedical ontologies) and thus I wonder how much programmatic approaches, as implemented by Tawny OWL, will be adapted in the future. Or in which way they might get somehow integrated in those methodologies.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 2366 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2366/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Archiving of Scientific Material</title>
		<link>http://www.russet.org.uk/blog/2360</link>
		<comments>http://www.russet.org.uk/blog/2360#comments</comments>
		<pubDate>Fri, 05 Apr 2013 15:42:27 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Communication]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2360</guid>
		<description><![CDATA[In this article, I consider the practical issues with archiving of scientific material placed on the web; I will describe the motivation for doing this, the background and consider the various mechanisms for doing so. As part of our work on knowledgeblog , we have been investigating ways of injecting the formal technical aspects of [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2360">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Archiving+of+Scientific+Material&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2013-04-05&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2360&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>In this article, I consider the practical issues with archiving of scientific material placed on the web; I will describe the motivation for doing this, the background and consider the various mechanisms for doing so.</p>
<p>As part of our work on knowledgeblog <span id="cite_ITEM-2360-0" name="citation"><a href="#ITEM-2360-0">[1]</a></span>, we have been investigating ways of injecting the formal technical aspects of the scientific publication process into this form of publication. The reasons for this are myriad: if the scientist can control the form, they can innovate in their presentation how they choose; the publication process itself becomes very simple and straight-forward (as opposed to the authoring, which is as hard as it ever way). Finally, it means that scientists can publish as they go, as I have done and am doing on my work with <a href="http://www.russet.org.uk/blog/category/all/professional/tech/tawny-owl">Tawny-OWL</a>. This latter point has many potential implications: firstly, it makes science much more interactive&#8201;&#8212;&#8201;scientists can publish things that they are not clear on, early ideas and can (and I do) get feedback on this early; secondly, it should help to overcome publication bias as it is much lighter-weight than the current publication process. Scientists are more likely to publish negative results if the process is easy and not expensive. And, lastly, it can help to establish provenance for the work; if every scientists published in this way, scientific fraud would be much harder, as a fraudulant scientist would have to produce a coherent, faked set of data from the early days of the work.</p>
<p>However to achieve this, posts must still be available. The scientific record needs to be maintained. Now, this should not be an issue. I write this blog in Asciidoc <span id="cite_ITEM-2360-1" name="citation"><a href="#ITEM-2360-1">[2]</a></span>, and rarely use images, so the source is quite small. In fact, since I moved to WordPress in 2009 <span id="cite_ITEM-2360-2" name="citation"><a href="#ITEM-2360-2">[3]</a></span>, it totals about 725k; so it would fit on a <a href="http://en.wikipedia.org/wiki/Floppy_disk#3+1.E2.81.842-inch_floppy_disk_.28.22Microfloppy.22.29">floppy</a>, which is a crushing blow to my ego. So, how easy is it to archive your content?</p>
<p>The difficulty here is that there is no obvious person to do this. Like many universities, I have access to an <a href="http://eprints.ncl.ac.uk/">eprints</a> archive. Unfortunately, this is mainly used for REF, and has no programmatic interface. The university also has a <a href="http://www.lockks.org">LOCKKS</a> box. However, this is not generally available for the work that the University staff has produced, but journals that the University has bought; so I have to give my work away to a paywall publisher, or pay lots to an open access publisher to access this.</p>
<p>Another possibility would be to use Figshare. Now, I have some qualms about Figshare anyway; it appears to be a walled garden, the Facebook of science. Others, however, do not worry about this and are using Figshare. Carl Boettiger <span id="cite_ITEM-2360-3" name="citation"><a href="#ITEM-2360-3">[4]</a></span>, for instance archives his note book on Figshare. But there is a problem: consider the <a href="http://figshare.com/articles/Lab_Notebook_2012/106620">2012</a> archive; it is a tarball, with Markdown files inside; I know what to do with this, but many people will not. And it is only weakly linked to the original publication link. Titus Brown had the same idea <span id="cite_ITEM-2360-4" name="citation"><a href="#ITEM-2360-4">[5]</a></span>, and claiming the added value of DOIs, something I find dubious <span id="cite_ITEM-2360-5" name="citation"><a href="#ITEM-2360-5">[6]</a></span>. Again, though, the same problem; Figshare archives the <a href="http://figshare.com/articles/Test_blog_post/97901">source</a>. The most extreme example of this comes from Karthik Ram who has <a href="http://figshare.com/articles/git_repository_for_paper_on_git_and_reproducible_science/155613">published</a> an <a href="http://git-scm.com/">Git</a> repository; unsurprisingly, it is impossible to interact with as a repo.</p>
<p>Figshare likes to make great play of the fact that it is backed by CLOCKKS&#8201;&#8212;&#8201;this is a set of distributed copies maintained by some research libraries. Now, it might seem sensible that CLOCKKS would offer this service (at a price, of course) to researchers. Perhaps they do. But the website reveals nothing about this. And, although, I tried they did not respond to emails either. Rather like DOIs, the infrastructure is build around scale; in short, you need a publisher or some other institution involved; all very well, but this contradicts the desire for a light-weight publication mechanism. There is a second problem with CLOCKKS; it is a dark archive, that is, its content only becomes available to the public after a &#8220;trigger event&#8221;; the publisher going bust, the website going down and so on. Now data which is on the web and, critically, archived by someone other than the author essentially becomes non-repudiable and time-stamped. I can prove (to a lower-bound) when I said something. And you can prove that I said something even if I wish I hadn&#8217;t. In a strict sense, this is true if the data is in CLOCKKS; but in a practical sense, it is not, as checking when and what I said becomes too much of a burden to be useful.</p>
<p>So, we move onto web archiving. The idea of web archiving is attractive to me for one main reason; it is not designed for science. It is a general purpose, commodity solution, rather like a blog engine. If one thing scientific publication needs more than anything, it is to move the technology base away from bespoke and toward commodity.</p>
<p>One of the most straight-forward solutions for web archiving is <a href="http://www.webcitation.org">WebCite</a>; the critical advantage that this has is that it provides an on-demand service. I have been using it for a while to archive this site; greycite <span id="cite_ITEM-2360-6" name="citation"><a href="#ITEM-2360-6">[7]</a></span> now routinely submits new items here, if we can extract enough metadata from them. The archiving is quick, rapid and effective. The fly-in-the-ointment is that WebCite has funding issues and is threatened with closure at the end of 2013. The irony is that it claims it needs $25,000 to continue. Set against the millions put aside for APCs <span id="cite_ITEM-2360-7" name="citation"><a href="#ITEM-2360-7">[8]</a></span>, or the thousands NPG claims is necessary to publish a single paper <span id="cite_ITEM-2360-8" name="citation"><a href="#ITEM-2360-8">[9]</a></span>, or the millions that ACM spends supporting its digital library <span id="cite_ITEM-2360-9" name="citation"><a href="#ITEM-2360-9">[10]</a></span>, this is small beer, and it shows the lack of seriousness with which we take web archiving. I hope it survives; if it does, Gunther Eysenbach, who runs it, tells me that the plan to expand the services they offer. It may yet become the archiving option of choice.</p>
<p>I have been able to find no on-demand alternative to <a href="http://www.webcitation.org">WebCite</a>. However, there are several other archives available. I have been using the <a href="http://www.webarchive.org.uk">UK Web Archive</a> for a while now. I first heard about this service, irony or ironies, on the radio. Since I <a href="http://www.webarchive.org.uk/wayback/archive/20100711220041/http://knowledgeblog.org/">first</a> used it to archive <a href="http://knowledgeblog.org">knowledge blog</a> and <a href="http://www.webarchive.org.uk/wayback/archive/20130126131815/http://www.russet.org.uk/blog/">later</a> used it to archive this site, the process has got a lot easier. No longer do I need to send signed physical copyright permission; first it was electronic (email I think). It now appears that the law is changing to allow them to archive more widely (the BBC covered this in a <a href="http://www.bbc.co.uk/news/entertainment-arts-22028738">story</a>, categorized under &#8220;entertainment and arts&#8221; and which is largely focused on Stephen Fry&#8217;s tweets), although this will be a dark archive. Currently, this journal has been archived only once; from my other sites, it appears that they have a six month cycle. So, while this provides good digital preservation, it is a less good solution from the perspective of non-repudiablility; there is a significant gap before the archive happens, and a slightly longer one till the archive is published.</p>
<p>The UKWA is, as the name suggests, is UK specific. Another solution is to use, of course, <a href="http://archive.org">archive.org</a>, which might be considered to be the elephant in the room for web archiving. Unlike the UKWA, they don&#8217;t take submissions, but just crawl the web (although I suspect that the UKWA will start doing this also now). Getting onto a crawl can, therefore, be rather hit-and-miss. Frustratingly, they do have an &#8220;upload my data&#8221; service, which you can access through a logged in account; but not an &#8220;archive my URL&#8221; service. Again, a very effective resource from a digital preservation resource, but with similar problems to the UKWA from a point-of-view of non-repudiablilty. The archives take time to appear; in my experience, somewhat longer than the UKWA. I have also contacted their commercial wing, <a href="http://archive-it.org">http://archive-it.org</a>. Their software and the crawls that the offer could easily be configured to do the job, but unfortunately, they are currently aimed very much at the institutional level: their smallest package provides over around 100Gb of storage; this blog can be archived in around 130Mb (this is without deduplication which would save a lot); even a fairly <a href="http://blogs.ch.cam.ac.uk/pmr/">prolific blogger</a> comes in at around 250Mb. The price, unfortunately, reflects this. Although, again, it is on a par with my yearly publication costs, so is well within an average research budget.</p>
<p>Of course, these solutions are not exclusive; with <a href="http://greycite.knowledgeblog.org">greycite</a> we have started to add tools to support these options. For instance, kblog-metadata <span id="cite_ITEM-2360-10" name="citation"><a href="#ITEM-2360-10">[11]</a></span>, now supports an &#8220;archives&#8221; widget which is in use on this page; this links directly through to all the archives we know about. For individual pages, these are deep links, so you can see archived versions of each article straight-forwardly. The data comes from greycite, which we discover by probing; we may move later to using <a href="http://www.webarchive.org.uk/mementos/search/http://www.russet.org.uk/blog">Mementos</a>. greycite itself archives metadata about webpages, so we link to this also. As a side effect, these also mean that each article is submitted to greycite, which in turn causes archiving of the page through WebCite. Likewise, archive locations are returned within the BibTeX downloads, which is useful for those referencing sites.</p>
<p>Finally, greycite now generates pURLs&#8201;&#8212;&#8201;these are two-step resolution URLs which work rather like DOIs (or actually DOIs operate like pURLs, since as far as I am aware, pURLs predate the web infrastructure for DOIs). These resolve directly to the website in question. With a little support greycite can track content as and if it moves around the web; even if this fails, and an article disappears, greycite will redirect to the nearest web archive.</p>
<p>In summary, there is no perfect solution available at the moment, but there are many options; in many cases, archiving will happen somewhat magically. As we have found with many other aspects of author self-publishing on the web, it is possible to architecturally replicate many of the guarantees provided by the scientific publication industry through the simple use of web technology. Tools like greycite and kblog-metadata are useful in uncovering the archives that are already there, and linking these together with pURLs. Taken together, I have a reasonable degree of confidence that this material will be available in 10 or 50 years time. Whether anyone will still be reading it, well, that is a different issue entirely.</p>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2360-0'></a>
P. Lord, "Knowledge Blog", <i>Knowledge Blog</i>, 2009. <a href="http://knowledgeblog.org/">http://knowledgeblog.org/</a>


</li>
<li><a name='ITEM-2360-1'></a>
P. Lord, "Posting to a Knowledge Blog in asciidoc", <i>The Knowledgeblog Process</i>, 2011. <a href="http://process.knowledgeblog.org/167">http://process.knowledgeblog.org/167</a>


</li>
<li><a name='ITEM-2360-2'></a>
P. Lord, "New Day, New Blog", <i>An Exercise in Irrelevance</i>, 2009. <a href="http://www.russet.org.uk/blog/1175">http://www.russet.org.uk/blog/1175</a>


</li>
<li><a name='ITEM-2360-3'></a>
C. Boettiger, "Carl Boettiger", <i>Lab Notebook</i>, 2013. <a href="http://www.carlboettiger.info/index.html">http://www.carlboettiger.info/index.html</a>


</li>
<li><a name='ITEM-2360-4'></a>
T. Brown, "Posting blog entries to figshare", <i>Living in an Ivory Basement</i>, 0. <a href="http://ivory.idyll.org/blog/posting-blog-entries-to-figshare.html">http://ivory.idyll.org/blog/posting-blog-entries-to-figshare.html</a>


</li>
<li><a name='ITEM-2360-5'></a>
P. Lord, and S. Cockell, "The Problem with DOIs", <i>An Exercise in Irrelevance</i>, 2011. <a href="http://www.russet.org.uk/blog/1849">http://www.russet.org.uk/blog/1849</a>


</li>
<li><a name='ITEM-2360-6'></a>
L. Marshall, and P. Lord, "GreyCite", <i>GreyCite</i><a href="http://greycite.knowledgeblog.org/">http://greycite.knowledgeblog.org/</a>


</li>
<li><a name='ITEM-2360-7'></a>
"RCUK announces block grants for universities to aid drives to open access to research outputs", 2012. <a href="http://www.rcuk.ac.uk/media/news/2012news/Pages/121108.aspx">http://www.rcuk.ac.uk/media/news/2012news/Pages/121108.aspx</a>


</li>
<li><a name='ITEM-2360-8'></a>
A. Jha, "Open access to research is inevitable, says Nature editor-in-chief", <i>the Guardian</i>, 2012. <a href="http://www.guardian.co.uk/science/2012/jun/08/open-access-research-inevitable-nature-editor">http://www.guardian.co.uk/science/2012/jun/08/open-access-research-inevitable-nature-editor</a>


</li>
<li><a name='ITEM-2360-9'></a>
P. Lord, "The Naivete of Scientists", <i>An Exercise in Irrelevance</i>, 2011. <a href="http://www.russet.org.uk/blog/1924">http://www.russet.org.uk/blog/1924</a>


</li>
<li><a name='ITEM-2360-10'></a>
P. Lord, "Kblog Metadata Plugin", <i>Knowledge Blog</i>, 2012. <a href="http://knowledgeblog.org/kblog-metadata">http://knowledgeblog.org/kblog-metadata</a>


</li>
</ol>

</div> <!-- kcite-section 2360 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2360/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Open Access Response to HEFCE</title>
		<link>http://www.russet.org.uk/blog/2349</link>
		<comments>http://www.russet.org.uk/blog/2349#comments</comments>
		<pubDate>Mon, 25 Mar 2013 08:12:20 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Communication]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2349</guid>
		<description><![CDATA[HEFCE is currently asking for feedback on the role of Open Access in the next REF. While I have a a number of technical suggestions, I think that the biggest and best contribution that the next HEFCE could make to the next REF is to state pubically that all journal/conference/venue metadata be removed from papers [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2349">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Open+Access+Response+to+HEFCE&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2013-03-25&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2349&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>HEFCE is currently asking for <a href="http://www.hefce.ac.uk/whatwedo/rsrch/rinfrastruct/openaccess/">feedback</a> on the role of Open Access in the next REF. While I have a a number of technical suggestions, I think that the biggest and best contribution that the next HEFCE could make to the next REF is to state pubically that all journal/conference/venue metadata be removed from papers before they are sent for review.</p>
<p>It is time that we stopped judging books by their cover. It would be a fantastic contribution if HEFCE could take a lead on this. This is my full response.</p>
<hr /> 
<h2><a name="_expectations_for_open_access"></a>Expectations for Open Access</h2>
<p>I feel that one key issue is missing from this document. Scientists still have problems in some areas (including mine of computing science) in that the &#8220;high-impact&#8221; journals or conferences often provide no or prohibitively expensive open access options. In this past, I have refused to publish in these journals because I wish my work to remain open access and instead published elsewhere. However this works directly against my own interests in the current REF as the research will be judged less good. The use of journals as a primary indicator of quality, also works against my ability to choose cheaper venues. Few people believe statements that research will not be judged on publication venue; indeed, as an individual academic, I have even been told to directly comment on the venue in my return.</p>
<p>One simple and yet enormous contribution that procedures for the next REF could make to Open Access is to not to coerce, but to remove this enormous barrier. This could happen simply and straight-forwardly by removing all journal and publication venue metadata from papers when presented to reviewers. Of course, this reviewers could work around this (the data is a google search away), but the message sent by such a step would be enormous.</p>
<p>The general expectations for OA publishing seem reasonable. However, I think, I would add a further specific requirement. Currently, it is very hard to find the location of a green OA copy of any article. Making articles available is not enough; they must be discoverable. Therefore, I would suggest that a specific requirement that a primary identifier (DOI, ISSN, ISBN or URL) <strong>must</strong> be present in the institutional repository, and this must be visible on the web page and present in computational metadata. Finally, making the paper discoverable is also not enough. There must be computational and human-readable metadata making clear the contents of the paper are Open Access; without this form of explicit statement, the only safe course of action for readers to take is assume the copyright default position that you cannot use the material.</p>
<hr /> 
<h2><a name="_institutional_repositories"></a>Institutional Repositories</h2>
<p>Despite the significant investment, our experience is that few people ever retrieve data from institutional repositories. Partly, this is because it is difficult to link between articles on a journal website and articles in institutional repositories. As a second problem, institutional repositories provide an inconsistent experience, both for computational and human access. For instance, the presentation of identifiers such as DOIs is inconsistent. Even when present DOIs are often inaccurate, containing syntactic errors, which prevent their usage.</p>
<p>Ultimately, institutional repositories would be much better if there were a single infrastructure maintained at a national level (or international). In fact, a strong exemplar for this already exists in the form of arXiv. The ability to update the could be devolved to individual institutions. An authentication framework for this is already in place through JE-S.</p>
<p>Linking between institutional repositories and subject repositories unfortunately is likely to be difficult from a social perspective; there are many subject repositories and the institutional repositories are not likely to link to them well, because they are not experts in these repositories. This might be more plausible in a single national repository.</p>
<p>The better solution is to enable authors of papers to perform this linking. Scientists who actually care about the links working and being to the correct place are best place do this. This could be supported in the REF, by making linking to data, software or other subject repositories an explicit criteria in REF; this happens in some disciplines (for example, in bioinformatics a clear statement of if and where software is available and under what conditions is often asked for by reviewers).</p>
<hr /> 
<h2><a name="_approach_to_exceptions"></a>Approach to Exceptions</h2>
<p>If exceptions are to be for a transitional period, then they any exceptions given should be marked with a &#8220;sell-by&#8221; date, after which they should no longer be considered valid.</p>
<p>It is worth reiterating that embargoes really only benefit the publishers; ensuring that the REF framework allows academics to choose their publication venue more freely, rather than effectively requiring them to publish in selected &#8220;high-impact&#8221; venues would enable them to choose venues with short, or no embargo period. The most effective mechanism for achieving this would be to remove all publication venue information from future REF returns. The research would be judged on the basis of the research, and not the publication venue.</p>
<hr /> 
<h2><a name="_open_data"></a>Open Data</h2>
<p>There is more complexity behind the requirement for open data than for open access, particularly where the data needs to remain confidential for reasons of data protection. Having said all of this, there are many disciplines (again bioinformatics is an obvious example) where the majority of data is open. Making a decision now to rule this out of scope, for a REF which may be a significant distance in future seems premature.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 2349 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2349/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The evil a space can do</title>
		<link>http://www.russet.org.uk/blog/2340</link>
		<comments>http://www.russet.org.uk/blog/2340#comments</comments>
		<pubDate>Thu, 21 Mar 2013 10:51:11 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Communication]]></category>
		<category><![CDATA[greycite]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2340</guid>
		<description><![CDATA[Recently, I was contacted by a Kcite user who had found an interesting problem. They had cut-and-paste a DOI from the American Society of Microbiology article [webcite], and then used this in a blog post. But it was not working. The user actually did identify the problem, which was a strange character in the DOI. [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2340">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=The+evil+a+space+can+do&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2013-03-21&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2340&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>Recently, I was contacted by a Kcite <span id="cite_ITEM-2340-0" name="citation"><a href="#ITEM-2340-0">[1]</a></span> user who had found an interesting problem. They had cut-and-paste a DOI from the American Society of Microbiology <a href="http://aac.asm.org/content/55/4/1494">article</a> <a href="http://www.webcitation.org/6FHUw0icU">[webcite</a>], and then used this in a blog post. But it was not working. The user actually did identify the problem, which was a strange character in the DOI.</p>
<p>So, I decided to investigate a bit futher. Looking at the source for the page, and the DOI appears mostly fine; it is not formatted according to CrossRef display guidelines <span id="cite_ITEM-2340-1" name="citation"><a href="#ITEM-2340-1">[2]</a></span>, but they are hardly alone in this.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:0.2em 0;"> 
<tr>
<td style="padding:0.5em;">
<pre style="margin:0; padding:0;">&lt;span class="slug-doi"&gt;10.1128/​AAC.01664-10
&lt;/span&gt;</pre>
</td>
</tr>
</table>
<p>However, looking a bit further into this at the binary of this source and we see this:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:0.2em 0;"> 
<tr>
<td style="padding:0.5em;">
<pre style="margin:0; padding:0;">00006260: 2020 2020 2020 2020 203c 7370 616e 2063           &lt;span c
00006270: 6c61 7373 3d22 736c 7567 2d64 6f69 223e  lass="slug-doi"&gt;
00006280: 3130 2e31 3132 382f e280 8b41 4143 2e30  10.1128/...AAC.0
00006290: 3136 3634 2d31 300a 2020 2020 2020 2020  1664-10.</pre>
</td>
</tr>
</table>
<p>The character &#8220;e2808b&#8221; is &#8220;zero width space&#8221; in UTF-8. The first time I saw this, my initial inclination was to suggest that it is the publishers being a pain and trying to prevent automatic harvesting of DOIs.</p>
<p>Actually, I suspect that this is not the case, as the DOI is in the page metadata:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:0.2em 0;"> 
<tr>
<td style="padding:0.5em;">
<pre style="margin:0; padding:0;">&lt;meta content="10.1128/AAC.01664-10" name="citation_doi" /&gt;</pre>
</td>
</tr>
</table>
<p>It is also present in multiple other locations, in their social bookmarking widgets. And there it is unmolested by spaces. So, why have they done this? The answer, I think, is that they display their DOI in a widget which is &#8220;cleverly&#8221; written to appear static on the screen (well, sort of, but this is a different story). And their widget is not wide-enough; the space is non-joining, so it allows them to control where the line break will happen. None the less, this piece of insanity prevents cutting and pasting of the DOI, and worse does so in a way which is very hard to detect for humans at least. To the extent that this kind of error even gets into institutional repositories, which significantly hinder their usefulness <span id="cite_ITEM-2340-2" name="citation"><a href="#ITEM-2340-2">[3]</a></span>. A quick check suggests this is ubiquitous for the American Society of Microbiology website. Consider:</p>
<ul> 
<li> <a href="http://jvi.asm.org/content/87/7/3903.full">http://jvi.asm.org/content/87/7/3903.full</a> </li>
<li> <a href="http://cmr.asm.org/content/26/1/2.full">http://cmr.asm.org/content/26/1/2.full</a> </li>
<li> <a href="http://jcm.asm.org/content/51/4/1066.full">http://jcm.asm.org/content/51/4/1066.full</a> </li>
</ul>
<p>The CrossRef display guidelines are a little bit ambiguous here. Technically, as the zero-width space cannot be seen, it could be considered within the guidelines. I shall write to them to find out.</p>
<p>In case, this article sounds overly pious, I have to raise my hand here in shame, as I have used the same technique for different purposes. An article that I published yesterday on inline citations for kcite <span id="cite_ITEM-2340-3" name="citation"><a href="#ITEM-2340-3">[4]</a></span> uses zero-width joiners to break up a short-code, so that it is displayed rather than interpreted. If the example is cut-and-paste from the article into a new wordpress post, it will not work because of it. I will fix this soon, using unicode entities for the brackets instead.</p>
<p><strong>Update</strong></p>
<p>Thanks to some swift action by Geoff Bilder, CrossRefs display guidelines have now been updated. While it will take a while, the knock-on effects of this change will be significant.</p>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2340-0'></a>
S. Cockell, and P. Lord, "KCite Plugin", <i>Knowledge Blog</i>, 2011. <a href="http://knowledgeblog.org/kcite-plugin">http://knowledgeblog.org/kcite-plugin</a>


</li>
<li><a name='ITEM-2340-1'></a>
"DOI display guidelines"<a href="http://www.crossref.org/02publishers/doi_display_guidelines.html">http://www.crossref.org/02publishers/doi_display_guidelines.html</a>


</li>
<li><a name='ITEM-2340-2'></a>
J. Cope, "OA DOI resolver status update", <i>eRambler</i>, 2013. <a href="http://erambler.co.uk/blog/doi2oa-status-update/">http://erambler.co.uk/blog/doi2oa-status-update/</a>


</li>
<li><a name='ITEM-2340-3'></a>
P. Lord, "Inline Citations with Kcite", <i>The Knowledgeblog Process</i>, 2013. <a href="http://process.knowledgeblog.org/309">http://process.knowledgeblog.org/309</a>


</li>
</ol>

</div> <!-- kcite-section 2340 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2340/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Why Metadata Must be Useful</title>
		<link>http://www.russet.org.uk/blog/2336</link>
		<comments>http://www.russet.org.uk/blog/2336#comments</comments>
		<pubDate>Fri, 08 Mar 2013 11:05:17 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Communication]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2336</guid>
		<description><![CDATA[Adding metadata to article could be done by many people. This could be the author, and in the ideal world, this would be the author. They know most about the content and are best placed to put the most knowledge into it. But, we have to answer the question, why would they do this? We [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2336">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Why+Metadata+Must+be+Useful&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2013-03-08&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2336&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>Adding metadata to article could be done by many people. This could be the author, and in the ideal world, this <strong>would</strong> be the author. They know most about the content and are best placed to put the most knowledge into it. But, we have to answer the question, why would they do this? We have previously argued that semantic metadata must be useful to the people who producing it <span id="cite_ITEM-2336-0" name="citation"><a href="#ITEM-2336-0">[1]</a></span>. For this, we need tools that extract and consume this metadata.</p>
<p>I discovered a nice example of this recently while reading an interesting paper from Yimei Zhu and Rob Proctor <span id="cite_ITEM-2336-1" name="citation"><a href="#ITEM-2336-1">[2]</a></span>, investigating how PhD students use various tools to communicate. I was interested in citing this paper. The paper can be found on the web at the <a href="https://www.escholar.manchester.ac.uk/item/?pid=uk-ac-man-scw:187789">Manchester escholar site</a> <a href="http://www.webcitation.org/6Ev81Ndwo">[webcite]</a>. What metadata is in this page? Well, our Greycite <span id="cite_ITEM-2336-2" name="citation"><a href="#ITEM-2336-2">[3]</a></span> tool is designed for this purpose; unfortunately, it <a href="http://greycite.knowledgeblog.org/?uri=https://www.escholar.manchester.ac.uk/item/?pid=uk-ac-man-scw:187789">suggests</a> that there is very little in the way of metadata.</p>
<p>I contacted the escholar helpdesk, and they confirmed that there really is no embedded metadata; greycite has not just missed it. The strange thing is that, several days later, I managed to get to a <a href="http://www.manchester.ac.uk/escholar/uk-ac-man-scw:187789">very similar page</a> <a href="http://www.webcitation.org/6Ev7xe8Gd">[webcite]</a>. It has a different layout and colour scheme, but it&#8217;s clearly the same. The bibliographic metadata fields (not easily extractable, sadly!) appear identical. However, investigating the metadata in this page, and we see a very different story. It is full of <a href="http://greycite.knowledgeblog.org/?uri=https://www.escholar.manchester.ac.uk/uk-ac-man-scw:187789">Dublin Core</a>. It&#8217;s not ideally laid out, but it is all that we need for citation.</p>
<p>Unfortunately, there is no link between the two, nor do I know why Manchester has these two different pages; perhaps one is designed to replace the other. And, of course, from the point of view of reader, there is no reason why they would suspect that one contains metadata and the other does not.</p>
<p>The point here is not to criticise Manchester library services. Instead, it is to raise the question, why are the two locations so different in terms of their metadata? My suspicion is that the real answer is simple: very few people have noticed, and no one really cares. It might be argued that metadata must be correct to be useful. The evidence suggests that the inverse is true: metadata must be useful to be correct.</p>
<p>With tools like Greycite <span id="cite_ITEM-2336-2" name="citation"><a href="#ITEM-2336-2">[3]</a></span> and kblog-metadata <span id="cite_ITEM-2336-3" name="citation"><a href="#ITEM-2336-3">[4]</a></span>, making the metadata useful is a key aim. Using kcite, I can now reference any article here in journal, or at bio-ontologies <span id="cite_ITEM-2336-4" name="citation"><a href="#ITEM-2336-4">[5]</a></span>. So now kcite users care about the metadata. From this page you can download a bib file for this article, or even for every article on the site (all 500+). This metadata comes directly from Greycite, which in turn scrapes it from this website. So now the site operator (me!) cares. And, I use the bib files to drive the tools that I use to cite my own work. So, now, the author (also me!) cares.</p>
<p>There is a chicken-and-egg situation here; why write the tools to operate over metadata when no one is using the metadata. Fortunately, with kcite we have had a gradual path: first we used DOIs, then pubmed IDs, then arXiv, and now any URI at all. And with Greycite, we have used a lot of heuristics, and quite a few metadata formats. While it has been a significant amount of work, metadata is now making our lives easier. This is the way that it must be.</p>
<p><strong><strong>Update</strong></strong></p>
<p>Typographical correction.</p>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2336-0'></a>
P. Lord, S. Cockell, and R. Stevens, "Three Steps to Heaven", <i>SePublica 2012</i>, 2012. <a href="http://www.russet.org.uk/blog/2054">http://www.russet.org.uk/blog/2054</a>


</li>
<li><a name='ITEM-2336-1'></a>
Y. Zhu, and R. Procter, "Use of blogs, Twitter and Facebook by PhD Students for Scholarly Communication: A UK study", <i>In:  2012 China New Media Communication Association Annual Conference, Macao International Conference ; 06 Dec 2012-08 Dec 2012; Macao . 2012.</i>, 2012. <a href="http://www.escholar.manchester.ac.uk/uk-ac-man-scw:187789">http://www.escholar.manchester.ac.uk/uk-ac-man-scw:187789</a>


</li>
<li><a name='ITEM-2336-2'></a>
L. Marshall, and P. Lord, "GreyCite", <i>GreyCite</i><a href="http://greycite.knowledgeblog.org/">http://greycite.knowledgeblog.org/</a>


</li>
<li><a name='ITEM-2336-3'></a>
P. Lord, "Kblog Metadata Plugin", <i>Knowledge Blog</i>, 2012. <a href="http://knowledgeblog.org/kblog-metadata">http://knowledgeblog.org/kblog-metadata</a>


</li>
<li><a name='ITEM-2336-4'></a>
P. Lord, "Table of Contents", <i>Bio-Ontologies SIG</i>, 2011. <a href="http://bio-ontologies.knowledgeblog.org/table-of-contents">http://bio-ontologies.knowledgeblog.org/table-of-contents</a>


</li>
</ol>

</div> <!-- kcite-section 2336 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2336/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Splitting a Mercurial Repository</title>
		<link>http://www.russet.org.uk/blog/2333</link>
		<comments>http://www.russet.org.uk/blog/2333#comments</comments>
		<pubDate>Sun, 03 Mar 2013 17:28:01 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2333</guid>
		<description><![CDATA[The Mercurial repository for KnowledgeBlog has been starting to show the strain for a while now. Firstly, when it was created we were all new to mercurial; for instance it contains the trunk directory which is really a Subversion metaphor. The second problem is that it is a single large repository, which maps to the [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2333">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Splitting+a+Mercurial+Repository&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2013-03-03&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2333&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>The <a href="http://code.google.com/p/knowledgeblog/">Mercurial repository</a> for KnowledgeBlog <span id="cite_ITEM-2333-0" name="citation"><a href="#ITEM-2333-0">[1]</a></span> has been starting to show the strain for a while now. Firstly, when it was created we were all new to mercurial; for instance it contains the <tt>trunk</tt> directory which is really a Subversion metaphor. The second problem is that it is a single large repository, which maps to the development directory on my hard drive; there is now a lot of experimental software on my hard drive which I don&#8217;t want in a public enviroment, so I am now faced with either an enormous <tt>.hgignore</tt> or more &#8220;untracked&#8221; files than tracked. Not ideal.</p>
<p>At the same time, I have more recently moved mostly toward using git; actually, I still think Mercurial is nicer than git; the interface to the commands is cleaner, and the functionality is not that different. However, there is a fantastic UI, <a href="http://philjackson.github.com/magit/">magit</a>, for Emacs, while the equivalent for Mercurial is not as good. This is important to me. So, I wanted to try and address both of the issues at the same time; splitting the repository upon, and move to git.</p>
<p>The process for achieving this turned out to be relatively simply; mercurial comes with a fantastic extension called convert. This is actually a general purpose extension to convert from other VCS systems into mercurial; however, it will also convert one hg repo to another. It has the ability to both filter the existing repo and rename locations at the same time. To create my new repository I used these commands:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:0.2em 0;"> 
<tr>
<td style="padding:0.5em;">
<pre style="margin:0; padding:0;">mkdir mathjax-latex-hg
cd mathjax-latex-hg
## create filemap.txt
hg init
hg convert --filemap filemap.txt devel-hg-old/ .</pre>
</td>
</tr>
</table>
<p>which create a new Mercurial repository, and convert the data from the old, tangled repository. The <tt>filemap.txt</tt> file contains a couple of lines only:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:0.2em 0;"> 
<tr>
<td style="padding:0.5em;">
<pre style="margin:0; padding:0;">include trunk/plugins/mathjax-latex
rename trunk/plugins/mathjax-latex .</pre>
</td>
</tr>
</table>
<p>These filter for just the mathjax-latex plugin and move all its files to top level. This is the only part of the process that needs changing to export different parts of the repo, as I done four times now. This now gives me a Mercurial repository in the right shape. Now, we create a new git repo, and import the untangled Mercurial repo into git. Again, reasonably straight-forward. <tt>hg-fast-export</tt> is the name of the command on ubuntu which is more sensible than original <tt>fast-export</tt> which is both overly generic, and a hostage to the future.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:0.2em 0;"> 
<tr>
<td style="padding:0.5em;">
<pre style="margin:0; padding:0;">cd ..
git init mathjax-latex-git
cd mathjax-latex-git
hg-fast-export -r ../mathjax-latex-hg
git checkout HEAD</pre>
</td>
</tr>
</table>
<p>Finally, the repo needs to be made publicly available, in this case of github. And all is complete.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:0.2em 0;"> 
<tr>
<td style="padding:0.5em;">
<pre style="margin:0; padding:0;">git remote add origin git@github.com:phillord/mathjax-latex.git
git push -u origin master</pre>
</td>
</tr>
</table>
<p>Of course, mathjax-latex does not actually need updating, because it is feature complete and working. However, the WordPress <a href="http://wordpress.org/extend/plugins/mathjax-latex/">plugin page</a> now includes a nasty warning, so I probably need to update it just to avoid this. Bit of a pain, especially the only way of doing this involves updating the Subversion repository, which I don&#8217;t actually use. Slightly painful.</p>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2333-0'></a>
P. Lord, "Knowledge Blog", <i>Knowledge Blog</i>, 2009. <a href="http://knowledgeblog.org/">http://knowledgeblog.org/</a>


</li>
</ol>

</div> <!-- kcite-section 2333 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2333/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Is Peer Review the Future?</title>
		<link>http://www.russet.org.uk/blog/2327</link>
		<comments>http://www.russet.org.uk/blog/2327#comments</comments>
		<pubDate>Fri, 22 Feb 2013 14:08:09 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Communication]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2327</guid>
		<description><![CDATA[Today, I recieved an email from a journal, asking me if I would review a paper. The paper in question is by, amoung others, Iddo Friedberg, and can be read on arXiv . I&#8217;ve known Iddo Friedberg for a while; he was an earlier user of my semantic similarity work , for protein function prediction [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2327">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Is+Peer+Review+the+Future%3F&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2013-02-22&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2327&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>Today, I recieved an email from a journal, asking me if I would review a paper. The paper in question is by, amoung others, Iddo Friedberg, and can be read on arXiv <span id="cite_ITEM-2327-0" name="citation"><a href="#ITEM-2327-0">[1]</a></span>. I&#8217;ve known Iddo Friedberg for a while; he was an earlier user of my semantic similarity work <span id="cite_ITEM-2327-1" name="citation"><a href="#ITEM-2327-1">[2]</a></span>, for protein function prediction <span id="cite_ITEM-2327-2" name="citation"><a href="#ITEM-2327-2">[3]</a></span>, and was also the editor for our paper on realism in ontology development <span id="cite_ITEM-2327-3" name="citation"><a href="#ITEM-2327-3">[4]</a></span>. I would have liked to review this paper, and I feel a little bad because I know these things are important for the careers of the scientists.</p>
<p>So, why did I decline? Well, nice and simple; the page charges are just too high. There is no real justification for this as it can be done much cheaper <span id="cite_ITEM-2327-4" name="citation"><a href="#ITEM-2327-4">[5]</a></span>&#8201;&#8212;&#8201;£200 or so seems reasonable; more over, I think it is bad for science because it is one of the factors that cause authors to think very carefully, and often save up work for &#8220;a bigger publication&#8221;. This can delay publication for years after the work has happened. Scientists have to think carefully about their research, and their work; thinking about whether to publish now or later is one piece of baggage that we could do without <span id="cite_ITEM-2327-5" name="citation"><a href="#ITEM-2327-5">[6]</a></span>.</p>
<p>The real irony of the situation though is that the peer-review for this paper is and has already happened. The paper concerns bias in Gene Ontology annotations of protein functions. Iddo <a href="https://mailman.stanford.edu/pipermail/go-discuss/2013-January/006280.html">posted</a> his work to the various Gene Ontology mailing lists; unsurprisingly, the GO annotation team saw the paper, and <a href="http://www.ebi.ac.uk/training/online/trainers/rachael-huntley">Rachael Huntley</a> <a href="https://mailman.stanford.edu/pipermail/go-friends/2013-January/002045.html">responded</a>. The academic debate has started, and is in full swing. Others may see and contribute. And, frankly, the quality of the discussion going on there, and the depth of the analysis is higher than I would have given. No journal has been involved; it happened because there is a mailing list which the scientists in question used.</p>
<p>The current peer-review <strong>system</strong> does not add value; my peers and the scientific debate that does this. And this can, and will happen, regardless of the journals; indeed, in this case, why don&#8217;t the journal editors just read the mailing list?</p>
<p>So why do scientists, including myself, continue to publish in this way? It can often be difficult particularly where there are no open access options available <span id="cite_ITEM-2327-6" name="citation"><a href="#ITEM-2327-6">[7]</a></span>. We have to; it&#8217;s part, indeed, the main part of our assessment <span id="cite_ITEM-2327-7" name="citation"><a href="#ITEM-2327-7">[8]</a></span>. As I have said before, this is now the only reason I publish in this way <span id="cite_ITEM-2327-8" name="citation"><a href="#ITEM-2327-8">[9]</a></span>.</p>
<p>Having said this, I do have my doubts. I feel somewhat guilty toward Iddo Friedberg, for instance. There is also a degree of hypocrisy in this&#8201;&#8212;&#8201;I will still submit to journals (for my own sake, of course, but also for my PhD students); will people, perhaps, wish to not review my articles? What would happen if everybody thought like this (here, I can use the Yossarian defence: then I&#8217;d be a damn fool to think any different). If I set the bar at £200, then who will I review for? Well, I do review for conferences and workshops where I can. Still, I feel that this is not enough; people review my work, I should review theirs. So, I state here, that subject to some time constraints, I will happily review work that is posted either to the web in this form, or to sites such as <a href="http://arxiv.org/">arXiv</a>. Reviews will be posted here, on this blog.</p>
<p>I have my doubts; but open access is not enough. Publication <strong>must</strong> get lighter, faster and much, much cheaper. I would welcome alternative courses of action.</p>
<p>Many thanks to the <a href="http://blog.sjcockell.me/">Simon Cockell</a> and <a href="http://jamesmaloneebi.blogspot.co.uk/">James Malone</a> who peer-reviewed this post, and provided helpful comments. I am also grateful to <a href="http://iddo-friedberg.org/">Iddo Friedberg</a> who gave me permission to use the story about his paper in this way. The opinions expressed here are, however, my own.</p>
<p><strong>Update</strong></p>
<p>In response to feedback from <a href="http://svpow.com/">Mike Taylor</a>, it is worth pointing out that I do not review for paywall journals, and have not for quite a while.</p>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2327-0'></a>
A.M. Schnoes, D.C. Ream, A.W. Thorman, P.C. Babbitt, and I. Friedberg, "Biases in the Experimental Annotations of Protein Function and their
  Effect on Our Understanding of Protein Function Space", <i>arXiv</i>, 2013. <a href="http://arxiv.org/abs/1301.1740">http://arxiv.org/abs/1301.1740</a>


</li>
<li><a name='ITEM-2327-1'></a>
P.W. Lord, R.D. Stevens, A. Brass, and C.A. Goble, "Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation", <i>Bioinformatics</i>, vol. 19, pp. 1275-1283, 2003. <a href="http://dx.doi.org/10.1093/bioinformatics/btg153">http://dx.doi.org/10.1093/bioinformatics/btg153</a>


</li>
<li><a name='ITEM-2327-2'></a>
I. Friedberg, M. Jambon, and A. Godzik, "New avenues in protein function prediction", <i>Protein Science</i>, vol. 15, pp. 1527-1529, 2006. <a href="http://dx.doi.org/10.1110/ps.062158406">http://dx.doi.org/10.1110/ps.062158406</a>


</li>
<li><a name='ITEM-2327-3'></a>
P. Lord, and R. Stevens, "Adding a Little Reality to Building Ontologies for Biology", <i>PLoS ONE</i>, vol. 5, pp. e12258, 2010. <a href="http://dx.doi.org/10.1371/journal.pone.0012258">http://dx.doi.org/10.1371/journal.pone.0012258</a>


</li>
<li><a name='ITEM-2327-4'></a>
. rmounce, "The Gold OA plot v0.2", <i>Ross Mounce</i>, 2012. <a href="http://rossmounce.co.uk/2012/09/04/the-gold-oa-plot-v0-2/">http://rossmounce.co.uk/2012/09/04/the-gold-oa-plot-v0-2/</a>


</li>
<li><a name='ITEM-2327-5'></a>
P. Lord, "Why academic publishing is like a coffee shop", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2248">http://www.russet.org.uk/blog/2248</a>


</li>
<li><a name='ITEM-2327-6'></a>
P. Lord, "Open Access and the Semantic Web", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2157">http://www.russet.org.uk/blog/2157</a>


</li>
<li><a name='ITEM-2327-7'></a>
D. Colquhoun, "Is Queen Mary University of London trying to commit scientific suicide?", <i>DC's Improbable Science</i>, 2012. <a href="http://www.dcscience.net/?p=5388">http://www.dcscience.net/?p=5388</a>


</li>
<li><a name='ITEM-2327-8'></a>
P. Lord, "Bringing Things to Life", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2170">http://www.russet.org.uk/blog/2170</a>


</li>
</ol>

</div> <!-- kcite-section 2327 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2327/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Testing Times for Tawny</title>
		<link>http://www.russet.org.uk/blog/2324</link>
		<comments>http://www.russet.org.uk/blog/2324#comments</comments>
		<pubDate>Fri, 15 Feb 2013 12:56:22 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Tawny-OWL]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2324</guid>
		<description><![CDATA[Tawny OWL, my library for building ontologies is now reaching a nice stage of maturity; it is possible to build ontologies, reason over them and so forth. We have already started to use the programmable nature of Tawny, trivially with disjoints , as well as allowing the ontology developer to choose the identifiers that they [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2324">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Testing+Times+for+Tawny&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2013-02-15&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2324&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>Tawny OWL, my library for building ontologies <span id="cite_ITEM-2324-0" name="citation"><a href="#ITEM-2324-0">[1]</a></span> is now reaching a nice stage of maturity; it is possible to build ontologies, reason over them and so forth. We have already started to use the programmable nature of Tawny, trivially with disjoints <span id="cite_ITEM-2324-1" name="citation"><a href="#ITEM-2324-1">[2]</a></span>, as well as allowing the ontology developer to choose the identifiers that they use to interact with the concepts <span id="cite_ITEM-2324-2" name="citation"><a href="#ITEM-2324-2">[3]</a></span>. However, I wanted to explore further the usefulness of a programmatic environment.</p>
<p>One standard facility present in most languages is a test harness, and Clojure is no exception in this regard. Tawny already comes with a set of predicates for testing superclasses, both asserting and inferred, which provides a good basis for unit testing. So, this example using my test Pizza ontology shows a nice example, essentially testing definitions for <tt>CheesyPizza</tt>&#8201;&#8212;&#8201;these should in both a positive and negative definition.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>deftest CheesyShort
  <font color="#FF0000">(</font>is <font color="#FF0000">(</font>r/isuperclass? p/FourCheesePizza p/CheesyPizza<font color="#FF0000">))</font>
  <font color="#FF0000">(</font>is <font color="#FF0000">(</font>r/isuperclass? p/MargheritaPizza p/CheesyPizza<font color="#FF0000">))</font>
  <font color="#FF0000">(</font>is
   <font color="#FF0000">(</font>not <font color="#FF0000">(</font>r/isuperclass? p/MargheritaPizza p/FourCheesePizza<font color="#FF0000">))))</font></tt></pre>
</td>
</tr>
</table>
<p>While ths is nice, it is not enough in some cases where I wanted to test that things that do not happen. For this I introduce a new macro, <tt>with-probe-entities</tt> which adds &#8220;probe classes&#8221; into the ontology&#8201;&#8212;&#8201;that is a class which is there only for the purpose of a test. In this case, I test the definition of <tt>VegetarianPizza</tt> to see whether <tt>MargheritaPizza</tt> reasons correctly as a subclass. Additionally, though, I also check to see whether a subclass of <tt>VegetarianPizza</tt> and <tt>CajunPizza</tt>&#8201;&#8212;&#8201;which contains sausage&#8201;&#8212;&#8201;is inconsistent. This test could be more specific, as it tests for general coherency, although I do check for this independently. The <tt>with-probe-entities</tt> macro cleans up after itself. All entities (which can be of any kind and not just classes) are removed from the ontology afterwards; so independence of testing is not compromised).</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>deftest VegetarianPizza
  <font color="#FF0000">(</font>is
   <font color="#FF0000">(</font>r/isuperclass? p/MargheritaPizza p/VegetarianPizza<font color="#FF0000">))</font>

  <font color="#FF0000">(</font>is
   <font color="#FF0000">(</font>not
    <font color="#FF0000">(</font>o/with-probe-entities
      [c <font color="#FF0000">(</font>o/owlclass <font color="#FF0000">"probe"</font>
                     <b><font color="#000080">:subclass</font></b> p/VegetarianPizza p/CajunPizza<font color="#FF0000">)</font>]
      <font color="#FF0000">(</font>r/coherent?<font color="#FF0000">)))))</font></tt></pre>
</td>
</tr>
</table>
<p>Of course, a natural consequence of the addition of tests is the desire to run them frequenty; more over, the desire to run them in a clean environment. The solution to this turns out to be simple. <a href="https://travis-ci.org/">Travis-CI</a> integrates nicely with github&#8201;&#8212;&#8201;so the addition of a simple YAML file of this form enables a Continuous Integration, of both the Pizza ontology and the environment (such as Tawny, for instance).</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:0.2em 0;"> 
<tr>
<td style="padding:0.5em;">
<pre style="margin:0; padding:0;">language: clojure
lein: lein2
jdk:
  - openjdk7</pre>
</td>
</tr>
</table>
<p>The output of this process is <a href="https://travis-ci.org/phillord/tawny-pizza">available</a> for all to read, along with the tests for my mavenized version of <a href="https://travis-ci.org/phillord/hermit-maven">Hermit</a>, and also <a href="https://travis-ci.org/phillord/tawny-owl">tawny</a> itself. This is not the first time that ontologies have been continuously integrated <span id="cite_ITEM-2324-3" name="citation"><a href="#ITEM-2324-3">[4]</a></span>; however, the nice advantage of this is that I have not had to install anything. It even works against external ontologies: so we have both <a href="https://travis-ci.org/phillord/tawny-go">GO</a> and <a href="https://travis-ci.org/phillord/tawny-obi">OBI</a>. Currently, these work against static versions of GO and OBI. I could automate this process from the respective repositories of these projects, by pulling with git-svn and pushing again to github.</p>
<p>All in all, though, the process of recasting ontology building as a programming task is turning out to be an interesting experience. Much of the tooling that enables collaborative ontology building just works. It holds much promise for the future.</p>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2324-0'></a>
P. Lord, "OWL Concepts as Lisp Atoms", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2254">http://www.russet.org.uk/blog/2254</a>


</li>
<li><a name='ITEM-2324-1'></a>
P. Lord, "Disjoints in Clojure-owl", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2275">http://www.russet.org.uk/blog/2275</a>


</li>
<li><a name='ITEM-2324-2'></a>
P. Lord, "Clojure OWL 0.2", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2303">http://www.russet.org.uk/blog/2303</a>


</li>
<li><a name='ITEM-2324-3'></a>
C.J. Mungall, H. Dietze, S.J. Carbon, A. Ireland, S. Bauer, and S. Lewis, "Continuous Integration of Open Biological Ontology Libraries", 2012. <a href="http://bio-ontologies.knowledgeblog.org/405">http://bio-ontologies.knowledgeblog.org/405</a>


</li>
</ol>

</div> <!-- kcite-section 2324 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2324/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Computing Publication Online</title>
		<link>http://www.russet.org.uk/blog/2322</link>
		<comments>http://www.russet.org.uk/blog/2322#comments</comments>
		<pubDate>Fri, 08 Feb 2013 12:22:14 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Communication]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2322</guid>
		<description><![CDATA[A lot has been said about the scientific publication process, and how the publishers add value. I have commented before on the joys of being asked to pay extra page charges for colour pixels , which as a naive scientist I would think costs the same as black and white ones. I am not always [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2322">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Computing+Publication+Online&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2013-02-08&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2322&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>A lot has been said about the scientific publication process, and how the publishers add value. I have commented before on the joys of being asked to pay extra page charges for colour pixels <span id="cite_ITEM-2322-0" name="citation"><a href="#ITEM-2322-0">[1]</a></span>, which as a naive scientist <span id="cite_ITEM-2322-1" name="citation"><a href="#ITEM-2322-1">[2]</a></span> I would think costs the same as black and white ones. I am not always convinced of the value that is bought but even I am occasionally surprised by how paleolithic the industry can be. An example is a new article on <a href="http://www.computer.org/portal/web/computingnow/content?g=53319&amp;type=article&amp;urlTitle=clojure-for-number-crunching-on-multicore-machin-1">Clojure for concurrency</a>.</p>
<p>The article itself is fine, and I make no comments on it. But the display is highly interesting. Firstly, it uses a nice Javascript paging interface in case your browser is not equipped with scrollbars. If you manage to defeat this (there is a &#8220;Display stuff on one page&#8221; button), then scroll down to Figure 2 which is a code listing. It is split into two parts: Figure 2 and Figure 2 (continued)&#8201;&#8212;&#8201;if anyone can tell me why this is a good idea on the web, I&#8217;d be interested. Even better, it&#8217;s a GIF. An image, of a piece of code. Not easy to say for sure, but it even looks rather like an image of a print out.</p>
<p>Still, all is saved, because if you scroll to the end, you find that all the code is on github. There is even a URL there which takes you straight to it; or, rather, you can cut-and-paste it into your browser, because it&#8217;s not hyperlinked, and neither are any of the others.</p>
<p>A quick check suggests that this is <a href="http://www.computer.org/portal/web/computingnow/content?g=53319&amp;type=article&amp;urlTitle=caring-for-your-data">normal practice</a>. All a bit of an epic fail really.</p>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2322-0'></a>
P. Lord, "Bringing Things to Life", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2170">http://www.russet.org.uk/blog/2170</a>


</li>
<li><a name='ITEM-2322-1'></a>
P. Lord, "The Naivete of Scientists", <i>An Exercise in Irrelevance</i>, 2011. <a href="http://www.russet.org.uk/blog/1924">http://www.russet.org.uk/blog/1924</a>


</li>
</ol>

</div> <!-- kcite-section 2322 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2322/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Remembering the World as it used to be</title>
		<link>http://www.russet.org.uk/blog/2316</link>
		<comments>http://www.russet.org.uk/blog/2316#comments</comments>
		<pubDate>Fri, 11 Jan 2013 16:50:37 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Clojure-owl]]></category>
		<category><![CDATA[Tawny-OWL]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2316</guid>
		<description><![CDATA[I have been working on a Clojure library for developing OWL ontologies . There have been two significant advances with this library recently. First, I have changed its name from clojure-owl to tawny-owl. I was never really happy with the original name; I think it is bad practice to name something after the language it [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2316">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Remembering+the+World+as+it+used+to+be&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2013-01-11&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2316&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>I have been working on a Clojure library for developing OWL ontologies <span id="cite_ITEM-2316-0" name="citation"><a href="#ITEM-2316-0">[1]</a></span>. There have been two significant advances with this library recently. First, I have changed its name from clojure-owl to tawny-owl. I was never really happy with the original name; I think it is bad practice to name something after the language it uses (even partly, as the many jlibraries attest), and there was several other libraries around for manipulating OWL in clojure, albeit in different ways. &#8220;Tawny&#8221; is simple and straight-forward and memorable, I think. At the same time, I moved to <a href="https://github.com/phillord/tawny-owl">Github</a> because I can now just updated <tt>readme.md</tt>, rather than having to update a separate website.</p>
<p>Perhaps, more importantly, I have put in new code for handling change to external ontologies, which is particularly important for external libraries.</p>
<p>Throughout the development of tawny-owl, I have focused on provided an environment that is easy to use for the developer; so, classes, properties and other entities are represented as lisp symbols <span id="cite_ITEM-2316-1" name="citation"><a href="#ITEM-2316-1">[2]</a></span>. This works well and produces very attractive looking code in, for example, my version of the <a href="http://code.google.com/p/pizza-clj/">Pizza ontology</a>. I have also written code so that ontologies only available as OWL files can be treated as first-class citizens: very easy in a highly dynamic language like Lisp.</p>
<p>However, it causes problems when combined with an ontology such as OBI <span id="cite_ITEM-2316-2" name="citation"><a href="#ITEM-2316-2">[3]</a></span>. The difficulty here is that OBI uses semantics-free identifiers <span id="cite_ITEM-2316-3" name="citation"><a href="#ITEM-2316-3">[4]</a></span>. While there are some good reasons for this, would result in Clojure of the form:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> OBI:0034322
     <b><font color="#000080">:subclass</font></b> OBI:0034321<font color="#FF0000">)</font></tt></pre>
</td>
</tr>
</table>
<p>Clearly, this is not good, and something that I want to avoid. So, instead, we apply a transform function to OBI when importing it; basically, this munges the <tt>rdfs:label</tt> annotation, turning it into something that is a legal Clojure symbol.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt>  <b><font color="#000080">:transform</font></b>
  <i><font color="#9A1900">;; fix the space problem</font></i>
  <font color="#FF0000">(</font>fn [e]
    <font color="#FF0000">(</font>clojure.string/replace
     <i><font color="#9A1900">;; with luck these will always be literals, so we can do this</font></i>
     <i><font color="#9A1900">;; although not true in general</font></i>
     <font color="#FF0000">(</font>.getLiteral
      <i><font color="#9A1900">;; get the value of the annotation</font></i>
      <font color="#FF0000">(</font>.getValue
       <font color="#FF0000">(</font>first
        <i><font color="#9A1900">;; filter for annotations which are labels</font></i>
        <i><font color="#9A1900">;; is lazy, so doesn't eval all</font></i>
        <font color="#FF0000">(</font>filter
         #<font color="#FF0000">(</font>.. % <font color="#FF0000">(</font>getProperty<font color="#FF0000">)</font> <font color="#FF0000">(</font>isLabel<font color="#FF0000">))</font>
         <i><font color="#9A1900">;; get the annotations</font></i>
         <font color="#FF0000">(</font>.getAnnotations e
                          <font color="#FF0000">(</font>owl.owl/get-current-jontology<font color="#FF0000">))))))</font>
     #<font color="#FF0000">"[ /]"</font> <font color="#FF0000">"_"</font>
     <font color="#FF0000">))</font></tt></pre>
</td>
</tr>
</table>
<p>All well and good. However, there is a problem. The label in OBI has two characteristics. First, it is human readable, which is good, and the reason why we are using it. Second, however, is does not carry formal semantics; the developers are free to change these labels when ever they like. Of course, any ontology that I build against by tawnyized version of OBI will break, because the label has changed. This is not a problem for a GUI like protege, because, perhaps ironically, GUIs are not WYSIWYG&#8201;&#8212;&#8201;what you see is actually a view of the underlying datamodel. So, protege shows you the label, but actually you are manipulating the URI. A dependency can change their labels, and when Protege reloads it, this is what the developer will see.</p>
<p>With code, on the other hand, there is no separation at all. If the label changes, I will <strong>have</strong> to update anything that refers to this, which seems a substantial problem. However, I have now managed to work around this. My new library <tt>memorise</tt> saves all the mappings into a file, then restores them when OBI is loaded. Any old labels that no longer exist but which point to an IRI that still does exist are generated as duplicate symbols pointing to the same OWL object; however, I have done this in a way that they will emit warnings both when loading, and during use, with a description of the new symbol name. This data would also make automatic upgrading possible, of course, using Clojure to perform a big search and replace on the source code. I think that this is a nicer solution than the denormalisation <span id="cite_ITEM-2316-4" name="citation"><a href="#ITEM-2316-4">[5]</a></span> or &#8220;colour cube&#8221; solution <span id="cite_ITEM-2316-3" name="citation"><a href="#ITEM-2316-3">[4]</a></span> that I previously suggested for Manchester syntax. It also shows off the advantage of using a programming language, rather than a static format; I, or any other user of the library, can just add this, as I choose, without having to wait for standardisation process, and tool support to catch up.</p>
<p>This will still leave a secondary problem; it is dependent on the IRI which for pre-release versions of OBI is not fixed, as <a href="http://docs.google.com/Doc?id=dzprnmw_7fdhfg62j">documented</a>. Of course, this problem could go away, if OBI used a tool like <a href="http://code.google.com/p/urigen/">URIGen</a>, or alternatively if OBI released more regularly. Still, the data should also allow a reverse lookup&#8201;&#8212;&#8201;finding out what IRI a label now has.</p>
<p>I think these are the main tools that are needed to build against an external resource. The 0.5 version of Tawny is now available on Clojars and Github.</p>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2316-0'></a>
P. Lord, "Programming OWL", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2214">http://www.russet.org.uk/blog/2214</a>


</li>
<li><a name='ITEM-2316-1'></a>
P. Lord, "OWL Concepts as Lisp Atoms", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2254">http://www.russet.org.uk/blog/2254</a>


</li>
<li><a name='ITEM-2316-2'></a>
R.R. Brinkman, M. Courtot, D. Derom, J.M. Fostel, Y. He, P. Lord, J. Malone, H. Parkinson, B. Peters, P. Rocca-Serra, A. Ruttenberg, S. Sansone, L.N. Soldatova, C.J. Stoeckert, J.A. Turner, and J. Zheng, "Modeling biomedical experimental processes with OBI", <i>Journal of Biomedical Semantics</i>, vol. 1, pp. S7, 2010. <a href="http://dx.doi.org/10.1186/2041-1480-1-S1-S7">http://dx.doi.org/10.1186/2041-1480-1-S1-S7</a>


</li>
<li><a name='ITEM-2316-3'></a>
P. Lord, "Semantics-Free Ontologies", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2040">http://www.russet.org.uk/blog/2040</a>


</li>
<li><a name='ITEM-2316-4'></a>
P. Lord, "OBO Format and Manchester Syntax", <i>An Exercise in Irrelevance</i>, 2009. <a href="http://www.russet.org.uk/blog/1470">http://www.russet.org.uk/blog/1470</a>


</li>
</ol>

</div> <!-- kcite-section 2316 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2316/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Clojure OWL 0.2</title>
		<link>http://www.russet.org.uk/blog/2303</link>
		<comments>http://www.russet.org.uk/blog/2303#comments</comments>
		<pubDate>Mon, 03 Dec 2012 16:24:28 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Clojure-owl]]></category>
		<category><![CDATA[Tawny-OWL]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2303</guid>
		<description><![CDATA[I have been developing a library written in Clojure, that I can use for building OWL ontologies programmatically . The basic idea behind this library is to give me something that looks like Manchester syntax , but which is none the less fully programmatic; it can be extended arbitrarily, both for general use and for [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2303">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Clojure+OWL+0.2&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2012-12-03&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2303&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>I have been developing a library written in <a href="http://www.clojure.org">Clojure</a>, that I can use for building OWL ontologies programmatically <span id="cite_ITEM-2303-0" name="citation"><a href="#ITEM-2303-0">[1]</a></span>. The basic idea behind this library is to give me something that looks like Manchester syntax <span id="cite_ITEM-2303-1" name="citation"><a href="#ITEM-2303-1">[2]</a></span>, but which is none the less fully programmatic; it can be extended arbitrarily, both for general use and for one-off, single ontology specific custom code.</p>
<p>This has already shown its worth: for example, adding a syntax for &#8220;some and only&#8221; closure axioms was straightforward; likewise, I can now express disjoints and subclasses implicitly through bracket placement, rather than through named concepts <span id="cite_ITEM-2303-2" name="citation"><a href="#ITEM-2303-2">[3]</a></span>. Although in its early dates, I have added initial support for ontology design patterns&#8201;&#8212;&#8201;in this case a value partition, which I think will be extended significantly in future versions. I have used this in my version of the <a href="http://code.google.com/p/pizza-clj/">Pizza ontology</a> which seemed as good a demonstrator to start with as any <span id="cite_ITEM-2303-3" name="citation"><a href="#ITEM-2303-3">[4]</a></span>; this also contains some custom &#8220;one-off&#8221; patterns, for building &#8220;named&#8221; pizzas. Taken together these where enough to constitute the first (unheralded) 0.1 of Clojure-OWL.</p>
<p>However for 0.2, I wanted one more feature that I think makes this now a usable alternative for developing ontologies. I wanted to be able to address ontologies that were built using other technologies, which were accessibly only as an OWL file. Of course, the library has always had the ability to build classes using URIs as strings; this facility means that it is possible to address another ontology. However, I wanted ontologies read from OWL files to be first-class citizens; classes and properties should be represented as lisp symbols <span id="cite_ITEM-2303-4" name="citation"><a href="#ITEM-2303-4">[5]</a></span>, providing a degree of safety to the system&#8201;&#8212;&#8201;it is not possible to refer to a concept not previously defined, nor use a concept where a property is needed.</p>
<p>This turned out to be reasonably straightforward; Clojure-OWL now maps an individual ontology to a clojure namespace. Reading an ontology in from an OWL file is reasonably simple using the OWL API; finally, as a highly dynamic language, clojure can create new symbols on the fly with ease. To test this out, I needed a reasonably large and complex ontology: I choose OBI <span id="cite_ITEM-2303-5" name="citation"><a href="#ITEM-2303-5">[6]</a></span> for reasons of familiarity.</p>
<p>The process of integrating it into Clojure-OWL starts to show the power of this approach. A basic outline of the code to achieve this is simple enough. It requires a location, prefix and an identifier. The location is generic, including a stream, so could be anything. I have include &#8220;obi.owl&#8221; as a class resource; I can use a URL, but accessing the network every time I wish to use things is a pain, although this would effective provide a form of continuous integration.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>defread obi
  <i><font color="#9A1900">;; something that the OWL API can interpret. This includes a stream, so</font></i>
  <i><font color="#9A1900">;; it's totally generic.</font></i>
  <b><font color="#000080">:location</font></b> <font color="#FF0000">(</font>IRI/create <font color="#FF0000">(</font>clojure.java.io/resource <font color="#FF0000">"obi.owl"</font><font color="#FF0000">))</font>
  <i><font color="#9A1900">;; the prefix that you want to use in this case</font></i>
  <b><font color="#000080">:prefix</font></b> <font color="#FF0000">"obo"</font>
  <i><font color="#9A1900">;; normally only things from this IRI will be imported</font></i>
  <b><font color="#000080">:iri</font></b> <font color="#FF0000">"http://purl.obolibrary.org/obo/"</font>
  <font color="#FF0000">)</font></tt></pre>
</td>
</tr>
</table>
<p>On its own though, OBI contains a large number of concepts from many different ontologies. Normally, I filter for only entities whose identifier starts with the IRI above. This fails with OBO ontologies which use a sort of namespacing mechanism and a numeric identifier. So I need to apply a custom filter.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt>  <b><font color="#000080">:filter</font></b>
  <font color="#FF0000">(</font>fn [e]
    <font color="#FF0000">(</font>and <font color="#FF0000">(</font>instance? OWLNamedObject e<font color="#FF0000">)</font>
         <font color="#FF0000">(</font>.startsWith
          <font color="#FF0000">(</font>.toString <font color="#FF0000">(</font>.getIRI e<font color="#FF0000">))</font>
          <font color="#FF0000">"http://purl.obolibrary.org/obo/OBI"</font>
          <font color="#FF0000">))</font>
    <font color="#FF0000">)</font></tt></pre>
</td>
</tr>
</table>
<p>I can think of many other uses for this sort of filtering; if I want to include a subset of entities then this would work also.</p>
<p>The next problem is OBIs use of semantic-free identifiers <span id="cite_ITEM-2303-6" name="citation"><a href="#ITEM-2303-6">[7]</a></span>. Even if the reasons behind this decision are good, the resulting numeric atoms (<tt>OBI_0000107</tt>) are useless&#8201;&#8212;&#8201;I want to be able to say <tt>provides_service_consumer_with</tt>. So for this I use a custom transform function. This forms the name of the lisp symbol from the label instead, with a regexp fix to remove characters which are illegal&#8201;&#8212;&#8201;spaces for obvious reasons, and &#8220;/&#8221; which clojure uses as a namespace qualifier.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt>  <b><font color="#000080">:transform</font></b>
  <i><font color="#9A1900">;; fix the space problem</font></i>
  <font color="#FF0000">(</font>fn [e]
    <font color="#FF0000">(</font>clojure.string/replace
     <i><font color="#9A1900">;; with luck these will always be literals, so we can do this</font></i>
     <i><font color="#9A1900">;; although not true in general</font></i>
     <font color="#FF0000">(</font>.getLiteral
      <i><font color="#9A1900">;; get the value of the annotation</font></i>
      <font color="#FF0000">(</font>.getValue
       <font color="#FF0000">(</font>first
        <i><font color="#9A1900">;; filter for annotations which are labels</font></i>
        <i><font color="#9A1900">;; is lazy, so doesn't eval all</font></i>
        <font color="#FF0000">(</font>filter
         #<font color="#FF0000">(</font>.. % <font color="#FF0000">(</font>getProperty<font color="#FF0000">)</font> <font color="#FF0000">(</font>isLabel<font color="#FF0000">))</font>
         <i><font color="#9A1900">;; get the annotations</font></i>
         <font color="#FF0000">(</font>.getAnnotations e
                          <font color="#FF0000">(</font>owl.owl/get-current-jontology<font color="#FF0000">))))))</font>
     #<font color="#FF0000">"[ /]"</font> <font color="#FF0000">"_"</font>
     <font color="#FF0000">))</font></tt></pre>
</td>
</tr>
</table>
<p>The final addition is to add the ability to import an ontology into the current; without this, references to another ontology will share URIs, but not pull the referenced ontology with all its axioms into the current namespace. Without this, reasoning will not work as expected. This is achieved with a single form:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>owlimport obi<font color="#FF0000">)</font></tt></pre>
</td>
</tr>
</table>
<p>Unfortunately, I have had to disable Hermit functionality&#8201;&#8212;&#8201;our current <a href="https://github.com/phillord/hermit-maven">mavenized version</a> of HermiT <span id="cite_ITEM-2303-7" name="citation"><a href="#ITEM-2303-7">[8]</a></span> is working, but failing a few tests from incompatibilities with the current OWL API. This will be re-enabled in the new version.</p>
<p>Taken together, I think, clojure-owl now represents a reasonable programmatic environment for OWL. We now have the tools we need to replicate the the essential functionality of a tool like Protege; not that I am trying to replace Protege, as I still use it as a viewer for my generated ontologies. But, more over, I can now extend this functionality. As well as importing an ontology, I can filter the import so that only certain entities are available&#8201;&#8212;&#8201;an ad-hoc form of privacy. In later versions, I will probably add more explicit support for this. We can now package an OWL ontology in a Jar and publish it to any Maven repository. You may love or hate maven (generally, the latter), but being able to resolve dependencies is a strong point, especially as it brings versioning with it.</p>
<p>Release 0.2 is now available on <a href="https://clojars.org/uk.org.russet/clojure-owl">Clojars</a> or <a href="http://code.google.com/p/clojure-owl/">Google Code</a>.</p>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2303-0'></a>
P. Lord, "Programming OWL", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2214">http://www.russet.org.uk/blog/2214</a>


</li>
<li><a name='ITEM-2303-1'></a>
M. Horridge, and P.F. Patel-Schneider, "OWL 2 Web Ontology Language
Manchester Syntax (Second
Edition)", <i>W3C</i>, 2012. <a href="http://www.w3.org/TR/owl2-manchester-syntax/">http://www.w3.org/TR/owl2-manchester-syntax/</a>


</li>
<li><a name='ITEM-2303-2'></a>
P. Lord, "Disjoints in Clojure-owl", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2275">http://www.russet.org.uk/blog/2275</a>


</li>
<li><a name='ITEM-2303-3'></a>
. @wordpressdotcom, "Why the Pizza Ontology Tutorial?", <i>Robert Stevens' Blog</i>, 2010. <a href="http://robertdavidstevens.wordpress.com/2010/01/22/why-the-pizza-ontology-tutorial/">http://robertdavidstevens.wordpress.com/2010/01/22/why-the-pizza-ontology-tutorial/</a>


</li>
<li><a name='ITEM-2303-4'></a>
P. Lord, "OWL Concepts as Lisp Atoms", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2254">http://www.russet.org.uk/blog/2254</a>


</li>
<li><a name='ITEM-2303-5'></a>
R.R. Brinkman, M. Courtot, D. Derom, J.M. Fostel, Y. He, P. Lord, J. Malone, H. Parkinson, B. Peters, P. Rocca-Serra, A. Ruttenberg, S. Sansone, L.N. Soldatova, C.J. Stoeckert, J.A. Turner, and J. Zheng, "Modeling biomedical experimental processes with OBI", <i>Journal of Biomedical Semantics</i>, vol. 1, pp. S7, 2010. <a href="http://dx.doi.org/10.1186/2041-1480-1-S1-S7">http://dx.doi.org/10.1186/2041-1480-1-S1-S7</a>


</li>
<li><a name='ITEM-2303-6'></a>
P. Lord, "Semantics-Free Ontologies", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2040">http://www.russet.org.uk/blog/2040</a>


</li>
<li><a name='ITEM-2303-7'></a>
"HermiT Reasoner: Home"<a href="http://hermit-reasoner.com/">http://hermit-reasoner.com/</a>


</li>
</ol>

</div> <!-- kcite-section 2303 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2303/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Publishing With Future Internet</title>
		<link>http://www.russet.org.uk/blog/2297</link>
		<comments>http://www.russet.org.uk/blog/2297#comments</comments>
		<pubDate>Mon, 26 Nov 2012 15:30:08 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Communication]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2297</guid>
		<description><![CDATA[I have previously described the difficulty that we have had publishing in semantic web conferences ; the two main conferences (ESWC and ISWC) both publish with Springer-Verlag, and so provide no open access option. Although the contents of the paper has been made available now both through arXiv and here, we decided that in the [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2297">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Publishing+With+Future+Internet&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2012-11-26&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2297&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>I have previously described the difficulty that we have had publishing in semantic web conferences <span id="cite_ITEM-2297-0" name="citation"><a href="#ITEM-2297-0">[1]</a></span>; the two main conferences (ESWC and ISWC) both publish with Springer-Verlag, and so provide no open access option.</p>
<p>Although the contents of the paper has been made available now both through <a href="http://arxiv.org/abs/1206.5135v1">arXiv</a> and <a href="http://www.russet.org.uk/blog/2054">here</a>, we decided that in the middle of REF madness, it did not make sense to let the work lie there. So, where to publish?</p>
<p>Well, I was inspired by Ross Mounce post <span id="cite_ITEM-2297-1" name="citation"><a href="#ITEM-2297-1">[2]</a></span> showing the various open access options, showing the entertainingly large gap between the price of open access from different publishers; the gap should only be a surprise to those with little understanding of economics; prices relate to what the market can bear and not what a service costs to provide, I have described previously <span id="cite_ITEM-2297-2" name="citation"><a href="#ITEM-2297-2">[3]</a></span>. At the top left of the graph (most permissive license, cheapest article charges) comes <a href="http://www.mdpi.com">MDPI</a>, although the second version of the plot shows cheaper options <span id="cite_ITEM-2297-3" name="citation"><a href="#ITEM-2297-3">[4]</a></span>.</p>
<p>I have never published with MDPI before, but I have recently reviewed a paper for them; I am very selective with reviewing these days, but the paper sounded interesting and I had never heard of <a href="http://www.mdpi.com/journal/futureinternet">Future Internet</a> before. So, this seemed like a reasonable bet. Accordingly the paper has just been published and come complete with <span id="cite_ITEM-2297-4" name="citation"><a href="#ITEM-2297-4">[5]</a></span>; a dubious badge of honour if ever there was one <span id="cite_ITEM-2297-5" name="citation"><a href="#ITEM-2297-5">[6]</a></span>.</p>
<p>So, how was the experience? On the whole I think it was very positive. The review period was pretty impressive, with a turn around of relatively few days; they seem to have taken the approach to ask reviewers to say no, if they cannot return within a short time. Ironically, our own response to the reviews was much longer delayed by the start of term work bomb, stretching to well over a month. Type setting was efficient and seems to have be done reasonably; this is not a given as previous experience has shown <span id="cite_ITEM-2297-6" name="citation"><a href="#ITEM-2297-6">[7]</a></span>.</p>
<p>There are only two things that I dislike; first they offer full-text only as PDF. I use PDFs only under exceptional circumstances these days; the viewers are all clunky and horrible, and total fail when moving between different screen sizes. Especially, given that MDPI offers an <a href="http://www.mdpi.com/1999-5903/4/4/1004/xml">XML</a> download, the lack of full-text HTML is a bit surprising. And slightly ironic for a paper discussing web publication.</p>
<p>Secondly, they seem to have adopted a strange policy with respect to publication/acceptance dates. Our original submission recieved this reply:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:0.2em 0;"> 
<tr>
<td style="padding:0.5em;">
<pre style="margin:0; padding:0;">Thank you very much for your manuscript:

Manuscript ID: futureinternet-22379
Type of manuscript: Article
Title: Three Steps to Heaven: Semantic Publishing in a Real World Workflow
Authors: Phillip Lord *, Simon Cockell, Robert Stevens
Received: 17 August 2012</pre>
</td>
</tr>
</table>
<p>Yet, by the time the paper was <a href="http://www.mdpi.com/1999-5903/4/4/1004">published</a> our submission date has morphed into 22 September 2012. Now we were told that this might happen&#8201;&#8212;&#8201;apparently, this is their policy if revisions last over a month. Very strange, and I cannot really think of a good reason for not having a submission date which is the same as the date of the submission.</p>
<p>Of course, this does not worry me that much; the work was &#8220;submitted&#8221; and published on 12th April here <span id="cite_ITEM-2297-7" name="citation"><a href="#ITEM-2297-7">[8]</a></span>. I consider this to be the canonical version of my work <span id="cite_ITEM-2297-6" name="citation"><a href="#ITEM-2297-6">[7]</a></span>. However, as a mechanism for secondary publication, I think, Future Internet seems a reasonable bet.</p>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2297-0'></a>
P. Lord, "Open Access and the Semantic Web", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2157">http://www.russet.org.uk/blog/2157</a>


</li>
<li><a name='ITEM-2297-1'></a>
. rmounce, "A visualization of Gold Open Access options ", <i>Ross Mounce</i>, 2012. <a href="http://rossmounce.co.uk/2012/08/30/a-visualization-of-gold-open-access-options/">http://rossmounce.co.uk/2012/08/30/a-visualization-of-gold-open-access-options/</a>


</li>
<li><a name='ITEM-2297-2'></a>
P. Lord, "Why academic publishing is like a coffee shop", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2248">http://www.russet.org.uk/blog/2248</a>


</li>
<li><a name='ITEM-2297-3'></a>
. rmounce, "The Gold OA plot v0.2", <i>Ross Mounce</i>, 2012. <a href="http://rossmounce.co.uk/2012/09/04/the-gold-oa-plot-v0-2/">http://rossmounce.co.uk/2012/09/04/the-gold-oa-plot-v0-2/</a>


</li>
<li><a name='ITEM-2297-4'></a>
P. Lord, S. Cockell, and R. Stevens, "Three Steps to Heaven: Semantic Publishing in a Real World Workflow", <i>Future Internet</i>, vol. 4, pp. 1004-1015, 2012. <a href="http://dx.doi.org/10.3390/fi4041004">http://dx.doi.org/10.3390/fi4041004</a>


</li>
<li><a name='ITEM-2297-5'></a>
P. Lord, and S. Cockell, "The Problem with DOIs", <i>An Exercise in Irrelevance</i>, 2011. <a href="http://www.russet.org.uk/blog/1849">http://www.russet.org.uk/blog/1849</a>


</li>
<li><a name='ITEM-2297-6'></a>
P. Lord, "Bringing Things to Life", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2170">http://www.russet.org.uk/blog/2170</a>


</li>
<li><a name='ITEM-2297-7'></a>
P. Lord, S. Cockell, and R. Stevens, "Three Steps to Heaven", <i>SePublica 2012</i>, 2012. <a href="http://www.russet.org.uk/blog/2054">http://www.russet.org.uk/blog/2054</a>


</li>
</ol>

</div> <!-- kcite-section 2297 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2297/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>A One Man War</title>
		<link>http://www.russet.org.uk/blog/2292</link>
		<comments>http://www.russet.org.uk/blog/2292#comments</comments>
		<pubDate>Wed, 14 Nov 2012 16:44:50 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Communication]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2292</guid>
		<description><![CDATA[I was recently described by Duncan Hull as waging a one man war for metadata on the web. There is a degree of truth in this, of course. Since Lindsay Marshall and myself started work on Greycite (Lindsay writes it, then I break it, roles both of us are happy with), there is a degree [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2292">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=A+One+Man+War&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2012-11-14&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2292&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>I was recently described by <a href="http://duncan.hull.name/">Duncan Hull</a> as waging a one man war for metadata on the web. There is a degree of truth in this, of course. Since Lindsay Marshall <span id="cite_ITEM-2292-0" name="citation"><a href="#ITEM-2292-0">[1]</a></span> and myself started work on Greycite <span id="cite_ITEM-2292-1" name="citation"><a href="#ITEM-2292-1">[2]</a></span> (Lindsay writes it, then I break it, roles both of us are happy with), there is a degree of truth in this. I have found myself continually amazed by which websites do or do not carry metadata. Often there is none whatsoever, and sometimes it&#8217;s just wrong.</p>
<p>What is amazing is that many organisations who you think really should have metadata don&#8217;t. Toward this end, I have started to compile two pages: <a href="http://www.russet.org.uk/blog/metadata-irony">metadata irony</a> and <a href="http://www.russet.org.uk/blog/metadata-awards">metadata awards</a>. At the moment, these are just some pages, but I might make something better if they get long enough. I guarantee that some of the organisations here will surprise you.</p>
<p>Duncan, incidentally, has reasonably good <a href="http://greycite.knowledgeblog.org/?uri=http://duncan.hull.name/">metadata</a> on this blog. Should probably fix his name though.</p>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2292-0'></a>
L. Marshall, "Lindsay Marshall's Home Page", <i>Catless</i>, 2012. <a href="http://catless.ncl.ac.uk/Lindsay/">http://catless.ncl.ac.uk/Lindsay/</a>


</li>
<li><a name='ITEM-2292-1'></a>
P. Lord, "Greycite", <i>Knowledge Blog</i>, 2012. <a href="http://knowledgeblog.org/greycite">http://knowledgeblog.org/greycite</a>


</li>
</ol>

</div> <!-- kcite-section 2292 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2292/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Disjoints in Clojure-owl</title>
		<link>http://www.russet.org.uk/blog/2275</link>
		<comments>http://www.russet.org.uk/blog/2275#comments</comments>
		<pubDate>Mon, 12 Nov 2012 12:56:36 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Clojure-owl]]></category>
		<category><![CDATA[Ontology]]></category>
		<category><![CDATA[Tawny-OWL]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2275</guid>
		<description><![CDATA[When I started work on Clojure-owl the original intention was to provide myself with a more programmatic environment for writing ontologies, where I could work with a full programming language at to define the classes I wanted . After some initial work with functions taking strings, I have moved to an approach where classes (and [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2275">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Disjoints+in+Clojure-owl&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2012-11-12&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2275&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>When I started work on Clojure-owl the original intention was to provide myself with a more programmatic environment for writing ontologies, where I could work with a full programming language at to define the classes I wanted <span id="cite_ITEM-2275-0" name="citation"><a href="#ITEM-2275-0">[1]</a></span>. After some initial work with functions taking strings, I have moved to an approach where classes (and other ontological entities), are each assigned to a Lisp symbol <span id="cite_ITEM-2275-1" name="citation"><a href="#ITEM-2275-1">[2]</a></span>. I&#8217;m using &#8220;symbol&#8221; rather than &#8220;atom&#8221; because its a bit more accurate, especially as Clojure uses <a href="http://clojure.org/atoms">&#8220;atom&#8221;</a> with a different meaning.</p>
<p>This means that I now have something which allows me to write ontological terms looking something like this:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> a<font color="#FF0000">)</font>
<font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> b <b><font color="#000080">:subclass</font></b> a<font color="#FF0000">)</font>

<font color="#FF0000">(</font>defoproperty r<font color="#FF0000">)</font>
<font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> d
     <b><font color="#000080">:subclass</font></b> <font color="#FF0000">(</font>some r b<font color="#FF0000">))</font></tt></pre>
</td>
</tr>
</table>
<p>While this is quite nice, and looks fairly close to Manchester syntax <span id="cite_ITEM-2275-2" name="citation"><a href="#ITEM-2275-2">[3]</a></span>, ultimately, so far all this really provides me with is a slightly complex mechanism for achieving what I could already do; which raises the questions, why not just use Manchester syntax? Why bother with the Lisp if this is all I am to achieve?</p>
<p>I think I have now got to the point where the advantages are starting to show through, as I have started to create useful macros, which operate at a slightly higher level of abstraction from Manchester syntax. I will explain this using examples, perhaps inevitably, based around pizza <span id="cite_ITEM-2275-3" name="citation"><a href="#ITEM-2275-3">[4]</a></span>, which I have started to develop using Clojure-owl.</p>
<p>First I wanted to be able to define several classes at once, rather than having to use a somewhat long-winded <tt>defclass</tt> form for each; for this I have written a macro called <tt>declare-classes</tt>&#8201;&#8212;&#8201;perhaps a slight misnomer, as it also adds the classes to the ontology. This example shows the purpose:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt>  <font color="#FF0000">(</font><b><font color="#0000FF">declare</font></b>-classes
   GoatsCheeseTopping
   GorgonzolaTopping
   MozzarellaTopping
   ParmesanTopping<font color="#FF0000">)</font></tt></pre>
</td>
</tr>
</table>
<p>In practice, this may not be that useful for an ontology builder, as it creates a bare class; no documentation, nothing else. It may be useful for forward-declaration (like Clojure <a href="http://clojuredocs.org/clojure_core/clojure.core/declare">declare</a>).</p>
<p>One slightly unfortunate consequence of the decision to use lisp symbols is I know find myself writing a lot of macros. For those who have not used lisp before, most work is done with functions. Macros are only necessary when you wish to extend the language itself. They tend to be more complex to write and to debug, although fortunately are easy to use. Compare, for example, the definition of <tt>declare-classes</tt> to that of the functional equivalent which uses strings.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font><b><font color="#0000FF">defmacro</font></b> declare-classes
  [&amp; names]
  `<font color="#FF0000">(</font>do ~@<font color="#FF0000">(</font>map
          <font color="#FF0000">(</font>fn [x#]
            `<font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> ~x#<font color="#FF0000">))</font>
          names<font color="#FF0000">)))</font>

<font color="#FF0000">(</font><b><font color="#0000FF">defun</font></b> f-declare-classes
  [&amp; names]
  <font color="#FF0000">(</font>dorun
   <font color="#FF0000">(</font>map #<font color="#FF0000">(</font>owlclass x<font color="#FF0000">)</font> names<font color="#FF0000">)))</font></tt></pre>
</td>
</tr>
</table>
<p>Even in this case, there is more hieroglyphics in the macro&#8201;&#8212;&#8201;two backticks, one unquote splice and some <a href="http://clojuredocs.org/clojure_core/clojure.core/gensym"><tt>gensym</tt></a> symbols although Clojure&#8217;s slightly irritating lazy sequences and the resultant <tt>dorun</tt> mean that the two are nearly as long as each other. I suspect that the macros are going to get more complex, however. In most cases, should not be the user of the library that has to cope though.</p>
<p>While this provided a useful convenience, I also wanted a cleaner method for declaring disjoints. Consider this example:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> a<font color="#FF0000">)</font>
<font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> b<font color="#FF0000">)</font>
<font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> c<font color="#FF0000">)</font>

<font color="#FF0000">(</font>disjointclasses a b c<font color="#FF0000">)</font></tt></pre>
</td>
</tr>
</table>
<p>This is reasonably effective, but a pain if there are many classes, as they all need to be listed in the <tt>disjointclasses</tt> list. Worse, this is error prone; it is all too easy to miss a single class out, particularly if a new classes is added. So, I have now implemented an <tt>as-disjoint</tt> macro which gives this code:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>as-disjoints
   <font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> a<font color="#FF0000">)</font>
   <font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> b<font color="#FF0000">)</font>
   <font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> c<font color="#FF0000">))</font></tt></pre>
</td>
</tr>
</table>
<p>This should avoid both the risk of dropping a disjoint, as well avoiding the duplication. An even more common from is to wish to declare a set of classes as disjoint children. Again, I have provided a macro for this, which looks like this:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt> <font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> CheeseTopping<font color="#FF0000">)</font>

 <font color="#FF0000">(</font>as-disjoint-subclasses
  CheeseTopping

  <font color="#FF0000">(</font><b><font color="#0000FF">declare</font></b>-classes
   GoatsCheeseTopping
   GorgonzolaTopping
   MozzarellaTopping
   ParmesanTopping<font color="#FF0000">))</font></tt></pre>
</td>
</tr>
</table>
<p>Although this was not my original intention, these are actually nestable. This gives the interesting side effect that the ontology hierarchy is now represented in the structure of the lisp. Example below is an elided hierarchy from pizza. Lisp programmers will notice I have rather exaggerated the indentation to make the point.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>as-disjoint-subclasses
 PizzaTopping

    <font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> CheeseTopping<font color="#FF0000">)</font>

    <font color="#FF0000">(</font>as-disjoint-subclasses
         CheeseTopping

        <font color="#FF0000">(</font><b><font color="#0000FF">declare</font></b>-classes
            GoatsCheeseTopping<font color="#FF0000">))</font>

    <font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> FishTopping<font color="#FF0000">)</font>
    <font color="#FF0000">(</font>as-disjoint-subclasses
        FishTopping

        <font color="#FF0000">(</font><b><font color="#0000FF">declare</font></b>-classes AnchoviesTopping<font color="#FF0000">))</font>

    <font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> FruitTopping<font color="#FF0000">)</font>
    <font color="#FF0000">(</font>as-disjoint-subclasses
         FruitTopping

         <font color="#FF0000">(</font><b><font color="#0000FF">declare</font></b>-classes PineappleTopping<font color="#FF0000">)))</font></tt></pre>
</td>
</tr>
</table>
<p>Of course, it is not essential to do this. The nested use of <tt>as-disjoint-subclasses</tt> confers no semantics; but it does allow juxtaposition of a class and it&#8217;s children.</p>
<p>Being able to build up macros in this way was the main reason I wanted a real programming language; those described here are, I think, fairly general purpose; so, this form of declaration could also be supported in any of the various syntaxes, although it would require update to the tools. However, some ontologies will benefit from less general purpose extensions. These are never going to be supported in syntax specification.</p>
<p>Still, it is not all advantage. Using a programming language means embedding within this language. And this means that some of names I would like to use are gone; <a href="http://clojuredocs.org/clojure_core/clojure.core/some">http://clojuredocs.org/clojure_core/clojure.core/some</a> [some] is the obvious example. While Clojure has good <a href="http://clojure.org/namespaces">namespace</a> support, functions in <tt>clojure.core</tt> are available in all other namespaces; like all lisps, Clojure lacks types which would have avoided the problem. There are other ways around this, but ultimately clashing with these names is likely to bring pain; for example, I could always explicitly reference clojure-owl functions; but writing <tt>owl.owl.defclass</tt> rather than <tt>defclass</tt> seems a poor option; hence, <tt>some</tt> has become <tt>owlsome</tt>, and <tt>comment</tt> has become <tt>owlcomment</tt>. I have decided to accept the lack of consistency and kept <tt>only</tt> and <tt>label</tt>; the alternative, taken by the <a href="http://owlapi.sourceforge.net/">OWL API</a> to appending OWL to everything seems too unwieldy.</p>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2275-0'></a>
P. Lord, "Programming OWL", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2214">http://www.russet.org.uk/blog/2214</a>


</li>
<li><a name='ITEM-2275-1'></a>
P. Lord, "OWL Concepts as Lisp Atoms", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2254">http://www.russet.org.uk/blog/2254</a>


</li>
<li><a name='ITEM-2275-2'></a>
M. Horridge, and P.F. Patel-Schneider, "OWL 2 Web Ontology Language
Manchester Syntax (Second
Edition)", <i>W3C</i>, 2012. <a href="http://www.w3.org/TR/owl2-manchester-syntax/">http://www.w3.org/TR/owl2-manchester-syntax/</a>


</li>
<li><a name='ITEM-2275-3'></a>
. @wordpressdotcom, "Why the Pizza Ontology Tutorial?", <i>Robert Stevens' Blog</i>, 2010. <a href="http://robertdavidstevens.wordpress.com/2010/01/22/why-the-pizza-ontology-tutorial/">http://robertdavidstevens.wordpress.com/2010/01/22/why-the-pizza-ontology-tutorial/</a>


</li>
</ol>

</div> <!-- kcite-section 2275 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2275/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Improving Emacs lisp modes</title>
		<link>http://www.russet.org.uk/blog/2259</link>
		<comments>http://www.russet.org.uk/blog/2259#comments</comments>
		<pubDate>Wed, 24 Oct 2012 17:45:24 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Emacs]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2259</guid>
		<description><![CDATA[I&#8217;ve been writing a lot of lisp recently, both to extend my Emacs environment for OWL , and with my Clojure OWL library . I have been trying out two new modes to support this. The first is paredit.el which I have managed to miss despite knocking out Lisp for years; it&#8217;s a work of [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2259">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Improving+Emacs+lisp+modes&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2012-10-24&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2259&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>I&#8217;ve been writing a lot of lisp recently, both to extend my Emacs environment for OWL <span id="cite_ITEM-2259-0" name="citation"><a href="#ITEM-2259-0">[1]</a></span>, and with my Clojure OWL library <span id="cite_ITEM-2259-1" name="citation"><a href="#ITEM-2259-1">[2]</a></span>. I have been trying out two new modes to support this. The first is paredit.el which I have managed to miss despite knocking out Lisp for years; it&#8217;s a work of insane genius; fantastic when it does the right thing, but sometimes I find myself stuck in a rut. This will probably improve over time, but is only going to work when I am writing a lot of lisp.</p>
<p>My initial solution to this problem was to use the paredit <a href="http://emacswiki.org/emacs/PareditCheatsheet">cheat sheet</a>. This is good but unfortunately does not scale as it is only available as an image. I was a bit surprised to find that this cheat sheet is actually build on information that is embedded in the paredit source. Paredit uses this for generating help strings. This also makes it possible to generate a menu with examples as tooltips, which I have now done with paredit-menu.el. Code is available on <a href="http://code.google.com/p/phillord-emacs-packages/">http://code.google.com/p/phillord-emacs-packages/</a></p>
<p>My second problem was with show-paren mode. This is very useful for lisp, but I find it irritating in other modes. This is particularly the case because of my own pabbrev.el. This offers abbreviation expansions using sq[uare] brackets; even though this are transitory show-paren highlights which produces a rather annoying flickering on screen. Unfortunately, there is no way of blocking this&#8201;&#8212;&#8201;adding an overlay which blocks show-paren would be the ideal solution. Worse, show-paren, even though it is a minor mode is global; once it is on, it is on in every buffer. Really, it needs rewriting so that it can be switched on and off on a per-module basis.</p>
<p>My solution to this is to switch the minor mode on and off depending on the current mode. This isn&#8217;t worth turning into a package (since it&#8217;s a hack), but I put it up here in case anyone finds it useful.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><i><font color="#9A1900">;; show-paren-mode is a minor mode, but it's automatically global. Annoying.</font></i>
<font color="#FF0000">(</font><b><font color="#0000FF">defun</font></b> phil-show-paren-mode-check<font color="#FF0000">()</font>
  <i><font color="#9A1900">;; it isn't and it should be</font></i>
  <font color="#FF0000">(</font><b><font color="#0000FF">when</font></b> <font color="#FF0000">(</font>and <font color="#FF0000">(</font>phil-paren-mode-should-be-active-p<font color="#FF0000">)</font>
             <font color="#FF0000">(</font>not show-paren-mode<font color="#FF0000">))</font>
    <font color="#FF0000">(</font>show-paren-mode <font color="#993399">1</font><font color="#FF0000">))</font>
  <i><font color="#9A1900">;; it is and it shouldn't be</font></i>
  <font color="#FF0000">(</font><b><font color="#0000FF">when</font></b> <font color="#FF0000">(</font>and <font color="#FF0000">(</font>not <font color="#FF0000">(</font>phil-paren-mode-should-be-active-p<font color="#FF0000">))</font>
             show-paren-mode<font color="#FF0000">)</font>
    <font color="#FF0000">(</font>show-paren-mode <font color="#993399">0</font><font color="#FF0000">)))</font>

<font color="#FF0000">(</font>add-hook <font color="#009900">'post-command-hook</font>
          <font color="#009900">'phil-show-paren-mode-check</font><font color="#FF0000">)</font>

<font color="#FF0000">(</font><b><font color="#0000FF">defun</font></b> phil-paren-mode-should-be-active-p<font color="#FF0000">()</font>
  <font color="#FF0000">(</font>memq major-mode phil-paren-mode-active<font color="#FF0000">))</font>

<font color="#FF0000">(</font><b><font color="#0000FF">defvar</font></b> phil-paren-mode-active
  '<font color="#FF0000">(</font>clojure-mode emacs-lisp-mode<font color="#FF0000">))</font></tt></pre>
</td>
</tr>
</table>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2259-0'></a>
P. Lord, "Ontology Building with Emacs", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2161">http://www.russet.org.uk/blog/2161</a>


</li>
<li><a name='ITEM-2259-1'></a>
P. Lord, "OWL Concepts as Lisp Atoms", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2254">http://www.russet.org.uk/blog/2254</a>


</li>
</ol>

</div> <!-- kcite-section 2259 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2259/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>OWL Concepts as Lisp Atoms</title>
		<link>http://www.russet.org.uk/blog/2254</link>
		<comments>http://www.russet.org.uk/blog/2254#comments</comments>
		<pubDate>Tue, 23 Oct 2012 15:39:02 +0000</pubDate>
		<dc:creator>Phillip Lord</dc:creator>
				<category><![CDATA[Clojure-owl]]></category>
		<category><![CDATA[Ontology]]></category>
		<category><![CDATA[Tawny-OWL]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=2254</guid>
		<description><![CDATA[With my initial work on developing a Clojure environment for OWL , I was focused on producing something similar to Manchester syntax . Here, I describe my latest extensions which makes more extensive use of Lisp atoms. The practical upshot of this should be to reduce errors due to spelling mistakes, as well as enabling [...]]]></description>
				<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="2254">
<p><!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=OWL+Concepts+as+Lisp+Atoms&amp;rft.source=An+Exercise+in+Irrelevance&amp;rft.date=2012-10-23&amp;rft.identifier=http%3A%2F%2Fwww.russet.org.uk%2Fblog%2F2254&amp;rft.au=Phillip+Lord&amp;rft.format=text&amp;rft.language=English"></span><a name="preamble"></a> 
<p>With my initial work on developing a Clojure environment for OWL <span id="cite_ITEM-2254-0" name="citation"><a href="#ITEM-2254-0">[1]</a></span>, I was focused on producing something similar to Manchester syntax <span id="cite_ITEM-2254-1" name="citation"><a href="#ITEM-2254-1">[2]</a></span>. Here, I describe my latest extensions which makes more extensive use of Lisp atoms. The practical upshot of this should be to reduce errors due to spelling mistakes, as well as enabling me to add simple checks for correctness.</p>
<p>The desire for a simple syntax is an important one. I would like my library to be usable by people not experienced with Lisp, although I am clearly aware that this sort of environment is likely to be aimed at those with some programming skills. I have managed to produce a syntax which, I think, is reasonable straight forward. It has more parentheses than Manchester syntax, but is easier in other ways, especially now that I have learnt a little more about how Clojure namespaces work. For example, this defines a class in OWL.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>owlclass <font color="#FF0000">"HumanArm"</font>
          <b><font color="#000080">:subclass</font></b> <font color="#FF0000">"Arm"</font> <font color="#FF0000">(</font>some <font color="#FF0000">"isPartOf"</font> <font color="#FF0000">"Human"</font><font color="#FF0000">)</font>
          <b><font color="#000080">:annotation</font></b> <font color="#FF0000">(</font>comment <font color="#FF0000">"The Human arm is an Arm which is part of a human"</font><font color="#FF0000">))</font></tt></pre>
</td>
</tr>
</table>
<p>One of my initial desires for the Clojure mode was to enable the use of standard tools that we have come to expect from a modern programming language, which should enable us to build a more pragmatic ontology building methodology <span id="cite_ITEM-2254-2" name="citation"><a href="#ITEM-2254-2">[3]</a></span>. The first of these is a unit testing environment. Clojure already has one of these integrated. So far, I have only used this for testing my own code; so, for example, this is the current unit test for the <tt>owlclass</tt> function used above.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>deftest owlclass
  <font color="#FF0000">(</font>is <font color="#FF0000">(</font>= <font color="#993399">1</font>
         <font color="#FF0000">(</font>do <font color="#FF0000">(</font>o/owlclass <font color="#FF0000">"test"</font><font color="#FF0000">)</font>
             <font color="#FF0000">(</font>.size <font color="#FF0000">(</font>.getClassesInSignature
                     <font color="#FF0000">(</font><font color="#009900">#'o/get-current-jontology</font><font color="#FF0000">))))))</font>
  <font color="#FF0000">(</font>is <font color="#FF0000">(</font>instance? org.semanticweb.owlapi.model.OWLClass
                 <font color="#FF0000">(</font>o/owlclass <font color="#FF0000">"test"</font><font color="#FF0000">))))</font></tt></pre>
</td>
</tr>
</table>
<p>There are, however, some limitations to the approach that I have taken so far. Consider this statement:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>owlclass <font color="#FF0000">"HumanArm"</font>
          <b><font color="#000080">:subclass</font></b> <font color="#FF0000">(</font>some <font color="#FF0000">"isPartOf"</font> <font color="#FF0000">"Humn"</font><font color="#FF0000">)</font> <font color="#FF0000">"Arm"</font>
<font color="#FF0000">)</font></tt></pre>
</td>
</tr>
</table>
<p>This is broken because I have referred to the class <tt>Humn</tt> which I probably do not want to exist because I have spelt it wrongly. Unfortunately, as it stands my code does not know this and so will create the class &#8220;Humn&#8221;. Now, this form of error is not that likely to happen; tools such as Kudu <span id="cite_ITEM-2254-3" name="citation"><a href="#ITEM-2254-3">[4]</a></span> enforce this correctness in the Editor, while pabbrev.el <span id="cite_ITEM-2254-4" name="citation"><a href="#ITEM-2254-4">[5]</a></span> provides &#8220;correctness-by-completion&#8221;. None the less, these errors will happen and I do not want them to. There are a variety of ways that I could build this form of checking in&#8201;&#8212;&#8201;generally, this would involve introspecting over the ontology to see if classes already exist.</p>
<p>However, I have taken a different approach, so that I can use the Lisp itself to prevent the problem. To do this, for each class created, I generate a new Lisp symbol; likewise, object property and the ontology itself. The practical upshot of this, I that I can write code like so:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> a<font color="#FF0000">)</font>
<font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> b <b><font color="#000080">:subclass</font></b> a<font color="#FF0000">)</font>

<font color="#FF0000">(</font>defoproperty r<font color="#FF0000">)</font>
<font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> d
     <b><font color="#000080">:subclass</font></b> <font color="#FF0000">(</font>some r b<font color="#FF0000">))</font>

<i><font color="#9A1900">;; will fail as f does not exist</font></i>
<font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> e
     <b><font color="#000080">:subclass</font></b> f<font color="#FF0000">)</font>

<i><font color="#9A1900">;; will fail as r and b are the wrong way around</font></i>
<font color="#FF0000">(</font><b><font color="#0000FF">defclass</font></b> e
     <b><font color="#000080">:subclass</font></b> <font color="#FF0000">(</font>some b r<font color="#FF0000">))</font></tt></pre>
</td>
</tr>
</table>
<p>The advantages are three-fold. Firstly, it&#8217;s slightly shorter, and there is no need to use quotes all over the place. Secondly, it is no longer possible to refer to a class that has not yet been defined; Clojure will pick this up immediately; from the user perspective, you can test your statements as you go, as soon as you have written them, by evaluating them. Finally, because the atoms carry values which are typed, we can also detect errors such as using a property when a class is necessary.</p>
<p>Of course, the original functions are all still in place; there would be no point defining symbols if the intention was to use the API entirely programmatically. But, my intention for Clojure-OWL is to have environment for humans (well, programmers anyway) to develop ontologies with.</p>
<p>There is a final advantage to this, that I have not yet exploited. Currently, I have generated the name of the OWL class directly from the symbol name. So, in the above example the class <tt>a</tt> will have a name &#8220;<tt>a</tt>&#8220;. There are some problems with this. Not all characters are legal in Clojure symbol names nor in OWL class names, and the set of characters is not the same. So, while this is a useful default, I will formally separate these. At the same time, I think that this will allow me to address a second problem, that of semantics vs semantics free identifiers <span id="cite_ITEM-2254-5" name="citation"><a href="#ITEM-2254-5">[6]</a></span>. I can call a class, ontology or object property anything at all, and refer to it with a easy to remember identifier. I might use something like this:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td><!-- Generator: GNU source-highlight 3.1.5 by Lorenzo Bettini http://www.lorenzobettini.it http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#FF0000">(</font>defoproperty has_part
   <b><font color="#000080">:name</font></b> <font color="#FF0000">"BFO_OOOOO51"</font><font color="#FF0000">)</font></tt></pre>
</td>
</tr>
</table>
<p>The is still a significant amount of work to do yet; I haven&#8217;t made a complete coverage of OWL yet, just the most important parts (i.e. the bits that I use most often). Next, I need to start building some predicates so I can test (asserted) subclass relationships. So far, however, this approach is showing significant promise.</p>
<h2>References</h2>
    <ol>
    <li><a name='ITEM-2254-0'></a>
P. Lord, "Programming OWL", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2214">http://www.russet.org.uk/blog/2214</a>


</li>
<li><a name='ITEM-2254-1'></a>
M. Horridge, and P.F. Patel-Schneider, "OWL 2 Web Ontology Language
Manchester Syntax (Second
Edition)", <i>W3C</i>, 2012. <a href="http://www.w3.org/TR/owl2-manchester-syntax/">http://www.w3.org/TR/owl2-manchester-syntax/</a>


</li>
<li><a name='ITEM-2254-2'></a>
. @wordpressdotcom, "Unicorns in my Ontology", <i>Robert Stevens' Blog</i>, 2011. <a href="http://robertdavidstevens.wordpress.com/2011/05/26/unicorns-in-my-ontology/">http://robertdavidstevens.wordpress.com/2011/05/26/unicorns-in-my-ontology/</a>


</li>
<li><a name='ITEM-2254-3'></a>
. @wordpressdotcom, "My Own Ontology Projects", <i>Robert Stevens' Blog</i>, 2010. <a href="http://robertdavidstevens.wordpress.com/2010/04/24/my-own-ontology-projects/">http://robertdavidstevens.wordpress.com/2010/04/24/my-own-ontology-projects/</a>


</li>
<li><a name='ITEM-2254-4'></a>
P. Lord, "Ontology Building with Emacs", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2161">http://www.russet.org.uk/blog/2161</a>


</li>
<li><a name='ITEM-2254-5'></a>
P. Lord, "Semantics-Free Ontologies", <i>An Exercise in Irrelevance</i>, 2012. <a href="http://www.russet.org.uk/blog/2040">http://www.russet.org.uk/blog/2040</a>


</li>
</ol>

</div> <!-- kcite-section 2254 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2254/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
