<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>An Exercise in Irrelevance &#187; Science</title>
	<atom:link href="http://www.russet.org.uk/blog/category/all/professional/science/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.russet.org.uk/blog</link>
	<description>Ramblings from Phil Lord&#039;s life</description>
	<lastBuildDate>Mon, 06 Sep 2010 19:42:45 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Latex to WordPress</title>
		<link>http://www.russet.org.uk/blog/2010/08/latex-to-wordpress/</link>
		<comments>http://www.russet.org.uk/blog/2010/08/latex-to-wordpress/#comments</comments>
		<pubDate>Thu, 26 Aug 2010 14:34:34 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1740</guid>
		<description><![CDATA[LaTeX to WordPress Phillip Lord This post describes the process of posting to WordPress from a LaTeX source file, using tools generated as part of the Knowledgeblog project. 1 Introduction About a month ago, we managed to get funding from JISC for knowledgeblog; the idea is to turn a blog platform from something for light [...]]]></description>
			<content:encoded><![CDATA[<div class="titlepage"> 
<h1>LaTeX to WordPress</h1>
<p>Phillip Lord</p>
</p></div>
<div class="abstract"> This post describes the process of posting to WordPress from a LaTeX source file, using tools generated as part of the Knowledgeblog project. </div>
<h1 id="sec:introduction">1 Introduction</h1>
<p>About a month ago, we managed to get funding from <a href="http://www.jisc.ac.uk">JISC</a> for <a href="http://www.knowledgeblog.org">knowledgeblog</a>; the idea is to turn a blog platform from something for light commentary into a framework for serious scientific publication. One of the key requirements for this is to fit in with peoples existing working practices; and for this, we need a good document creation environment. This means word and latex. I’ve been working mostly on the latter, and this post is the first outcome. It’s generated totally automatically from latex. This is an advance on my paper on <a href="http://www.russet.org.uk/blog/2010/07/realism-and-science/">realism</a> which was semi-automatically converted, with some hand editing of the HTML. </p>
<p>At the moment, the tool-chain is a little bit clunky, but it will improve! This is not meant to be an annoucement that all is ready, just an early alpha release and proof-of-principle. </p>
<h1 id="a0000000002">2 Implementation</h1>
<p>The implementation of these tool-chain uses three pieces of software: </p>
<dl class="description">  
<dt>latextowordpress: </dt>
<dd>
<p>This package, that I have written, uses <a href="http://plastex.sourceforge.net">plasTeX</a> to parse and render the latex into HTML. Most of the work is being performed by plasTeX out-of-the-box, although using a non-default configuration. Math-mode is being treated separately however, rather than using plasTeXs default image rendering approach. </p>
</dd>
<dt>blogpost:</dt>
<dd>
<p><a href="http://www.methods.co.nz/asciidoc/#_blogpost_weblog_client">blogpost</a> is being used to actually post the generated HTML onto the web. The HTML can also be cut-and-paste directly into wordpress, but blogpost is easier for me, as its the usual tool I use anyway (normally over asciidoc source). Blogpost is unmodified. </p>
</dd>
<dt>mathjax-latex: </dt>
<dd>
<p>This is a wordpress plugin, that I have written, which uses <a href="http://www.mathjax.org">MathJax</a> to render math-mode from the original latex in the browser. The plugin just injects the mathjax javascript headers into a post on-demand (i.e. only on posts with math-mode in them). </p>
</dd>
</dl>
<p>Currently, this is all held together with some dodgy makefiles; this will be improved in time. </p>
<p>The first and last of these tools are available from <a href="http://services.knowledgeblog.org/download/">knowledgeblog</a>. I’ve tested them on Ubuntu 10.04 and they are in alpha. Comments are welcome, to <a href="knowledgeblog-discuss@knowledgeblog.org">knowledgeblog-discuss</a>. </p>
<h1 id="a0000000003">3 Key Features</h1>
<p>At the moment, I haven’t fully explored all the features of LaTeX that are well supported. However, all the structural elements (sections, lists), bibliographies, links via the <a href="http://www.tug.org/applications/hyperref/manual.html">hyperref</a> package all seem to work well. </p>
<p>The math mode rendering works well. I’ve been using one famous equation: \(E=mc^2\), as my main test. But more complex examples work also. This is from <a href="http://www.mathjax.org">mathjax</a>:\(J_\alpha (x) = \sum _{m=0}^\infty \frac{(-1)^ m}{m! \,  \Gamma (m + \alpha + 1)}{\left({\frac{x}{2}}\right)}^{2 m + \alpha }\). </p>
<p>I’ve made a few tweaks to this also for common idioms. So the lesser than symbol is written in mathmode in latex but rendered directly in HTML: &lt;. </p>
<h1 id="sec:future">4 Future Work</h1>
<p>There are many things left to do yet. The process needs to made smooother, with a single tool to hook the current tool-chain together; it would be good to attach a PDF generated from the latex also. Currently, titles are set independently (which is why this post appears to have two titles). The mathjax plugin needs configuration options (it overwrites wp-latex functionality at the moment). And there is significant testing to do to see what advanced features (figures critically!) work and don’t work. Still, it’s good to see that most of the tools that I needed to get this work already existed. With luck, most of the other tools we need will be as good. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/08/latex-to-wordpress/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The problem with institutional repositories</title>
		<link>http://www.russet.org.uk/blog/2010/08/the-problem-with-institutional-repositories/</link>
		<comments>http://www.russet.org.uk/blog/2010/08/the-problem-with-institutional-repositories/#comments</comments>
		<pubDate>Mon, 09 Aug 2010 11:09:50 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1737</guid>
		<description><![CDATA[I don&#8217;t normally use my blog to engage in conversations the way that some people do. I already spend enough time on mailing lists, so using the blog seems redundant for this. However, I will change the habit of a life-time this once, because of an interesting discussion on institutional repositories, which I have previously [...]]]></description>
			<content:encoded><![CDATA[<p>I don&#8217;t normally use my blog to engage in conversations the way that some people do. I already spend enough time on mailing lists, so using the blog seems redundant for this. However, I will change the habit of a life-time this once, because of an interesting discussion on <a href="http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=2527">institutional repositories</a>, which I have <a href="http://www.russet.org.uk/blog/2007/06/institutional-and-subject-archives/">previously</a> written about myself.</p>
<p>To me the difficulty with institutional repositories is this. First, they are a resource. Then, some one says, this is good, everyone should do this. Then, someone else says, hey this is great, we could use this for our RAE (REF, whatever) return.</p>
<p>Now, you have to deposit things in your IR. But people object, on various &#8220;data is mine&#8221; grounds, so perhaps they make the IR non-public. The data model gets tweaked with various additional data (which school, who your line manager is) necessary for RAE. At the same time, your co-authors also have to deposit into their IR. And, if you move, you have to type your entire back catalogue into various repositories for your new institution.</p>
<p>Currently I am supposed to deposit papers in various IRs, including at University and school level. As well as add bibliographic information to various databases. And, then of course, project wiki&#8217;s. And the funders want the information in various databases. All of which is very time consuming, produces highly duplicated, and often error-prone data. In short, it&#8217;s a bad thing.</p>
<p>The irony is, if you google for any of my papers, the main source from which they are scraped is my website. I set this up myself many years ago now; it&#8217;s a simple bibtex to HTML thing (actually not so simple nowadays&#8201;&#8212;&#8201;it grew over time). So, the simplest and most straight-forward solution, also turns out to be the best. The most important thing is this; the bibtex files are the ones that I use, for my own work, for citing myself (which, like any good scientist I do as often as possible even when the citation is largely <a href="http://www.russet.org.uk/blog/">irrelevant</a>). The website is what I use, when on the road to get the PDF of my own papers; if I want to give a reference to someone, I&#8217;ll email a link to my website. So, I keep it upto date, because it&#8217;s in my benefit to do so.</p>
<p>We need a few simple and easy to use standards for bibliographic data. It has to be simple, because it needs to fit in with peoples&#8217; current work practices; this means it needs to be supported by a heterogenous environment, by many different tools. And it&#8217;s won&#8217;t be, if the standard is hard to develop against.</p>
<p>For data, of course, the issues are somewhat different. Mostly because data needs more structure than human-readable information, and because the data is often large. However, two issues remain: first, we still need to fit with peoples working practices; second, with data, engaging in the institutional football we see with bibliographic data, will still be a bad thing.</p>
<p>Again, simple data standards are what we need. After that, people will choose whatever they choose; the data standard will be enough to bring it all together in the best way that we can.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/08/the-problem-with-institutional-repositories/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A new grant for Knowledgeblog</title>
		<link>http://www.russet.org.uk/blog/2010/08/a-new-grant-for-knowledgeblog/</link>
		<comments>http://www.russet.org.uk/blog/2010/08/a-new-grant-for-knowledgeblog/#comments</comments>
		<pubDate>Mon, 02 Aug 2010 14:06:36 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Grants]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1729</guid>
		<description><![CDATA[  I&#8217;m very pleased that our grant for knowledgeblog has been accepted by JISC. I shall follow the tradition that I set with my last post, of publishing all my primary scientific output on this blog. In this case, I&#8217;m using Word, which like the latex that I used last time isn&#8217;t perfect. Still improving [...]]]></description>
			<content:encoded><![CDATA[<p>
 </p>
<p><span style="font-family:Arial">I&#8217;m very pleased that our grant for <a href="http://www.knowledgeblog.org">knowledgeblog</a> has been accepted by JISC. I shall follow the tradition that I set with my <a href="http://www.russet.org.uk/blog/2010/07/realism-and-science/">last</a> post, of publishing all my primary scientific output on this blog. In this case, I&#8217;m using Word, which like the latex that I used last time isn&#8217;t perfect. Still improving this process is part of the knowledgeblog proposal, so this post is also attacking a key deliverable for the grant!
</span></p>
<p>
</span></p>
<p><span style="font-family:Arial">The main content for this post is also available on the <a href="http://knowledgeblog.org/category/all">knowledgeblog</a> events blog.</p>
<p> </p>
<p><span style="font-family:Arial"><strong>Outline Project Description
</strong></span></p>
<h1><span style="font-family:Arial; font-size:11pt">The project extends existing blogging tools for use as a lightweight, semantically linked publication environment. This enables researchers to create a hub in the linked-data environment, that we call <em>knowledge</em> or <em>k-blogs</em>.  K-blogs are convenient and straight-forward for authors to use, integrating into researchers existing work practices and tools. The provide readers with distributed feedback and commenting mechanisms. We will support three communities (microarray, public health and workflow), providing immediate benefit, in addition to the long term benefit of the platform as a whole.  Additionally, this will enable a user-centric development approach, while showcasing the platform as the basis for next generation research publishing. 1. Introduction 
</span></h1>
<p><span style="font-family:Arial"><sup>1</sup>This document describes a proposal for a project within the JISC &#8220;Managing research Data&#8221; call. Data comes in many forms, from raw statistics, to highly structured databases, through to textual reports; natural language, although hard to search and manage, is still the richest form of representation; data in the form of reports and publications are the central hub around which all other data sit. This project, therefore, will provide a lightweight, yet extensible, framework for scientific publishing, incorporating a software-supported peer-review process. Bi-directional links will be maintained both between publications and to other forms of data, using semantic markup to enhance the meaning of these links. We will also customize this framework for three communities which, as well as being directly useful, will provide real-world requirements. The project will largely develop &#8220;glue&#8221; between existing, widely-used, open-source software systems, ensuring its sustainability and usefulness past the end of the funding.<br/>
		</span></p>
<h1><span style="font-family:Arial; font-size:11pt">2. Fit to Programme Objectives and Project Outline 
</span></h1>
<p><span style="font-family:Arial"><br/><sup>2</sup>The project call identifies the <strong>complexity</strong> and <strong>hybrid</strong> nature of the UK research data environment; despite this, one central focal point remains &#8212; most researchers spend considerable amounts of time discussing their data in the form of &#8220;paper&#8221; publications. For some, more theoretical disciplines, such as parts of computer science, the paper is the sole output; in others, such as biology, datasets are associated with papers and the <strong>barriers</strong> between &#8220;publication&#8221; and &#8220;data&#8221; are breaking down; most data sources in biology are rich in <strong><em>annotation</em></strong>; text that supports and explains the raw data. It is normally the annotation, not the raw data, which defines the quality of the resource. In these cases, <strong>text</strong> is an <strong>intrinsic</strong> part of the <strong>data</strong>. <br/><br/><sup>3</sup>However, the conventional publication process has changed relatively little; the adoption of web technologies have largely been used as a distribution mechanism. Publications are still <strong>expensive</strong> &#8212; either at subscription or publication time, depending on the business model of the publisher, and involve considerable, time-consuming interactions between author and publisher, often relating to display and presentation issues. This is in stark contrast to, for example, the biological data centres where both raw and annotated data are often made available <strong>within hours </strong>of their generation.<br/><br/><sup>4</sup>This situation is unfortunate because it limits the ability of researchers to customise their publication process for the requirements of their own discipline. As demonstrated by Shotton et al, and Rousay et al, it is possible to add considerable value, both enhancing the paper for the reader, as well as providing <strong>direct and semantically enhanced links</strong> to underlying data. The cost of the existing process, however, makes this form of publication unlikely for some data; for example, few scientists publish papers about negative results, resulting in an acknowledged publication bias<sup>,</sup>. As a result, it is <strong>hard</strong> for the semantically enhanced publication to take its place as the central hub for a <strong>linked data</strong> environment as envisioned by Coles and Frey, linking to and between research datasets, and the published knowledge about these datasets. <br/><br/><sup>5</sup>In the last decade, the blog has become a common, web-based publication framework. There are now numerous off-the-shelf tools and platforms for managing blogs, providing a high-degree of functionality. Many scientists blog about their work, about other published work (research blogging) or &#8220;live blog&#8221; about conferences and talks as they happen. In this case, the researcher is in-charge of their own publication environment, can extend it to their requirements, and publication happens immediately. However, the blog has not yet become a standard means of publication for <strong>primary research output</strong>.<br/><br/><sup>6</sup>Recently, as part of the EPSRC funded Ontogenesis network (ref), we trialled the <strong><em>Knowledge Blog</em></strong> process; in this case aimed at producing an educational resource describing many aspects of ontology development and usage, which might previously have been published in book form. We have shown that with this technology base, it is possible to replicate many of the features of the open peer-review, scientific book publication process; following two small meetings, we have written around 20 articles, and the website maintains around 1000 post reads per month (not simple hits!). To achieve this, we used only two features of the blog &#8212; trackbacks (bidirectional links) and categories (hierarchical keywords); although we used the WordPress blogging software, these features are supported by most other systems. We call these articles <strong><em>k-blogs</em></strong>.<br/><br/><sup>7</sup>Currently, however, the k-blog process is not fully supported with blog software alone, nor does it fully support the referencing, advanced linking and provenance needed specifically for research publications. For this project, we propose to provide extensions to support data-rich publications, deeply and semantically linked to other k-blogs and to other forms of data repository. Therefore, the project addresses the objectives and aims of the call through four main workpackages.<br/><br/>1) A documented <strong>k-blog process </strong>(WP1.1) describing different levels of  peer-review suitable for different forms of research data. An implementation (WP1.2), the <strong>k-blog platform</strong>, of these process based around open-source, off-the-shelf software.<br/><br/>2) Extensions to the k-blog platform supporting <strong>linking</strong>. This includes full support for referencing including COINS metadata on posts (WP2.1), client-side and permanently linked versions (WP2.2) and bidirectional links (WP2.3) to other data sets. We will add <strong>semantics</strong> to these links using the Citation Ontology (CiTO) (WP2.4).<br/><br/>3) Support for three specialist environments&#8212;<strong>healthcare</strong> (WP3.1), <strong>microarray</strong> (WP3.2) and <strong>workflows</strong> (WP3.3). All useful in their own right and showcasing the extensibility of the framework.<br/><br/>4) <strong>Documentation</strong> and <strong>tooling</strong> to integrate the k-blog process into scientists existing working practice and tooling; scientists will be able to publish from Word, OpenOffice, Google Docs or LaTeX (WP4.1). We will add tooling and documentation, as WP4.2, to support the use of reference management tools such as Endnote, Mendeley or Zotero, making use of deliverables from WP2.<br/>
		</span></p>
<h1><span style="font-family:Arial; font-size:11pt">3. Quality of proposal and Robustness of Workplan
</span></h1>
<p>
 </p>
<p>
<h2><span style="font-family:Arial">3.1 WP1: Knowledge Blog Process</span></h2>
<p><span style="font-family:Arial"><br/><br/><sup>8</sup>In this project, we aim to develop a light-weight publication framework, including the desirable aspects of the formal<strong> peer-review process</strong>. However, different forms of scientific publication require different levels of peer-review. For example, for http://ontogenesis.knowledgeblog.org, we require two reviews from an editorial board, assessing quality, appropriate for an educational resource. However, for http://process.knowledgeblog.org, which is intended to contain informal &#8220;how-to&#8221; and request for comment documents, a much lighter-weight, single editorial review assessing scope alone is more appropriate. Deliverable <strong>WP1.1</strong> will consist of <strong>documentation</strong> describing both formally and informally, a number of <strong>levels</strong> for the knowledge blog process, and how these can be achieved using a blog. These documents will, themselves, be published on http://process.knowledgeblog.org.<br/><br/><sup>9</sup>These processes will be <strong>implemented</strong> as Deliverable <strong>WP1.2</strong>, comprising <strong>freely available</strong> and widely used pieces of software, with additional &#8220;glue&#8221;. The basic publication framework will use WordPress 3 (WoP) &#8212; an open-source, multi-site, multi-author blogging system used to provide the hosted blog service at http://www.wordpress.com. While, we have found that WoP supports many aspects of this process, particularly from the readers perspective, a significant degree of &#8220;book-keeping&#8221; is required from authors, reviewers and editors. Readers know whether a paper has been reviewed or not, but authors have to remember for themselves who is reviewing the paper. Therefore, we will use a &#8220;ticket system&#8221;, specifically Request Tracker 3 (RT) (http://bestpractical.com/rt/). Both WoP and RT are <strong>extensible</strong> with plugins and will be extended and adapted to reflect the k-blog levels of WP1.1.<br/><br/><sup>10</sup>We will use this extensibility to provide a light-weight integration. RT operates as an email response system; by <strong>extending WoP</strong> to send <strong>email</strong> on submission of new papers, this can provide both an integration point, as well as the main point of interaction for authors, reviewers and editors. To provide editorial and reviewer functionality tickets can be moved between queues; extensions to RT will use standard blogging <strong>XML-RPC</strong> calls to feedback to WoP by, for example, re-categorising papers once accepted. OpenID (http://openid.net) will be used to integrate the user accounts between the two systems. WoP already supports this fully, while RT supports it in skeleton form.<br/><br/><sup>11</sup>Although we will provide an implementation of the <strong>k-blog</strong> process, it will be described sufficiently generically to support complete and independent implementation. 
</span></p>
<p>
 </p>
<p><em>3.2 </em><strong>WP2: References and Metadata</strong><br/><sup>12</sup>For k-blogs to become an integral part of the scientific record, they must fully support the semantic and linked data environment. Although WoP supports standard <strong>URI based linking</strong> to resources, and bidirectional &#8220;trackback&#8221; linking to other resources, it lacks complete functionality suitable for research communities. This is a rare example of functionality that is not already provided by WoP or an associated plugin. Deliverable <strong>WP2.1</strong> will fulfil this need; we will support the insertion of at least <strong>DOI</strong>s and <strong>PubMed ID</strong>s (PMID), that will be resolved to full human-readable reference lists for display, using APIs provided by CrossRef and NCBI eUtils respectively. To fully support computational agents wishing to access the same information, references will also support <strong>COinS</strong> metadata, embedded into the display HTML. 
</p>
<p><span style="font-family:Arial">K-blog posts will also require outward facing metadata, that describe the resources they provide in a standards-compliant manner. The Open Archives Initiative (OAI) provide standards that aim to facilitate the efficient dissemination of content. Specifically, the Object Reuse and Exchange specification (<strong>OAI-ORE</strong>) is a standard for the description and exchange of compound digital objects  (such as a WoP post or page). The WordPress OAI-ORE plugin provides link header elements that implement this specification.<br/><br/><sup>13</sup>Our initial investigations into the k-blog process showed that WoP support for versioning and provenance are lacking; the k-blog process involves updating papers after submission but before final acceptance. While WoP stores all these <strong>versions</strong>, these are only currently visible by authors or editors through the administration interface. Whilst existing plugins for WoP already provide some of this functionality, Deliverable<strong> WP2.2</strong> will uncover these to readers, along with a defined permalink scheme for access to all versions, providing full <strong>provenance</strong>. <br/><br/><sup>14</sup>WoP supports <strong>bi-directional</strong> links in the form of trackbacks; this is mediated by XML-RPC calls between resources when a link is made. This will support linking to data where, for example, the data is another <strong>k-blog</strong>; however, general data resources may lack support for this process. Therefore, as Deliverable<strong> WP2.3</strong>, we will provide a trackback proxy, hosted on the http://knowledgeblog.org server, storing and presenting these links for resources  that cannot directly  process trackbacks.<br/><br/><sup>15</sup>To complete this work package, we will add semantics to the links using CiTO, as Deliverable <strong>WP2.4</strong>. Therefore, as well as enabling easier data linking and provenance, we will also enable addition of meaning to these links.
</span></p>
<p>
 </p>
<p>
<h2><span style="font-family:Arial">3.3 WP3 &#8211; Specialist Environments</span></h2>
<p><span style="font-family:Arial"><strong><br/><br/></strong><sup>16</sup>The k-blog platform and process is designed to be flexible and adaptable to the needs of specialist environments. We will use three main use cases to ensure <strong>real world</strong> applicability of the software, as well as <strong>fulfilling</strong> the immediate <strong>needs</strong> of these communities.<br/><br/><sup>17</sup>For Deliverable <strong>WP3.1</strong>, we will add additional features for supporting the microarray community. Currently, the microarray community is well serviced in terms of <strong>metadata</strong> capture (MIAME) and <strong>deposition</strong> in public repositories (ArrayExpress, GEO). As part of WP2, we will support <strong>linking</strong> to these datasets through stable URIs. However, these resources deal only with data generation. Post-processing and analysis is largely captured at the publication stage, often in supplementary material.<br/><br/><sup>18</sup>A substantial amount of this analysis uses BioConductor: a widely used, open-source platform for statistical microarray analysis based on the R statistical programming language. We will extend k-blog with <strong>specific support for R</strong> and BioConductor. Authors will be able to directly embed code into k-blog papers, along with the figures that result; as a result reviewers and readers will be able to see a <strong>computationally precise description </strong>of methods and replicate the generation of figures should they choose.<br/><br/><sup>19</sup>Finally, we will investigate the possibility of publication to a k-blog using only R code and references to public databases, in a process similar to Sweave &#8212; figures will be generated on the server, provide guarantees of correctness and precise provenance. The limited scope of this call means this part of WP3.1 will be proof-of-principle only.<br/><br/><sup>20</sup>For <strong>WP3.2</strong>, we will focus on the <strong>public health community</strong> (PHC): a key workforce in delivering quality and effective healthcare by providing timely and accurate public health intelligence (PHI)<sup>,</sup>.  PHI is a varied environment performing statistical analyses: producing information figures, diagrams and reports to communicate results to the wider health community.  However, the PHC operates in small groups with little knowledge networking.  The main aim of the k-blog is to improve the availability of health information, data and knowledge, to inform decisions for health protection and care standards as supported by the Quality Improvement Productivity and Prevention initiative.  The NWeHealth <em>e-Lab</em> project, hosted at The University of Manchester, provides an environment to bring together <em>research objects</em> into a single location. As elsewhere, textual data forms the key hub that links together all the other forms of knowledge. By <strong>linking to e-Lab</strong>
			<em>research objects</em> from a k-blog, this link will be made explicit, available, interpretable and directly valuable to the PHC; as a result WP3.2 is synergistic with the rest of the proposal.  This community also bring a set of access control requirements. To support these we will use existing WoP facilities, providing a simple, easy-to-use three level access model.
</span></p>
<p>
 </p>
<p><span style="font-family:Arial"><sup>20</sup>For WP3.3, we will generate k-blog <strong>content</strong> about <strong>Taverna</strong> workflows and methods for building them. Workflows have become a popular way of realizing computational analyses and have become an important form of <strong>data</strong>. The <strong>JISC funded myExperiment</strong> project is widely used to disseminate the workflows themselves. Knowledge about issues surrounding workflows is, however, more difficult to produce and disseminate. A k-blog, with its ability to produce short, targeted articles as the need arises and the resources become available for writing, suits the need for taverna workflow documentation. We will seek k-blogs on Taverna issues such as: the basics of workflow design; how to choose among a set of similar services in producing a workflow; and, the testing of workflows. We will implement a light-weight mechanism, using <strong>trackbacks</strong>, to link between the k-blog and myExperiment. 
</span></p>
<p>
 </p>
<p><span style="font-family:Arial"><sup>21</sup>As part of <strong>WP3</strong>, we will also hold four workshops, at 3-month intervals, each focusing on one particular k-blog and community. These workshops will be of the form previously trialled as part of the Ontogenesis network, and will serve several purposes; requirements gathering and feedback for us, education for the community and development of content, that demonstrates the process to the general readership.  
</span></p>
<p>
 </p>
<p>
<h2><span style="font-family:Arial">3.4 WP4 &#8211; Integration with Existing Working Practices</span></h2>
<p><span style="font-family:Arial"><br/><br/><sup>22</sup>For the k-blog process to be <strong>acceptable</strong> to <strong>communities</strong> such as those described in WP3, it must fit with existing working practices. Researchers mostly write documents using a word-processor. Fortunately, as the <strong>k-blog</strong> platform is based on the <strong>widely-used</strong> WoP, which in turns offers a <strong>widely-supported</strong> API, this style of working can be readily integrated. It is already possible to author using Word (2007 onward), OpenOffice, Google Docs and LaTeX using integrated or existing technologies, as demonstrated by our previous work at http://ontogenesis.knowledgeblog.org. For Deliverable <strong>WP4.1</strong>, user oriented documentation, describing these tools will be developed. This documentation will also describe clearly how to present and organise papers in a way which is optimized for the <strong>k-blog</strong> process. While, we expect this documentation to take a significant time-span to produce, refining it as a result of user feedback, it is important to note that a k-blog is already <strong>useful</strong> and <strong>possible</strong>.
</span></p>
<p style="background: white"><span style="color:black; font-family:Arial">To take maximal advantage of linking technologies developed in WP2, we will need to integrate with existing technologies for referencing. As deliverable <strong>WP4.2</strong>, we will add tooling to enable the use of bibliographic tools such as Endnote, Mendeley, Zotero or BiBTeX to insert references that <strong>k-blog</strong> can directly translate. Largely, this should consist of &#8220;styles&#8221;, modifying the in-text citation, as the reference plugin of <strong>WP2.1</strong> will generate reference lists. As with other deliverables, this tooling will include substantial documentation, developed using the <strong>k-blog</strong> process. 
</span></p>
<h2><span style="font-family:Arial; font-size:11pt">4. Project Timeline
</span></h2>
<p style="background: white">
 </p>
<div>
<table style="border-collapse:collapse" border="0">
<colgroup>
<col style="width:91px"/>
<col style="width:90px"/>
<col style="width:94px"/>
<col style="width:66px"/>
<col style="width:302px"/></colgroup>
<tbody valign="top">
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  solid 0.5pt; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial"><strong>Name</strong></span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  solid 0.5pt; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial"><strong>Start</strong></span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  solid 0.5pt; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial"><strong>End</strong></span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  solid 0.5pt; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial"><strong>Staff</strong></span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  solid 0.5pt; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial"><strong>Notes</strong></span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">    WP 1</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">02/08/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/10/2010</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt"> </td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt"> </td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 1.1</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">02/08/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">31/08/2010</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">All</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">A documented k-blog process</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 1.2</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/09/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/10/2010</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">DS,SC</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Implementation with off-the-shelf software</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">    WP 2</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/11/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/04/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt"> </td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt"> </td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 2.1</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/11/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">26/02/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">SC</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">COinS metadata on posts</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 2.2</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/11/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">29/01/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">SC</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Client-side, permanently linked versions</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 2.3</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">03/01/2011</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">26/02/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">DS</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Bi-directional links to other datasets</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 2.4</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/03/2011</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/04/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">PL</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Semantic linking with CITO</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">    WP 3</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/11/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/07/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt"> </td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt"> </td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 3.1</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/11/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/07/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">GM</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Specialist environment – Healthcare</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 3.2</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/11/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/07/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">DS</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Specialist environment &#8211; Microarrays</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 3.3</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/11/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/07/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">RS</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Specialist environment – Workflows</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">    WP 4</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">02/08/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/06/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt"> </td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt"> </td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 4.1</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">02/08/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/04/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">GM,DS</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Authoring documentation and tools</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 4.2</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">02/05/2011</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/06/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">GM,SC</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Referencing documentation and tools</span></p>
</td>
</tr>
</tbody>
</table>
</div>
<p style="background: white">
 </p>
<h2><span style="font-family:Arial; font-size:11pt">5. Project Management Arrangements
</span></h2>
<p><span style="font-family:Arial"><sup>23</sup>The project will be managed from Newcastle University; the <strong>primary management</strong> will be from Dr Lord who will be responsible for:
</span></p>
<ul>
<li>
<div style="background: white"><span style="font-family:Arial">Developing Project Management Plans;
</span></div>
</li>
<li>
<div style="background: white"><span style="font-family:Arial">Ensuring that the Project technical objectives are met;
</span></div>
</li>
<li>
<div style="background: white"><span style="font-family:Arial">Prioritising and reconciling conflicting opportunities;
</span></div>
</li>
<li>
<div style="background: white"><span style="font-family:Arial">Reporting and collaborating with JISC programme Manager;
</span></div>
</li>
<li>
<div style="background: white"><span style="font-family:Arial">Dissemination of the k-blog platform.
</span></div>
</li>
</ul>
<p><span style="font-family:Arial">Project progress will be evaluated through <strong>scheduled</strong>, short, &#8220;<strong>stand-up</strong>&#8221; meetings on a weekly basis, conducted face-to-face, via skype or phone as appropriate. Although most project staff are co-located, primary <strong>unscheduled</strong> communication will be via <strong>public mailing list</strong>, ensuring maximum visibility and openness.  <strong>User consultation</strong> will be via <strong>public mailing list</strong>, as well as through a &#8220;<strong>dogfooding</strong>&#8221; k-blog.  All project staff have been handpicked; they are highly experienced and self-directed, as outlined elsewhere. All are associated with several other projects and duties (research, research support, teaching and training), and are responsible for managing these independent workloads.  
</span></p>
<p>
 </p>
<h2><span style="font-family:Arial; font-size:11pt">5.1 Risks
</span></h2>
<p><span style="font-family:Arial"><sup>24</sup>Staff Risk – as with all projects, loss of staff could negatively impact on this project; however, all staff are on permanent contracts, have long histories in research, so this is less likely. Additionally, by dividing the work between five individuals, we limit the risk should a single person leave. 
</span></p>
<p><span style="font-family:Arial">WoP3 and other dependencies – the project depends on other software, most notably WoP for which a new version (3.0) is now in beta; however the software is widely supported. Other software is replaceable. 
</span></p>
<p><span style="font-family:Arial">Standards Shifting – the project depends on a number of standards and these may change. In this project, we will <strong>NOT </strong>support standards, but rather use those that support us. Where standard change rapidly, their implementation will be delayed (till they stabilize) or dropped. None of the standards described here is critical to the success of the project. <strong>
			</strong></span></p>
<p>
 </p>
<h2><span style="font-family:Arial; font-size:11pt">5.2 IPR Position
</span></h2>
<p><span style="font-family:Arial"><sup>25</sup>All code will be developed under open source licences. WoP and RT are licensed under GPL, so code linking to these will be likewise licensed. Code that is separable will be released under LGPL. Code will remain copyright of respective institutions or authors. Any documentation produced by project staff relating to the project will be licensed under Creative Commons Attribution license. Licensing of individual k-blogs will be delegated, but permissive licenses will be encouraged. 
</span></p>
<p>
 </p>
<h2><span style="font-family:Arial; font-size:11pt">5.3 Sustainability
</span></h2>
<p><span style="font-family:Arial"><sup>26</sup>This project is largely based around innovative, novel and leading <strong>use of existing</strong> software.  As such the sustainability of the majority of the technology base is not dependent on project members but large companies with established and proven business models. The <strong>k-blog</strong> process will be cleanly separated from its implementation, ensuring only weak dependencies to underlying software. Where, we produce software &#8220;glue&#8221;, public and widely supported APIs will be used where possible. This will ensure that components are replaceable. All code, including historical versions will be publicly available. Documents produced by project staff will be publically available and clearly licensed so will be archived through the internet &#8220;cloud&#8221; resources; we are also seeking explicit support for archiving from the British Library. 
</span></p>
<p>
 </p>
<h2><span style="font-family:Arial; font-size:11pt">5.4 Staff Recruitment
</span></h2>
<p><span style="font-family:Arial"><sup>27</sup>All staff are already in post. 
</span></p>
<p>
 </p>
<h2><span style="font-family:Arial; font-size:11pt">5.5 Key Beneficiaries
</span></h2>
<p><span style="font-family:Arial"><sup>28</sup>Our key beneficiaries are the <strong>public health</strong>, <strong>microarray</strong> and <strong>workflow</strong> communities; as the k-blog process is based around commodity software, these groups can use the <strong>basic </strong>environment from the first day of the project to generate and share content. As the project progresses, so will the process, the software to support it and the documentation to explain it; at all stages, the k-blog process fulfils a <strong>clear and immediate need</strong>. While we are specifically targeting these communities, the k-blog process and platform is sufficiently <strong>generic</strong> that it can support a <strong>wide range</strong> of research activities.
</span></p>
<p><span style="font-family:Arial">Although presented here as a single platform, the process and components are <strong>separable</strong> and can benefit communities independently. In particular, the tools and documentation from WP2 and WP4 will find use within the research blogging community, who find, in particular, the lack of tooling for referencing difficult. Finally, the statement of a peer-review process, and its implementation within RT will be applicable to any peer-review environment regardless of the form of publication. This includes publications published using wiki or other Content Management Systems. 
</span></p>
<p>
 </p>
<h2><span style="font-family:Arial; font-size:11pt">5.6 Engagement with Community
</span></h2>
<p><span style="font-family:Arial"><sup>29</sup>We consider the mechanism for engagement with four kinds of community: engagement with our core <strong>content generating</strong> community is an intrinsic part of this proposal, as described in <strong>WP3</strong>.  Further interaction with more disparate groups will be maintained through personal contacts; each of the five individuals named in this proposal are experienced and embedded in different communities (health care, microarray, ontology, proteomics). Engagement with our core <strong>content consuming</strong> community is, again, an intrinsic part of the proposal; all project communications will be via open mailing list or k-blog. Project members are active users of Web 2.0 social technologies; our initial trials as part of Ontogenesis showing this approach to be highly effective form of dissemination, with minimal effort. Engagement with <strong>software users</strong> will be via website and direct interaction. All software will be released or advertised via normal channels (website, versioning, and mailing list), including a (debian) package repository for those wishing to set up their own server.  Finally, <strong>developer communities</strong> will not be specifically targeted, but our open source, continually integrated development plan will be attractive, and we will accept suitably licensed contributions.  
</span></p>
<p><span style="font-family:Arial"><sup>30</sup>All communities will benefit from the open and agile development methodology we will adopt; changes to the environment will be integrated and released rapidly, ensuring continual improvement and facilitating rapid feedback cycles. 
</span></p>
<p>
 </p>
<h2><span style="font-family:Arial; font-size:11pt">6. Previous Experience and Project Team
</span></h2>
<p>
 </p>
<p><span style="font-family:Arial"><sup>31</sup><strong>Dr. Phillip Lord</strong> is a Lecturer of Computing Science at Newcastle University. He has a PhD in yeast genetics from University of Edinburgh, after which he moved into bioinformatics. He is well known for his work on ontologies in biology, as well as his contributions to eScience beginning with his role as a RA on the myGrid project. Since his move to Newcastle, he has been an investigator on there more eScience projects; CARMEN, ONDEX and InstantSOAP, as well as maintaining an active engagement in standards development (OBI, MIGS, MIBBI), and publishing on the fundamentals of ontology design. He was an active participant in the Ontogenesis network, and developed the initial idea for knowledge blogs as part of this. He is an active blogger and developer.
</span></p>
<p>
 </p>
<p><span style="font-family:Arial"><sup>32</sup><strong>Dr. Georgina Moulton</strong> is an Education and Development Fellow at The University of Manchester.  Since 2005 her main roles have been to co-ordinate the development, and delivery of multi-disciplinary bio/health informatics education programmes; and to facilitate the engagement of biological and health communities in a variety of bio and health informatics research projects (<em>e.g.,</em> ONDEX, Obesity e-Lab).  For 3 years, Georgina was the EPSRC funded Ontogenesis Network Manager, in which she co-ordinated the activities of the network and expanded the network through the facilitation of the development of new activities and was involved in the trial k-blog process.  More recently her work includes the development and delivery in conjunction with NHS partners of an education and development programme tailored to match the needs of North West public health analysts and the wider healthcare workforce.  
</span></p>
<p>
 </p>
<p><span style="font-family:Arial"><sup>33</sup><strong>Dr. Daniel Swan</strong> has a PhD in developmental biology and continued to work in developmental biology as a post-doctoral researcher before moving into bioinformatics in 2001.  Subsequent positions included working for Bart&#8217;s and the London Genome Centre and the Centre for Hydrology and Ecology in informatics driven roles dealing with large, distributed biological datasets generated by large user communities.  Currently the manager of the Newcastle University Bioinformatics Support Unit, he leads a small team aiding biological researchers generate, capture, store and analyse their digital data.  His interdisciplinary background means he has grounding in both computer and biological sciences and is comfortable working on CS focused projects (CARMEN, InstantSOAP, Bio-Linux) as well as acting in a research capacity analysing high-throughput data. 
</span></p>
<p>
 </p>
<p><span style="font-family:Arial"><sup>34</sup><strong>Dr. Simon Cockell</strong> has a PhD in Genetics from Leicester University, and refocussed into Bioinformatics with a Masters degree from Leeds in 2005. From there he moved to Newcastle, and the Bioinformatics Support Unit. Since coming to Newcastle, Simon has worked on a range of projects involving large scale analyses (AptaMEMS-ID), data integration (Ondex) and health informatics (MRC Mitochondrial Disease Cohort). 
</span></p>
<p>
 </p>
<p><span style="font-family:Arial"><sup>35</sup><strong>Dr Robert Stevens </strong>is a senior lecturer in Bioinformatics in the Bio and Health Informatics group at the University of Manchester. His main areas of research are in the development and use of semantics within the life sciences. This is blended with the use of e-Science platforms to gather and manage the data and knowledge of the life sciences. He was PI on the Ontogenesis network that ran the meetings for the first k-blog. He is or has been a co-investigator on the myGrid and myExperiment grants that will provide both content and technical input to this project. As well as the JISC funded myExperiment project, Stevens was an investigator on the JISC funded CO-ODE project that developed Protégé 4. On the back of this, Stevens has led the OWL training activities at Manchester that has directly fed in to the Ontogenesis k-blog. This range of experience makes Stevens an ideal partner to lead the development of content within this project.
</span></p>
<p><span style="font-family:Arial">
		</span> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/08/a-new-grant-for-knowledgeblog/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Realism and Science</title>
		<link>http://www.russet.org.uk/blog/2010/07/realism-and-science/</link>
		<comments>http://www.russet.org.uk/blog/2010/07/realism-and-science/#comments</comments>
		<pubDate>Wed, 28 Jul 2010 15:28:42 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Ontology]]></category>
		<category><![CDATA[Papers]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1713</guid>
		<description><![CDATA[This post carries the text of a paper accepted for PLoS One (now published). I publish it here as a pre-print because of the recent discussion on OBO discuss about realism. I have converted this from the original latex, which isn&#8217;t perfect. Apologies for errors. The [PDF] is available here. Adding a little reality to [...]]]></description>
			<content:encoded><![CDATA[<p>This post carries the text of a paper accepted for PLoS One (now <a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0012258">published</a>). I publish it here
as a pre-print because of the recent discussion on OBO discuss about realism.
I have converted this from the original latex, which isn&#8217;t perfect. Apologies
for errors.</p>
<p>The <a href="http://homepages.cs.ncl.ac.uk/phillip.lord/download/publications/realism_and_science.pdf">[PDF]</a> 
is available here. </p>
<div>
<p><big class="xlarge"><b class="bfseries">Adding a little reality to
  building ontologies for biology</b><br /></big> Phillip Lord and Robert
  Stevens<br /> School of Computing Science<br /> Claremont Road<br />
  Newcastle University<br /> Newcastle-upon-Tyne, UK<br />
  <a href="phillip.lord@newcastle.ac.uk">phillip.lord@newcastle.ac.uk</a><br />
  School of Computer Science<br /> The University of Manchester<br /> Oxford
  Road<br /> Manchester, UK<br />
  <a href="robert.stevens@manchester.ac.uk">robert.stevens@manchester.ac.uk</a></p>
</div>
<h1 id="a0000000002">Abstract</h1>
<p><b class="bfseries">Background:</b> Many areas of biology are open to
mathematical and computational modelling. The application of discrete, logical
formalisms defines the field of biomedical ontologies. Ontologies have been
put to many uses in bioinformatics. The most widespread is for description of
entities about which data have been collected, allowing integration and
analysis across multiple resources. There are now over 60 ontologies in active
use, increasingly developed as large, international collaborations.</p>
<p>There are, however, many opinions on how ontologies should be authored;
that is, what is appropriate for representation. Recently, a common opinion
has been the &ldquo;realist&rdquo; approach that places restrictions upon the
style of modelling considered to be appropriate.</p>
<p><b class="bfseries">Methodology/Principle Findings:</b> Here, we use a
number of case studies for describing the results of biological experiments.
We investigate the ways in which these could be represented using both realist
and non-realist approaches; we consider the limitations and advantages of each
of these models.</p>
<p><b class="bfseries">Conclusions/Significance:</b> From our analysis, we
conclude that while realist principles may enable straight-forward modelling
for some topics, there are crucial aspects of science and the phenomena it
studies that do not fit into this approach; realism appears to be
over-simplistic which, perversely, results in overly complex ontological
models. We suggest that it is impossible to avoid compromise in modelling
ontology; a clearer understanding of these compromises will better enable
appropriate modelling, fulfilling the many needs for discrete mathematical
models within computational biology.</p>
<h1 id="a0000000003">Introduction</h1>
<p>Ontologies are now widely used for describing and enhancing biological
resources and biological data, largely following on from the success of the
Gene
Ontology&nbsp;<span class="cite">[<a href="#Ashburner2000">1</a>]</span>.
Ontologies have been used for many purposes, from schema integration to value
reconcilliation to query
interfaces&nbsp;<span class="cite">[<a href="#handbook2">2</a>]</span>.
Ontologies have also become a cornerstone of computational biology and
bioinformatics. As computationally amenable artifacts they are, themselves, a
direct part of computational biology; many computational biologists are
involved in their production and maintenance. Many more use ontologies to
summarise their data, often by looking for
over-representation&nbsp;<span class="cite">[<a href="#Zeeberg2003">3</a>]</span>,
as the basis for drawing computational inferences about
data&nbsp;<span class="cite">[<a href="#Wolstencroft2006">4</a>]</span>,
or as the basis for determining semantic
similarity&nbsp;<span class="cite">[<a href="#Lord2003">5</a>]</span>.
Even those not making direct computational use of ontologies are likely to
come into contact with them, for example, when preparing annotation as part of
their data
release&nbsp;<span class="cite">[<a href="#Whetzel2006a">6</a>]</span>.</p>
<p>It is, therefore, of vital interest to computational biologists that
ontologies for use within biomedicine are fit for purpose. One effort that
aims to increase the quality of the ontologies available within biomedicine is
the &ldquo;OBO
Foundry&rdquo;&nbsp;<span class="cite">[<a href="#Smith2007">7</a>]</span>.
The main tool that it uses for this is &ldquo;an evolving set of shared
principles governing ontology development&rdquo;. The initial eleven
principles of the OBO
Foundry&nbsp;<span class="cite">[<a href="#OBOFoundry2006">8</a>]</span>
were largely concerned with what might be termed &lsquo;good engineering
practice&rsquo; (ontologies must, for example, be openly available, with a
common syntax, well documented, and used). These principles have later been
joined by a further
eleven&nbsp;<span class="cite">[<a href="#OBOFoundry2008">9</a>]</span>;
these include principles such as &ldquo;textual definitions will use the
genus-species form&rdquo;, &ldquo;Use of Basic Formal Ontology&rdquo; and, the
somewhat quixotic, &ldquo;terms [&hellip;] should correspond to instances in
reality&rdquo;. These stem not from engineering practice, but from a
perspective called <i class="itshape">realism</i>.</p>
<p>The many different uses for ontologies that we have described are reflected
in different understandings and methodologies about how and what to represent
in an ontology. Over the last few years, for many uses the paradigm has moved
from &ldquo;a conceptualization of the application domain&rdquo; toward
&ldquo;a description of the key entities in reality&rdquo;; it is this latter
approach that defines
realism&nbsp;<span class="cite">[<a href="#Johansson2006">10</a>]</span>.
This approach to ontology is typified by the Basic Formal Ontology (BFO); a
small upper-ontology for use within science in general and biomedical ontology
building in
particular&nbsp;<span class="cite">[<a href="#Grenon2004">11</a>]</span>.</p>
<p>There has been significant discussion regarding the possibility of
representing <em>only</em> &ldquo;real entities&rdquo; in computational
ontologies&nbsp;<span class="cite">[<a href="#smith2004beyond">12</a>]</span>.
Likewise, there has been significant discussion about the philosophy
surrounding realism and the role of ontology in its
representation&nbsp;<span class="cite">[<a href="#Johansson2006">10</a>]</span>.
While it is argued by some that it is possible to represent <em>only</em>
reality when making a domain description, there has, however, been little
discussion on whether it is necessarily desirable to do so.</p>
<p>In this paper, we consider the implications that realism has for the
choices that are open to the ontologist while they are modelling their domain
of interest. In particular, we consider the implications that this has for the
computational capabilities of any resultant ontology, in terms of its ability
to represent scientific knowledge in a computationally amenable form, as well
as the ability to perform automated inference or statistics over this
knowledge. We suggest that the application of realism results in ontologies
that are over-complex, awkward or limited; as such, realism falls far short of
its aim of increasing the fitness-for-purpose of ontologies. This approach,
therefore, is unlikely to fulfil the needs of computational biologists whom
form a substantial part of both the user and developer community for
bio-ontologies.</p>
<h1 id="a0000000004">Methods</h1>
<p>In this paper, we take the approach of a number of worked exemplars; this
is a complementary approach to an in-depth consideration of the modelling
decisions for a particular area or particular ontology, which we have used
previously&nbsp;<span class="cite">[<a href="#Lord2009">13</a>]</span>,
as it allows broader conclusions about the general principles of ontology
development. For each section, as well as the main exemplars, a number of
related examples are briefly discussed, to reinforce that the issues raised
are, indeed, general.</p>
<p>The exemplars have been selected by several criteria. First, all the main
exemplars are all taken from within biomedicine; this is also true for the
majority of the related examples. Second, we have chosen exemplars that
provide as wide a coverage of biology as possible. For practical reasons,
third, we have chosen exemplars where the underlying science is relatively
basic to much of biology and is likely to be immediately clear to the reader
without significant explanation.</p>
<p>We have chosen exemplars requiring as little knowledge of specific
ontologies as possible. We refer to only three. The first is BFO (see
&ldquo;sec:what-realism-2&rdquo;) which is a canonical example of a realist
ontology. BFO is described as a cross-domain, upper-ontology; as a result,
most terms fail the criteria given above; they are of poor biomedical
relevance, and are not basic science or immediately clear. We have, therefore,
also used PATO
(see <a href="http://obofoundry.org/wiki/index.php/PATO:Main_Page">http://obofoundry.org/wiki/index.php/PATO:Main_Page</a>);
this defines &ldquo;qualities&rdquo; that we might consider attributes of
other entities; so, the authors of this paper have a height, weight and shape,
all of which are considered to be qualities of the authors. Finally, we use
the relationship
ontology&nbsp;<span class="cite">[<a href="#Smith2005">14</a>]</span>;
this describes the relations between entities. So, for example, the height of
the author <em>inheres_in</em> the author.</p>
<p>As discussed in this and other
works&nbsp;<span class="cite">[<a href="#Russell1946">15</a>, <a href="#Merrill2010">16</a>]</span>,
&ldquo;realism&rdquo; is itself poorly defined. Where this lack of definition
makes the consequences of realism hard to determine, we have taken the
practical course, of showing the consequences as they play out in practice; to
an extent, therefore, these three ontologies are not only exemplars for
realism, but define it, as it is currently practiced. In short, for this
paper, when we say &ldquo;realism&rdquo;, we largely mean &ldquo;realism as
practiced by BFO&rdquo;. We do not claim, in this paper, to address all the
philosophical perspectives that through time carried the name
&ldquo;realism&rdquo;.</p>
<h1 id="a0000000005">Results</h1>
<h2 id="a0000000006">What is Realism?</h2>
<p>Building ontologies based on reality is obviously appealing to most
scientists; after all the study of <em>reality</em> to determine its behaviour
and laws is the goal of scientists. A brief consideration, however, shows that
this notion cannot define a methodology for the building of ontologies.</p>
<p>Within the context of science &ldquo;reality&rdquo; would normally be taken
to mean our experimental or observational data; but the statement that science
(ontologies) should be based on experimental or observational data is a truism
and, as such, has no explanatory power. The &ldquo;real&rdquo; in realism
refers, in fact, to the belief that the categories that we can use to divide
entities are, themselves, real.</p>
<p>This distinction stems from an old argument from philosophy; realism
against conceptualism. Again, both sides of the argument agree that the world
we can percieve, and as scientists, experiment on, is mind-independent. The
conceptualist, however, argues that the categories that they
term <em>concepts</em> are a product of social agreement. Conversely, the
realist argues that these categories that they term <em>universals</em> are
themselves real, that is mind independent in their own right, like the
entities they describe.</p>
<p>This distinction may seem fairly confusing; as
Russell&nbsp;<span class="cite">[<a href="#Russell1946">15</a>]</span>
says &ldquo;if I have failed to make Aristotle&rsquo;s theory of universals
clear, that is (I maintain) because it is not clear&rdquo;. In fact, there is
a third possibility that is a more empirical view&mdash;that is, if categories
(or other models) help in describing and predicting experimental data, then
they are useful regardless of whether they are real or
otherwise&nbsp;<span class="cite">[<a href="#Dumontier2010">17</a>]</span>.
As an example, the Mendelian notion of segregating units of inheritance was
defined and useful many years before a complete mechanistic description of
their cause was available. In this context, we note that there is no commonly
used term to express this form of category; most commonly,
&ldquo;concept&rdquo; is used.</p>
<p>For a field with a core activity of providing definitions, there is
surprisingly little agreement on the meaning of the word
&ldquo;ontology&rdquo;; as there have been many papers on the topic, we
consider just a few that reflect the distinction between these approaches.
Probably the most commonly cited
definition&nbsp;<span class="cite">[<a href="#Gruber1992">18</a>]</span>
describes an ontology as &ldquo;a specification of a conceptualization&rdquo;.
This definition emphasises the formality (i.e. logical and, therefore,
computationally amenable) aspect to ontology development.</p>
<p>This is countered with a realist definition; while the requirements from
Gruber&rsquo;s definition&mdash;a formal specification&mdash;are necessary,
realist ontologies add the requirement that &ldquo;the nodes and edges
correspond not to concepts but, rather, to entities in
reality&rdquo;&nbsp;<span class="cite">[<a href="#Ceusters2006">19</a>]</span>.</p>
<p>What does&ldquo;reality&rdquo; in this context actually mean? Definitions
such as &ldquo;that which exists&rdquo; are strangely circular leaving the
question of what &ldquo;exists&rdquo; means.
Smith&nbsp;<span class="cite">[<a href="#smith2004beyond">12</a>]</span>
adds the priviso that reality is &ldquo;captured in scientific laws&rdquo;.
Being a scientific law is not strictly enough, as some are later shown to be
wrong, but a scientific law is the current best attempt at reality; this
possibility does not make an ontology non-realist. For a realist ontology, the
nodes are &ldquo;universals&rdquo;&mdash;entities in reality&mdash;rather than
concepts; at least one particular must exist for every universal.</p>
<p>This still leaves the difficulty of applying the realist definition in
practice. So most scientists will happily accept, for example, that a cell is
real as it is an entity that can be observed, interacted with and manipulated.
However, concepts such as
&ldquo;function&rdquo;&nbsp;<span class="cite">[<a href="#Lord2009">13</a>]</span>
have raised more
discussion&nbsp;<span class="cite">[<a href="#Shrager2003">20</a>]</span>;
is this &ldquo;real&rdquo; or just a word biologists use as a point of
reference? While the definition involving &ldquo;entities in reality&rdquo;
maybe of philosophical interest, they are hard to turn into a specific assay;
how to test whether a particular concept is, also, a universal. Instead of a
clear assay for existence, realism offers direction about what concepts are
NOT reality, rather than those that are reality. For example, and perhaps
ironically given the negative practical definition of reality, a statement
such as:</p>
<pre>
  Dog is_a not Cat
</pre>
<p>is not held to be a statement about reality as it is a logically
constructed example of subsumption (an <tt>is_a</tt> relationship); there is
no real universal containing particular <tt>not Cat</tt>s in existence.
Likewise,</p>
<pre>
  Dog is_a (Dog or Cat)
</pre>
<p>as the existence of particular <tt>Dog</tt>s and <tt>Cat</tt>s does not
mean that there are any particular <tt>Dog or Cat</tt>s (examples modified
from&nbsp;<span class="cite">[<a href="#smith2004beyond">12</a>]</span>).</p>
<p>This is not meant to provide a complete introduction to
&ldquo;realism&rdquo;, but to provide a grounding for the discussion that
follows; we will consider the issues raised by realism, throughout the paper.
A more philosophical treatment of realism is given by
Merrill&nbsp;<span class="cite">[<a href="#Merrill2010">16</a>]</span>.
It is useful to note that
Gruber&rsquo;s&nbsp;<span class="cite">[<a href="#Gruber1992">18</a>]</span>
statement that &ldquo;And it [a computational ontology] is certainly a
different sense of the word than its use in philosophy.&rdquo;. In this paper,
we are concerned with the ontologies as computational artefacts.</p>
<p>To summarise, a realist approach to ontology says that the categories or
universals in to which objects or particulars fall have an existence in their
own right. It is these universals and <em>only</em> these universals that a
realist approach says should be the nodes within an ontology. In this paper we
examine whether this approach is an adequate means to provide an account for
the data produced by biomedicine.</p>
<h2 id="a0000000007">Models that represent reality</h2>
<p>In this section, we suggest that many universals have a range of
representations. In some cases, the choice of representation may be obvious,
such as length which has a natural scientific representation in SI units. In
many cases, however, there is no clear set of criteria for choosing between
representations. We consider the way that one quality, <em>colour</em>, could
be represented ontologically.</p>
<p>Colour is a complex phenomenon. The colour of an object or other phenomena
arises, in part, from that object and, in part, from the eye that perceives
it.</p>
<p>A representation of the physical reality would be an account of the
reflection, transmission and perception of light by an organism. Such an
account of the reality of light and its perception might cover the following
facts: Chlorophyll is green in reflection and red in transmission; a flower
petal appears white to a human, but has UV stripes to a bee; the plant leaf
and the algae appear green to humans, but have different reflection spectra
because their chlorophyll co-ordinate to their Mg<sup>2+</sup> ion in
different ways.</p>
<p>There have been a number of different attempts to represent the
complexities of colour numerically, for a number of different purposes. These
are models that allow us to describe colour, without having to deal with the
underlying physics or reality of colour. Probably the best known of these are
RGB (Red, Green, Blue) or HSV (Hue, Saturation, Value), both of which are
additive colour models appropriate for describing colour on a display screen.
CYMK (Cyan, Yellow, Magenta and Black) is a subtractive colour model and
commonly used for printing.</p>
<p>Collectively these representation schemes are known
as <i class="itshape">colour models</i>. That none of these schemes has become
predominant reflects both their different uses and the preferences of
different user groups.</p>
<p>For the ontology builder, this leaves us with a difficult choice:</p>
<ol class="enumerate">
<li>
<p>We bless one of the colour models, substituting the model for the
    underlying physics and do not describe the others.</p>
</li>
<li>
<p>We describe all of the colour models, but do not describe that they are
    part of a colour model.</p>
</li>
<li>
<p>We explicitly describe the reality of the physics, biology and the
    relationship to the different colour models, reflecting the practise of
    describing colour in much of science.</p>
</li>
</ol>
<p>Currently, considering the PATO ontology, which is documented as being
built according to realist principles, the first approach has been taken,
using the HSV scheme. So, PATO has a term <b class="bfseries">Color Hue</b>
(PATO:15) that is defined as :</p>
<blockquote class="quote"><p>
  <i class="itshape">&ldquo;A chromatic scalar-circular quality inhering in an
  object that manifests in an observer by virtue of the dominant wavelength of
  the visible light; may be subject to fiat divisions, typically into 7 or 8
  spectra.&rdquo;</i>
</p></blockquote>
<p>Using this model, PATO describes <b class="bfseries">red</b> (PATO:322) as
:</p>
<blockquote class="quote"><p>
  <i class="itshape">&ldquo;A color hue with high wavelength of the long-wave
  end of the visible spectrum, evoked in the human observer by radiant energy
  with wavelengths of approximately 630 to 750 nanometers.&rdquo;</i>
</p></blockquote>
<p>This modelling approach has a number of limitations.</p>
<ul class="itemize">
<li>
<p>The decision to choose one colour model or the other is arbitrary.
    While there are reasonable justifications for the use of HSV as opposed
    to, for example, RGB, there is no <i class="itshape">a priori</i>
    justification for use of an additive colour model as opposed to a
    subtractive model. Both are valid, for different usage; in general,
    reflective colour is more common in biology (e.g. pigmentation) than
    emitted colour (e.g. fluorescence) which would suggest that subtractive
    models are more generally applicable, but a full treatment requires
    both.</p>
</li>
<li>
<p>There are no terms which can be used to express data described
    according to other colour models, necessitating a transformation between
    the different models into the officially &ldquo;blessed&rdquo; version
    during application of the ontology. These transformations may be lossy and
    not fully reversible.</p>
</li>
</ul>
<p>The second approach is also possible. This would allow expression of data
in multiple colour models, however:</p>
<ul class="itemize">
<li>
<p>The ontology would tend to get rather confusing as more colour models
    are added; colour would have children &ldquo;Hue&rdquo;, &ldquo;Red&rdquo;
    and &ldquo;Cyan&rdquo; and seven other sibling terms.</p>
</li>
<li>
<p>It is not clear which terms comprise a colour model: do values for
    &ldquo;Hue&rdquo;, &ldquo;Green&rdquo; and &ldquo;Magenta&rdquo; specify a
    colour?</p>
</li>
<li>
<p>It is not clear whether terms that occur in the other contexts are
    equivalent. Is &ldquo;Red as in RGB&rdquo; the same or different
    as <b class="bfseries">Red</b> (PATO:322)? Is &ldquo;Hue as in HSV&rdquo;
    the same or different from &ldquo;Hue as in HSL&rdquo; (HSL is another
    additive colour model).</p>
</li>
</ul>
<p>The third approach does not suffer from the limitations described. We
suggest from this analysis that it is necessary, if unfortunate, for some
qualities to be explicitly described with multiple representations. To avoid
confusion, the universal quality, colour, would need to be explicitly
described as having multiple valid models. Yet, realism argues that we should
not do this, as colour is real and not a model; more over, the focus on
realism means that the documentation does not describe the choices that have
been made, nor refer to the relationship between <b class="bfseries">Color
Hue</b> (PATO:15) and &ldquo;Hue as in HSV&rdquo;. In short, realism has
limited our ability to represent colour.</p>
<h3 id="a0000000008">Related Examples</h3>
<p>There are many different examples of this issue; having two or more models
  to describe the same part of reality is common. The distance between two
  markers on a chromosome can be measured using (one of a number of) genetic
  techniques. Some qualities have a bewildering array of different
  measurements associated with them; Wikipedia, for example, lists 13
  different measurements of concentration such as molarity or \(gm^{-3}\).</p>
<p>This issue has been previously recognised. In computing science, explicitly
modelling one model in another is a form of <em>metamodelling</em>. Other,
non-realist, upper-ontologies such as DOLCE use the concept
of <tt class="ttfamily">Quale</tt> to describe a cognitive abstraction (such
as Colour), including those over a physical quality (such as the spectral
properties of reflected
light)&nbsp;<span class="cite">[<a href="#Seyed2009">21</a>]</span>.</p>
<h2 id="a0000000009">Sequences and the Central Dogma</h2>
<p>The central dogma of molecular biology suggests that all genetic
information is encoded in the DNA of a cell, as the ordered nucleotides that
comprise the DNA. RNA is transcribed from this DNA. The RNA molecule also has
a defined order of nucleotides related to the DNA. Finally the RNA is
translated into protein.</p>
<p>Consider an ontology describing these entities. First, the DNA molecule has
a number of properties; as well as physical dimensions (discussed further in
&ldquo;sec:limits-consistency&rdquo;), including a length expressed in metres,
it consists of a number of monomeric units. So, for example, we might say a
DNA molecule with a series of nucleotide residues represented
as <tt class="ttfamily">&lsquo;GATC&rsquo;</tt> <tt class="ttfamily">has&shy;Monomeric&shy;Part</tt> <tt class="ttfamily">4</tt>.</p>
<p>This causes a slight worry from a realist perspective; the number 4 may not
  be a realist universal. There are no instances of 4. In this case, the
  number 4 is being used to describe a part of reality, so this is allowable
  in a realist ontology. Alternatively, we could describe the same reality
  using units (traditionally base-pairs or bp). Therefore,
  the <tt class="ttfamily">DNA
  molecule</tt> <tt class="ttfamily">has&shy;Polymer&shy;Length</tt> 4bp.</p>
<p>Accepting the use of natural numbers in this way, also means that we accept
  the use of sets and sequences to describe reality. One definition of 4 is a
  sequence. Stating that the DNA molecule represented with the
  sequence <tt class="ttfamily">&lsquo;GATC&rsquo;</tt> <tt class="ttfamily">has&shy;Polymer&shy;Length</tt> <tt class="ttfamily">4bp</tt>
  is equivalent, therefore, to stating that
  it <tt class="ttfamily">hasSequence</tt> <tt class="ttfamily">&lsquo;NNNN&rsquo;</tt>
  where <tt class="ttfamily">&lsquo;N&rsquo;</tt> is any nucleotide
  residue.</p>
<p>It should be noted, however, that the usefulness of these statements stems
from our <em>implicit</em> knowledge. The number 4 is a natural number,
so <tt class="ttfamily">has&shy;Monomeric&shy;Part</tt> <tt class="ttfamily">4.2</tt>
is not possible. If a new monomer is attached to our DNA molecule, it will
now <tt class="ttfamily">has&shy;Monomeric&shy;Part</tt> <tt class="ttfamily">5</tt>,
because the natural numbers are additive. We understand the operation of
natural numbers as part of our shared, background knowledge, and we can apply
this knowledge here.</p>
<p>Having described that the DNA molecule represented
as <tt class="ttfamily">&lsquo;GATC&rsquo;</tt> <tt class="ttfamily">has&shy;Polymer&shy;Length</tt> <tt class="ttfamily">4</tt>
(or <tt class="ttfamily">hasSequence</tt> <tt class="ttfamily">&lsquo;NNNN&rsquo;</tt>)
we might wish to be more specific about the order of nucleotide residues and
state <tt class="ttfamily">hasSequence</tt> <tt class="ttfamily">&lsquo;GATC&rsquo;</tt>.
The implicit background knowledge we used previously about the natural numbers
still applies here.</p>
<p>Next consider the process of transcription. The previous discussion about
DNA likewise applies to RNA. The RNA molecule will,
however, <tt class="ttfamily">hasSequence</tt> <tt class="ttfamily">&lsquo;GAUL&rsquo;</tt>,
as RNA uses a different set of bases to DNA. Mathematically, one sequence can
be determined from the other by applying a mapping; though the mapping is a
human activity, not a representation of biochemical reality. To describe this,
we have two options:</p>
<ul class="itemize">
<li>
<p>Taking the realist approach, we can continue to rely on
    the <em>implicit</em> knowledge of the biologist, as we have previously
    relied on an implicit understanding of the natural numbers.</p>
</li>
<li>
<p>We can be explicit about the properties of these sequences (additional
    to those properties shared with the naturals). We can talk about non-real
    world concepts such as alphabets, transformations and how these map to the
    real entities involved.</p>
</li>
</ul>
<p>It should be noted that the former severely limits the ability to describe
the central dogma. The transformation of DNA to RNA sequence is simple, but
the transformation of RNA to protein is more complex. Again, the choice is
between representing reality or representing how we practise science.</p>
<h3 id="a0000000010">Related examples</h3>
<p>The issues relating to sequences are fairly general. In computer science
terms, these are abstract data types. The DNA sequence is a kind of sequence
with special properties (a limited alphabet). Many of the physical quantities
in science have special properties in this way. Consider:</p>
<dl class="description">
<dt>Temperature:</dt>
<dd>
<p>While these look like positive real numbers, temperatures are only
    meaningfully subtracting from each other, which gives information about
    heat-flow between two bodies. Other operations (addition, multiplication)
    which are useful for real numbers have little meaning for temperature.</p>
</dd>
<dt>Recombination Distance:</dt>
<dd>
<p>These look like probabilities but are not, requiring a transformation
    to add.</p>
</dd>
</dl>
<p>There is a limitation on the ability to use abstract data types within a
given ontology language; in most cases, the expressivity of the language will
not allow arbitrary mathematical relations. Some languages, such as OWL, for
example, provide &ldquo;concrete domains&rdquo;; these provide extension
points within the ontology language where, for example, the special properties
of temperature could be represented; other languages do not. In either case,
there are limitations to these capabilities; for example, the constraint and
behaviour of a concrete domain needs to be interpreted with its own semantics
within a reasoner, rather than expressed explicitly within the ontology. It
may make more sense in many circumstances to describe the existence of a
mathematical model as discussed in &ldquo;sec:go-where-science&rdquo;.</p>
<h2 id="a0000000011">The limitations of computers</h2>
<p>Modelling continuous properties is a common problem in ontological
engineering. For example, according to statistics the western world is now
facing an obesity epidemic; in short many or most of us weigh too much.
Understanding, however, exactly what &ldquo;too much&rdquo; means is not
necessarily simple; a common technique to use is body mass index
(BMI)&mdash;body weight divided by square of the height, which is a continuous
value. The BMI range is split into 4 categories: Obese (&gt;30), Overweight
(&gt;25), Normal (&gt;18.5) and Underweight (&lt;18.5). These categories
represent ranges of the value of BMI.</p>
<p>This data simplification has many justifications. On an individual basis,
the BMI is not a particularly accurate measure, so the simplification does not
lose much accuracy. It is also easier to describe to patients, for whom a
&ldquo;BMI of 25&rdquo; will be less comprehensible than being
&ldquo;overweight&rdquo;.</p>
<p>Modelling some of this is straight-forward. Height and weight are modelled
as properties of the individual. The BMI would therefore appear to be a
property of the individual as it is a restatement of two existing properties.
It would appear, therefore, that the category into which an individual falls
should also be a property of the individual.</p>
<p>Consider the values of the property next. These categories are an
abstraction over the real-world properties. Although, height as an integer
value is expressed using a non-real-world entity, it is a description of a
part of reality. A range, however, in the BMI does not describe part of
reality in the same sense. There are no instances of BMI &ldquo;Obese&rdquo;.
In a realist ontology, therefore, it is unclear what the relationship is
between BMI Obese and the individual person.</p>
<p>For the statistician or computer scientist, there is an additional
advantage to the simplification; four discrete groups have better
computational properties than a continuous measure. Database queries become
easier to write, and quicker to run. This is also true for the ontology
builder; simplifying the real-world may fulfil the needs of an application for
which the ontology is built, while avoiding unnecessary complexity. This is a
widely used method for representing partitions of continuous values, the
appropriately named <em>value
partition</em>&nbsp;<span class="cite">[<a href="#rector2005">22</a>]</span>.</p>
<p>In the case of BMI there is a pre-existing social agreement toward a set of
categories; however, even in the absence of such an agreement, the ontology
builder might wish to represent a continuous range as a value partition to
decrease the complexity of their ontology. The value partition is useful, but
many of the concepts involved are not realist universals. The choice, then, is
modelling &ldquo;reality&rdquo; and modelling a simplification that is easier
to use and has better computational properties.</p>
<h3 id="a0000000012">Related Examples</h3>
<p>Splitting the two cases, there are many examples of pre-existing
simplifications. From medicine, there are so many that it seems to be the norm
rather than the exception: hypo- vs hyperthermic; hypo vs hypertensive; hypo-
vs hyperglycemic. In many cases, these ranges have standard interpretations
akin to the BMI.</p>
<p>There are likewise a number of constructions or design patterns that reduce
complexity, extend the effective capabilities of the language or simply
provide standard solutions to common
problems&nbsp;<span class="cite">[<a href="#egana2008">23</a>]</span>.</p>
<h2 id="a0000000013">To go where science has gone before</h2>
<p>Many experiments in biomedicine require the measurement of some physical
property of a biological system. Take, for example, the measurement of heart
rate; in standard practice, this is measured in beats per minute, and is
calculated simply by counting beats (\(b\)) over a time period (\(t\))
and dividing one by the other (\(b/t\)). However, what time period is
appropriate? We might choose 60s, but this raises the question, what is the
meaning of heart rate over shorter periods?</p>
<p>Fortunately, there is a standard solution to this problem, which is to
  define heart rate using differential calculus; so heart rate becomes \(db/dt\).</p>
<p>The derivative, \(db/dt\), presents some problems from a realist
perspective. As noted previously (see &ldquo;sec:sequ-centr-dogma&rdquo;), it
is possible to associate real numbers with entities; however, \(db/dt\) is
\(0/0\). It is not clear whether this quantity is a universal; it is
certainly the case that the expression \(db/dt\) is not a universal, yet
such values and calculus itself is apowerful tool within science and not using
it within ontological models is a severe restriction.</p>
<p>We can describe this ontologically in three ways:</p>
<ul class="itemize">
<li>
<p>We can model the real world entities involved &ndash; beats, time and
    describe nothing else.</p>
</li>
<li>
<p>We can describe rate in mathematical terms. In this case, we are
    defining the heart rate as a mathematical abstraction.</p>
</li>
<li>
<p>We can model the heart rate as a real world entity, \(db/dt\) as a
    mathematical entity and explicitly state that $latex db/dt is a model of
    heart rate.</p>
</li>
</ul>
<p>These different solutions present different advantages. The first is
  consistent with realism. The second is consistent with the most common
  definition used within science. The third is consistent with both but it is
  unclear when to use which term (for example, is \(\Delta {}b/\Delta{} t\) 
  an approximation of \(db/dt\), a quantification of the real world
  quality or both)?</p>
<p>In most cases for the description of science, the second option makes most
sense; conflating the mathematical model with the real entity enables us to
use the advantages of two different modelling techniques without introducing
the confusion of the third option.</p>
<h3 id="a0000000014">Related Examples</h3>
<p>There are many related examples from mechanics, electromagnetics or
chemistry; as with value partitions in medicine, so many that they appear to
be the norm. All of these subject areas have direct relevance to biology and,
perhaps even more so, to the equipment used in the practice of biology.</p>
<p>Mechanical examples would include velocity (\(dr/dt\)) and acceleration
(\(d^2r/dt^2\)). Electromagnetics would include current (\(dC/dt\))
and capacitance (\(dV/dt\)). Chemistry examples would include rate
constants and pH. In biology, population biology, systems biology and
neurosciences make wide use of mathematical models. The lack of a link in
realist ontologies to these mathematical models is not free from consequences
(described further in &ldquo;sec:discussion&rdquo;).</p>
<p>The more general issue comes not from relating to differential calculus,
but relating to pre-existing non-ontological techniques. For example, taxonomy
in the linnean sense. There have been many discussions about whether species
and high taxons are reflective of reality; it is certainly the case that a
number of higher taxons do not reflect
phylogeny&nbsp;<span class="cite">[<a href="#Schulz2008">24</a>]</span>.
Given that it is of uncertain status, should we represent taxonomy as a
quality of an organism, an independent conceptualisation of the biologists or
both?</p>
<h2 id="a0000000015">The limits of consistency</h2>
<p>Physical biological entities such as cells and organisms have an extent in
the real world. This paper&rsquo;s first author, for example, has a height of
around 1.8m; a similar value cannot be applied meaningfully to the electronic
version of this document, although it may apply to the paper that it may be
printed on.</p>
<p>There are a number of different, well-understood mechanisms for
representing physical space. We can use a dimensional or cartesian model, with
three perpendicular lines with a linear scale. We can use a polar model,
expressing extent using angles and a single distance. Modern physics has told
us, however, that all of these are limited models of reality; physics
generally uses a four dimensional Minkowskian spacetime model; here the axes
are not linear; motion of the observer down one will change values down the
others. Alternatively, at a quantum level, length is a probability
distribution.</p>
<p>For the ontology builder, this leaves a difficult choice and the same
choice discussed previously in &ldquo;sec:colo-colo-models&rdquo;: Represent
the reality physicists relate; bless one, ignore the rest; describe their
components but not their models; explicitly describe them.</p>
<p>If the ontology builder is to be consistent, then, they should make the
same choice in both cases; if we describe colour models, we should explicitly
describe Minkowskian spacetime, quantuum probability distributions, cartesian
and polar systems.</p>
<p>There are, however, two important differences to colour models. First,
there is a strong social bias toward cartesian systems. Secondly, within the
scope of biology and the life sciences, four dimensional spacetime or quantuum
models confuse rather than simplify; the relativistic corrections produce such
small differences that they are statistically meaningless; similarly,
describing a leg as a probability distribution adds little other than
complexity.</p>
<p>This leaves the ontology builder with two options:</p>
<ol class="enumerate">
<li>
<p>We can build an ontology with a consistent relationship to reality. So,
    having decided to explicitly represent colour models, this suggests that
    we should also explicitly model 3D space, 4D spacetime and the various
    co-ordinate systems that are used to describe these.</p>
</li>
<li>
<p>We build an ontology with an inconsistent relationship to reality. So,
    we might be explicit about colour models, but arbitrarily bless 3
    dimensional space, using cartesian co-ordinates.</p>
</li>
</ol>
<p>The compromise here is very straight-forward. The first solution retains
its consistency to reality, the second is consistent with usability and usage;
for biomedicine, a 3D cartesian co-ordinate system plus time is likely to be
enough for the foreseeable future and makes life easier in the meantime.</p>
<p>The Newtonian view of the world is the best model in this case: it is good
enough. When building an ontology for biomedicine, it makes most sense to use
this view as it will produce the results required. If, in the future,
biomedicine advances so that relativistic or quantuum representations are
necessary, then current ontologies will need refactoring; even then, this
future cost is likely to be offset by gains in the present.</p>
<h3 id="a0000000016">Related examples</h3>
<p>In the choice of units for measurement for scientific purposes, SI units
are to be preferred. It should be noted, here, that there is a domain
dependency; for an engineering ontology, the use of American imperial units
would be inevitable.</p>
<p>For most of biology it is unnecessary to distinguish between the length of
the calendar year and the astronomical year&mdash;the latter changing with
respect to variability in the motion of the earth. There are occasions when
this distinction may be important for data integration in bioinformatics as
leap years and leap seconds show.</p>
<p>For an ecologist counting the number of trees in a sampling square 100m by
100m, they will take the area as 10,000m<sup>2</sup>; The surface is, however,
neither smooth nor a Euclidean plane, so this area is wrong in reality. For
much of ecology, this distinction will not matter. Again, there is a domain
dependency here; whale or bird biologists interested in migration patterns may
well care about the curvature of the earth.</p>
<h1 id="a0000000017">Discussion</h1>
<p>Realism has been held up as a methodology for &ldquo;good&rdquo;
ontological modelling, and the production of more tightly defined and
consistent ontologies. In this paper, we have discussed five different cases,
with biological examples, that we might wish to model ontologically; for each,
we have presented different models, describing the same underlying science. In
each case, a realist solution is possible, but places either limitations or
awkwardness on the models produced.</p>
<p>Building an ontology with a consistent relationship to reality may help to
enable
interoperability&nbsp;<span class="cite">[<a href="#Smith2007">7</a>]</span>
under some circumstances. If, however, it disallows modifications for
computability (see &ldquo;sec:work-around-comp&rdquo;), or requires arbitrary
blessing for one form of specification over another (see
&ldquo;sec:colo-colo-models&rdquo;) it may have the opposite effect.</p>
<p>Nor are the issues discussed in this paper free from consequences. In
&ldquo;sec:go-where-science&rdquo;, we discussed interoperability with
existing scientific models. Mathematics and physics have produced complex,
refined and expressive notation systems, representing a deep understanding of
how numbers and the physical world work. These are, however, not being used in
current ontologies and this results in a lack of precision, errors and
omissions:</p>
<dl class="description">
<dt>Lack of Precision:</dt>
<dd>
<p>The PATO term <b class="bfseries">speed</b> (PATO:8) which is defined
    as:</p>
<blockquote class="quote"><p>
      <i class="itshape">&ldquo;A physical quality inhering in a bearer by
      virtue of the bearer&rsquo;s rate of change of position&rdquo;</i>
    </p></blockquote>
<p>with a synonym of <tt class="ttfamily">velocity</tt>; from this
    definition, we cannot distinguish the vector and scalar quantities of
    velocity and speed; indeed, it is not clear which of these
    two <b class="bfseries">speed</b> (PATO:8) is.
    Meanwhile <b class="bfseries">acceleration</b> (PATO:1028) is defined
    as:</p>
<blockquote class="quote"><p>
      <i class="itshape">&ldquo;&hellip; the rate of change of the
      bearer&rsquo;s velocity in either speed or direction&rdquo;</i>
    </p></blockquote>
<p>which is implicitly a vector quantity, and contradicts the statement
    that speed and velocity are synonyms. The mathematical definitions
    (velocity as \(dr/dt\), speed \(\left|{dr/dt}\right|\),
    acceleration \(d^2r/dt^2\)) are precise, concise and accurate.</p>
</dd>
<dt>Errors:</dt>
<dd>
<p>Similarly, <b class="bfseries">length</b> (PATO:122) is defined as a
    quality; qualities have to inhere in <tt class="ttfamily">Independent
    Continuant</tt>s; as a <tt class="ttfamily">Spatial Region</tt> is a child
    of <tt class="ttfamily">Continuant</tt> this means
    that <tt class="ttfamily">Spatial Region</tt>s cannot
    bear <tt class="ttfamily">length</tt>s. In short, in current versions of
    BFO, there is no intuitive way of modelling the length of a region in
    space.</p>
</dd>
<dt>Omissions:</dt>
<dd>
<p>BFO is mass-centric; it is currently unclear where many physical
    entities exist, examples including energy, waves (through a medium) or EM
    radiation. Likewise, it lacks a natural position for numbers (that have no
    particulars), patterns and distributions. Yet, these entities are key to a
    physical description of the world.</p>
</dd>
</dl>
<p>To our mind, these are indicative of some of the most serious flaws of
realism-based ontology building. It makes little sense to replicate the models
of physics using English instead of a more precise mathematical notation. If
BFO had been built using direct links to a grounded physical model of the
world, it seems likely that these problems would not have arisen.</p>
<p>We have discussed a number of concrete examples where building an ontology
by considering realist concerns has detrimental consequences for the model. We
believe that the real world entities and the relationships between them is
only one consideration among many: simplicity, usability, fitness for purpose
are equally important.</p>
<p>Taken to its most extreme form realism, it seems to these authors, would
produce models unsuitable for use within science. There is a choice between a
correct account of reality that does not allow the data of science to be
adequately described and a description of reality that takes in to account how
science is performed. Fortunately, most &ldquo;realist&rdquo; ontologies are
not really so: PATOs representation of HSV for modelling colour is not a bad
decision; it represents a straight-forward, pragmatic approach to ontology
building, where the representation has been chosen on the basis of a use case,
not the entities as they exist in reality. Similarly BFO uses a 3D plus time
model of reality; it suggests that length are properties of the entity alone,
without reference to the observer. This is not a true reflection of reality,
but one which is a good enough approximation for use within the biomedical
sciences; in short, usability and simplicity have been considered to be more
important in the modelling process than the relationship of the model to
reality. In accepting these compromises, BFO has placed itself squarely as a
computational rather than philosophical ontology.</p>
<p>Despite these concerns, realism has made a contribution to the field of
biomedical ontology engineering. By emphasising the importance of real-world
entities and by encouraging a more specific interpretation than the
generalisation of a &ldquo;conceptualisation&rdquo;, realism helps to avoid
the introduction of unnecessary layers of abstraction. A consideration of the
entities in reality may be a part of an ontology engineering process; ontology
builders should have careful and considered reasons for diverting from
modelling in this way and that ontologies should explicitly describe through
annotations the terms that do or may divert from this view. Ontology builders
should, however, be free to make this decision; the acceptance of compromise
with respect to reality will result in simpler and more effective knowledge
artefacts.</p>
<p>Johansson&nbsp;<span class="cite">[<a href="#Johansson2006">10</a>]</span>
when discussing realism asks the rhetorical question: &ldquo;would you like to
be treated for a physiological illness by a <em>(non-realist)</em> physician
who is not sure that there are human bodies?&rdquo; &ndash; (our emphasis). As
scientists, our reply would be if their survival and success statistics were
the best, we would not care whether they were a realist, a non-realist or a
robot which admitted of no philosophical position at all; also, using a doctor
who was strictly realist and thus cut off from much of the practise of science
(such as determining heart rate) would disturb many patients. As
bioinformaticians, we build ontologies to provide a descriptive and predictive
model of the wealth of experimental data that is now available. In biology,
the job of an ontologist is to describe data such that it can be analysed.
Naturally this entails a description of entities in reality; it also, however,
entails a description of science, and it entails compromise; we overlook this
to our peril. The last 200 years of science shows the success and strength of
this position; it is on this groundwork that we should build for the
future.</p>
<div>
<h1>Bibliography</h1>
<dl class="bibliography">
<dt>[<a name="Ashburner2000" id="Ashburner2000">1</a>]</dt>
<dd>
<p>Ashburner M, Ball C, Blake J, Botstein D, Butler H, et&nbsp;al.
      (2000) Gene Ontology: a tool for the unification of biology. The Gene
      Ontology Consortium. Nat Genet 25: 25&ndash;9.</p>
</dd>
<dt>[<a name="handbook2" id="handbook2">2</a>]</dt>
<dd>
<p>Stevens R, Lord P (2008) Application of ontologies in bioinformatics.
      In: Staab S, Studer R, editors, Handbook on Ontologies in Information
      Systems, Springer. Second edition.
      URL <a href="http://www.cs.man.ac.uk/~stevensr/papers/handbook2.pdf">http://www.cs.man.ac.uk/~stevensr/papers/handbook2.pdf</a>.</p>
</dd>
<dt>[<a name="Zeeberg2003" id="Zeeberg2003">3</a>]</dt>
<dd>
<p>Zeeberg B, Feng W, Wang G, Wang M, Fojo A, et&nbsp;al. (2003)
      GoMiner: a resource for biological interpretation of genomic and
      proteomic data. Genome Biol 4: R28.</p>
</dd>
<dt>[<a name="Wolstencroft2006" id="Wolstencroft2006">4</a>]</dt>
<dd>
<p>Wolstencroft K, Lord P, Tabernero L, Brass A, Stevens R (2006)
      Protein classification using ontology classification. Bioinformatics 22:
      e530-538.</p>
</dd>
<dt>[<a name="Lord2003" id="Lord2003">5</a>]</dt>
<dd>
<p>Lord PW, Stevens RD, Brass A, Goble CA (2003) Investigating semantic
      similarity measures across the gene ontology: the relationship between
      sequence and annotation. Bioinformatics 19: 1275&ndash;1283.</p>
</dd>
<dt>[<a name="Whetzel2006a" id="Whetzel2006a">6</a>]</dt>
<dd>
<p>Whetzel PL, Parkinson H, Causton HC, Fan L, Fostel J, et&nbsp;al.
      (2006) The MGED Ontology: a resource for semantics-based description of
      microarray experiments. Bioinformatics 22: 866&ndash;873.</p>
</dd>
<dt>[<a name="Smith2007" id="Smith2007">7</a>]</dt>
<dd>
<p>Smith B, Ashburner M, Rosse C, Bard J, Bug W, et&nbsp;al. (2007) The
      OBO Foundry: coordinated evolution of ontologies to support biomedical
      data integration. Nat Biotechnol 25: 1251&ndash;1255.</p>
</dd>
<dt>[<a name="OBOFoundry2006" id="OBOFoundry2006">8</a>]</dt>
<dd>
<p>OBO Foundry Consortium (2006). OBO Foundry
      Principles. <a href="http://obofoundry.org/wiki/index.php/OBO_Foundry_Principles">http://obofoundry.org/wiki/index.php/OBO_Foundry_Principles</a>.</p>
</dd>
<dt>[<a name="OBOFoundry2008" id="OBOFoundry2008">9</a>]</dt>
<dd>
<p>OBO Foundry Consortium (2008). OBO Foundry
      Principles. <a href="http://obofoundry.org/wiki/index.php/OBO_Foundry_Principles">http://obofoundry.org/wiki/index.php/OBO_Foundry_Principles</a>.</p>
</dd>
<dt>[<a name="Johansson2006" id="Johansson2006">10</a>]</dt>
<dd>
<p>Johansson I (2006) Bioinformatics and biological reality. J Biomed
      Inform 39: 274&ndash;287.</p>
</dd>
<dt>[<a name="Grenon2004" id="Grenon2004">11</a>]</dt>
<dd>
<p>Grenon P, Smith B, Goldberg L (2004) Biodynamic ontology: applying
      BFO in the biomedical domain. Stud Health Technol Inform 102:
      20&ndash;38.</p>
</dd>
<dt>[<a name="smith2004beyond" id="smith2004beyond">12</a>]</dt>
<dd>
<p>Smith B (2004) Beyond concepts: ontology as reality representation.
      In: Formal ontology in information systems: proceedings of the third
      conference (FOIS-2004). Ios Pr Inc, p.&nbsp;73.</p>
</dd>
<dt>[<a name="Lord2009" id="Lord2009">13</a>]</dt>
<dd>
<p>Lord P (2009) An Evolutionary Approach to Function. In:
      Bio-Ontologies 2009: Knowledge in Biology.
      URL <a href="http://hdl.handle.net/10101/npre.2009.3228.1">http://hdl.handle.net/10101/npre.2009.3228.1</a>.</p>
</dd>
<dt>[<a name="Smith2005" id="Smith2005">14</a>]</dt>
<dd>
<p>Smith B, Ceusters W, Klagges B, K&ouml;hler J, Kumar A, et&nbsp;al.
      (2005) Relations in biomedical ontologies. Genome Biol 6: R46.</p>
</dd>
<dt>[<a name="Russell1946" id="Russell1946">15</a>]</dt>
<dd>
<p>Russell B (1946) A History of Western Philosophy. Routledge.</p>
</dd>
<dt>[<a name="Merrill2010" id="Merrill2010">16</a>]</dt>
<dd>
<p>Merrill G (2010) Ontological realism: methodology or misdirection.
      Applied Ontology 5: 79-108.</p>
</dd>
<dt>[<a name="Dumontier2010" id="Dumontier2010">17</a>]</dt>
<dd>
<p>Dumontier M, Hoehndorf R (2010) Realism for scientific ontologies.
      In: 6th International Conference on Formal Ontology in Information
      Systems.</p>
</dd>
<dt>[<a name="Gruber1992" id="Gruber1992">18</a>]</dt>
<dd>
<p>Gruber T (1992). What is an ontology?
      URL <a href="http://www-ksl.stanford.edu/kst/what-is-an-ontology.html">http://www-ksl.stanford.edu/kst/what-is-an-ontology.html</a>.</p>
</dd>
<dt>[<a name="Ceusters2006" id="Ceusters2006">19</a>]</dt>
<dd>
<p>Ceusters W, Smith B (2006) A realism-based approach to the evolution
      of biomedical ontologies. AMIA Annu Symp Proc : 121&ndash;125.</p>
</dd>
<dt>[<a name="Shrager2003" id="Shrager2003">20</a>]</dt>
<dd>
<p>Shrager J (2003) The fiction of function. Bioinformatics 19:
      1934-1936.</p>
</dd>
<dt>[<a name="Seyed2009" id="Seyed2009">21</a>]</dt>
<dd>
<p>Seyed AP (2009) BFO/DOLCE Primitive Relation Comparison. In:
      BioOntologies 2009: Knowledge in Biology.</p>
</dd>
<dt>[<a name="rector2005" id="rector2005">22</a>]</dt>
<dd>
<p>Rector A (2005). Representing specified values in owl: &ldquo;value
      partitions&rdquo; and &ldquo;value sets&rdquo;. W3C Working Group Note.
      URL <a href="http://www.w3.org/TR/swbp-specified-values/">http://www.w3.org/TR/swbp-specified-values/</a>.</p>
</dd>
<dt>[<a name="egana2008" id="egana2008">23</a>]</dt>
<dd>
<p>Egana M, Rector A, Stevens R, Antezana E (2008) Applying Ontology
      Design Patterns in Bio-ontologies, Springer Berlin/Heidelberg. pp.
      7-16.</p>
</dd>
<dt>[<a name="Schulz2008" id="Schulz2008">24</a>]</dt>
<dd>
<p>Schulz S, Stenzhorn H, Boeker M (2008) The ontology of biological
      taxa. Bioinformatics 24: i313&ndash;i321.</p>
</dd>
</dl>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/07/realism-and-science/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Knowledge Blogging</title>
		<link>http://www.russet.org.uk/blog/2010/06/knowledge-blogging/</link>
		<comments>http://www.russet.org.uk/blog/2010/06/knowledge-blogging/#comments</comments>
		<pubDate>Fri, 18 Jun 2010 16:16:36 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1701</guid>
		<description><![CDATA[Some advance on the knowledge blog front this week. Firstly, myself and Simon Cockell spent a short while setting up a development and testing environment and wrote our first wordpress plugin&#8201;&#8212;&#8201;&#8221;Peaches&#8221; based around the Hello Dolly plugin, but with the lyrics from the Stranglers song instead. We finished this yesterday just before automattic released WordPress [...]]]></description>
			<content:encoded><![CDATA[<p>Some advance on the knowledge blog front this week. Firstly, myself and <a href="http://blog.fuzzierlogic.com/">Simon Cockell</a> spent a short while setting up a development and testing environment and wrote our first wordpress plugin&#8201;&#8212;&#8201;&#8221;Peaches&#8221; based around the Hello Dolly plugin, but with the lyrics from the Stranglers song instead. We finished this yesterday just before automattic released WordPress 3.0. Hopefully, it will be easy to upgrade. Rather more usefully, I got the very first version of a reference list plugin working. At the moment, it just transforms DOIs into hyperlinks.</p>
<p>And, secondly, I got notification from the British Library that they will be archiving the website. Good news, although there are not archives available yet.</p>
<p>We move forward!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/06/knowledge-blogging/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New Grant for Knowledgeblog</title>
		<link>http://www.russet.org.uk/blog/2010/05/new-grant-for-knowledgeblog/</link>
		<comments>http://www.russet.org.uk/blog/2010/05/new-grant-for-knowledgeblog/#comments</comments>
		<pubDate>Wed, 26 May 2010 14:01:29 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1687</guid>
		<description><![CDATA[It&#8217;s been relatively quiet from me for the last few weeks. One of the reasons for this is that I have been submitting a JISC bid. I&#8217;ve not submitted a JISC bid before, so it was quite a lot of work; it&#8217;s exactly the same as a research council proposal, except for all the bits [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s been relatively quiet from me for the last few weeks. One of the reasons for this is that I have been submitting a <a href="http://www.jisc.ac.uk/fundingopportunities/funding_calls/2009/12/1409researchdata.aspx">JISC bid</a>. I&#8217;ve not submitted a JISC bid before, so it was quite a lot of work; it&#8217;s exactly the same as a research council proposal, except for all the bits that differ.</p>
<p>The bid, in this case, was for extensions to the <a href="http://www.knowledgeblog.org">Knowledgeblog</a> environment; we want to make sure that it supports research better than at the current time. Our initial experiences were generally good, with a <a href="http://ontogenesis.knowledgeblog.org/647">few naysayers</a>. Additionally, we wanted much better linking to external forms of data; array express, Swissprot and the like. And, finally, we wanted to trial this out against a set of specific use cases. Critically, I also got tired of writing &#8220;knowledgeblog&#8221; the entire time, so they will now be &#8220;k-blogs&#8221;.</p>
<p>If it gets accepted, we proposing to develop some additional functionality, often reusing existing software. We really are trying to avoid developing any software that we don&#8217;t have to. The plans include:</p>
<ol type="1"> 
<li> A documented k-blog process, including information on who does want, and how to use various existing tools (word and latex in particular). </li>
<li> Proper support for referencing&#8201;&#8212;&#8201;authors should be able to drop in a PMID, or DOI and get a reference list and in-text citation automatically. </li>
<li> Various metadata support, so that the in-text citations have semantics from the readers side. </li>
<li> Trackback proxying for those resources which don&#8217;t support trackbacks. </li>
<li> Integration and additional tooling for adding references and cross-links. </li>
</ol>
<p>I&#8217;m hoping that we get the money; if we do, the work will give us a platform on which to build a publishing environment, a place for an educational resource, and finally, and excellent extension point for playing with semantic forms of publishing. I am not sure what the odds are; I know quite a few other proposals are going in, and there&#8217;s a reasonable chance that George Osbourne will cut the money back before its awarded. All I can do now is wait.</p>
<p>I&#8217;ll probably blog the whole proposal in a few days; this gives me a chance to try out the &#8220;blogging from Word&#8221; experience. How exciting.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/05/new-grant-for-knowledgeblog/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PhD Position Available</title>
		<link>http://www.russet.org.uk/blog/2010/05/phd-position-available/</link>
		<comments>http://www.russet.org.uk/blog/2010/05/phd-position-available/#comments</comments>
		<pubDate>Mon, 17 May 2010 15:49:50 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1685</guid>
		<description><![CDATA[I have a new PhD position available; I am looking to extend some work that I was involved with a while ago now, but into a new area of biology. The idea is that we build an ontological model of the mitochondria, and the knowledge that exists about it. We should be able to build [...]]]></description>
			<content:encoded><![CDATA[<p>I have a new PhD position available; I am looking to <a href="http://bioinformatics.oxfordjournals.org/cgi/content/abstract/22/14/e530">extend</a> some work that I was involved with a while ago now, but into a new area of biology. The idea is that we build an ontological model of the mitochondria, and the knowledge that exists about it. We should be able to build a light-weight model that covers many areas of the biology as an entire system. This will be useful both as an integration point (a traditional use for ontologies), but also so that we can make predictions and search for inconsistencies in the model. In other words, the ontology should be an integral part of the scientific process; we represent a hypothesis ontological and then let the reasoner search for the data for contradictions.</p>
<p>This is quite exciting, as we did the original work quite a few years ago, and it looked very promising; despite the gap, I still think this could work really well. Since that time, system biology has gained currency; this work fits, as we aim to look at the mitochondria as a whole. Instead of an in depth mathematical model of part of the mitochondria, as is common in systems biology, we will have a light-weight logical model of both what we know about the mitochondria <strong>and</strong> how we know it.</p>
<p>Please feel free to distribute this!</p>
<hr /> 
<h2><a name="_phd_studentship_2010"></a>PhD Studentship, 2010</h2>
<p>EPSRC PhD Studentship Building a logical model of biology: the Ontology of Mitochondria</p>
<p>For this project, you will use cutting edge technology designed for the Semantic Web, and apply it to the new field of systems biology. Specifically, you will develop an OWL ontology, a formal, logically specified model, to describe the mitochondria, a subsystem of the cell. You will use this to integrate large amounts of real-world data, to search for inconsistencies and produce a predictions about the underlying biology. From a computing perspective, this will result in insights both about the technology, and its scalablity; from a systems biology perspective, you gain understanding of the value of models which are wider than traditional mathematical models; from a biomedical perspective, you may gain insight in the functioning and behaviour of a medically important system of the cell.</p>
<p>This is a challenging multi-disciplinary project; applicants are not expected to understand all its aspects at the outset; as a result, it is of interest to those from either a computing science, computational biology or bioinformatics background. Any experience of ontologies, modelling or mitochondrial biology will be an advantage, but is not required. A willingness to learn is critical; students will spend significant time in both a computing science and biology environment, and will become familiar with both.</p>
<p>You should have either a First or 2.1 in Computing Science, a Biological Science or Mathematics, and a distinction level Masters degree in a related subject. Equivalent experience will also be considered.</p>
<p>Depending on how you meet the EPSRC&#8217;s eligibility criteria, you may be entitled to a full or a partial award. A full award covers tuition fees at the UK/EU rate and an annual stipend of £13,290 (2009/10). A partial award covers fees at the UK/EU rate only.</p>
<p>For further details, please contact Phillip Lord &lt;<a href="mailto:phillip.lord@newcastle.ac.uk">phillip.lord@newcastle.ac.uk</a>&gt;.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/05/phd-position-available/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Second Knowledge Blog Meeting</title>
		<link>http://www.russet.org.uk/blog/2010/04/the-second-knowledge-blog-meeting/</link>
		<comments>http://www.russet.org.uk/blog/2010/04/the-second-knowledge-blog-meeting/#comments</comments>
		<pubDate>Mon, 12 Apr 2010 14:53:14 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1681</guid>
		<description><![CDATA[I&#8217;m on my way to the second Knowledge Blog meeting. Well, sort of. The first meeting was badged the &#8220;Ontogenesis Tutorial&#8221; meeting; the focus was on developing a tutorial resource for ontologies. Actually, much the same will be true of this meeting, but I&#8217;ve decided that, for this meeting, as well as addressing the reviews [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m on my way to the second <a href="http://www.knowledgeblog.org">Knowledge Blog</a> meeting. Well, sort of. The <a href="http://www.russet.org.uk/blog/2010/01/the-ontogenesis-tutorial/">first</a> meeting was badged the &#8220;Ontogenesis Tutorial&#8221; meeting; the focus was on developing a tutorial resource for ontologies. Actually, much the same will be true of this meeting, but I&#8217;ve decided that, for this meeting, as well as addressing the reviews for my own article on Ontogenesis, I am going to want to spend some time supporting the process itself. In the first place, this means writing a couple of articles for <a href="http://process.knowledgeblog.org">Process</a>: a new knowledge blog that I am starting for discussion of the process itself.</p>
<p>Since the first meeting, I&#8217;ve had plenty of time to reflect on the general idea of <a href="http://www.knowledgeblog.org">knowledgeblogging</a>. As far as I can see, there is one overwhelming truth about the situation; we got 15 articles in 2 days and, since then, we have been averaging between 500 and 1000 page hits a month. Now, of course, it&#8217;s an open question whether this is at all sustainable; we have no advertising and no financial support. But, still, our most read article (&#8220;What is an Ontology&#8221;) has had several hundred reads and, bottom line, that is pretty good going for an academic article. We might like to think that the work that we do is important (well, it is!), but in publishing terms we are pretty much of a niche market.</p>
<p>On the negative side, we have had articles flooding in and none of those from the last meeting have got any further. Thinking back to <a href="http://en.wikipedia.org/wiki/Nupedia">Nupedia</a>, many moons ago, it&#8217;s obvious that getting an authorship is always going to be a problem.</p>
<p>I&#8217;m also going to have to think of a snappier and short name for than &#8220;knowledgeblog&#8221; which is taking far too long to type. So far:</p>
<dl> 
<dt> k-log </dt>
<dd> Simple, straightforward, but already used </dd>
<dt> knowblog </dt>
<dd> Good, but a homonym for &#8220;noblog&#8221; which is confusing. </dd>
<dt> knoblog </dt>
<dd> Pronounced &#8220;noh-blog&#8221; would be great, but English is not a phoentic language </dd>
<dt> knob </dt>
<dd> &#8220;KNOweledge Blog&#8221;&#8201;&#8212;&#8201;excellent in many ways, but I realise that the entire world does not share my slightly puerile sense of humour. </dd>
</dl>
<p>Hmmm. Comments welcome. So long as they are not about my puerile sense of humour.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/04/the-second-knowledge-blog-meeting/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Conclusions from Ontogenesis</title>
		<link>http://www.russet.org.uk/blog/2010/01/conclusions-from-ontogenesis/</link>
		<comments>http://www.russet.org.uk/blog/2010/01/conclusions-from-ontogenesis/#comments</comments>
		<pubDate>Sat, 23 Jan 2010 16:12:25 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1528</guid>
		<description><![CDATA[The Ontogenesis knowledgeblog meeting has now finished; it&#8217;s been a fascinating experience and one that I&#8217;ve enjoyed very much. I was hoping for two things out of the meeting; the first was to get some content. There has been a pressing need introductory material on ontologies for a long time now. We were never going [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://ontogenesis.knowledgeblog.org">Ontogenesis</a> <a href="http://knowledgeblog.org">knowledgeblog</a> meeting has now finished; it&#8217;s been a fascinating experience and one that I&#8217;ve enjoyed very much.</p>
<p>I was hoping for two things out of the meeting; the first was to get some content. There has been a pressing need introductory material on ontologies for a long time now. We were never going to address this completely in a two day meeting even with the significant number of people that we had in the room. But, we managed to write quite a number of articles between us&#8201;&#8212;&#8201;I rather let the side-down with only one small article, but I have the excuse that I was busy answering questions. Most of these have not achieved the required number of reviews yet, although I&#8217;ve just done the second reviews for Mikel&#8217;s, so once that&#8217;s posted, we should be there for at least one article. I think that people enjoyed the process enough that some more articles will appear over time, although, inevitably, once the immediacy of being in the same room will mean that this process will not happen as rapidly.</p>
<p>The second question was to get a clear understanding of whether the idea of <a href="http://knowledgeblog.org">knowledgeblogging</a> has legs; it seems reasonable in theory, but does it work in practice. There were some issues&#8201;&#8212;&#8201;the server crashing twice out of memory was not ideal, although quickly resolved. Quite a number of people who hadn&#8217;t blogged before found the wordpress interface, particularly the editor, fairly nasty; it&#8217;s not really designed for large posts. The review process also was a little clunky and there were many questions and ideas about this. However, for my money, the 80/20 rules comes in; we got 80 percent there with a more-or-less modified wordpress. Well, maybe, 70/30.</p>
<p>The rest is going to require more thinking about.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/01/conclusions-from-ontogenesis/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The Ontogenesis Tutorial</title>
		<link>http://www.russet.org.uk/blog/2010/01/the-ontogenesis-tutorial/</link>
		<comments>http://www.russet.org.uk/blog/2010/01/the-ontogenesis-tutorial/#comments</comments>
		<pubDate>Wed, 20 Jan 2010 19:23:39 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1521</guid>
		<description><![CDATA[I&#8217;m on my way down to Manchester for the Ontogenesis meeting while I was sad enough to blog about on Christmas Day. I&#8217;m looking forward to this meeting a lot; the idea has been in gestation for five or six months since Bio-Ontologies last year. In summary, we are getting a number of people together [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m on my way down to Manchester for the Ontogenesis meeting while I was sad enough to <a href="/2009/12/ontogenesis-and-dois/">blog about</a> on Christmas Day. I&#8217;m looking forward to this meeting a lot; the idea has been in gestation for five or six months since <a href="/2009/06/to-bio-ontologies-2009/">Bio-Ontologies</a> last year. In summary, we are getting a number of people together to write articles for a book, but instead of going through the tedious and difficult process of getting it published we are going to use a <a href="http://ontogenesis.knowledgeblog.org">blog</a>.</p>
<p>I finished fiddling with wordpress yesterday and, hopefully, all is ready (fingers crossed that our server doesn&#8217;t get hacked as happened to this blog a few days ago). I&#8217;m hoping that we manage to get a number of articles written during the meeting; in practice, getting people in one room is the best way of getting these things done. However, this is not a closed process; I&#8217;d welcome articles from anyone, as well as those not at the meeting. Being blog based, the system is inherently distributed. So, if you have an ontology-related topic that you have a burning desire to write about, please <a href="mailto:phillip.lord@newcastle.ac.uk">contact me</a> and I&#8217;ll let you know whether if anyone else is doing it. Alternatively, there is a list of <a href="http://ontogenesis.knowledgeblog.org/topics/">topics</a> that we hope to make a start in covering. The articles will be peer-reviewed and available for the world to see, fully-credited to your name.</p>
<p>I can&#8217;t guarantee that it&#8217;s going to be included in the REF, but I am working on it.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/01/the-ontogenesis-tutorial/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Ontogenesis and DOIs</title>
		<link>http://www.russet.org.uk/blog/2009/12/ontogenesis-and-dois/</link>
		<comments>http://www.russet.org.uk/blog/2009/12/ontogenesis-and-dois/#comments</comments>
		<pubDate>Fri, 25 Dec 2009 14:06:42 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1506</guid>
		<description><![CDATA[Okay, so I am totally sad and writing a blog post on Christmas day. Well, the thing is that I&#8217;ve been teaching for months and moving house. This is the first still period that I&#8217;ve had for ages; well, thinking is inevitable. One of the things that I am looking to next year is the [...]]]></description>
			<content:encoded><![CDATA[<p>Okay, so I am totally sad and writing a blog post on Christmas day. Well, the thing is that I&#8217;ve been teaching for months and moving house. This is the first still period that I&#8217;ve had for ages; well, thinking is inevitable.</p>
<p>One of the things that I am looking to next year is the last ontogenesis meeting. It&#8217;s been a lot of fun doing these, I&#8217;ve enjoyed them all. The last one is my idea, and I think it&#8217;s going to be good. As an ontologist, you get a lot of questions about how to build ontologies and is there a book. At the moment, there isn&#8217;t really one and it&#8217;s a problem. So, for ontogenesis, we decided to write a set of book chapters; here is the clever bit&#8201;&#8212;&#8201;we just stick them on a blog, because the process of formal publication as a book is long-winded, tiresome and error-prone. I&#8217;m calling the process knowledge blogging&#8201;&#8212;&#8201;it&#8217;s peer-reviewed, formal and with no intention of being regular; articles come when they are written.</p>
<p>I set up the <a href="http://www.knowledgeblog.org">blog</a> sometime ago. I haven&#8217;t, as yet, had a lot of time to fiddle with theme or organisation. There is some content, but it&#8217;s just the wordpress default theme. Not ideal, and I hope I will have some time for fixing things after I get back from holidays. I&#8217;ve noticed two problems already though. First is that with longer articles you need section headings and wordpress doesn&#8217;t do them; I&#8217;ve found a solution for this, in the shape of a <a href="http://www.evanscode.com/wordpress-table-of-contents-plugin/">contents table plugin</a>, although subsequent googling also came up with <a href="http://hackadelic.com/solutions/wordpress/toc-boxes">others</a>. This should make navigation a bit better.</p>
<p>The other issue is references&#8201;&#8212;&#8201;I don&#8217;t have a good idea about how to do these sanely. I&#8217;ve been looking for DOI wordpress plugins, but can only find <a href="http://www.crossref.org/CrossTech/2008/02/crossref_citation_plugin_for_w.html">one</a> from crossref which doesn&#8217;t do what I want. This allows you to search for citations; what I wanted was to put a DOI in code and have it present properly.</p>
<p>Still I think I know how to do this; I&#8217;ve found a tool for linking references to the Mormon books; not normally something I would download, but the principle is the same. So I can replace DOIs with a proper link, using a DOI resolver. What I&#8217;d really like to do is have a proper in-text citation also. The documentation on DOIs and metadata harvesting is all rather nasty though; a nice simple REST API would do the trick.</p>
<p>It all confirms my long-held concerns about DOIs; there are a tool for the publishers. Still, perhaps pubmed will come to my rescue. Next place to look.</p>
<p>Happy Christmas to all my subscribers of whom there are very few.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/12/ontogenesis-and-dois/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Homeward from OBI</title>
		<link>http://www.russet.org.uk/blog/2009/10/homeward-from-obi/</link>
		<comments>http://www.russet.org.uk/blog/2009/10/homeward-from-obi/#comments</comments>
		<pubDate>Fri, 23 Oct 2009 19:14:08 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Life]]></category>
		<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1497</guid>
		<description><![CDATA[Fours days of ontology bashing at an OBI meeting; this leaves me extremely glad to be going home. The meeting was long, hard and tiring. We got a lot done in the time available, though, and that was impressive. All the people in the room knew what they were doing, and we managed to work [...]]]></description>
			<content:encoded><![CDATA[<p>Fours days of ontology bashing at an OBI meeting; this leaves me extremely glad to be going home. The meeting was long, hard and tiring. We got a lot done in the time available, though, and that was impressive. All the people in the room knew what they were doing, and we managed to work together and in parallel to an impressive extent. Even while listening to the main conversation, most people we also skype chatting about something else to those in and outside the room.</p>
<p>I spend a considerable time working on the paper, which will accompany the release. I got this job, mostly as to regularise and clean up the English, but in the end did rather more than this; I hope people are not upset about the stuff that I took out; the whole thing was done &#8220;pair programming style&#8221;, although I had different pairs for different sections.</p>
<p>Despite all the efforts, though, there are still tracker items open for the 1.0 release, and thats not ideal, but it is good that we are much closer to it.</p>
<p>Philly was much as I remember it; it&#8217;s a reasonably pleasant city. It doesn&#8217;t feel too aggressive and it&#8217;s relatively quiet. As I had a late flight out today (meeting finished yesterday), I spent the time wandering around town; like too many US cities Philly has been built to be easy to drive through, rather than good to live in, but you Philly is okay for walking around. They have a nice parkway area on JFK boulevard; I had a nice guided tour around the Rodin museum, which was wonderful, even if lacking The Thinker which is normally their show piece entranceway sculpture. Rodin was big on hands, it turns out, and rather fond of the musculature of backs; the captions on the bronzes suggest that he was having affairs with many of his models, so I wonder if this stems from&#8230;well, you can work it out.</p>
<p>After that, I wandered up to the art museum past the twee statue of Rocky Balboa, and the converse footprints sculpted in the stairs. The art museum itself is huge; the Thinker is temporarily here, so I got to see it after all, but I think it needs to be outside. As well as the traditional galleries, and strangely, they also have a lot of furniture there, and have imported whole rooms from various places. For me, the Asian section was the best; they had an Indian temple, dark and brooding in the half-light, and a Chinese room with the most amazing timbers. I felt the indoor Romanesque outside courtyard (erm&#8230;) was taking it a bit too far.</p>
<p>Not much left to be done after an afternoon full of culture; on the way back to the hotel, I looked for a little park on Chestnut that I had wanted to see and a falafel shop which I had seen sign posted. I found neither; the park had left no traces at all, the falafel shop I found a poster for, but I walked all the street and as far as I can tell 1740 Sansome is a multistory parking lot.</p>
<p>Back where I started, sitting in the airport; tick, tock, tick, tock.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/10/homeward-from-obi/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>OBI Bound</title>
		<link>http://www.russet.org.uk/blog/2009/10/obi-bound/</link>
		<comments>http://www.russet.org.uk/blog/2009/10/obi-bound/#comments</comments>
		<pubDate>Mon, 19 Oct 2009 20:29:52 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Life]]></category>
		<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1495</guid>
		<description><![CDATA[Bleary eyed, stacks of chocolate muffins obscuring the &#8220;healthy snacking&#8221; sign, kids on heelers. Yes, I&#8217;m in the airport at stupid-o-clock on saturday morning. I&#8217;m heading out to Philadelphia for an OBI meeting. It&#8217;s an important meeting; OBI has been a long time in gestation, but this should constitute the 1.0 release; it&#8217;s going to [...]]]></description>
			<content:encoded><![CDATA[<p>Bleary eyed, stacks of chocolate muffins obscuring the &#8220;healthy snacking&#8221; sign, kids on heelers. Yes, I&#8217;m in the airport at stupid-o-clock on saturday morning. I&#8217;m heading out to Philadelphia for an OBI meeting. It&#8217;s an important meeting; OBI has been a long time in gestation, but this should constitute the 1.0 release; it&#8217;s going to be a mass tidy up session.</p>
<p>I&#8217;m quite looking forward to it, in some ways. I quite like Philadelphia, at least if my memory serves me well; I&#8217;ve only been there once, for the SOFG conference many, many moons ago, certainly in my pre-blog days. I remember it as a pleasant town, with a water-front, only slightly scarred by the enormous roads that make US cities less livable than European. I&#8217;m also hoping to catch up with Robin McEntire, who was one of the co-chairs of <a href="http://www.bio-ontologies.org.uk">Bio-Ontologies</a> at ISMB, and is local.</p>
<p>I&#8217;m rather unprepared for the meeting. There has been a lot of activity on the mailing list recently, some of it concerned with paper preparation. But I&#8217;ve been trying to get the rest of my teaching preparation finished (nearly done now) which has left me very busy over the last few weeks; I haven&#8217;t even had time to look at the paper; I&#8217;ve hardly read even the mailing list subject lines. Still, the next week is entirely given over to OBI, which will have to be enough. Travel at this time of year messes with my life to an extent that I&#8217;m certainly not going to feel guilty about it.</p>
<p>The flip side of being busy, is that I am now in the process of writing about 5 papers, with the next 2 in my head. After the confusion of moving to Newcastle, working out what research to do and learning to teach, my research was getting a bit stuck; I was running out of ideas for the simple reason of not having time to think. Having an enormous backlog of nearly finished, half-finished, and hardly started good ideas (most of which will, in time, turn out not to be) for papers makes me feel like a proper academic again.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/10/obi-bound/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>EPSRCs Proposal Policy</title>
		<link>http://www.russet.org.uk/blog/2009/10/epsrcs-proposal-policy/</link>
		<comments>http://www.russet.org.uk/blog/2009/10/epsrcs-proposal-policy/#comments</comments>
		<pubDate>Wed, 14 Oct 2009 10:50:27 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1493</guid>
		<description><![CDATA[I was most entertained to read about EPSRCs funding policy changes. Basically, they have taken a long hard look at their system for funding, they have decided that the peer-review system has fundamental problems, and have therefore issued their well thought out and considered solution to the problem: blame the users. Their idea is this; [...]]]></description>
			<content:encoded><![CDATA[<p>I was most entertained to read about EPSRCs funding policy changes. Basically, they have taken a long hard look at their system for funding, they have decided that the peer-review system has fundamental problems, and have therefore issued their well thought out and considered solution to the problem: blame the users.</p>
<p>Their idea is this; if you are on too many grants that fail, then you won&#8217;t be allowed to submit again until you have been on some sort of re-education camp. The basic criteria appear to be this: three or more unfunded proposals, ranked in the bottom half, and lower than 25% success over the same two years.</p>
<p>The first criteria is problematic because it is based on an aggregate score; it is impossible to judge in advance whether you are going to be in bottom half; your proposal could be brilliant and internationally outstanding (EPSRC is like Lake Wobegon, all the grants are above average) and you could still be in the bottom half. The second half of the criterion is also interesting; if you submit a single proposal and it gets rejected then you are fall into this category straight away. It&#8217;s also going to mean that it&#8217;s going to be harder to get people to do collaborative grants, as it might bring their stats down. This is after EPSRC have been pushing us for years to put at least 5 different institutions on each proposal if we want it to be funded.</p>
<p>At the same time, information about the REF which is to follow up from the wonderous RAE is starting to trickle out. Nice to see that they are still going to reinforce the existing closed publication system with more bibliometric data. The <a href="http://youaretheref.researchresearch.com/">&#8220;You are the REF&#8221;</a> website offers itself as a way to work out your score. Excitingly the first question is &#8220;What is your discipline?&#8221;; Computer Scientist or Biologist. This seems reflective of the REF documentation that I have seen already. It works on this basis: different disciplines have different rules, so we will make different decisions in each, which is fine, because no one can be in two anyway.</p>
<p>Glad to see that the REF is carrying on the RAE tradition of encouraging multi-disciplinary research.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/10/epsrcs-proposal-policy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>OBO Format and Manchester Syntax</title>
		<link>http://www.russet.org.uk/blog/2009/09/obo-format-and-manchester-syntax/</link>
		<comments>http://www.russet.org.uk/blog/2009/09/obo-format-and-manchester-syntax/#comments</comments>
		<pubDate>Thu, 10 Sep 2009 21:42:16 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1470</guid>
		<description><![CDATA[At Neuroinformatics 2009, David Sutherland and I talked about the problems of ontology building. One of the current (and past!) difficulties is to choose an appropriate language for representing the knowledge in your ontology. I thought I would write my thoughts up as a post; this will probably result in the most boring thing I [...]]]></description>
			<content:encoded><![CDATA[<p>At Neuroinformatics 2009, David Sutherland and I talked about the problems of ontology building. One of the current (and past!) difficulties is to choose an appropriate language for representing the knowledge in your ontology. I thought I would write my thoughts up as a post; this will probably result in the most boring thing I have ever written (I am sure someone will point out worse offenses); syntax is dull but distressingly important.</p>
<p>In bioinformatics, there are essentially two choices that is OWL and OBO (format). A second issue, is finding a good environment for developing the ontology; this divides between Protege, OBO-Edit and the ever-present &#8220;text editor&#8221;. It&#8217;s often the case, that we want to use both of these at the same time. Take, for example, OBI, which I am involved in. While the ontology itself is being developed in OWL, many of its dependent ontologies are built using OBO; being purist and demanding one is really not an option. OWL itself has many different syntaxes; at the moment, I generally prefer Manchester sytnax because you can edit it with text-editor, which is really not so easy with any of the XML representations.</p>
<p>While these two languages have somewhat different expressivity, there have been a number of descriptions of how to translate both the syntax and the semantics which have been described elsewhere. One of the recurrent problems, however, stems from the best practices and the syntax of identifiers.</p>
<p>OBO makes use of a numerical, semantics-free identifier and a namespace, with a syntax of <tt>NAMESPACE:IDENTIFER</tt>. So, a Gene Ontology term looks like <tt>GO:0003674</tt>. The namespace is not constrained to be two-letters and has mechanisms for world-uniqueness, in that people talk to each other and sort it out, if they clash. The use of a semantics-free identifier means that term names can be changed while maintaining the implied meaning with the term; the label for the term, meanwhile, provides a human readable version, which can be shown to users of the ontology. I will call these the OBO identifier and OBO label respectively.</p>
<p>Translating this, however, into OWL, including Manchester syntax causes significant problems. The naturalistic translation is to turn the OBO identifier onto the identifier in OWL; the OBO namespace would become an XML namespace, the OBO identifier would become an XML identifier. Unfortunately, this doesn&#8217;t work. First, the OBO identifier is genuniely just a short string and XML requires a URI; so a mapping between OBO identifiers and URIs is necessary. Second, the OBO identifier is numerical; unfortunately, while the identifiers in OWL can contain numbers they have to start with a non-numerical character. The standard translation, therefore, uses in most cases an OBO wide URL (<a href="http://purl.obolibrary.org/obo/">http://purl.obolibrary.org/obo/</a>), although some ontologies have their own namespace (GO uses <a href="http://purl.org/obo/owl/GO#">http://purl.org/obo/owl/GO#</a>). The OBO identifier is mapping to an valid identifer by sticking a prefix onto the numbers. So, we have identifiers such as <tt>GO:GO_0042101</tt> or <tt>obo:OBI_1110045</tt>. There are also some OBO ontologies for which this does NOT occur; for instance, BFO classes in OBI come out with identifiers of the form <tt>snap:Continuant</tt> or <tt>span:Process</tt>, except for one which is <tt>bfo:Entity</tt>.</p>
<p>Again, all perfectly reasonable, but unfortunately, when converted to Manchester syntax it means that we end up with classes that look like this slightly elided class from OBI:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Class</font></b>: obo:OBI_1110161

    <font color="#990000">Annotations:</font>
        rdfs:label <font color="#FF0000">"T cell epitope ELISA IL-1b assay"</font>@en,

    <font color="#990000">SubClassOf:</font>
        obo:OBI_0000661,
        obo:OBI_0000299 some (obo:IAO_0000109
        <b><font color="#000080">and</font></b> (obo:IAO_0000136 some obo:OBI_1110196))
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>which completely defeats the aim of a human-readable syntax. Now OBO format has much the same problem; relationships to other classes are specified using cross-referenes to their identifiers which are, essentially, unreadable. OBO format works around this with a denormalisation as can be seen from this somewhat elided example from IAO:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt>[Term]
id: IAO:0000027
name: data item
def:<font color="#FF0000">"a data item is an information content entity that is intended...."</font>
is_a: IAO:0000030 ! information content entity
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>The cross reference in this case is a subsumption link to <tt>IAO:0000030</tt></p>
<p>One solution would be to use the <tt>rdfs:label</tt> in place of the identifier. So, we would have something that looked like this:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Class</font></b>: <font color="#FF0000">"T cell epitope ELISA IL-1b assay"</font> @en

    <font color="#990000">Annotations:</font>
        obo:identifier <font color="#FF0000">"1110161"</font>

    <font color="#990000">SubClassOf:</font>
        obo:OBI_0000661,
        obo:OBI_0000299 some (obo:IAO_0000109
        <b><font color="#000080">and</font></b> (obo:IAO_0000136 some obo:OBI_1110196))

</tt></pre>
</pre>
</td>
</tr>
</table>
<p>Other identifiers would also have to be changed, also. I&#8217;ve also added the <tt>odo:identifier</tt> line (which I think would be valid, but might require the creation of an OWL individual). Without this, it would not be possible to go backward.</p>
<p>However, this is problematic as it changes the serializiation between the OWL Manchester syntax and other syntaxes of OWL. The class identifier has to be URI legal, and OBO label here is not. We could do a syntactic conversion (e.g. <tt>T%20%cell%20%epitope</tt>) but this, again, reduces readiblity, defeating the point. Also, the <tt>rdfs:label</tt> would become part of the final identifier URI, which then becomes a semantics heavy identifier. Finally, it would require a OBO specific loading of the Manchester syntax, taking the URI identifier from the annotation block, and the <tt>rdfs:label</tt> from the class name.</p>
<p>So, is there any solution. First, there are tooling solutions. In Protege, it is already possible to use any component of the definition in the display. So, you can set the <tt>rdfs:label</tt> as the main display form. Tooling solutions are attractive, but there is a problem; you have to extend all tools to support this view; I realise that the number of freaks who wish to edit OWL with emacs is not that large, so this might not seem an issue. However, many people wish to develop ontologies collaboratively using version control; if you want to compare versions you use diff, so we now need an Manchester syntax diff viewer. Also, if you want to do some perl hacking, or straight-forward search and replace, again, it&#8217;s all harder.</p>
<p>To some extent this might seem trivial, but then the entire purpose of Manchester syntax (and the functional syntax) is to have an easy to read and manipulate syntax which the XML version of OWL is not. This purpose is defeated if it&#8217;s hard to read.</p>
<p>So, a second non-tooling solution. The obvious answer is to take the OBO approach and add comments. Now, the Manchester syntax includes a comment character (#), although last time I tried the Protege parser doesn&#8217;t implement this. None then less, it allows this:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Class</font></b>: obo:OBI_1110161 <i><font color="#9A1900">#"T cell epitope ELISA IL-1b assay"@en</font></i>

    <font color="#990000">Annotations:</font>
        rdfs:label <font color="#FF0000">"T cell epitope ELISA IL-1b assay"</font>@en,

    <font color="#990000">SubClassOf:</font>
        obo:OBI_0000661,
        obo:OBI_0000299 some (obo:IAO_0000109
        <b><font color="#000080">and</font></b> (obo:IAO_0000136 some obo:OBI_1110196))
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>This is not too bad, but it doesn&#8217;t work well for complex class expressions. I can&#8217;t be bothered to look up the labels and have reused one, but you get something like:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Class</font></b>: obo:OBI_1110161 <i><font color="#9A1900">#"T cell epitope ELISA IL-1b assay"@en,</font></i>

    <font color="#990000">Annotations:</font>
        rdfs:label <font color="#FF0000">"T cell epitope ELISA IL-1b assay"</font>@en,

    <font color="#990000">SubClassOf:</font>
        obo:OBI_0000661, <i><font color="#9A1900">#"T cell epitope ELISA IL-1b assay"@en</font></i>
        obo:OBI_0000299 <i><font color="#9A1900">#"T cell epitope ELISA IL-1b assay"@en</font></i>
        some (obo:IAO_0000109 <i><font color="#9A1900">#"T cell epitope ELISA IL-1b assay"@en</font></i>
        <b><font color="#000080">and</font></b> (obo:IAO_0000136 <i><font color="#9A1900">#"T cell epitope ELISA IL-1b assay"@en</font></i>
             some obo:OBI_11101 <i><font color="#9A1900">#"T cell epitope ELISA IL-1b assay"@en</font></i>
             ))
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>This has three problems. Firstly, we have used comments &#8220;meaningfully&#8221; as we can&#8217;t distinguish between these comments and other normal comments. Secondly, we have had to reformat the output because we have only a &#8220;to-end-of-line&#8221; comment character. Thirdly, it looks horrible.</p>
<p>So, my minimal solution would be this; we introduce some new comment characters, which are treated as comments normally, but which carry enough semantics to allow a warning when they are wrong; rather like Javadoc, which is a comment wrt the language, but is structured and meaningful wrt the documentation. Tooling could be used to check that the comment masquerading labels are correct wrt to the identifiers.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Class</font></b>: obo:OBI_1110161 [T cell epitope ELISA IL-1b assay],

    <font color="#990000">Annotations:</font>
        rdfs:label <font color="#FF0000">"T cell epitope ELISA IL-1b assay"</font>@en,

    <font color="#990000">SubClassOf:</font>
        obo:OBI_0000661 [blah],
        obo:OBI_0000299 [longer blah]
        some (obo:IAO_0000109 [more]
        <b><font color="#000080">and</font></b> (obo:IAO_0000136 [stuff]
        some obo:OBI_11101 [OBI Thing]
        ))
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>This is still not ideal; it would require extension to Manchester syntax, but it&#8217;s minimal, and it does support the semantics free identifiers in OBO in a way which does not require extensive tooling. It&#8217;s worth reiterating here that OBOs semantics-free identifiers are a good thing; so, supporting them supports others people who may wish to do the same, sensible thing. It does have the disadvantages of duplicating information, but at least in a way that is checkable.</p>
<p>Comments welcome!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/09/obo-format-and-manchester-syntax/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Neuroinformatics 2009</title>
		<link>http://www.russet.org.uk/blog/2009/09/neuroinformatics-2009/</link>
		<comments>http://www.russet.org.uk/blog/2009/09/neuroinformatics-2009/#comments</comments>
		<pubDate>Thu, 10 Sep 2009 13:42:10 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1468</guid>
		<description><![CDATA[This is the third year in a row that I have been to Neuroinformatics (or it&#8217;s forerunner, Databasing the Brain). It&#8217;s still turning out to be an enjoyable meeting, even though there is still lots of it that I don&#8217;t understand. Come to think of, perhaps because there is lots of it that I don&#8217;t [...]]]></description>
			<content:encoded><![CDATA[<p>This is the third year in a row that I have been to Neuroinformatics (or it&#8217;s forerunner, Databasing the Brain). It&#8217;s still turning out to be an enjoyable meeting, even though there is still lots of it that I don&#8217;t understand. Come to think of, perhaps because there is lots of it that I don&#8217;t understand.</p>
<p>Pilsen (or Plzen) is, perhaps, a strange place for the meeting. It&#8217;s a bit of a pig to get to, as the airport is in Prague. Likewise, the conference centre was a bit out of town, so you had to get a taxi if you wanted food in the evening. Still the venue itself worked well. Slightly flaky wireless, but it had tables upstairs on a balcony; a lot of people migrated up there as the meeting went on, making the auditorium a little deserted.</p>
<p>Although, I&#8217;ve said I didn&#8217;t understand lots of it, many of the keynotes this year were bioinformatics, systems biology or data integration which I know well. As well as that, there was a (semantic) web and ontology section. I enjoyed Tim Clarks talk, as he&#8217;s made stuff that lots of people are actually using, although I don&#8217;t think he explained why during his talk.</p>
<p>The section of high performance computing was probably the least relevant. While they&#8217;ve become interested in power consumption recently, these guys are still obsessed with teraflops  (&#8230;now petaflops&#8230;now exaflops). To be honest, I don&#8217;t care. With more power, you can build more granular, higher resolution models, but I doubt that will bring you anything, unless you also have more granular data. They should be worried about discs &#8212; always the Cindarella of the hardware world, only slightly more interesting than printers &#8212; but it&#8217;s discs which carry the data. While we are at it, spinning discs use lots of power. And they have more flashing lights than CPUs. The hardware guys should be talking about disc space. The neuroscientists should be worrying about filling discs up. Neuroinformaticians should make that they end up with an exabyte dataset; not 1000 petabyte datasets or worse, 1,000,000 gigabyte datasets.</p>
<p>I tried to get a bit of Web 2.0 stuff happening at the meeting. David Sutherland set up a <a href="http://www.friendeed.com/incfpilsen">friendfeed room</a>. Second day, we were sitting next to each other like two sad blokes at a party full of women, sending each other messages on their iphones. Although, it was a neuroinformatics meeting, so largely without the women. Second day, mostly it was just me, sad, lonely and pathetic. Still, having said that, I did manage to meet almost all of those subscribed to the room, which you couldn&#8217;t achieve at ISMB nowadays. <a href="http://friendfeed.com/deconstructingzaniness">Pavan Ramkumar</a> said hello at lunch, and then later at the airport. I met <a href="http://friendfeed.com/sarahmaynard">Sarah Maynard</a> at her poster; it had ontologies, OWL and information content-based similarity measures; bound to make me happy. Only Lisa Kjonigsen remained in cyberspace only. With luck, next year, more people will join; not least because I&#8217;ll probably not go to Japan.</p>
<p>I had a quick go at live blogging also; to be honest, I am not a natural. The problem is I have too much desire to editorialise. The <a href="http://themindwobbles.wordpress.com/">roboblogger</a> tells me that she just blogs the notes that she would have taken anyway; my notes, on the other hand, are full of comment, invective and questions. Perhaps I could just put these into the asciidoc source of my blog as comments. I stopped live blogging on the last day, not for these reasons, but largely as a desire not to hold my crushing ignorance of the topics being discussed up to public scrutiny.</p>
<p>Neuroinformatics (the meeting) is changing. I have to believe that if there is more about genomic and multiomic data integration that this has to be a good thing. The brain is a hard to thing to figure out; I have to believe that using more data, more types of data and a heavier use of nice, simple, model organisms is going to increase the rate of advance; with all the fuss about systems biology, it&#8217;s easy to forget the fabulous success of the last 100 years of reductionism biology, which made systems biology possible. This has to be the way forward for neuroscience. Even if it does make the meeting more usual and, perhaps, less interesting for me as a result.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/09/neuroinformatics-2009/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Data-management on the Web Scale, Alon Halevy</title>
		<link>http://www.russet.org.uk/blog/2009/09/data-management-on-the-web-scale-alon-halevy/</link>
		<comments>http://www.russet.org.uk/blog/2009/09/data-management-on-the-web-scale-alon-halevy/#comments</comments>
		<pubDate>Mon, 07 Sep 2009 15:25:30 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[LiveConference]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1464</guid>
		<description><![CDATA[This is a live blog from Neuroinformatics 2009. Data management: View from 50,000 feet&#8201;&#8212;&#8201;dimensions are amount of structure and the number of data sources. More structure, less data sources. Distinguishes between parallelisation and heterogeneity. Can distribute data across tables in an organised way&#8201;&#8212;&#8201;this is parallelisation; or, you can have lots of data, spread across resources, [...]]]></description>
			<content:encoded><![CDATA[<p><strong>This is a live blog from Neuroinformatics 2009.</strong></p>
<p>Data management: View from 50,000 feet&#8201;&#8212;&#8201;dimensions are amount of structure and the number of data sources. More structure, less data sources.</p>
<p>Distinguishes between parallelisation and heterogeneity. Can distribute data across tables in an organised way&#8201;&#8212;&#8201;this is parallelisation; or, you can have lots of data, spread across resources, with multiple entities and with no common plan.</p>
<p>Outline&#8201;&#8212;&#8201;data integration and suggest data spaces as a solution.</p>
<p>Databases are so successful because it provides a level of abstraction over the data. Data integration is a higher level of abstraction still because you don&#8217;t have to worry how the data is stored or structured.</p>
<p>Mediated schema, uses a mediation language, a mapping tool, and then a set of wrappers over the datasources, which map them to a common syntax (relational database for example).</p>
<p>So, we know how to do it, but the cost of building data integration systems are really high. Creating the mediated schema or ontology is hard; sometimes it&#8217;s impossible. Mapping source to mediated schema can be a nightmare, because you need many people from both sides of the mediation. Are some automated systems, but human is always needed. Data level mappings (changing IDs, synonyms and so on). Social costs.</p>
<p>One of the problems with data integration is that it costs a lot early, but yields very little till quite a long time on, and it&#8217;s all done. What we really want is pay-as-you-go data management; want useful data out early and constantly.</p>
<p>Everytime human does something with data, they are telling you some information about the data. If you can capture this information then you can useful stuff with this.</p>
<p>Structured data on the web: the deep web, which is data behind forms; and two others. So, deep web. Knowledge which is not accessible through general purpose search engines&#8201;&#8212;&#8201;cars, houses and so on are examples of this. Uses data spaces as a way of doing this; learned different 5000 data sources in two months.</p>
<p>One possible way to access the deep web is to put queries against web forms. Have to guess what to put in; one way is to just use words on the form page in the first place. Currently, google gives much knowledge from this deep web; has the biggest impact on the deep web.</p>
<p>Web tables; can we exploit the knowledge from the tables better. There are 14billion tables on the web, of which about 154million are interesting&#8201;&#8212;&#8201;rest formatting or whatever. First problem is to identify schema elements; these are expressible in HTML but actually no one uses it. So have to guess. They got 2.6 million schemas. Would be good to put these into automcomplete (although not sure where).</p>
<p><a href="http://tables.googlelabs.com">Fusion tables</a> lets you upload data and collaborate on the visualisation of it. Changes the visualisation options depending on the data types.</p>
<p>Conclusions&#8201;&#8212;&#8201;bottom up data-integration, which is more realistic than top-down. Dataspaces are an approach. Fusion tables is good.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/09/data-management-on-the-web-scale-alon-halevy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Neurolex and NIF, Jeff Grethe</title>
		<link>http://www.russet.org.uk/blog/2009/09/neurolex-and-nif-jeff-grethe/</link>
		<comments>http://www.russet.org.uk/blog/2009/09/neurolex-and-nif-jeff-grethe/#comments</comments>
		<pubDate>Mon, 07 Sep 2009 10:04:22 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[LiveConference]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1462</guid>
		<description><![CDATA[Guiding principles of NIF. Builds heavily on existing technologies. Information resources come in all sorts of size and shape. Highest level NIF registry. Web index of resources which are relevant to neurosciences. NIF resource diversity&#8201;&#8212;&#8201;three different levels of data, with increasing amount of structure. Is GRM1 in cerebral cortex? NIF system allows searching over multiple [...]]]></description>
			<content:encoded><![CDATA[<p>Guiding principles of NIF. Builds heavily on existing technologies. Information resources come in all sorts of size and shape.</p>
<p>Highest level NIF registry. Web index of resources which are relevant to neurosciences.</p>
<p>NIF resource diversity&#8201;&#8212;&#8201;three different levels of data, with increasing amount of structure.</p>
<p>Is GRM1 in cerebral cortex? NIF system allows searching over multiple different resources. But problems; inconsistent and sparse annotation of scientific data. Many different names of the same thing and so on. Added to this there are over 2000 databases in the registry.</p>
<p>Uses mixed searching so that both ontological information and string based systems important for where there is no annotation. Can also do query expansion with ontology to get better querying.</p>
<p>Building ontologies is difficult even for limited domains, never mind all of neurosciences. Trying to do this with multiple levels. NeuroLex&#8201;&#8212;&#8201;single inheritance, lexicon. NIFSTD, standardize modules under same upper ontology. NIFPlus&#8201;&#8212;&#8201;create intra-domain and more useful hierarchies using properties and restrictions. .</p>
<p>Using logical classification as a result of properties of the entities.</p>
<p>Question&#8201;&#8212;&#8201;how to get the community involved. Need to provide an easy to use platform for community collaboration. They have a semantic wiki for contributing to neurolex. Really lowers the barries for entry for domain experts who wish to use (and extend!) these terms.</p>
<p>Lots of people are starting to use the resources (they find this out because people complain when the systems are broken!).</p>
<p>Contributing to Neurolex. Don&#8217;t need an account, but better if you have one, everything online. Many thing that they are looking at is content, content, content. More stuff the better. Finally, getting people to value ontologists is really important.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/09/neurolex-and-nif-jeff-grethe/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Semantic Integration of biomedical web communities, Tim Clark</title>
		<link>http://www.russet.org.uk/blog/2009/09/semantic-integration-of-biomedical-web-communities-tim-clark/</link>
		<comments>http://www.russet.org.uk/blog/2009/09/semantic-integration-of-biomedical-web-communities-tim-clark/#comments</comments>
		<pubDate>Mon, 07 Sep 2009 09:40:43 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[LiveConference]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1460</guid>
		<description><![CDATA[This is a live blog from Neuroinformatics 2009. Motivation, what is the common feature of a set of disorders. They are all complex disorders, which we don&#8217;t really understand. Alzforum is a nice example of an early web community. Alzheimers forum. Works as an ongoing journal club, with curated discussions. Started off during the early [...]]]></description>
			<content:encoded><![CDATA[<p><strong>This is a live blog from Neuroinformatics 2009.</strong></p>
<p>Motivation, what is the common feature of a set of disorders. They are all complex disorders, which we don&#8217;t really understand.</p>
<p>Alzforum is a nice example of an early web community. Alzheimers forum. Works as an ongoing journal club, with curated discussions. Started off during the early days of the web.</p>
<p>Developed StemBook which is an online book, launched about a year ago. Discussion of stuff that is happening. pd online research, is another alzheimers website, using a toolkit that they have developed. Linking across these forums can be a problem; need some forms of shared terminology server. Science Collaboration Framework. Based around drupal, allows common collaborative tools for biomedicine, shared ontologies/vocabulary and so on.</p>
<p>How do you link between these communities? Issues of semantic annotation; how does this happen? Are systems which allow you to guess what an ontology is; building system which should work across lots of different content management faciltiies. This can bring lots of benefits, as the additional semantics allows you to work around synonyms etc.</p>
<p>Discourse ontologies. SALT&#8201;&#8212;&#8201;semantic annotation of latex.</p>
<p>Need to support a spectrum of different knowledge structures, theasurai and so on. Less complex == more tractable to biologists. Complex and formalised, tractable to computers.</p>
<p>Are now integration discourse ontology into myexperiment and others.</p>
<p>Using existing work on entity recognition and try and produce a provenance aware representation of these results.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/09/semantic-integration-of-biomedical-web-communities-tim-clark/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Semantic Web for Neuroscience, Alan Ruttenberg</title>
		<link>http://www.russet.org.uk/blog/2009/09/semantic-web-for-neuroscience-alan-ruttenberg/</link>
		<comments>http://www.russet.org.uk/blog/2009/09/semantic-web-for-neuroscience-alan-ruttenberg/#comments</comments>
		<pubDate>Mon, 07 Sep 2009 09:11:37 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[LiveConference]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1458</guid>
		<description><![CDATA[This is a live blog from Neuroinformatics 2009. Creative Commons is based around issues with data and copyright, trying to change the idea that not sharing is the default. Science Commons looks at the issues specific to science. Semantic web in a nutshell; adds to web standards and practices encouraging, common naming, ontology development, expression [...]]]></description>
			<content:encoded><![CDATA[<p><strong>This is a live blog from Neuroinformatics 2009.</strong></p>
<p>Creative Commons is based around issues with data and copyright, trying to change the idea that not sharing is the default. Science Commons looks at the issues specific to science.</p>
<p>Semantic web in a nutshell; adds to web standards and practices encouraging, common naming, ontology development, expression in knowledge representation language, easy integration over multiple sources, works both inside and outside the organisational boundaries.</p>
<p>Why should you want this? Network effects, people can use their own skills, and combine knowledge from many different sources. Provides efficiencies at the global scale.</p>
<p>Copy and paste for the semantic web; a mashup with knowledge from Allen brain institute, and google API. Had to screenscrape Allen brain for this.</p>
<p>Trying to look for druggable targets in pyramidal neurons. Google provides too many results, so does pubmed. Shows complex SPARQL query over the knowledge from the web; crossing from MESH to gene to GO. This may not be the best query, but it&#8217;s none the less useful and will make biologists happy.</p>
<p>A brief jump into ontology making. Terms that mix up material and neurotransmitter. Uses example, peptide, neurotransmitter, hormone and ligand; all of these could be peptides, although not necessarily. Need to untangle these. In many cases, these have already been done (ChEBI). Move from English to OWL.</p>
<p>How to build consensus in ontology building&#8201;&#8212;&#8201;somewhat related to OBOFoundary rules. Another program is INCF program for ontology of neural structures.</p>
<p>Challenges&#8201;&#8212;&#8201;building bigger ontologies is hard. Barrier to sharing are a major difficulty.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/09/semantic-web-for-neuroscience-alan-ruttenberg/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
