<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>An Exercise in Irrelevance &#187; Tech</title>
	<atom:link href="http://www.russet.org.uk/blog/category/all/professional/tech/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.russet.org.uk/blog</link>
	<description>Ramblings from Phil Lord&#039;s life</description>
	<lastBuildDate>Thu, 02 Feb 2012 14:11:11 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>KCite &#8212; the next generation</title>
		<link>http://www.russet.org.uk/blog/2011/12/kcite-the-next-generation/</link>
		<comments>http://www.russet.org.uk/blog/2011/12/kcite-the-next-generation/#comments</comments>
		<pubDate>Tue, 13 Dec 2011 11:12:48 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1954</guid>
		<description><![CDATA[Well, I am pleased to say that we have now released the new version of kcite. It&#8217;s been a while in coming&#8201;&#8212;&#8201;I had the difficult bit of the code working about 5 months ago, but then got caught up in teaching. Kcite is our bibliography manager which enables citations such as this one , using [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1954">
<p><a name="preamble"></a> 
<p>Well, I am pleased to say that we have now released the new version of kcite. It&#8217;s been a while in coming&#8201;&#8212;&#8201;I had the difficult bit of the code working about 5 months ago, but then got caught up in teaching. Kcite is our bibliography manager which enables citations such as this one <span class="kcite" kcite-id="ITEM-1">(doi:10.1371/journal.pone.0012258)</span>
, using DOI or PubMed IDs.</p>
<p>Kcite now uses the marvellous <a href="https://bitbucket.org/fbennett/citeproc-js/wiki/Home">citeproc.js</a> to render the bibliography on the client. The main advantage of this for this release is that the biblography formatting is slightly more regular than before. We&#8217;ve also switched to name-author style as the default. There is also a disadvantage which is that the browser has to do lots of Javascript execution client-side; I&#8217;ve made efforts to ensure that this is not too onerous; on my desktop, I have been rendering 200-300 item bibliographies, which is much more than most people will use in practice.</p>
<p>In future versions, however, I feel the use of citeproc-js will really come into it&#8217;s own. We should be able to enable the user to select their own citation style (currently this is the choice of the authors which makes little sense). We can also add any semantics to the HTML that we choose&#8201;&#8212;&#8201;CiTO will come properly, for instance. I can also clean up the &#8220;unresolved&#8221; and &#8220;timed out&#8221; references. However, first thing on the list is to make the call back for the bibliographic data asychronous. Client-side this <strong>should</strong> be easy, as we are already using jquery. Server-side requires rewrite rules which I haven&#8217;t done before, but I think should not be too hard.</p>
<p>On a separate track, now that I have kcite on what I think is a stable technological footing, I can start to extend in other ways, the most obvious being additional forms of identifiers, critically including WordPress posts with kcite enabled. I&#8217;m also pleased that Cross-Ref have recently added the ability to drag metadata in citeproc format (JSON), which means I can skip an integration step.</p>
<p>However, before all of that, we need to restore <a href="http://www.russet.org.uk/blog/2011/09/kblog-has-been-compromised/">kblog</a>. We&#8217;ve taken the opportunity to move it to a better technological footing, and have started to prepare the new machine that it will be hosted on. This has taken a long time, due to a busy start to the (academic) year. Hopefully, getting hacked is not something we will repeat soon.</p>
<p>The current release of kcite is 1.4.1. This fixes two bugs, one reported by Carl Boettinger (so that now the Javascript only loads when necessary) and another I found which writing this post which made editors appears as authors.</p>


<p>Bibliography
      <div class="kcite-bibliography"></div>
</p>


<script type="text/javascript">
      var kcite_citation_data;
      if( kcite_citation_data == undefined ){
          kcite_citation_data = [];
      }
      kcite_citation_data[ 1954 ] = {"ITEM-1":{"source":"doi","identifier":"10.1371/journal.pone.0012258","resolved":true,"id":"ITEM-1","title":"Adding a Little Reality to Building Ontologies for Biology","author":[{"family":"Lord","given":"Phillip"},{"family":"Stevens","given":"Robert"}],"container-title":"PLoS ONE","issued":{"date-parts":[[2010,9,3]]},"page":"e12258-","volume":"5","issue":"9","DOI":"10.1371/journal.pone.0012258","type":"article-journal"}};
</script>


</div> <!-- kcite-section 1954 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2011/12/kcite-the-next-generation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thoughts on a Chimney</title>
		<link>http://www.russet.org.uk/blog/2011/10/thoughts-on-a-chimney/</link>
		<comments>http://www.russet.org.uk/blog/2011/10/thoughts-on-a-chimney/#comments</comments>
		<pubDate>Tue, 18 Oct 2011 12:53:23 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1943</guid>
		<description><![CDATA[While I am currently spending a significant amount of my time promoting the idea that blog technology can be, and should be used for serious scientific material, I thought I would make a post of a different and perhaps more traditional vein: that is, a light-weight idea, with no serious research behind it, but Years [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1943">
<p><a name="preamble"></a> 
<p>While I am currently spending a significant amount of my time promoting the idea that blog technology can be, and should be used for serious scientific material, I thought I would make a post of a different and perhaps more traditional vein: that is, a light-weight idea, with no serious research behind it, but  Years ago now, I created an <a href="http://homepages.cs.ncl.ac.uk/phillip.lord/wiki/energy/index.html">Energy Wiki</a> full of daft ideas for making energy. I last revisted this in 2009, with an idea for <a href="http://www.russet.org.uk/blog/2009/05/the-sea-cylinder-storage-system/">storing energy at sea</a>. I&#8217;d actually forgotten that part of the reason for this was to try out Inkscape, which is part of the reason for this post. I wanted to try a bit of multi-media, that is, a blog post with an image in it. High tech.</p>
<p>So, the idea. One form of renewable is the <a href="http://en.wikipedia.org/wiki/Solar_updraft_tower">Solar Updraft Tower</a>, also known as a solar chimney. This works straightforwardly enough: you build a large greenhouse in a desert, with a very large chimney in the middle. The top of the chimney is in cold air, the bottom in hot, and an updraft results; stick a turbine in or at the base of the chimney, and you get energy out.</p>
<p>The problem is to work at all efficiently, you need a big temperature differential, so a tall chimney. This in turn means a wide chimney, both to support a substantial updraft, and for mechanical reasons. Tall means 500m or more. The bottom line of this is that a pretty significant capital expenditure is required, followed by a relatively long pay-back period, which in turn means that the biggest single expense of the project is likely to be interest charges, rather than anything else.</p>
<p>So, my idea, is to use an inflatable chimney instead. Initially, I thought about some kind of helium lifting scheme, but then I realised that this makes no sense; why not use hot air, which after all is what the whole system is designed to generate. Consider, for instance, the following organisation:</p>
<p><img src="http://www.russet.org.uk/blog/wp-content/uploads/2011/10/inflatable_solar_chimney.png" style="border-width: 0;" alt="Inflatable Chimney" height="500"></p>
<p>Essentially, it&#8217;s a traditional balloon with a hole in the middle. Obviously the whole system is stackable&#8201;&#8212;&#8201;a second balloon could be placed on top of the first and so on. The whole structure could be assembled or dissassembled as desired. Unfortunately, though this would probably take quite a bit of work.</p>
<p>My second thought came from the idea that, while most designs for solar chimneys have the chimney in the middle of the greenhouse, it doesn&#8217;t really need to be. A horizontal pipe to the middle would be enough. The chimney could be outside of the greenhouse. The advantage that this brings is that the tower could be raised or lowered in-situ, without the risk of it falling on, and damaging the greenhouse. So my second idea was to build the chimney as a two cylinders, with the gap between the serving as the inflatable, buoyant structure. By pleating the cylinders in opposite directions like so:</p>
<p><img src="http://www.russet.org.uk/blog/wp-content/uploads/2011/10/concertina_chimney.png" style="border-width: 0;" alt="Concertina Chimney" height="500"></p>
<p>the whole structure should concertina up and down. By inflating from the top and deflating from the bottom, it should be possible to raise or lower the entire system by opening and shutting vents at the bottom or top of each section to the inside of the chimney.</p>
<p>One advantage with this system, is that as the chimney gets higher, the temperature differential between the inside and the outside gets greater, which should mean that the taller the tower, the more bouyant the sections get; this should help to keep the entire thing as upright as possible, as will the air travelling through the middle, like some gigantic party blower.</p>
<p>Another addition that cames to mind would be to add inflatable half-toroids around the chimney at regular intervals. With a curve on the top, and a flat bottom-side, the entire thing should operate like an aerofoil, lifting the tower up; so, the windier it gets, the greater the lift, which is just what is needed to keep it as upright as possible. This should mean that the chimney can operate in relatively high wind levels.</p>
<p>This kind of system could even work in concert with a fixed chimney&#8201;&#8212;&#8201;extending the height by 500m say, and increasing it&#8217;s efficiency. It could also act as a supplement&#8201;&#8212;&#8201;operating only on very hot days when the greenhouse has excess capacity. Or, finally, it could operate while the main chimney was being built, meaning that a plant can start generating income earlier, which should reduce the cost of interest payments.</p>
<p>Of course, this all comes with drawbacks: the ongoing running costs are likely to be a significant; wind will remain a significant factor regardless; and, finally, inflating the tower will using hot air, which will reduce the efficiency of the whole system. Are these flaws significant? Well, as I said, this post is light-weight with no serious research behind it. I have no idea, nor any really clear idea about how to work out these costs. Answers on a postcard please.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1943 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2011/10/thoughts-on-a-chimney/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Kblog has been compromised</title>
		<link>http://www.russet.org.uk/blog/2011/09/kblog-has-been-compromised/</link>
		<comments>http://www.russet.org.uk/blog/2011/09/kblog-has-been-compromised/#comments</comments>
		<pubDate>Tue, 20 Sep 2011 13:35:07 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1939</guid>
		<description><![CDATA[I have been pushing the idea of Kblogs&#8201;&#8212;&#8201;scientific publishing using commodity software&#8201;&#8212;&#8201;for a year or so know. Our main site, Knowledgeblog.org has got around 100 articles now, and has had about 50k page views (or about 4x the number of raw page hits) and has generated a certain presence on the internet. While this is [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1939">
<p><a name="preamble"></a> 
<p>I have been pushing the idea of Kblogs&#8201;&#8212;&#8201;scientific publishing using commodity software&#8201;&#8212;&#8201;for a year or so know. Our main site, <a href="http://knowledgeblog.org">Knowledgeblog.org</a> has got around 100 articles now, and has had about 50k page views (or about 4x the number of raw page hits) and has generated a certain presence on the internet. While this is generally good, the price of fame is that we have moved somewhat up the list of potential hack targets. Unfortunately, this has resulted in two compromises on the machine; they were probably not disconnected, although we have no evidence to link the two at the moment.</p>
<p>The first was through the timthumb zero day vulnerability. It involved a code injection into a WordPress installation using a thumb nail generator with a dodgy bit of PhP in it. We cleaned the system up as well as we are able and went from there. Sadly, a couple of days ago, we had a second break in. This was a more serious and directed attack (the timthumb was scripted, and we were one of several thousands of sites to be hit). In this case, the machine has been root compromised, and the web server used to gather username/passwords in a phishing expedition. We do have backups and all of the content. There were a number of things that we could have done to secure the machine further, at least one of which may have prevented the hack, but there are only so many hours in the day.</p>
<p>So, where does this leave us? Is the whole idea of knowledgeblog broken? Personally, I do not think so. While I have been critical of the cost associated with academic publishing, I am aware that it cannot happen for free. Running and maintaining a web server takes money; it is something that we have been doing on a shoe-string for a while, especially since our JISC money ran out. In the couple of years that we have run knowledgeblog, I think that we have learned and shown a lot. As well as page views and content, we have shown that scientific publishing can be easy for the author; that we can generate attractive articles this way; that we can start to embed computational accessible knowledge into these articles. We have shown that we can do peer-review, if we need. We have shown we can <a href="http://wayback.archive.org/web/*/http://knowledgeblog.org">archive</a> and preserve for the future. We have shown that knowledgeblog is good for grey literature. We have added <a href="http://www.russet.org.uk/blog/2011/02/the-problem-with-dois/">DOIs</a>. Multiple authors. Good looking <a href="http://www.russet.org.uk/blog/2010/08/latex-to-wordpress/">maths</a>. We even have some preliminary stats on how much publication costs from Word doc to website.</p>
<p>At the moment, though, we do not have a business model. It is clear that if we are to move this forward, it needs to be run as a service, managed, and looked after, something which is neither my expertise or desire. The analogy that I have made earlier with <a href="http://www.russet.org.uk/blog/2011/06/the-naivete-of-scientists/">Wikipedia</a> is, I think, a good one; it would be good to move this into a foundation status.</p>
<p>The path from here to there is a long one, however. For the moment, we will restore knowledgeblog, and it will re-emerge, although at this time of year, it will take a while. But we look to the future as well.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1939 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2011/09/kblog-has-been-compromised/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>WordPress for Authoring</title>
		<link>http://www.russet.org.uk/blog/2011/05/wordpress-for-authoring/</link>
		<comments>http://www.russet.org.uk/blog/2011/05/wordpress-for-authoring/#comments</comments>
		<pubDate>Tue, 10 May 2011 14:58:21 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1902</guid>
		<description><![CDATA[In a typically thoughtful post, Peter Sefton discusses the advantages and disadvantages of WordPress as an authoring environment. I though I would clarify my feelings on this a little. Previously, from our experience on Knowledge Blog suggests to us that the WordPress environment is very poor for editing, something we have expressed in our process [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1902">
<p><a name="preamble"></a> 
<p>In a typically thoughtful <a href="http://jiscpub.blogs.edina.ac.uk/2011/05/10/wordpress/">post</a>, Peter Sefton discusses the advantages and disadvantages of WordPress as an authoring environment. I though I would clarify my feelings on this a little.</p>
<p>Previously, from our experience on <a href="http://www.knowledgeblog.org">Knowledge Blog</a> suggests to us that the WordPress environment is very poor for editing, something we have expressed in our <a href="http://process.knowledgeblog.org/3">process</a> documentation.</p>
<p>I should be clear that this is in the context of knowledgeblog. Academics have their own way of working, and normally are used to this. They use tools which fit with their lifes. For example, Google docs is a good tool but, basically, useless if you do most of your paper writing on an plane. The same will be true for tools such as <a href="http://annotum.wordpress.com">Annotum</a> if it ever appears. It is hard to beat Word and email (or frequently dropbox nowadays).</p>
<p>Of course, there are other ways; for example WordPress offers &#8220;A complete revision history of the document is maintained with the ability to roll-back to earlier versions&#8221;. But, then, so does Word with dropbox. And the WordPress facilities are in no way comparable to the versioning that you get with latex, or asciidoc and Subversion or Git. Although, in practice, I rarely use versioning when authoring, and dropbox&#8217;s poor-mans roll-back is enough.</p>
<p>The only clear advantage of using WordPress tools is that you don&#8217;t need a two stage publication process. But, the general idea behind blogs, is that publication does not happen often; it happens once, and then the post remains. This is in contrast to a Wiki, where using external editing tools is impractical at best. And the situation is very similar to current publication where PDF is the common medium.</p>
<p>My conclusion&#8201;&#8212;&#8201;there are lots of people, lots of use cases, and lots of requirements. I don&#8217;t say that authoring <strong>must</strong> be independent from the publication environment; I do say that publication environment <strong>must</strong> not require a single authoring tool. Fortunately, for the tools that we have created for <a href="http://www.knowledgeblog.org">kblog</a>, we can afford to be agnostic. They will work integrated with WordPress editing also. Still, I just spent 10 minutes longer making this post than I need to, to stop the shortcodes in Peter&#8217;s quote below from being kcite&#8217;d (check the source for the trick!), which was harder because I use asciidoc. There are going to be problems. Supporting a heterogenous environment is painful. I wish there were a perfect solution, but there there are just a set of messy compromises.</p>
<p>Peter also makes a second point about our plugins (and others): that is, that they are non-standard.</p>
<blockquote><p>There are similar issues/risks with stuff like WordPress shortcodes such as KCite from KnowledgeBlogs. It’s a great tool for authors, allowing them to cite things in a rational way:</p>
<p><tt>DOI Example – &#91;cite source=’doi’]10.1021/jf904082b[/cite]</tt></p>
<p><tt>PMID example – &#91;cite source=’pubmed’]17237047[/cite]</tt></p>
<p>But it’s proprietary to a particular processing environment.</p>
<p>There is a risk of creating a new form of the proprietary lock-in we had up until recently (and arguably we still have) with document formats like Microsoft’s .doc.</p>
<p align="right"> &#8212; Peter Sefton </p>
</blockquote>
<p>It&#8217;s a fair point, and one which I agree with. The last thing that we need is hundreds of independent shortcode or other syntaxes; I mean, imagine what a nightmare it would be if every single Wiki engine and text conversion tool used their own, almost identical, but slightly different and incompatible syntax. Hmmm.</p>
<p>We chose to use shortcodes for two highly pragmatic reasons. First, WordPress has nice support for them. Building a shortcode handler is nice and simple and does not require us to build regexps (the first version did it by hand for one reason or another, and the regexps were painful). The second reason stems from our desire for a decoupled authoring environment. Shortcodes pass through the HTML publishing step without escaping; to use XML or HTML compliant mechanisms would require us to change, for example, the HTML export mechanism of Word. Not somewhere we wished to go.</p>
<p>In practice, however, I don&#8217;t think that this is a major problem, if the code is written carefully. With Mathjax-latex, the shortcodes are transfered into Mathjax syntax, then mathjax does the rest. The development version of kcite works this way&#8201;&#8212;&#8201;the shortcodes are translated into a <tt>span</tt>-tag based microformat, then the bibliography tools operate on the client to format the bibliography. So long as the code is crafted reasonable, it should not be dependant on WordPress.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1902 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2011/05/wordpress-for-authoring/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Greyhole and Scientific Data Handling</title>
		<link>http://www.russet.org.uk/blog/2011/02/greyhole-and-scientific-data-handling/</link>
		<comments>http://www.russet.org.uk/blog/2011/02/greyhole-and-scientific-data-handling/#comments</comments>
		<pubDate>Sun, 20 Feb 2011 21:37:38 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1875</guid>
		<description><![CDATA[I was delighted recently to discover Greyhole. Essentially, it&#8217;s a system that allows you to configure a Samba share at one end, and a bunch of disks at the other. The disks get the data shared between them, with a configurable level of duplication. It&#8217;s aimed mainly at the home user, who wants a higher [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1875">
<p> <a name="preamble"></a> 
<p>I was delighted recently to discover <a href="http://www.greyhole.org">Greyhole</a>. Essentially, it&#8217;s a system that allows you to configure a Samba share at one end, and a bunch of disks at the other. The disks get the data shared between them, with a configurable level of duplication. It&#8217;s aimed mainly at the home user, who wants a higher degree of data security than the single drive approach provides, but is not going to go the expensive and poorly scalable RAID approach.</p>
<p>The implementation is fairly straight-forward and elegant. The Samba share is provided by a customised Samba virtual file system. This augments the standard process by logging to a spool region (one file per file operation). A daemon consumes these files, stuffing them into a database, then consumes the entries in the database. Essentially, if anything has changed, greyhole rsyncs the change to one or more of the backend disks.</p>
<p>It&#8217;s a really nice system. I must admit that PhP wouldn&#8217;t have been my first choice, but that is horses for courses. Likewise, the dependency on Samba is unfortuante&#8201;&#8212;&#8201;I always found it a pig to configure, besides which I&#8217;d like to use this internally on a linux box. I had a <a href="http://code.google.com/p/greyhole/issues/detail?id=35">discussion</a> with the author Guillaume Boudreau, who confirmed my initial feeling that the Samba VFS could be easily replaced with another, such as FUSE. I&#8217;d like to have a go at doing this work, and it&#8217;s very possible&#8201;&#8212;&#8201;basically, it requires a big merge between Guillaumes VFS and the FUSE based <a href="http://loggedfs.sourceforge.net">loggedfs</a>. If I had written any C, I could probably do it in a day or so, but as it stands, it is likely to take longer.</p>
<p>As well as home usage, though, this could also be good for the researcher. While a small lab could pay for managed storage, this tends to come in at £1000 per TB, per annum. Most labs don&#8217;t need 24/7 recovery though, and the data is often write once, read occasionally. Greyhole would work out for 1TB at 200 quid (for a low-wattage PC server), 100 quid two 1TB discs which would cost, say, 40 quid to power for a year (say, 15W for the computer, 10W for the hard drives, and a bit more for networking, adaptors, USB hubs and so). For lab usage, the drives would probably last 2-3 years at least, while an all solid state computer might last twice this long. More storage space could be added as needed, dropping the cost per TB substantially, although how scalable greyhole is I don&#8217;t know.</p>
<p>The general approach could be used more widely, though. As well as JBOD spanning, what about:</p>
<table cellpadding="4"> 
<tr valign="top"> 
<td> <strong>Blackhole</strong>  </td>
<td> 
<p> The lab runs a local disc for their own data access needs, which is backed up to a institutional data store somewhere off-site. The daemon could be configured to use late night bandwidth, which would only compromise data security slightly. </p>
</td>
</tr>
<tr valign="top"> 
<td> <strong>Whitehole</strong>  </td>
<td> 
<p> More in line with my style of science, the local disc would be backed up to a public accessible repository. Obviously this would require suitable metadata to describe the status of the data, but everything would be sharable and accessible as it was produced. </p>
</td>
</tr>
<tr valign="top"> 
<td> <strong>Wormhole</strong>  </td>
<td> 
<p> Many labs collaborate with one or two others. A wormhole file system would be configured so that data placed on my file share would magically appear, read-only, in one or more places on the internet, using a rsync/ssh pipe. My collaborators data would, likewise, appear on my disc. </p>
</td>
</tr>
<tr valign="top"> 
<td> <strong>Plughole</strong>  </td>
<td> 
<p> This would replicate the normal scientific &#8220;supplementary data&#8221; process for releasing data publically. Essentially, everything on the file system would, after a significant period, be converted into an excel spreadsheet with no column titles or any additional metadata. This would then be placed in a web accessible location for between 2-6 months, before being randomly deleted. </p>
</td>
</tr>
</table>
<p>I&#8217;m buying a low power consumption <a href="http://www.aleutia.com/products/t1">PC</a> to try out greyhole in it&#8217;s current form, to see how it goes.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1875 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2011/02/greyhole-and-scientific-data-handling/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tooled up bibliographies</title>
		<link>http://www.russet.org.uk/blog/2011/02/tooled-up-bibliographies/</link>
		<comments>http://www.russet.org.uk/blog/2011/02/tooled-up-bibliographies/#comments</comments>
		<pubDate>Fri, 11 Feb 2011 17:35:47 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1862</guid>
		<description><![CDATA[I&#8217;ve just got around to installing the magnificient kcite plugin that Simon Cockell wrote for knowledgeblog. It&#8217;s actually a really simple plugin, but it&#8217;s tremedously useful. For instance, I can now cite my own papers on reality , function or protein classification and all the metadata will be gathered and cited for me in a [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1862">
<p> <a name="preamble"></a> 
<p>I&#8217;ve just got around to installing the magnificient <a href="http://knowledgeblog.org/kcite-plugin/">kcite</a> plugin that <a href="http://blog.fuzzierlogic.com/">Simon Cockell</a> wrote for <a href="http://knowledgeblog.org">knowledgeblog</a>. It&#8217;s actually a really simple plugin, but it&#8217;s tremedously useful. For instance, I can now cite my own papers on reality <span class="kcite" kcite-id="ITEM-1">(doi:10.1371/journal.pone.0012258)</span>
, function <span class="kcite" kcite-id="ITEM-2">(doi:10.1186/2041-1480-1-S1-S4)</span>
 or protein classification <span class="kcite" kcite-id="ITEM-3">(doi:10.1093/bioinformatics/btl208)</span>
 and all the metadata will be gathered and cited for me in a nice reference list at the end.</p>
<p>Of course, I am used to the good life, and this is still all a bit clunky for me. I wanted support from my text editor. For this blog, I use a tool-chain of Emacs, asciidoc and blogpost. But for references I use reftex mode and bibtex. Now I realise that this is a pretty minority tool-chain, but it seemed to me that it should be possible to get it working. And it is, actually, pretty easy. Very rough and ready, but the lisp is below. Obviously, this will need fiddling with for each user, and I will improve it over time.</p>
<p>But it demonstrates the point, I think. A little bit of glue can produce a pretty good publishing tool chain, relatively quickly.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" style="margin:0.2em 0;"> 
<tr>
<td style="padding:0.5em;">
<pre style="margin:0; padding:0;">(add-hook 'adoc-mode-hook
          'phil-asciidoc-reftex-support)

(defvar phil-reftex-citation-override nil)

(defun phil-asciidoc-reftex-support()
  (reftex-mode 1)
  (make-local-variable 'phil-reftex-citation-override)
  (setq phil-reftex-citation-override t)
  (make-local-variable 'reftex-default-bibliography)
  (setq reftex-default-bibliography
        '("~/documents/bibtex/phil_lord_refs.bib"
          "~/documents/bibtex/phil_lord/journal_papers.bib"
          "~/documents/bibtex/phil_lord/conference_papers.bib"
          )))

(defadvice reftex-format-citation (around phil-asciidoc-around activate)
  (if phil-reftex-citation-override
      (progn
        (setq ad-return-value (phil-reftex-format-citation entry format)))
    ad-do-it))

(defun phil-reftex-format-citation( entry format )
  (let ((doi (reftex-get-bib-field "doi" entry)))
    (format "pass:[<span class="kcite" kcite-id="ITEM-4">(doi:)</span>
%s[/cite\\]]" doi)))</pre>
</td>
</tr>
</table>


<p>Bibliography
      <div class="kcite-bibliography"></div>
</p>


<script type="text/javascript">
      var kcite_citation_data;
      if( kcite_citation_data == undefined ){
          kcite_citation_data = [];
      }
      kcite_citation_data[ 1862 ] = {"ITEM-1":{"source":"doi","identifier":"10.1371/journal.pone.0012258","resolved":true,"id":"ITEM-1","title":"Adding a Little Reality to Building Ontologies for Biology","author":[{"family":"Lord","given":"Phillip"},{"family":"Stevens","given":"Robert"}],"container-title":"PLoS ONE","issued":{"date-parts":[[2010,9,3]]},"page":"e12258-","volume":"5","issue":"9","DOI":"10.1371/journal.pone.0012258","type":"article-journal"},"ITEM-2":{"source":"pubmed","identifier":"20626924","resolved":true,"id":"ITEM-2","title":"An evolutionary approach to Function.","author":[{"family":"Lord","given":"Phillip"}],"container-title":"Journal of biomedical semantics","issued":{"date-parts":[[2010,6,22]]},"type":"article-journal"},"ITEM-3":{"source":"doi","identifier":"10.1093/bioinformatics/btl208","resolved":true,"id":"ITEM-3","title":"Protein classification using ontology classification","author":[{"family":"Wolstencroft","given":"K."},{"family":"Lord","given":"P."},{"family":"Tabernero","given":"L."},{"family":"Brass","given":"A."},{"family":"Stevens","given":"R."}],"container-title":"Bioinformatics","issued":{"date-parts":[[2006,7,15]]},"page":"e530-e538","volume":"22","issue":"14","DOI":"10.1093/bioinformatics/btl208","type":"article-journal"},"ITEM-4":{"source":"doi","identifier":"","resolved":true,"id":"ITEM-4","title":"","author":[],"container-title":null,"type":"article-journal"}};
</script>


</div> <!-- kcite-section 1862 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2011/02/tooled-up-bibliographies/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Problem with DOIs</title>
		<link>http://www.russet.org.uk/blog/2011/02/the-problem-with-dois/</link>
		<comments>http://www.russet.org.uk/blog/2011/02/the-problem-with-dois/#comments</comments>
		<pubDate>Tue, 08 Feb 2011 16:20:48 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1849</guid>
		<description><![CDATA[This article was jointly author by Phillip Lord and Simon Cockell. Rhodopsin is a protein found in the eye, which mediates low-light-level vision. It is one of the 7-transmembrane domain proteins and is found in many organisms including human. Rhodopsin has an number of identifiers attached to it, which allow you to get additional data [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1849">
<p> <a name="preamble"></a> 
<p>This article was jointly author by Phillip Lord and <a href="http://blog.fuzzierlogic.com/archives/473">Simon Cockell</a>.</p>
<p>Rhodopsin is a protein found in the eye, which mediates low-light-level vision. It is one of the 7-transmembrane domain proteins and is found in many organisms including human.</p>
<p>Rhodopsin has an number of identifiers attached to it, which allow you to get additional data about the protein. For instance, the human version is identified by the string &#8220;OPSD_HUMAN&#8221; in <a href="http://www.uniprot.org">uniprot</a>. If you wish, you can go to <a href="http://www.uniprot.org/OPSD_HUMAN">http://www.uniprot.org/OPSD_HUMAN</a> and find additional information. Actually, this URI redirects to <a href="http://www.uniprot.org/P08100.html">http://www.uniprot.org/P08100.html</a>. P08100 is an alternative (semantic-free) identifier for the same protein; P08100 is called the accession number and it is stable, as you can read in the <a href="http://expasy.org/sprot/userman.html#AC_line">user manual</a>. If you don&#8217;t like the HTML presentation, you can always get the traditional structured text so beloved of bioinformatics; this is at <a href="http://www.uniprot.org/P08100.txt">http://www.uniprot.org/P08100.txt</a>. Or the Uniprot XML (that is at <a href="http://www.uniprot.org/P08100.xml">http://www.uniprot.org/P08100.xml</a>). Or <a href="http://www.uniprot.org/P08100.rdf">http://www.uniprot.org/P08100.rdf</a> if you want RDF. If you just want the sequence, that is at <a href="http://www.uniprot.org/P08100.fasta">http://www.uniprot.org/P08100.fasta</a>, or <a href="http://www.uniprot.org/P08100.gff">http://www.uniprot.org/P08100.gff</a> if you want the sequence features. You might be worried about changes over time, in which case you can see all at <a href="http://www.uniprot.org/uniprot/P08100?version=*">http://www.uniprot.org/uniprot/P08100?version=*</a>. Or if you are worried about changes in the future, then <a href="http://www.uniprot.org/uniprot/P08100.rss?version=*">http://www.uniprot.org/uniprot/P08100.rss?version=*</a> is the place to be. Obviously, if you want to move outward from here to the DNA sequence, or a report about the protein family, or any of the domains, then all of that is linked from here. If you don&#8217;t want to code this for yourself, there are libraries in perl, python and java which will handle these forms of data for you.</p>
<p>So this might be overkill, but the point is surely clear enough. It&#8217;s very easy to get the data in a multiple variety of formats, through stable identifiers. The history is clear, and the future as clear as it can be. The technology is simple, straight-forward both for humans and computers to access. The world of the biologist is a good place to be.</p>
<p>What does this have to do with DOIs. Let&#8217;s consider a section of publications from one of us. Of course, one of the nice things about DOIs is that you can convert them into URIs. But what do they point to? Well, a variety of different things. Maybe the <a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0012258">full HTML article</a>. Or, <a href="http://www.liebertonline.com/doi/abs/10.1089/omi.2008.0080">perhaps</a> an HTML abstract and a picture of the front page. Or more <a href="http://dx.doi.org/10.1109/MIS.2004.1265885">links</a>. Or, bizarrely, a <a href="http://bib.oxfordjournals.org/content/8/1/45">list</a> of the author biographies. Or just another image of a print out of the <a href="http://dx.doi.org/10.1007/11574620_56">front page</a> of a identified digital object.</p>
<p>These are a selection from our conference and journal publications. Obviously, this doesn&#8217;t cover many of our conference papers, as most don&#8217;t have DOIs unless they are published by a big publisher. Or our books. These are published by big publishers, but obviously they are books which is different. I&#8217;ve also organised or been on the PC for a number of workshops. They don&#8217;t have DOIs either. All of them do have URIs.</p>
<p>In no case, can we guarantee that what we see today will be the same as what we get tomorrow, even though DOIs are supposedly persistent. The presentation of the HTML on those pages that display HTML is wildly different; in many cases, there is no standard metadata. Given the DOI, there doesn&#8217;t appear to be a standard way to get hold of the metadata. If you poke around really hard on the DOI website, you may get to <a href="http://www.doi.org/tools.html">http://www.doi.org/tools.html</a>. At this point, you probably already know about <a href="http://dx.doi.org">http://dx.doi.org</a>, which allows you to resolve a DOI through HTTP. The list of links doesn&#8217;t take that long to work through, so you might eventually get to <a href="http://www.crossref.org">http://www.crossref.org</a>. From here, you can perform searches, including extracting metadata for articles; obviously, you need to register, and you need an API key for this. It doesn&#8217;t always work, so if that fails, you can try <a href="http://www.pubmed.org">http://www.pubmed.org</a>, which returns metadata for some DOIs that CrossRef doesn&#8217;t, but doesn&#8217;t hold a DOI for every publication it lists (even those that have them), so it also fails in unpredictable ways.</p>
<p>The difference between the two situations couldn&#8217;t really be clearer. Within biology, we have an open, accessible and usable system. With DOIs, we don&#8217;t. The DOI handbook spends an awful lot of time describing the advantages of DOIs for publishers; very little is spent on the advantages for the people generating and accessing the content. It is totally unclear to us what use case DOIs are trying to address from our point of view; what ever it is, they certainly seem to fail of their purpose.</p>
<p>So, why do we care about this? Well, recently, we have been implementing a DOIs for <a href="http://www.knowledgeblog.org">kblogs</a>. <a href="http://ontogenesis.knowledgeblog.org">Ontogenesis</a> articles now all have DOIs. When we were originally thinking about <a href="http://www.knowledgeblog.org">kblogs</a>, our investigations on how to mint new DOIs came to very little. If DOIs are hard to use, creating them is even worse, you need a Registration Authority; setting this up within a university would be a nightmare. Compare this to the £9 credit card transaction required for a domain name (even this can be quite hard in a University setting!). In the end, we have managed to achieve this using <a href="http://www.datacite.org">DataCite</a>. Ironically, they are misusing technology intended for articles to represent data; we are misusing DataCite to represent articles again. We also have to keep a hard record of our own of the DOIs we have minted, because, despite the fact all this information is stored in the Datacite database, there is no way of discovering if a DOI points at a given URL using the Datacite API, so we have no way of doing a reverse lookup from a blogpost to discover its DOI.</p>
<p>We&#8217;ve also created a <a href="http://knowledgeblog.org/kcite-plugin/">referencing</a> system for WordPress. This does DOI lookups for the user, currently using CrossRef, or PubMed. We are not sure yet whether we can retrieve DataCite metadata in this way also.</p>
<p>The irony of this is that it is all totally pointless. WordPress already creates permalinks, based on a URI. These URIs are trackback/pingback capable so can be used bi-directionally. We have added support so that URIs maintain their own version history, so that you can see all previous versions. If you do not trust us, or if we go away, then <a href="http://knowledgeblog.org">URIs</a> are archived and versioned by the <a href="http://www.webarchive.org.uk/wayback/archive/20110111230930/http://knowledgeblog.org/">UK Web archive</a>. Currently, we are adding features for better metadata support, which will use a simple REST style API like Uniprot. Hopefully, multiple format and subsection access will follow also.</p>
<p>So, why are we using DOIs at all? For the same reason as DataCite which has as one of it&#8217;s <a href="http://datacite.org/whatisdc.html">aims</a> &#8220;to increase acceptance of research data as legitimate, citable contributions to the scientific record&#8221;. We need DOIs for <a href="http://www.knowledgeblog.org">kblog</a> because, although DOIs are pointless, they have become established, they are used for assigning credit, and they are used as a badge of worth. For us, we find it unfortunate, that in the process of using DOIs, we are supporting their credentials as a badge of worth, but it seems the course of least resistance.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1849 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2011/02/the-problem-with-dois/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Latex to WordPress</title>
		<link>http://www.russet.org.uk/blog/2010/08/latex-to-wordpress/</link>
		<comments>http://www.russet.org.uk/blog/2010/08/latex-to-wordpress/#comments</comments>
		<pubDate>Thu, 26 Aug 2010 15:34:34 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1740</guid>
		<description><![CDATA[LaTeX to WordPress Phillip Lord This post describes the process of posting to WordPress from a LaTeX source file, using tools generated as part of the Knowledgeblog project. 1 Introduction About a month ago, we managed to get funding from JISC for knowledgeblog; the idea is to turn a blog platform from something for light [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1740">
<div class="titlepage"> 
<h1>LaTeX to WordPress</h1>
<p>Phillip Lord</p>
</p></div>
<div class="abstract"> This post describes the process of posting to WordPress from a LaTeX source file, using tools generated as part of the Knowledgeblog project. </div>
<h1 id="sec:introduction">1 Introduction</h1>
<p>About a month ago, we managed to get funding from <a href="http://www.jisc.ac.uk">JISC</a> for <a href="http://www.knowledgeblog.org">knowledgeblog</a>; the idea is to turn a blog platform from something for light commentary into a framework for serious scientific publication. One of the key requirements for this is to fit in with peoples existing working practices; and for this, we need a good document creation environment. This means word and latex. I’ve been working mostly on the latter, and this post is the first outcome. It’s generated totally automatically from latex. This is an advance on my paper on <a href="http://www.russet.org.uk/blog/2010/07/realism-and-science/">realism</a> which was semi-automatically converted, with some hand editing of the HTML. </p>
<p>At the moment, the tool-chain is a little bit clunky, but it will improve! This is not meant to be an annoucement that all is ready, just an early alpha release and proof-of-principle. </p>
<h1 id="a0000000002">2 Implementation</h1>
<p>The implementation of these tool-chain uses three pieces of software: </p>
<dl class="description">  
<dt>latextowordpress: </dt>
<dd>
<p>This package, that I have written, uses <a href="http://plastex.sourceforge.net">plasTeX</a> to parse and render the latex into HTML. Most of the work is being performed by plasTeX out-of-the-box, although using a non-default configuration. Math-mode is being treated separately however, rather than using plasTeXs default image rendering approach. </p>
</dd>
<dt>blogpost:</dt>
<dd>
<p><a href="http://www.methods.co.nz/asciidoc/#_blogpost_weblog_client">blogpost</a> is being used to actually post the generated HTML onto the web. The HTML can also be cut-and-paste directly into wordpress, but blogpost is easier for me, as its the usual tool I use anyway (normally over asciidoc source). Blogpost is unmodified. </p>
</dd>
<dt>mathjax-latex: </dt>
<dd>
<p>This is a wordpress plugin, that I have written, which uses <a href="http://www.mathjax.org">MathJax</a> to render math-mode from the original latex in the browser. The plugin just injects the mathjax javascript headers into a post on-demand (i.e. only on posts with math-mode in them). </p>
</dd>
</dl>
<p>Currently, this is all held together with some dodgy makefiles; this will be improved in time. </p>
<p>The first and last of these tools are available from <a href="http://services.knowledgeblog.org/download/">knowledgeblog</a>. I’ve tested them on Ubuntu 10.04 and they are in alpha. Comments are welcome, to <a href="mailto:knowledgeblog-discuss@knowledgeblog.org">knowledgeblog-discuss</a>. </p>
<h1 id="a0000000003">3 Key Features</h1>
<p>At the moment, I haven’t fully explored all the features of LaTeX that are well supported. However, all the structural elements (sections, lists), bibliographies, links via the <a href="http://www.tug.org/applications/hyperref/manual.html">hyperref</a> package all seem to work well. </p>
<p>The math mode rendering works well. I’ve been using one famous equation: \(E=mc^2\), as my main test. But more complex examples work also. This is from <a href="http://www.mathjax.org">mathjax</a>:\(J_\alpha (x) = \sum _{m=0}^\infty \frac{(-1)^ m}{m! \,  \Gamma (m + \alpha + 1)}{\left({\frac{x}{2}}\right)}^{2 m + \alpha }\). </p>
<p>I’ve made a few tweaks to this also for common idioms. So the lesser than symbol is written in mathmode in latex but rendered directly in HTML: &lt;. </p>
<h1 id="sec:future">4 Future Work</h1>
<p>There are many things left to do yet. The process needs to made smooother, with a single tool to hook the current tool-chain together; it would be good to attach a PDF generated from the latex also. Currently, titles are set independently (which is why this post appears to have two titles). The mathjax plugin needs configuration options (it overwrites wp-latex functionality at the moment). And there is significant testing to do to see what advanced features (figures critically!) work and don’t work. Still, it’s good to see that most of the tools that I needed to get this work already existed. With luck, most of the other tools we need will be as good. </p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1740 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/08/latex-to-wordpress/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A new grant for Knowledgeblog</title>
		<link>http://www.russet.org.uk/blog/2010/08/a-new-grant-for-knowledgeblog/</link>
		<comments>http://www.russet.org.uk/blog/2010/08/a-new-grant-for-knowledgeblog/#comments</comments>
		<pubDate>Mon, 02 Aug 2010 14:06:36 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Grants]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1729</guid>
		<description><![CDATA[  I&#8217;m very pleased that our grant for knowledgeblog has been accepted by JISC. I shall follow the tradition that I set with my last post, of publishing all my primary scientific output on this blog. In this case, I&#8217;m using Word, which like the latex that I used last time isn&#8217;t perfect. Still improving [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1729">
<p>
 </p>
<p><span style="font-family:Arial">I&#8217;m very pleased that our grant for <a href="http://www.knowledgeblog.org">knowledgeblog</a> has been accepted by JISC. I shall follow the tradition that I set with my <a href="http://www.russet.org.uk/blog/2010/07/realism-and-science/">last</a> post, of publishing all my primary scientific output on this blog. In this case, I&#8217;m using Word, which like the latex that I used last time isn&#8217;t perfect. Still improving this process is part of the knowledgeblog proposal, so this post is also attacking a key deliverable for the grant!
</span></p>
<p>
</span></p>
<p><span style="font-family:Arial">The main content for this post is also available on the <a href="http://knowledgeblog.org/category/all">knowledgeblog</a> events blog.</p>
<p> </p>
<p><span style="font-family:Arial"><strong>Outline Project Description
</strong></span></p>
<h1><span style="font-family:Arial; font-size:11pt">The project extends existing blogging tools for use as a lightweight, semantically linked publication environment. This enables researchers to create a hub in the linked-data environment, that we call <em>knowledge</em> or <em>k-blogs</em>.  K-blogs are convenient and straight-forward for authors to use, integrating into researchers existing work practices and tools. The provide readers with distributed feedback and commenting mechanisms. We will support three communities (microarray, public health and workflow), providing immediate benefit, in addition to the long term benefit of the platform as a whole.  Additionally, this will enable a user-centric development approach, while showcasing the platform as the basis for next generation research publishing. 1. Introduction 
</span></h1>
<p><span style="font-family:Arial"><sup>1</sup>This document describes a proposal for a project within the JISC &#8220;Managing research Data&#8221; call. Data comes in many forms, from raw statistics, to highly structured databases, through to textual reports; natural language, although hard to search and manage, is still the richest form of representation; data in the form of reports and publications are the central hub around which all other data sit. This project, therefore, will provide a lightweight, yet extensible, framework for scientific publishing, incorporating a software-supported peer-review process. Bi-directional links will be maintained both between publications and to other forms of data, using semantic markup to enhance the meaning of these links. We will also customize this framework for three communities which, as well as being directly useful, will provide real-world requirements. The project will largely develop &#8220;glue&#8221; between existing, widely-used, open-source software systems, ensuring its sustainability and usefulness past the end of the funding.<br/>
		</span></p>
<h1><span style="font-family:Arial; font-size:11pt">2. Fit to Programme Objectives and Project Outline 
</span></h1>
<p><span style="font-family:Arial"><br/><sup>2</sup>The project call identifies the <strong>complexity</strong> and <strong>hybrid</strong> nature of the UK research data environment; despite this, one central focal point remains &#8212; most researchers spend considerable amounts of time discussing their data in the form of &#8220;paper&#8221; publications. For some, more theoretical disciplines, such as parts of computer science, the paper is the sole output; in others, such as biology, datasets are associated with papers and the <strong>barriers</strong> between &#8220;publication&#8221; and &#8220;data&#8221; are breaking down; most data sources in biology are rich in <strong><em>annotation</em></strong>; text that supports and explains the raw data. It is normally the annotation, not the raw data, which defines the quality of the resource. In these cases, <strong>text</strong> is an <strong>intrinsic</strong> part of the <strong>data</strong>. <br/><br/><sup>3</sup>However, the conventional publication process has changed relatively little; the adoption of web technologies have largely been used as a distribution mechanism. Publications are still <strong>expensive</strong> &#8212; either at subscription or publication time, depending on the business model of the publisher, and involve considerable, time-consuming interactions between author and publisher, often relating to display and presentation issues. This is in stark contrast to, for example, the biological data centres where both raw and annotated data are often made available <strong>within hours </strong>of their generation.<br/><br/><sup>4</sup>This situation is unfortunate because it limits the ability of researchers to customise their publication process for the requirements of their own discipline. As demonstrated by Shotton et al, and Rousay et al, it is possible to add considerable value, both enhancing the paper for the reader, as well as providing <strong>direct and semantically enhanced links</strong> to underlying data. The cost of the existing process, however, makes this form of publication unlikely for some data; for example, few scientists publish papers about negative results, resulting in an acknowledged publication bias<sup>,</sup>. As a result, it is <strong>hard</strong> for the semantically enhanced publication to take its place as the central hub for a <strong>linked data</strong> environment as envisioned by Coles and Frey, linking to and between research datasets, and the published knowledge about these datasets. <br/><br/><sup>5</sup>In the last decade, the blog has become a common, web-based publication framework. There are now numerous off-the-shelf tools and platforms for managing blogs, providing a high-degree of functionality. Many scientists blog about their work, about other published work (research blogging) or &#8220;live blog&#8221; about conferences and talks as they happen. In this case, the researcher is in-charge of their own publication environment, can extend it to their requirements, and publication happens immediately. However, the blog has not yet become a standard means of publication for <strong>primary research output</strong>.<br/><br/><sup>6</sup>Recently, as part of the EPSRC funded Ontogenesis network (ref), we trialled the <strong><em>Knowledge Blog</em></strong> process; in this case aimed at producing an educational resource describing many aspects of ontology development and usage, which might previously have been published in book form. We have shown that with this technology base, it is possible to replicate many of the features of the open peer-review, scientific book publication process; following two small meetings, we have written around 20 articles, and the website maintains around 1000 post reads per month (not simple hits!). To achieve this, we used only two features of the blog &#8212; trackbacks (bidirectional links) and categories (hierarchical keywords); although we used the WordPress blogging software, these features are supported by most other systems. We call these articles <strong><em>k-blogs</em></strong>.<br/><br/><sup>7</sup>Currently, however, the k-blog process is not fully supported with blog software alone, nor does it fully support the referencing, advanced linking and provenance needed specifically for research publications. For this project, we propose to provide extensions to support data-rich publications, deeply and semantically linked to other k-blogs and to other forms of data repository. Therefore, the project addresses the objectives and aims of the call through four main workpackages.<br/><br/>1) A documented <strong>k-blog process </strong>(WP1.1) describing different levels of  peer-review suitable for different forms of research data. An implementation (WP1.2), the <strong>k-blog platform</strong>, of these process based around open-source, off-the-shelf software.<br/><br/>2) Extensions to the k-blog platform supporting <strong>linking</strong>. This includes full support for referencing including COINS metadata on posts (WP2.1), client-side and permanently linked versions (WP2.2) and bidirectional links (WP2.3) to other data sets. We will add <strong>semantics</strong> to these links using the Citation Ontology (CiTO) (WP2.4).<br/><br/>3) Support for three specialist environments&#8212;<strong>healthcare</strong> (WP3.1), <strong>microarray</strong> (WP3.2) and <strong>workflows</strong> (WP3.3). All useful in their own right and showcasing the extensibility of the framework.<br/><br/>4) <strong>Documentation</strong> and <strong>tooling</strong> to integrate the k-blog process into scientists existing working practice and tooling; scientists will be able to publish from Word, OpenOffice, Google Docs or LaTeX (WP4.1). We will add tooling and documentation, as WP4.2, to support the use of reference management tools such as Endnote, Mendeley or Zotero, making use of deliverables from WP2.<br/>
		</span></p>
<h1><span style="font-family:Arial; font-size:11pt">3. Quality of proposal and Robustness of Workplan
</span></h1>
<p>
 </p>
<p>
<h2><span style="font-family:Arial">3.1 WP1: Knowledge Blog Process</span></h2>
<p><span style="font-family:Arial"><br/><br/><sup>8</sup>In this project, we aim to develop a light-weight publication framework, including the desirable aspects of the formal<strong> peer-review process</strong>. However, different forms of scientific publication require different levels of peer-review. For example, for http://ontogenesis.knowledgeblog.org, we require two reviews from an editorial board, assessing quality, appropriate for an educational resource. However, for http://process.knowledgeblog.org, which is intended to contain informal &#8220;how-to&#8221; and request for comment documents, a much lighter-weight, single editorial review assessing scope alone is more appropriate. Deliverable <strong>WP1.1</strong> will consist of <strong>documentation</strong> describing both formally and informally, a number of <strong>levels</strong> for the knowledge blog process, and how these can be achieved using a blog. These documents will, themselves, be published on http://process.knowledgeblog.org.<br/><br/><sup>9</sup>These processes will be <strong>implemented</strong> as Deliverable <strong>WP1.2</strong>, comprising <strong>freely available</strong> and widely used pieces of software, with additional &#8220;glue&#8221;. The basic publication framework will use WordPress 3 (WoP) &#8212; an open-source, multi-site, multi-author blogging system used to provide the hosted blog service at http://www.wordpress.com. While, we have found that WoP supports many aspects of this process, particularly from the readers perspective, a significant degree of &#8220;book-keeping&#8221; is required from authors, reviewers and editors. Readers know whether a paper has been reviewed or not, but authors have to remember for themselves who is reviewing the paper. Therefore, we will use a &#8220;ticket system&#8221;, specifically Request Tracker 3 (RT) (http://bestpractical.com/rt/). Both WoP and RT are <strong>extensible</strong> with plugins and will be extended and adapted to reflect the k-blog levels of WP1.1.<br/><br/><sup>10</sup>We will use this extensibility to provide a light-weight integration. RT operates as an email response system; by <strong>extending WoP</strong> to send <strong>email</strong> on submission of new papers, this can provide both an integration point, as well as the main point of interaction for authors, reviewers and editors. To provide editorial and reviewer functionality tickets can be moved between queues; extensions to RT will use standard blogging <strong>XML-RPC</strong> calls to feedback to WoP by, for example, re-categorising papers once accepted. OpenID (http://openid.net) will be used to integrate the user accounts between the two systems. WoP already supports this fully, while RT supports it in skeleton form.<br/><br/><sup>11</sup>Although we will provide an implementation of the <strong>k-blog</strong> process, it will be described sufficiently generically to support complete and independent implementation. 
</span></p>
<p>
 </p>
<p><em>3.2 </em><strong>WP2: References and Metadata</strong><br/><sup>12</sup>For k-blogs to become an integral part of the scientific record, they must fully support the semantic and linked data environment. Although WoP supports standard <strong>URI based linking</strong> to resources, and bidirectional &#8220;trackback&#8221; linking to other resources, it lacks complete functionality suitable for research communities. This is a rare example of functionality that is not already provided by WoP or an associated plugin. Deliverable <strong>WP2.1</strong> will fulfil this need; we will support the insertion of at least <strong>DOI</strong>s and <strong>PubMed ID</strong>s (PMID), that will be resolved to full human-readable reference lists for display, using APIs provided by CrossRef and NCBI eUtils respectively. To fully support computational agents wishing to access the same information, references will also support <strong>COinS</strong> metadata, embedded into the display HTML. 
</p>
<p><span style="font-family:Arial">K-blog posts will also require outward facing metadata, that describe the resources they provide in a standards-compliant manner. The Open Archives Initiative (OAI) provide standards that aim to facilitate the efficient dissemination of content. Specifically, the Object Reuse and Exchange specification (<strong>OAI-ORE</strong>) is a standard for the description and exchange of compound digital objects  (such as a WoP post or page). The WordPress OAI-ORE plugin provides link header elements that implement this specification.<br/><br/><sup>13</sup>Our initial investigations into the k-blog process showed that WoP support for versioning and provenance are lacking; the k-blog process involves updating papers after submission but before final acceptance. While WoP stores all these <strong>versions</strong>, these are only currently visible by authors or editors through the administration interface. Whilst existing plugins for WoP already provide some of this functionality, Deliverable<strong> WP2.2</strong> will uncover these to readers, along with a defined permalink scheme for access to all versions, providing full <strong>provenance</strong>. <br/><br/><sup>14</sup>WoP supports <strong>bi-directional</strong> links in the form of trackbacks; this is mediated by XML-RPC calls between resources when a link is made. This will support linking to data where, for example, the data is another <strong>k-blog</strong>; however, general data resources may lack support for this process. Therefore, as Deliverable<strong> WP2.3</strong>, we will provide a trackback proxy, hosted on the http://knowledgeblog.org server, storing and presenting these links for resources  that cannot directly  process trackbacks.<br/><br/><sup>15</sup>To complete this work package, we will add semantics to the links using CiTO, as Deliverable <strong>WP2.4</strong>. Therefore, as well as enabling easier data linking and provenance, we will also enable addition of meaning to these links.
</span></p>
<p>
 </p>
<p>
<h2><span style="font-family:Arial">3.3 WP3 &#8211; Specialist Environments</span></h2>
<p><span style="font-family:Arial"><strong><br/><br/></strong><sup>16</sup>The k-blog platform and process is designed to be flexible and adaptable to the needs of specialist environments. We will use three main use cases to ensure <strong>real world</strong> applicability of the software, as well as <strong>fulfilling</strong> the immediate <strong>needs</strong> of these communities.<br/><br/><sup>17</sup>For Deliverable <strong>WP3.1</strong>, we will add additional features for supporting the microarray community. Currently, the microarray community is well serviced in terms of <strong>metadata</strong> capture (MIAME) and <strong>deposition</strong> in public repositories (ArrayExpress, GEO). As part of WP2, we will support <strong>linking</strong> to these datasets through stable URIs. However, these resources deal only with data generation. Post-processing and analysis is largely captured at the publication stage, often in supplementary material.<br/><br/><sup>18</sup>A substantial amount of this analysis uses BioConductor: a widely used, open-source platform for statistical microarray analysis based on the R statistical programming language. We will extend k-blog with <strong>specific support for R</strong> and BioConductor. Authors will be able to directly embed code into k-blog papers, along with the figures that result; as a result reviewers and readers will be able to see a <strong>computationally precise description </strong>of methods and replicate the generation of figures should they choose.<br/><br/><sup>19</sup>Finally, we will investigate the possibility of publication to a k-blog using only R code and references to public databases, in a process similar to Sweave &#8212; figures will be generated on the server, provide guarantees of correctness and precise provenance. The limited scope of this call means this part of WP3.1 will be proof-of-principle only.<br/><br/><sup>20</sup>For <strong>WP3.2</strong>, we will focus on the <strong>public health community</strong> (PHC): a key workforce in delivering quality and effective healthcare by providing timely and accurate public health intelligence (PHI)<sup>,</sup>.  PHI is a varied environment performing statistical analyses: producing information figures, diagrams and reports to communicate results to the wider health community.  However, the PHC operates in small groups with little knowledge networking.  The main aim of the k-blog is to improve the availability of health information, data and knowledge, to inform decisions for health protection and care standards as supported by the Quality Improvement Productivity and Prevention initiative.  The NWeHealth <em>e-Lab</em> project, hosted at The University of Manchester, provides an environment to bring together <em>research objects</em> into a single location. As elsewhere, textual data forms the key hub that links together all the other forms of knowledge. By <strong>linking to e-Lab</strong>
			<em>research objects</em> from a k-blog, this link will be made explicit, available, interpretable and directly valuable to the PHC; as a result WP3.2 is synergistic with the rest of the proposal.  This community also bring a set of access control requirements. To support these we will use existing WoP facilities, providing a simple, easy-to-use three level access model.
</span></p>
<p>
 </p>
<p><span style="font-family:Arial"><sup>20</sup>For WP3.3, we will generate k-blog <strong>content</strong> about <strong>Taverna</strong> workflows and methods for building them. Workflows have become a popular way of realizing computational analyses and have become an important form of <strong>data</strong>. The <strong>JISC funded myExperiment</strong> project is widely used to disseminate the workflows themselves. Knowledge about issues surrounding workflows is, however, more difficult to produce and disseminate. A k-blog, with its ability to produce short, targeted articles as the need arises and the resources become available for writing, suits the need for taverna workflow documentation. We will seek k-blogs on Taverna issues such as: the basics of workflow design; how to choose among a set of similar services in producing a workflow; and, the testing of workflows. We will implement a light-weight mechanism, using <strong>trackbacks</strong>, to link between the k-blog and myExperiment. 
</span></p>
<p>
 </p>
<p><span style="font-family:Arial"><sup>21</sup>As part of <strong>WP3</strong>, we will also hold four workshops, at 3-month intervals, each focusing on one particular k-blog and community. These workshops will be of the form previously trialled as part of the Ontogenesis network, and will serve several purposes; requirements gathering and feedback for us, education for the community and development of content, that demonstrates the process to the general readership.  
</span></p>
<p>
 </p>
<p>
<h2><span style="font-family:Arial">3.4 WP4 &#8211; Integration with Existing Working Practices</span></h2>
<p><span style="font-family:Arial"><br/><br/><sup>22</sup>For the k-blog process to be <strong>acceptable</strong> to <strong>communities</strong> such as those described in WP3, it must fit with existing working practices. Researchers mostly write documents using a word-processor. Fortunately, as the <strong>k-blog</strong> platform is based on the <strong>widely-used</strong> WoP, which in turns offers a <strong>widely-supported</strong> API, this style of working can be readily integrated. It is already possible to author using Word (2007 onward), OpenOffice, Google Docs and LaTeX using integrated or existing technologies, as demonstrated by our previous work at http://ontogenesis.knowledgeblog.org. For Deliverable <strong>WP4.1</strong>, user oriented documentation, describing these tools will be developed. This documentation will also describe clearly how to present and organise papers in a way which is optimized for the <strong>k-blog</strong> process. While, we expect this documentation to take a significant time-span to produce, refining it as a result of user feedback, it is important to note that a k-blog is already <strong>useful</strong> and <strong>possible</strong>.
</span></p>
<p style="background: white"><span style="color:black; font-family:Arial">To take maximal advantage of linking technologies developed in WP2, we will need to integrate with existing technologies for referencing. As deliverable <strong>WP4.2</strong>, we will add tooling to enable the use of bibliographic tools such as Endnote, Mendeley, Zotero or BiBTeX to insert references that <strong>k-blog</strong> can directly translate. Largely, this should consist of &#8220;styles&#8221;, modifying the in-text citation, as the reference plugin of <strong>WP2.1</strong> will generate reference lists. As with other deliverables, this tooling will include substantial documentation, developed using the <strong>k-blog</strong> process. 
</span></p>
<h2><span style="font-family:Arial; font-size:11pt">4. Project Timeline
</span></h2>
<p style="background: white">
 </p>
<div>
<table style="border-collapse:collapse" border="0">
<colgroup>
<col style="width:91px"/>
<col style="width:90px"/>
<col style="width:94px"/>
<col style="width:66px"/>
<col style="width:302px"/></colgroup>
<tbody valign="top">
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  solid 0.5pt; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial"><strong>Name</strong></span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  solid 0.5pt; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial"><strong>Start</strong></span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  solid 0.5pt; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial"><strong>End</strong></span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  solid 0.5pt; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial"><strong>Staff</strong></span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  solid 0.5pt; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial"><strong>Notes</strong></span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">    WP 1</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">02/08/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/10/2010</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt"> </td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt"> </td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 1.1</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">02/08/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">31/08/2010</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">All</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">A documented k-blog process</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 1.2</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/09/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/10/2010</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">DS,SC</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Implementation with off-the-shelf software</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">    WP 2</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/11/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/04/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt"> </td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt"> </td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 2.1</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/11/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">26/02/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">SC</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">COinS metadata on posts</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 2.2</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/11/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">29/01/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">SC</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Client-side, permanently linked versions</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 2.3</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">03/01/2011</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">26/02/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">DS</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Bi-directional links to other datasets</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 2.4</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/03/2011</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/04/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">PL</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Semantic linking with CITO</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">    WP 3</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/11/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/07/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt"> </td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt"> </td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 3.1</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/11/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/07/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">GM</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Specialist environment – Healthcare</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 3.2</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/11/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/07/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">DS</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Specialist environment &#8211; Microarrays</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 3.3</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">01/11/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/07/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">RS</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Specialist environment – Workflows</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">    WP 4</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">02/08/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/06/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt"> </td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt"> </td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 4.1</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">02/08/2010</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/04/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">GM,DS</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Authoring documentation and tools</span></p>
</td>
</tr>
<tr style="height: 20px">
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  solid 0.5pt; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">      WP 4.2</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">02/05/2011</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p style="text-align: right"><span style="font-family:Arial">30/06/2011</span></p>
</td>
<td style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">GM,SC</span></p>
</td>
<td vAlign="bottom" style="padding-left: 7px; padding-right: 7px; border-top:  none; border-left:  none; border-bottom:  solid 0.5pt; border-right:  solid 0.5pt">
<p><span style="font-family:Arial">Referencing documentation and tools</span></p>
</td>
</tr>
</tbody>
</table>
</div>
<p style="background: white">
 </p>
<h2><span style="font-family:Arial; font-size:11pt">5. Project Management Arrangements
</span></h2>
<p><span style="font-family:Arial"><sup>23</sup>The project will be managed from Newcastle University; the <strong>primary management</strong> will be from Dr Lord who will be responsible for:
</span></p>
<ul>
<li>
<div style="background: white"><span style="font-family:Arial">Developing Project Management Plans;
</span></div>
</li>
<li>
<div style="background: white"><span style="font-family:Arial">Ensuring that the Project technical objectives are met;
</span></div>
</li>
<li>
<div style="background: white"><span style="font-family:Arial">Prioritising and reconciling conflicting opportunities;
</span></div>
</li>
<li>
<div style="background: white"><span style="font-family:Arial">Reporting and collaborating with JISC programme Manager;
</span></div>
</li>
<li>
<div style="background: white"><span style="font-family:Arial">Dissemination of the k-blog platform.
</span></div>
</li>
</ul>
<p><span style="font-family:Arial">Project progress will be evaluated through <strong>scheduled</strong>, short, &#8220;<strong>stand-up</strong>&#8221; meetings on a weekly basis, conducted face-to-face, via skype or phone as appropriate. Although most project staff are co-located, primary <strong>unscheduled</strong> communication will be via <strong>public mailing list</strong>, ensuring maximum visibility and openness.  <strong>User consultation</strong> will be via <strong>public mailing list</strong>, as well as through a &#8220;<strong>dogfooding</strong>&#8221; k-blog.  All project staff have been handpicked; they are highly experienced and self-directed, as outlined elsewhere. All are associated with several other projects and duties (research, research support, teaching and training), and are responsible for managing these independent workloads.  
</span></p>
<p>
 </p>
<h2><span style="font-family:Arial; font-size:11pt">5.1 Risks
</span></h2>
<p><span style="font-family:Arial"><sup>24</sup>Staff Risk – as with all projects, loss of staff could negatively impact on this project; however, all staff are on permanent contracts, have long histories in research, so this is less likely. Additionally, by dividing the work between five individuals, we limit the risk should a single person leave. 
</span></p>
<p><span style="font-family:Arial">WoP3 and other dependencies – the project depends on other software, most notably WoP for which a new version (3.0) is now in beta; however the software is widely supported. Other software is replaceable. 
</span></p>
<p><span style="font-family:Arial">Standards Shifting – the project depends on a number of standards and these may change. In this project, we will <strong>NOT </strong>support standards, but rather use those that support us. Where standard change rapidly, their implementation will be delayed (till they stabilize) or dropped. None of the standards described here is critical to the success of the project. <strong>
			</strong></span></p>
<p>
 </p>
<h2><span style="font-family:Arial; font-size:11pt">5.2 IPR Position
</span></h2>
<p><span style="font-family:Arial"><sup>25</sup>All code will be developed under open source licences. WoP and RT are licensed under GPL, so code linking to these will be likewise licensed. Code that is separable will be released under LGPL. Code will remain copyright of respective institutions or authors. Any documentation produced by project staff relating to the project will be licensed under Creative Commons Attribution license. Licensing of individual k-blogs will be delegated, but permissive licenses will be encouraged. 
</span></p>
<p>
 </p>
<h2><span style="font-family:Arial; font-size:11pt">5.3 Sustainability
</span></h2>
<p><span style="font-family:Arial"><sup>26</sup>This project is largely based around innovative, novel and leading <strong>use of existing</strong> software.  As such the sustainability of the majority of the technology base is not dependent on project members but large companies with established and proven business models. The <strong>k-blog</strong> process will be cleanly separated from its implementation, ensuring only weak dependencies to underlying software. Where, we produce software &#8220;glue&#8221;, public and widely supported APIs will be used where possible. This will ensure that components are replaceable. All code, including historical versions will be publicly available. Documents produced by project staff will be publically available and clearly licensed so will be archived through the internet &#8220;cloud&#8221; resources; we are also seeking explicit support for archiving from the British Library. 
</span></p>
<p>
 </p>
<h2><span style="font-family:Arial; font-size:11pt">5.4 Staff Recruitment
</span></h2>
<p><span style="font-family:Arial"><sup>27</sup>All staff are already in post. 
</span></p>
<p>
 </p>
<h2><span style="font-family:Arial; font-size:11pt">5.5 Key Beneficiaries
</span></h2>
<p><span style="font-family:Arial"><sup>28</sup>Our key beneficiaries are the <strong>public health</strong>, <strong>microarray</strong> and <strong>workflow</strong> communities; as the k-blog process is based around commodity software, these groups can use the <strong>basic </strong>environment from the first day of the project to generate and share content. As the project progresses, so will the process, the software to support it and the documentation to explain it; at all stages, the k-blog process fulfils a <strong>clear and immediate need</strong>. While we are specifically targeting these communities, the k-blog process and platform is sufficiently <strong>generic</strong> that it can support a <strong>wide range</strong> of research activities.
</span></p>
<p><span style="font-family:Arial">Although presented here as a single platform, the process and components are <strong>separable</strong> and can benefit communities independently. In particular, the tools and documentation from WP2 and WP4 will find use within the research blogging community, who find, in particular, the lack of tooling for referencing difficult. Finally, the statement of a peer-review process, and its implementation within RT will be applicable to any peer-review environment regardless of the form of publication. This includes publications published using wiki or other Content Management Systems. 
</span></p>
<p>
 </p>
<h2><span style="font-family:Arial; font-size:11pt">5.6 Engagement with Community
</span></h2>
<p><span style="font-family:Arial"><sup>29</sup>We consider the mechanism for engagement with four kinds of community: engagement with our core <strong>content generating</strong> community is an intrinsic part of this proposal, as described in <strong>WP3</strong>.  Further interaction with more disparate groups will be maintained through personal contacts; each of the five individuals named in this proposal are experienced and embedded in different communities (health care, microarray, ontology, proteomics). Engagement with our core <strong>content consuming</strong> community is, again, an intrinsic part of the proposal; all project communications will be via open mailing list or k-blog. Project members are active users of Web 2.0 social technologies; our initial trials as part of Ontogenesis showing this approach to be highly effective form of dissemination, with minimal effort. Engagement with <strong>software users</strong> will be via website and direct interaction. All software will be released or advertised via normal channels (website, versioning, and mailing list), including a (debian) package repository for those wishing to set up their own server.  Finally, <strong>developer communities</strong> will not be specifically targeted, but our open source, continually integrated development plan will be attractive, and we will accept suitably licensed contributions.  
</span></p>
<p><span style="font-family:Arial"><sup>30</sup>All communities will benefit from the open and agile development methodology we will adopt; changes to the environment will be integrated and released rapidly, ensuring continual improvement and facilitating rapid feedback cycles. 
</span></p>
<p>
 </p>
<h2><span style="font-family:Arial; font-size:11pt">6. Previous Experience and Project Team
</span></h2>
<p>
 </p>
<p><span style="font-family:Arial"><sup>31</sup><strong>Dr. Phillip Lord</strong> is a Lecturer of Computing Science at Newcastle University. He has a PhD in yeast genetics from University of Edinburgh, after which he moved into bioinformatics. He is well known for his work on ontologies in biology, as well as his contributions to eScience beginning with his role as a RA on the myGrid project. Since his move to Newcastle, he has been an investigator on there more eScience projects; CARMEN, ONDEX and InstantSOAP, as well as maintaining an active engagement in standards development (OBI, MIGS, MIBBI), and publishing on the fundamentals of ontology design. He was an active participant in the Ontogenesis network, and developed the initial idea for knowledge blogs as part of this. He is an active blogger and developer.
</span></p>
<p>
 </p>
<p><span style="font-family:Arial"><sup>32</sup><strong>Dr. Georgina Moulton</strong> is an Education and Development Fellow at The University of Manchester.  Since 2005 her main roles have been to co-ordinate the development, and delivery of multi-disciplinary bio/health informatics education programmes; and to facilitate the engagement of biological and health communities in a variety of bio and health informatics research projects (<em>e.g.,</em> ONDEX, Obesity e-Lab).  For 3 years, Georgina was the EPSRC funded Ontogenesis Network Manager, in which she co-ordinated the activities of the network and expanded the network through the facilitation of the development of new activities and was involved in the trial k-blog process.  More recently her work includes the development and delivery in conjunction with NHS partners of an education and development programme tailored to match the needs of North West public health analysts and the wider healthcare workforce.  
</span></p>
<p>
 </p>
<p><span style="font-family:Arial"><sup>33</sup><strong>Dr. Daniel Swan</strong> has a PhD in developmental biology and continued to work in developmental biology as a post-doctoral researcher before moving into bioinformatics in 2001.  Subsequent positions included working for Bart&#8217;s and the London Genome Centre and the Centre for Hydrology and Ecology in informatics driven roles dealing with large, distributed biological datasets generated by large user communities.  Currently the manager of the Newcastle University Bioinformatics Support Unit, he leads a small team aiding biological researchers generate, capture, store and analyse their digital data.  His interdisciplinary background means he has grounding in both computer and biological sciences and is comfortable working on CS focused projects (CARMEN, InstantSOAP, Bio-Linux) as well as acting in a research capacity analysing high-throughput data. 
</span></p>
<p>
 </p>
<p><span style="font-family:Arial"><sup>34</sup><strong>Dr. Simon Cockell</strong> has a PhD in Genetics from Leicester University, and refocussed into Bioinformatics with a Masters degree from Leeds in 2005. From there he moved to Newcastle, and the Bioinformatics Support Unit. Since coming to Newcastle, Simon has worked on a range of projects involving large scale analyses (AptaMEMS-ID), data integration (Ondex) and health informatics (MRC Mitochondrial Disease Cohort). 
</span></p>
<p>
 </p>
<p><span style="font-family:Arial"><sup>35</sup><strong>Dr Robert Stevens </strong>is a senior lecturer in Bioinformatics in the Bio and Health Informatics group at the University of Manchester. His main areas of research are in the development and use of semantics within the life sciences. This is blended with the use of e-Science platforms to gather and manage the data and knowledge of the life sciences. He was PI on the Ontogenesis network that ran the meetings for the first k-blog. He is or has been a co-investigator on the myGrid and myExperiment grants that will provide both content and technical input to this project. As well as the JISC funded myExperiment project, Stevens was an investigator on the JISC funded CO-ODE project that developed Protégé 4. On the back of this, Stevens has led the OWL training activities at Manchester that has directly fed in to the Ontogenesis k-blog. This range of experience makes Stevens an ideal partner to lead the development of content within this project.
</span></p>
<p><span style="font-family:Arial">
		</span> </p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1729 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/08/a-new-grant-for-knowledgeblog/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Deleted My Networking</title>
		<link>http://www.russet.org.uk/blog/2010/07/deleted-my-networking/</link>
		<comments>http://www.russet.org.uk/blog/2010/07/deleted-my-networking/#comments</comments>
		<pubDate>Tue, 27 Jul 2010 10:52:20 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1709</guid>
		<description><![CDATA[While travelling on Elba, I suffered the misfortune of a virus attack; I don&#8217;t use AV software these days, since it tends to break other things which take a long time to fix, and it&#8217;s been many years since I&#8217;ve lost a machine to malicious software. The process, though, was quite entertaining. First, I started [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1709">
<p>While travelling on <a href="http://www.russet.org.uk/blog/2010/07/elba/">Elba</a>, I suffered the misfortune of a virus attack; I don&#8217;t use AV software these days, since it tends to break other things which take a long time to fix, and it&#8217;s been many years since I&#8217;ve lost a machine to malicious software.</p>
<p>The process, though, was quite entertaining. First, I started getting an error stating that system.exe needed .net to run properly. After a while, a Windows update happened, along with the normal malicious software removal update. This found the virus, probably killed it, then stuck up a dialog saying &#8220;Some of your files were nasty, so they need to be restored, please insert your Windows SP3 disk&#8221;. Clicking &#8220;ok&#8221; said &#8220;I can&#8217;t find the disk, perhaps a) you put the wrong disk in or b) your drive isn&#8217;t working&#8221;. Or c) you are on holiday, and your disk is 1000 miles away, and anyway, the machine is old enough to have come with SP2. All sort of raising the question why the software that I&#8217;d just downloaded from Microsoft, can&#8217;t download the system components to replace the ones that it&#8217;s deleted from Microsoft also.</p>
<p>After the reboot, all trace of networking software had been blitzed from the machine; I couldn&#8217;t even use loopback addresses. In the end, I&#8217;ve done a complete factory reset from the recovery partition which I thought I had deleted years ago. The process took about 15 minutes to recover windows, 1 hour to recover the sony application layer, 2 hours to remove all the sony application layer (one application at a time, including the 10 different wallpaper packages, because add/remove programs doesn&#8217;t allow multiple select), except for the power management tweaks and drivers, then another hour trying to figure out NTFS file permissions so that I could read my files.</p>
<p>Actually, the process hasn&#8217;t been a complete loss; I was thinking of re-installing the OS anyway. The boot had got to around 3 to 4 minutes which was getting daft. Now, with a clean OS, a complete reboot takes under a minute. It&#8217;s also been a bit of a walk down memory lane; currently, I have no internet, so my computer is in 2005 state; there is Office 2003 trial, a rubbishy media centre thing Sony probably wrote as an answer to iTunes, and macromedia flash. I had an emacs install exe in my recycle which I managed to recover before the reset so, ironically, Emacs is the newest piece of software I have on here.</p>
<p>Still, this administrative nightmare makes me wonder what to do next; XP is not too long for this world, vista is a poll tax on wheels, and I am just not sure I can be bothered with learning 7. I&#8217;ve used windows on the road for a long time, but I think I may go small, light netbook running linux. There have been a couple of times when I have needed MS Office, but it&#8217;s not that common, and there is always a work-around.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1709 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2010/07/deleted-my-networking/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>OBO Format and Manchester Syntax</title>
		<link>http://www.russet.org.uk/blog/2009/09/obo-format-and-manchester-syntax/</link>
		<comments>http://www.russet.org.uk/blog/2009/09/obo-format-and-manchester-syntax/#comments</comments>
		<pubDate>Thu, 10 Sep 2009 21:42:16 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1470</guid>
		<description><![CDATA[At Neuroinformatics 2009, David Sutherland and I talked about the problems of ontology building. One of the current (and past!) difficulties is to choose an appropriate language for representing the knowledge in your ontology. I thought I would write my thoughts up as a post; this will probably result in the most boring thing I [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1470">
<p>At Neuroinformatics 2009, David Sutherland and I talked about the problems of ontology building. One of the current (and past!) difficulties is to choose an appropriate language for representing the knowledge in your ontology. I thought I would write my thoughts up as a post; this will probably result in the most boring thing I have ever written (I am sure someone will point out worse offenses); syntax is dull but distressingly important.</p>
<p>In bioinformatics, there are essentially two choices that is OWL and OBO (format). A second issue, is finding a good environment for developing the ontology; this divides between Protege, OBO-Edit and the ever-present &#8220;text editor&#8221;. It&#8217;s often the case, that we want to use both of these at the same time. Take, for example, OBI, which I am involved in. While the ontology itself is being developed in OWL, many of its dependent ontologies are built using OBO; being purist and demanding one is really not an option. OWL itself has many different syntaxes; at the moment, I generally prefer Manchester sytnax because you can edit it with text-editor, which is really not so easy with any of the XML representations.</p>
<p>While these two languages have somewhat different expressivity, there have been a number of descriptions of how to translate both the syntax and the semantics which have been described elsewhere. One of the recurrent problems, however, stems from the best practices and the syntax of identifiers.</p>
<p>OBO makes use of a numerical, semantics-free identifier and a namespace, with a syntax of <tt>NAMESPACE:IDENTIFER</tt>. So, a Gene Ontology term looks like <tt>GO:0003674</tt>. The namespace is not constrained to be two-letters and has mechanisms for world-uniqueness, in that people talk to each other and sort it out, if they clash. The use of a semantics-free identifier means that term names can be changed while maintaining the implied meaning with the term; the label for the term, meanwhile, provides a human readable version, which can be shown to users of the ontology. I will call these the OBO identifier and OBO label respectively.</p>
<p>Translating this, however, into OWL, including Manchester syntax causes significant problems. The naturalistic translation is to turn the OBO identifier onto the identifier in OWL; the OBO namespace would become an XML namespace, the OBO identifier would become an XML identifier. Unfortunately, this doesn&#8217;t work. First, the OBO identifier is genuniely just a short string and XML requires a URI; so a mapping between OBO identifiers and URIs is necessary. Second, the OBO identifier is numerical; unfortunately, while the identifiers in OWL can contain numbers they have to start with a non-numerical character. The standard translation, therefore, uses in most cases an OBO wide URL (<a href="http://purl.obolibrary.org/obo/">http://purl.obolibrary.org/obo/</a>), although some ontologies have their own namespace (GO uses <a href="http://purl.org/obo/owl/GO#">http://purl.org/obo/owl/GO#</a>). The OBO identifier is mapping to an valid identifer by sticking a prefix onto the numbers. So, we have identifiers such as <tt>GO:GO_0042101</tt> or <tt>obo:OBI_1110045</tt>. There are also some OBO ontologies for which this does NOT occur; for instance, BFO classes in OBI come out with identifiers of the form <tt>snap:Continuant</tt> or <tt>span:Process</tt>, except for one which is <tt>bfo:Entity</tt>.</p>
<p>Again, all perfectly reasonable, but unfortunately, when converted to Manchester syntax it means that we end up with classes that look like this slightly elided class from OBI:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Class</font></b>: obo:OBI_1110161

    <font color="#990000">Annotations:</font>
        rdfs:label <font color="#FF0000">"T cell epitope ELISA IL-1b assay"</font>@en,

    <font color="#990000">SubClassOf:</font>
        obo:OBI_0000661,
        obo:OBI_0000299 some (obo:IAO_0000109
        <b><font color="#000080">and</font></b> (obo:IAO_0000136 some obo:OBI_1110196))
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>which completely defeats the aim of a human-readable syntax. Now OBO format has much the same problem; relationships to other classes are specified using cross-referenes to their identifiers which are, essentially, unreadable. OBO format works around this with a denormalisation as can be seen from this somewhat elided example from IAO:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt>[Term]
id: IAO:0000027
name: data item
def:<font color="#FF0000">"a data item is an information content entity that is intended...."</font>
is_a: IAO:0000030 ! information content entity
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>The cross reference in this case is a subsumption link to <tt>IAO:0000030</tt></p>
<p>One solution would be to use the <tt>rdfs:label</tt> in place of the identifier. So, we would have something that looked like this:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Class</font></b>: <font color="#FF0000">"T cell epitope ELISA IL-1b assay"</font> @en

    <font color="#990000">Annotations:</font>
        obo:identifier <font color="#FF0000">"1110161"</font>

    <font color="#990000">SubClassOf:</font>
        obo:OBI_0000661,
        obo:OBI_0000299 some (obo:IAO_0000109
        <b><font color="#000080">and</font></b> (obo:IAO_0000136 some obo:OBI_1110196))

</tt></pre>
</pre>
</td>
</tr>
</table>
<p>Other identifiers would also have to be changed, also. I&#8217;ve also added the <tt>odo:identifier</tt> line (which I think would be valid, but might require the creation of an OWL individual). Without this, it would not be possible to go backward.</p>
<p>However, this is problematic as it changes the serializiation between the OWL Manchester syntax and other syntaxes of OWL. The class identifier has to be URI legal, and OBO label here is not. We could do a syntactic conversion (e.g. <tt>T%20%cell%20%epitope</tt>) but this, again, reduces readiblity, defeating the point. Also, the <tt>rdfs:label</tt> would become part of the final identifier URI, which then becomes a semantics heavy identifier. Finally, it would require a OBO specific loading of the Manchester syntax, taking the URI identifier from the annotation block, and the <tt>rdfs:label</tt> from the class name.</p>
<p>So, is there any solution. First, there are tooling solutions. In Protege, it is already possible to use any component of the definition in the display. So, you can set the <tt>rdfs:label</tt> as the main display form. Tooling solutions are attractive, but there is a problem; you have to extend all tools to support this view; I realise that the number of freaks who wish to edit OWL with emacs is not that large, so this might not seem an issue. However, many people wish to develop ontologies collaboratively using version control; if you want to compare versions you use diff, so we now need an Manchester syntax diff viewer. Also, if you want to do some perl hacking, or straight-forward search and replace, again, it&#8217;s all harder.</p>
<p>To some extent this might seem trivial, but then the entire purpose of Manchester syntax (and the functional syntax) is to have an easy to read and manipulate syntax which the XML version of OWL is not. This purpose is defeated if it&#8217;s hard to read.</p>
<p>So, a second non-tooling solution. The obvious answer is to take the OBO approach and add comments. Now, the Manchester syntax includes a comment character (#), although last time I tried the Protege parser doesn&#8217;t implement this. None then less, it allows this:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Class</font></b>: obo:OBI_1110161 <i><font color="#9A1900">#"T cell epitope ELISA IL-1b assay"@en</font></i>

    <font color="#990000">Annotations:</font>
        rdfs:label <font color="#FF0000">"T cell epitope ELISA IL-1b assay"</font>@en,

    <font color="#990000">SubClassOf:</font>
        obo:OBI_0000661,
        obo:OBI_0000299 some (obo:IAO_0000109
        <b><font color="#000080">and</font></b> (obo:IAO_0000136 some obo:OBI_1110196))
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>This is not too bad, but it doesn&#8217;t work well for complex class expressions. I can&#8217;t be bothered to look up the labels and have reused one, but you get something like:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Class</font></b>: obo:OBI_1110161 <i><font color="#9A1900">#"T cell epitope ELISA IL-1b assay"@en,</font></i>

    <font color="#990000">Annotations:</font>
        rdfs:label <font color="#FF0000">"T cell epitope ELISA IL-1b assay"</font>@en,

    <font color="#990000">SubClassOf:</font>
        obo:OBI_0000661, <i><font color="#9A1900">#"T cell epitope ELISA IL-1b assay"@en</font></i>
        obo:OBI_0000299 <i><font color="#9A1900">#"T cell epitope ELISA IL-1b assay"@en</font></i>
        some (obo:IAO_0000109 <i><font color="#9A1900">#"T cell epitope ELISA IL-1b assay"@en</font></i>
        <b><font color="#000080">and</font></b> (obo:IAO_0000136 <i><font color="#9A1900">#"T cell epitope ELISA IL-1b assay"@en</font></i>
             some obo:OBI_11101 <i><font color="#9A1900">#"T cell epitope ELISA IL-1b assay"@en</font></i>
             ))
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>This has three problems. Firstly, we have used comments &#8220;meaningfully&#8221; as we can&#8217;t distinguish between these comments and other normal comments. Secondly, we have had to reformat the output because we have only a &#8220;to-end-of-line&#8221; comment character. Thirdly, it looks horrible.</p>
<p>So, my minimal solution would be this; we introduce some new comment characters, which are treated as comments normally, but which carry enough semantics to allow a warning when they are wrong; rather like Javadoc, which is a comment wrt the language, but is structured and meaningful wrt the documentation. Tooling could be used to check that the comment masquerading labels are correct wrt to the identifiers.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Class</font></b>: obo:OBI_1110161 [T cell epitope ELISA IL-1b assay],

    <font color="#990000">Annotations:</font>
        rdfs:label <font color="#FF0000">"T cell epitope ELISA IL-1b assay"</font>@en,

    <font color="#990000">SubClassOf:</font>
        obo:OBI_0000661 [blah],
        obo:OBI_0000299 [longer blah]
        some (obo:IAO_0000109 [more]
        <b><font color="#000080">and</font></b> (obo:IAO_0000136 [stuff]
        some obo:OBI_11101 [OBI Thing]
        ))
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>This is still not ideal; it would require extension to Manchester syntax, but it&#8217;s minimal, and it does support the semantics free identifiers in OBO in a way which does not require extensive tooling. It&#8217;s worth reiterating here that OBOs semantics-free identifiers are a good thing; so, supporting them supports others people who may wish to do the same, sensible thing. It does have the disadvantages of duplicating information, but at least in a way that is checkable.</p>
<p>Comments welcome!</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1470 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/09/obo-format-and-manchester-syntax/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Modification in the Future</title>
		<link>http://www.russet.org.uk/blog/2009/09/modification-in-the-future/</link>
		<comments>http://www.russet.org.uk/blog/2009/09/modification-in-the-future/#comments</comments>
		<pubDate>Tue, 08 Sep 2009 14:53:53 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1466</guid>
		<description><![CDATA[Make has been driving me mad for the last week. It keeps on complaining about &#8220;modification time in the future&#8221;. Normally, this happens because you&#8217;re using rmeote files from a server which doesn&#8217;t have sync&#8217;d time. But this is rare these days. Anyway, it was complain that the file was 10E+06 seconds in the future; [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1466">
<p><a href="http://www.gnu.org/software">Make</a> has been driving me mad for the last week. It keeps on complaining about &#8220;modification time in the future&#8221;. Normally, this happens because you&#8217;re using rmeote files from a server which doesn&#8217;t have sync&#8217;d time. But this is rare these days. Anyway, it was complain that the file was 10E+06 seconds in the future; that&#8217;s a really, really big clock skew.</p>
<p>Did a bit of poking around. One possibility I found was that it was due to a limitation in FAT32; hmmm, not likely. Didn&#8217;t have time for more. I am at a conference; supposed to be paying some attention.</p>
<p>Anyway, the solution came to me today. Or rather the cause, because the solution was obvious. Turns up, when I changed timezone to Czech, I pushed the month back to August. What I don&#8217;t understand is that I was sure windows synced to a NTP server running somewhere. What does it do when you change the month?</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1466 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/09/modification-in-the-future/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Installing Vista</title>
		<link>http://www.russet.org.uk/blog/2009/08/installing-vista/</link>
		<comments>http://www.russet.org.uk/blog/2009/08/installing-vista/#comments</comments>
		<pubDate>Thu, 27 Aug 2009 16:02:50 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1426</guid>
		<description><![CDATA[This year, our clusters are going to be moved over to Vista, so I&#8217;ve decided to downgrade my windows box from XP to vista. It&#8217;s been an inevitable fun-filled afternoon as a result. Tried a remote installation to save the effort of finding disks. Unfortunately, we tried an installation which booted into Windows 7, and [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1426">
<p>This year, our clusters are going to be moved over to Vista, so I&#8217;ve decided to downgrade my windows box from XP to vista. It&#8217;s been an inevitable fun-filled afternoon as a result.</p>
<p>Tried a remote installation to save the effort of finding disks. Unfortunately, we tried an installation which booted into Windows 7, and then allowed you to install vista from there; this results in a mysterious 100M partition for use with bitlocker; vista doesn&#8217;t know about this, so mounted it as D drive and, as it&#8217;s marked as a system partition, you can&#8217;t change this. Three installations later, it was gone, and Vista is installed.</p>
<p>Next up, install synergy. Turns out that this is hosed because of UAC &#8212; the Vista access control. <a href="http://www.howtogeek.com/howto/windows-vista/fixing-problems-with-synergy-on-windows-vista/">How to Geek</a> was very helpful, although their technique doesn&#8217;t completely work. I have some ideas, but basically, had to turn off all UAC elevation dialogs (as synergy doesn&#8217;t work then, which rather defeats the point), and I have to start it by hand every login. At this juncture, a hardware KVM seems an option, but it&#8217;s clunky in comparison to synergy.</p>
<p>Cygwin installation has been okay, except for some mysterious &#8220;Program Compatibility&#8221; dialog which tells me that I have done things wrong and offers to make my life better. Next up is the problem of getting security permissions on my files on D, which think that they are owned by another user (from my old OS). Normal Windows problem I can&#8217;t get the permissions set up, or percolating downward whatever I do.</p>
<p>(At this point, a friend popped in and said, &#8220;Why don&#8217;t you install Windows 7 instead&#8221;. Not the first to ask).</p>
<p>Think I now have the security permissions set, although it&#8217;s going to take about 2 hours to find out for sure, as it traverses my file system. Cygwin appears to have another strange problem where a bash window doesn&#8217;t respond to a click&#8212;if you want to move it from the front, you have to use the taskbar.</p>
<p>Emacs, skype, miktex all seem to have installed okay; neither webcam nor sound drivers worked in the default installation, but vista did manage to find them, so no complaints there really. I&#8217;ve also found one major advantage; when you switch the irritating desktop sounds off, windows no longer asks you whether you want to save the old scheme (yes, being the default); well worth the billions of dollars spent on vista. The machine balked after all these installs, with explorer up to 100% CPU. Restart has solved.</p>
<p>Installing cygwin sshd was a bit hard; the trick is to run ssh-host-config in a <tt>cygwin.bat</tt> run as administrator. It all works fine then, except for the bit where you try to ssh in to the machine. Then you always get <tt>Connection Closed</tt>. Giving up for now.</p>
<p>How would I have got this far out with the wonderous <a href="http://www.cs.ncl.ac.uk/people/gerry.tomlinson">Gerry Tomlinson</a> to help me out? No idea.</p>
<p>On the flip side, thought, I was interested to see one of my own great ideas, first expounded in my work on <a href="http://homepages.cs.ncl.ac.uk/phillip.lord/wiki/energy/GeneratingSewageSystem.html">Generating Sewage Systems</a> has been taken up the Institute of Mechanical Engineering, in a report which has even got as far as the <a href="http://news.bbc.co.uk/1/hi/sci/tech/8223528.stm">BBC</a>. Yep, algae reactors down the side of buildings. It&#8217;s the way forward.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1426 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/08/installing-vista/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Avoiding the Upgrade Problem</title>
		<link>http://www.russet.org.uk/blog/2009/08/avoiding-the-upgrade-problem/</link>
		<comments>http://www.russet.org.uk/blog/2009/08/avoiding-the-upgrade-problem/#comments</comments>
		<pubDate>Wed, 12 Aug 2009 11:06:35 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1413</guid>
		<description><![CDATA[I&#8217;ve generally been reasonably impressed with wordpress since I moved to it from my old, emacs-driven system. It seems to work mostly and it&#8217;s reasonably easy to manage. One problem has been the regularity of the updates; worse, they all tend to be security updates (2.8.4 was to correct a problem where a crafted URI [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1413">
<p>I&#8217;ve generally been reasonably impressed with wordpress since I moved to it from my old, emacs-driven system. It seems to work mostly and it&#8217;s reasonably easy to manage.</p>
<p>One problem has been the regularity of the updates; worse, they all tend to be security updates (2.8.4 was to correct a problem where a crafted URI allowed overwrite of the admin password). So, you have to update. Often.</p>
<p>Fortunately, wordpress provides an automatic mechnism for achieving this. Less fortunately, it doesn&#8217;t work for me. We&#8217;ve finally pinned down why, which is too tedious to explain, but I don&#8217;t like the mechanism anyway, as I have to give wordpress my username/password (for the command line, not for wordpress).</p>
<p>So, I&#8217;m trying another solution. Check the whole thing out of <a href="http://codex.wordpress.org/Installing/Updating_WordPress_with_Subversion">SVN</a>. I&#8217;ve just moved over to this mechanism for the 2.8.4 upgrade and it seems to work. This is actually the same amount of effort as a regular manual upgrade; you just <tt>svn co</tt> rather than <tt>wget/unzip</tt>. In future, it should me much easier, though. Just a simple <tt>svn switch</tt>. No fiddling with moving <tt>wp-config</tt> across, and <tt>wp-content</tt> should be unaffected. Even better the one hack that I have had to apply to <tt>formatting.php</tt> every time should be automatically merged in, or will conflict &#8212; in which case, it will good to be warned.</p>
<p>I&#8217;ll post again in a few updates time if it all works; if this blog suddenly goes offline, well, probably this wasn&#8217;t such a good idea after all.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1413 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/08/avoiding-the-upgrade-problem/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New Categories</title>
		<link>http://www.russet.org.uk/blog/2009/07/new-categories/</link>
		<comments>http://www.russet.org.uk/blog/2009/07/new-categories/#comments</comments>
		<pubDate>Tue, 21 Jul 2009 16:40:10 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Personal]]></category>
		<category><![CDATA[Professional]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1380</guid>
		<description><![CDATA[Following my holiday, I&#8217;ve decided to create two new categories for my blog, one for all my professional pieces and one for my personal. This blog fulfils two many purposes. Firstly, it serves as a memory aid for myself; I can look back at the things and the ideas that I&#8217;ve had in the past. [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1380">
<p>Following my holiday, I&#8217;ve decided to create two new categories for my blog, one for all my professional pieces and one for my personal.</p>
<p>This blog fulfils two many purposes. Firstly, it serves as a memory aid for myself; I can look back at the things and the ideas that I&#8217;ve had in the past. Secondly, I use it to publish these ideas. I&#8217;m aware that the former is the more important than the latter; like most blogs, this site is not heavy traffic.</p>
<p>I do publish about my personal life here, but this is not a full disclosure blog; it&#8217;s called &#8220;an Exercise in Irrelevance&#8221; for exactly this reason. I put occasional reviews of things up; places I&#8217;ve visited or music that I&#8217;ve listened to. All about my reactions to public events. This blog isn&#8217;t meant to be a soap opera.</p>
<p>I also publish posts about my work here. I think, over time, these will become more important; recently, I&#8217;ve been the blog as <a href="http://www.russet.org.uk/blog/2009/06/introducing-omnencap/">lab book</a> but I think it will also start to become a more formal <a href="http://www.russet.org.uk/blog/2009/06/publication-by-blog/">publication route</a>.</p>
<p>Given this, I think it makes sense to separate the two strands, to enable the few subscribers that I have to choose whether to read about my life outside science or not. <a href="http://www.russet.org.uk/blog/category/all/personal/">Personal</a>, <a href="http://www.russet.org.uk/blog/category/all/professional/">Professional</a> or <a href="http://www.russet.org.uk/blog/category/all/">Everything</a>, the choice is yours.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1380 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/07/new-categories/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Blogpost Fiddling</title>
		<link>http://www.russet.org.uk/blog/2009/07/blogpost-fiddling/</link>
		<comments>http://www.russet.org.uk/blog/2009/07/blogpost-fiddling/#comments</comments>
		<pubDate>Thu, 02 Jul 2009 20:29:41 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1312</guid>
		<description><![CDATA[I think I now have my blogging environment as I want it. I&#8217;ve been using blogpost.py to do my posting. I couldn&#8217;t let go of my text only environment. I don&#8217;t care if it&#8217;s old fashioned, but I like the separation of editing and viewing. In this case, I&#8217;ve even had to learn asciidoc, but [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1312">
<p>I think I now have my blogging environment as I want it. I&#8217;ve been using <tt>blogpost.py</tt> to do my posting. I couldn&#8217;t let go of my text only environment. I don&#8217;t care if it&#8217;s old fashioned, but I like the separation of editing and viewing. In this case, I&#8217;ve even had to learn <tt>asciidoc</tt>, but it was worth the effort.</p>
<p>Today, I think I have fiddled with <tt>blogpost.py</tt> for the last time. I can now set both categories and status (published or unpublished) from within the blogfile. I&#8217;d added a <tt>post</tt> command previously; originally, <tt>blogpost</tt> used to have a create and update command.</p>
<p>The big advantage with this is that all the information about the blog is apparent from the file; this means I can use a single make file to compile the lot. Any changes that I make while on the road will automatically publish to the web when I get online again. I can even put a catch-up in my backfile to make sure everything is up-to-date.</p>
<p>Okay, so I am sad; so sue me.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1312 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/07/blogpost-fiddling/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Publication by Blog</title>
		<link>http://www.russet.org.uk/blog/2009/06/publication-by-blog/</link>
		<comments>http://www.russet.org.uk/blog/2009/06/publication-by-blog/#comments</comments>
		<pubDate>Fri, 26 Jun 2009 23:51:31 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Science]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1289</guid>
		<description><![CDATA[Blogs are generally seen as a slightly dubious part of the scientific publishing landscape. This is not, of course, unreasonable. I put stuff up here, for example, such as my idea for IDs that I&#8217;ve thought about for a few days, but that I am unlikely to follow any further, or stuff opinion pieces on [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1289">
<p>Blogs are generally seen as a slightly dubious part of the scientific publishing landscape. This is not, of course, unreasonable. I put stuff up here, for example, such as my idea for <a href="http://www.russet.org.uk/blog/2009/05/identifiers_for_science/">IDs</a> that I&#8217;ve thought about for a few days, but that I am unlikely to follow any further, or stuff opinion pieces on <a href="http://www.russet.org.uk/blog/2009/05/the-comparative-advantage-of-bees/">bees</a> about which I have as little expertise as the average journalist.</p>
<p>Fundamentally, though, despite it&#8217;s current use, a blog is just a media channel; you can use them to transfer anything you like. A scientific paper, for instance. This might be useful. While, for instance, I love open access publication, it&#8217;s quite expensive particularly as the cash tends to come out of my own budget, at least until I can get the library to pay.</p>
<p>So, I&#8217;ve been thinking about a cheap and cheerful blog-based system. It would work like this. The author would simply publish their paper onto their own blog. Next, they would send a request (using one of these pingback or trackback thingies that I haven&#8217;t worked out yet) to a &#8220;journal&#8221; which would also be a blog, in this case a private one. The editor would then invite comments from willing reviewers using same technique. Reviewers could then read the blog post, comment on it using their own blog. After the normal revision cycle, the editor would make a decision. If it was accepted, the authors blog post would be linked from the journals main feed (probably grabbing an archival copy at the same time). If it was not accepted, the author could try another journal, this time with initial reviews in-hand; the process would not beed to be reiterated.</p>
<p>This would have several advantages over the current system. Formatting and presentational problems would disappear because they would be controlled by the authors. Prepublication would become unnecessary, because submission and publication would become the same thing. The role of the journal would be limited to what they are best at; getting reviewers in and rubber stamping a seal of approval on worthy papers. Finally, the tireless work of reviewers would be publically acknowledged; their own blogs would have a record of every review that they have ever done.</p>
<p>All the technology for this already exists; it just needs some social conventions layering on top.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1289 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/06/publication-by-blog/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Introducing Omnencap</title>
		<link>http://www.russet.org.uk/blog/2009/06/introducing-omnencap/</link>
		<comments>http://www.russet.org.uk/blog/2009/06/introducing-omnencap/#comments</comments>
		<pubDate>Fri, 19 Jun 2009 20:27:26 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Ontology]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1269</guid>
		<description><![CDATA[Ah, it does on and on. After my last attempt at literate OWL programming, called omnsplit, I decided that there was a problem; this version splits the OWL file into individual statements, and puts them into files with the same name as the OWL class (property, or whatever). The problem is that, for an ontology [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1269">
<p>Ah, it does on and on. After my last attempt at literate OWL programming, called <a href="http://www.russet.org.uk/blog/2009/06/introducing-omnsplit/">omnsplit</a>, I decided that there was a problem; this version splits the OWL file into individual statements, and puts them into files with the same name as the OWL class (property, or whatever).</p>
<p>The problem is that, for an ontology like OBI, you get 1400 individual files; this is just inconvienient as many applications don&#8217;t like this many files in a directory. Also, there is a naming constraint; you can only use characters legal in the file system; this doesn&#8217;t include &#8220;:&#8221; if you want to be Windows (NTFS) compliant.</p>
<p>So, for my new system, I decided to generate an index file, which just points at locations in the ontology file. Initially, I was just going to index the main ontology file; in the end, I decided a partial copy was the way forward; generating both the index and indexed file ensure that they will stay in-sync.</p>
<p>It required a bit of nasty latex hacking; the basic problem was avoiding the limitation of being only able to use legal LaTeX macro characters (that is letters). The system now works like this:</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt>
<i><font color="#9A1900">%% This is generated by python which also generates the</font></i>
<i><font color="#9A1900">%% function_ont.spt file which is a copy of the ontology (with a</font></i>
<i><font color="#9A1900">%% few new lines gone.</font></i>

<i><font color="#9A1900">%% This just defines a new macro in what appears to be an</font></i>
<i><font color="#9A1900">%% unnecessarily complex way.</font></i>
<b><font color="#0000FF">\expandafter\def\csname</font></b> OmnEntityHeaderheader<b><font color="#0000FF">\endcsname</font></b><i><font color="#9A1900">%</font></i>
<font color="#009900">{\lstinputlisting[language=omn,firstline=1,lastline=8]{function_ont.spt}}</font>

<i><font color="#9A1900">%% But the use of \expandafter and \csname means that you can</font></i>
<i><font color="#9A1900">%% use any character you like, including underscores and numbers</font></i>
<i><font color="#9A1900">%% in the macro name.</font></i>
<b><font color="#0000FF">\expandafter\def\csname</font></b> OmnEntityObjectPropertyhas_role<b><font color="#0000FF">\endcsname</font></b><i><font color="#9A1900">%</font></i>
<font color="#009900">{\lstinputlisting[language=omn,firstline=206,lastline=219]{function_ont.spt}}</font>

<i><font color="#9A1900">%% We can now define two commands in the style file. Again</font></i>
<i><font color="#9A1900">%% we use \csname so that we are not bound to characters legal</font></i>
<i><font color="#9A1900">%% in latex macros.</font></i>
<b><font color="#0000FF">\newcommand</font></b><font color="#009900">{\omnclass}</font><font color="#993399">[2]</font><font color="#009900">{\csname OmnEntityClass#1#2\endcsname}</font>
<b><font color="#0000FF">\newcommand</font></b><font color="#009900">{\omnobjprop}</font><font color="#993399">[2]</font><font color="#009900">{\csname OmnEntityObjectProperty#1#2\endcsname}</font>

<i><font color="#9A1900">%% now in our source, we can do things like this.</font></i>
<b><font color="#0000FF">\omnobjprop</font></b><font color="#009900">{}{has_role}</font>

</tt></pre>
</pre>
</td>
</tr>
</table>
<p>Using an index in this way also has another advantage. I&#8217;ve had to make a decision whether to go with rdfs:label or the entity name. I can now back out of this; I can just use both in the index file, without too much extra space, so that either would be referencable within the latex.</p>
<p>To me, this feels like the right solution. It&#8217;s relatively simple (with a bit of nasty latex, which is nicely hidden), it doesn&#8217;t depend on the file system. It needs a bit more work to bring it to completion, but not that much.</p>
<p>Sadly <a href="http://bio-ontologies.org.uk/">bio-ontologies</a> looms, so next week will be getting ready for that; perhaps I can finish this off on the way back. &#8220;Sadly&#8221; is perhaps a poor choice of words; I&#8217;m greatly looking forward to it, but I&#8217;ve kind of had the bit between my teeth with python and latex hacking for the last few weeks.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1269 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/06/introducing-omnencap/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>WordPress Upgrade</title>
		<link>http://www.russet.org.uk/blog/2009/06/wordpress-upgrade/</link>
		<comments>http://www.russet.org.uk/blog/2009/06/wordpress-upgrade/#comments</comments>
		<pubDate>Wed, 17 Jun 2009 13:10:28 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1267</guid>
		<description><![CDATA[Just upgraded to WordPress 2.8. The automatic update didn&#8217;t work; this seems to be a continual problem which stems from wordpress not being in the default location. For some reason, it wants to push from the new version rather than pull under these circumstances. Not good. So, I did the manual upgrade; unfortunately the admin [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1267">
<p>Just upgraded to WordPress 2.8. The automatic update didn&#8217;t work; this seems to be a continual problem which stems from wordpress not being in the default location. For some reason, it wants to push from the new version rather than pull under these circumstances. Not good.</p>
<p>So, I did the manual upgrade; unfortunately the admin page crashed out with an error:</p>
<p><tt>PHP Fatal error: Call to a member function read() on a non-object in wp-includes/theme.php on line 387</tt></p>
<p>This has been reported <a href="http://wordpress.org/support/topic/280088">here</a> and <a href="http://libraryvoice.com/archives/2009/06/11/a-little-snag-in-upgrading-to-wordpress-2-8/">here</a></p>
<p>It&#8217;s this bit of code causing the problems.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><font color="#009900">$template_dir</font> <font color="#990000">=</font> @ <b><font color="#000000">dir</font></b><font color="#990000">(</font><font color="#FF0000">"$theme_root/$template"</font><font color="#990000">);</font>
                <b><font color="#0000FF">if</font></b> <font color="#990000">(</font> <font color="#009900">$template_dir</font> <font color="#990000">)</font> <font color="#FF0000">{</font>
                        <b><font color="#0000FF">while</font></b> <font color="#990000">(</font> <font color="#990000">(</font><font color="#009900">$file</font> <font color="#990000">=</font> <font color="#009900">$template_dir</font><font color="#990000">-&gt;</font><b><font color="#000000">read</font></b><font color="#990000">())</font> <font color="#990000">!==</font> false <font color="#990000">)</font> <font color="#FF0000">{</font>
<i><font color="#9A1900">// etc</font></i>
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>It appeared to be only be my modified version of the theme (Evanesence) causing the problem; it&#8217;s not very modified, so I removed them one by one. For no readily apparent reason the problem appears to be a subdirectory called &#8220;images.old&#8221;. Surely, not a good reason for a crash.</p>
<p>Weird and wonderful.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1267 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/06/wordpress-upgrade/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introducing omnsplit</title>
		<link>http://www.russet.org.uk/blog/2009/06/introducing-omnsplit/</link>
		<comments>http://www.russet.org.uk/blog/2009/06/introducing-omnsplit/#comments</comments>
		<pubDate>Wed, 17 Jun 2009 12:02:27 +0000</pubDate>
		<dc:creator>Phil Lord</dc:creator>
				<category><![CDATA[Ontology]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://www.russet.org.uk/blog/?p=1258</guid>
		<description><![CDATA[After a bit of struggle, I now have another literate OWL tool working, along the lines discussed in a previous blog post. Rather than generating the OWL documentation, I now split a Manchester syntax file up, so that I can refer to bits of it. I have this working with OBI, using Protege to produce [...]]]></description>
			<content:encoded><![CDATA[<div class="kcite-section" kcite-section-id="1258">
<p>After a bit of struggle, I now have another literate OWL tool working, along the lines discussed in a <a href="http://www.russet.org.uk/blog/2009/06/literate-omn/">previous</a> blog post. Rather than generating the OWL documentation, I now split a Manchester syntax file up, so that I can refer to bits of it. I have this working with OBI, using Protege to produce a single merged ontology file, in Manchester syntax.</p>
<p>The current implementation is rather simple; it produces one file-per-entity in the OWL file which I don&#8217;t think is entirely good. When run on OBI, it creates over 1400 files which is a lot. The other problem is that I&#8217;ve had to do some dubious hacking to get the file names work out. Firstly, I have to remove spaces and &#8220;\&#8221;&#8216;s, as wel as &#8220;:&#8221; which is illegal on NTFS.</p>
<p>There&#8217;s also a problem with some of the OWL. Unfortunately, the OBI to OWL conversion process has a reification step which I don&#8217;t quite understand the purpose of. This comes out as this sort of anonymous individual. I&#8217;m not sure at all how the definition has come out as the rdfs:label, but, for sure, you can&#8217;t use this as a filename!</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><b><font color="#0000FF">Individual</font></b>: relationship:genid7

    <font color="#990000">Annotations:</font>
        rdfs:label <font color="#FF0000">"C located_in C' if and only if: given any c that</font>
<font color="#FF0000">instantiates C at a time t, there is some c' such that: c' instantiates</font>
<font color="#FF0000">C' at time t and c *located_in* c'. (Here *located_in* is the</font>
<font color="#FF0000">instance-level location relation.)"</font>@en,
        oboInOwl:hasDbXref relationship:genid8

    <font color="#990000">Types:</font>
        oboInOwl:Definition
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>I think I might change the implementation a bit, though. Having 1400 files in one directory is not good. My idea is to serialize the entire file out as latex, with lots of macros, autogenerated.</p>
<table border="0" bgcolor="#e8e8e8" width="100%" cellpadding="10">
<tr>
<td>
<pre><!-- Generator: GNU source-highlight 2.11.1
by Lorenzo Bettini

http://www.lorenzobettini.it

http://www.gnu.org/software/src-highlite -->
<pre><tt><i><font color="#9A1900">%% this would appear in the generated file</font></i>
<b><font color="#0000FF">\newcommand</font></b><font color="#009900">{\OwlClassowlthing}</font>{
  <b><font color="#0000FF">\begin</font></b><font color="#009900">{omn}</font>
Class: owl:Thing
  <b><font color="#0000FF">\end</font></b><font color="#009900">{omn}</font>
}

<i><font color="#9A1900">%% then in your latex file you would do</font></i>
<b><font color="#0000FF">\owlclass</font></b><font color="#009900">{owl}{Thing}</font>

<i><font color="#9A1900">%% which would just resolve to the class above</font></i>
</tt></pre>
</pre>
</td>
</tr>
</table>
<p>The only worry with this is that latex would then have to read a large file into latex, even if most of the macros are not used. This might be really, really slow. Well, we can but try.</p>
<p>As before, the current version is available at <tt>git://github.com/phillord/literate_omn.git</tt>.</p>
<!-- kcite active, but no citations found -->
</div> <!-- kcite-section 1258 -->]]></content:encoded>
			<wfw:commentRss>http://www.russet.org.uk/blog/2009/06/introducing-omnsplit/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

