This is just a short introduction to Michael Bell, my PhD student. He’s now in the second year of his PhD, and has been looking at annotation in biological databases. More specifically, we are trying to define quality measures for textual annotation, based around the bulk properties of these databases. It’s related to, but distinct from my early work on semantic similarity. The question is whether we can judge the quality of sentences, words or records based on how they have been used previously, and how far they have spread.
Paola Marchionni of JISC has give her permission to reproduce the feedback from the peer-review of my last JISC grant which sadly failed. I want to publish it here, as part of my desire for open science rather that as an opportunity to reply which, perhaps unfortunately, the JISC process does not otherwise allow.
I was delighted recently to discover Greyhole. Essentially, it’s a system that allows you to configure a Samba share at one end, and a bunch of disks at the other. The disks get the data shared between them, with a configurable level of duplication. It’s aimed mainly at the home user, who wants a higher degree of data security than the single drive approach provides, but is not going to go the expensive and poorly scalable RAID approach.
I was entertained by a couple of articles recently, one from PLoS Blogs and one from Ed Yong both bemoaning the low social status of bloggers at least in some peoples minds. As the front page of the PLoS blog says:
I’ve just got around to installing the magnificient kcite plugin that Simon Cockell wrote for knowledgeblog. It’s actually a really simple plugin, but it’s tremedously useful. For instance, I can now cite my own papers on reality [cite source=‘doi’]10.1371/journal.pone.0012258[/cite], function [cite source=‘doi’]10.1186/2041-1480-1-S1-S4[/cite] or protein classification [cite source=‘doi’]10.1093/bioinformatics/btl208[/cite] and all the metadata will be gathered and cited for me in a nice reference list at the end.