Archive for the ‘Professional’ Category

Josh Brown from JISC has given his permission for me to reproduce the feedback from the peer-reivew of my last JISC grant which bounced. A shame, as it would have provided us with an opportunity to test out knowledgeblog on papers from the wild, while also producing an great demonstrator of the advantages of using the web to distribute papers with web technology rather than just dumping a link to a PDF.

With luck, we can rejuvenate this work in another way.

“One bid (Bid no 8: Newcastle University) was flagged by one of the markers as being out of scope, despite receiving good marks and positive comments from the other two markers.

The original terms of the call specifically state that projects must add value to existing peer reviewed journals. Projects seeking solely to create new publications are specifically excluded. (Please review the sections Expected Outputs and Requirements of the call for more detail on these conditions.)

Bid no 8 states:

“we will identify authors within Newcastle, take their open-access publications and recast them into a form suitable for WordPress”

The bid is clearly designed to aggregate content that has been published elsewhere, largely based on content held within Newcastle’s institutional repository. No existing, peer-reviewed scholarly journal is involved in this project.

While the creation of a web-native publishing tool clearly has merit, as identified by the two markers who praised this bid, the funding call is, as stated, intended to add value to existing publications. In the absence of an existing peer-reviewed publication as a partner in this project, the bid is out of scope”

The panel agreed with this analysis, which meant that, despite the fact that the project was viewed unanimously as very strong proposal on its own merits, we were obliged to decline to fund this project. The requirement for direct partnership with an existing peer-reviewed scholarly journal for all projects in this strand was imposed after lengthy discussion, and for a range of reasons, including sustainability, tight time-frames and so on, and it was felt that this should be upheld.

— Josh Brown

So, to start with a rant.

I have reached a key and pivotal point in my life. I have decided that I never, ever, ever want to see permalinks with any semantics in them, ever again. And before any one gets clever, yes, I know that this post has semantics in its permalink.

Recently I was looking through Knowledge Blog and realised that I have made a mistake with the permalink structure. When we created Ontogenesis I used semantic links — that is permalinks with the title of the article in them, because I thought that they would be more popular with authors and easier to remember. However, I didn’t want name clashes, land grabs or disambiguation of the sort that you get on Wikipedia(website). So I added in a date as well as a uniquish identifier. I realised quickly that I had manage to combine the worst of both worlds; people wished to change the titles of their articles, and the permalinks no longer fitted. And the links were still hard to remember. So I moved ontogenesis onto the simple number-based permalink structure that it has today. As a concession to usability, I didn’t use the basic ?=192 that is the default, but instead the rewritten 192 which is easier. As far as I can tell, WordPress remembers old permalinks — they do not just go away when the overall structure is changed and links are preserved. They really are as permanent as these things go.

But I had fixed the other knowledgeblogs subdomains consistently. My update to Process which defines and documents the process of knowledgeblog itself was still set up with the older style identifiers. So I changed it; for example, http://process.knowledgeblog.org/archives/19 became plain http://process.knowledgeblog.org/19. I don’t understand why, as WordPress seemed to maintain the links last time, but apparently this broke an email Dan Swan had sent out advertising out Bioinformatics Write-a-thon.

While I have generally purged semantics from links, WordPress still maintains the “title as link” approach for pages, as opposed to posts. I guess this makes sense, as you generally don’t have that many pages, but in this case it has shot me in the foot. I started to re-create a “Who are we” page for the www main domain of knowledgeblog. This ended up with a URL of http://www.knowledgeblog.org/who-are-we; but then I got distracted and left the job half-done. More I wanted to use my normal editing environment. So I trashed the page. Today, I created another page with the same name. But this got a URL of http://www.knowledgeblog.org/who-are-we-2. Ugly. WordPress would not let me rename this permalink, so I tried resurrecting the trashed post and changing it’s content. For reasons that I don’t understand, this didn’t work either and I ended up with http://www.knowledgeblog.org/who-are-we-3. I tried changing this to http://www.knowledgeblog.org/who which works, but redirects to http://www.knowledgeblog.org/who-are-we-3.

So, WordPress is doing (mostly) the right thing, but it still all worked against me. I don’t understand however, why, WordPress doesn’t allow you to set default permalinks for Pages as well as posts. It should do, but as far as I can tell, it does not.

The irony of this is that this is not a new issue. I even wrote a post about Manchester syntax and OBO which largely revolves around this issue. I know about the importance of semantics-free identifiers, and I should have known better then to make a mess of things this way, but on knowledgeblog and indeed on this blog. It just goes to show that handling change is hard and living with a nasty legacy is often the result. I guess that it is a nice example of the advantages and disadvantages of semantics and the compromises that have to be made in any engineering situation.

I haven’t decided yet, but I think I will change the permalink structure of this blog in a few days time. I am hopefully that existing links will be maintained, but that all future ones will exist only in numeric form. Fingers crossed, it will all work.

In a typically thoughtful post, Peter Sefton discusses the advantages and disadvantages of WordPress as an authoring environment. I though I would clarify my feelings on this a little.

Previously, from our experience on Knowledge Blog suggests to us that the WordPress environment is very poor for editing, something we have expressed in our process documentation.

I should be clear that this is in the context of knowledgeblog. Academics have their own way of working, and normally are used to this. They use tools which fit with their lifes. For example, Google docs is a good tool but, basically, useless if you do most of your paper writing on an plane. The same will be true for tools such as Annotum if it ever appears. It is hard to beat Word and email (or frequently dropbox nowadays).

Of course, there are other ways; for example WordPress offers “A complete revision history of the document is maintained with the ability to roll-back to earlier versions”. But, then, so does Word with dropbox. And the WordPress facilities are in no way comparable to the versioning that you get with latex, or asciidoc and Subversion or Git. Although, in practice, I rarely use versioning when authoring, and dropbox’s poor-mans roll-back is enough.

The only clear advantage of using WordPress tools is that you don’t need a two stage publication process. But, the general idea behind blogs, is that publication does not happen often; it happens once, and then the post remains. This is in contrast to a Wiki, where using external editing tools is impractical at best. And the situation is very similar to current publication where PDF is the common medium.

My conclusion — there are lots of people, lots of use cases, and lots of requirements. I don’t say that authoring must be independent from the publication environment; I do say that publication environment must not require a single authoring tool. Fortunately, for the tools that we have created for kblog, we can afford to be agnostic. They will work integrated with WordPress editing also. Still, I just spent 10 minutes longer making this post than I need to, to stop the shortcodes in Peter’s quote below from being kcite’d (check the source for the trick!), which was harder because I use asciidoc. There are going to be problems. Supporting a heterogenous environment is painful. I wish there were a perfect solution, but there there are just a set of messy compromises.

Peter also makes a second point about our plugins (and others): that is, that they are non-standard.

There are similar issues/risks with stuff like WordPress shortcodes such as KCite from KnowledgeBlogs. It’s a great tool for authors, allowing them to cite things in a rational way:

DOI Example – [cite source=’doi’]10.1021/jf904082b[/cite]

PMID example – [cite source=’pubmed’]17237047[/cite]

But it’s proprietary to a particular processing environment.

There is a risk of creating a new form of the proprietary lock-in we had up until recently (and arguably we still have) with document formats like Microsoft’s .doc.

— Peter Sefton

It’s a fair point, and one which I agree with. The last thing that we need is hundreds of independent shortcode or other syntaxes; I mean, imagine what a nightmare it would be if every single Wiki engine and text conversion tool used their own, almost identical, but slightly different and incompatible syntax. Hmmm.

We chose to use shortcodes for two highly pragmatic reasons. First, WordPress has nice support for them. Building a shortcode handler is nice and simple and does not require us to build regexps (the first version did it by hand for one reason or another, and the regexps were painful). The second reason stems from our desire for a decoupled authoring environment. Shortcodes pass through the HTML publishing step without escaping; to use XML or HTML compliant mechanisms would require us to change, for example, the HTML export mechanism of Word. Not somewhere we wished to go.

In practice, however, I don’t think that this is a major problem, if the code is written carefully. With Mathjax-latex, the shortcodes are transfered into Mathjax syntax, then mathjax does the rest. The development version of kcite works this way — the shortcodes are translated into a span-tag based microformat, then the bibliography tools operate on the client to format the bibliography. So long as the code is crafted reasonable, it should not be dependant on WordPress.

This is just a short introduction to Michael Bell, my PhD student. He’s now in the second year of his PhD, and has been looking at annotation in biological databases. More specifically, we are trying to define quality measures for textual annotation, based around the bulk properties of these databases. It’s related to, but distinct from my early work on semantic similarity. The question is whether we can judge the quality of sentences, words or records based on how they have been used previously, and how far they have spread.

Michael has now started to blog his work, following on from my own knowledgeblog work, and our general commitment to open science. As part of his work, he is starting to build web delivered tools, as it is a useful way of navigating the complex knowledge space of biological data. So, his website is also part of his work.

A good example of this recent blog post discusses the creation of word clouds for all historical versions of Swiss-Prot and TrEMBL and, because everyone loves a word cloud, it is well worth a look.

This is latest grant that we have submitted to JISC, in this case for a new application of the knowledgeblog platform. As usual, it is a direct post from word, so there may be a few presentational issues in it.

 

The grant is currently under review; I will post the outcome and any feedback (if possible) once I have a result.

Outline Project Description

In this project, we will generate a large body of web content, demonstrating the applicability of commodity blogging technology as supplement to the Universities existing eprints archive. Through a use of technology pioneered by the JISC funded Knowledgeblog project, we will publish 100+ scientific articles, from a variety of different word-processing environments, in a structured-web capable form rather than as PDF. This content will then be augmented to demonstrate the advantages of leverage from a commodity platform, enabling novel mechanisms of publication.

1. Introduction

1The modern publishing industry has been massively affected by the development of the web. However the impact has been highly varied across different domains. Publications that address news events or encyclopedic knowledge have been very heavily affected; other areas have changed little. The web initially developed from the desires of scientists to share knowledge; in some areas, such as biology, the uptake of web technologies has been little short of extraordinary. It is ironic, therefore, that the publishing of formal academic papers has been affected relatively little by the web. Although, content page listings may have been largely replaced by RSS or email, and papers may be available as HTML, they are still largely constrained by the print requirements, packaged as PDFs, poorly linked, with static figures.

 

2An alternative publication mechanism has already been funded by JISC as part of the “Managing Research Data” programme. As part of the Knowledgeblog project, we have investigated using a publication tool, which integrates well with scientists’ existing work-practices, based around a commodity blogging engine, namely WordPress. There are a number of tools such as Open Journal Systems, or organizations like Scielo which allow the web publication of academic articles. While these have large user bases (OJS — 6000 journals, Scielo — 600), currently, WordPress is used to drive around 10% of the world‘s websites; a user base orders of magnitude larger. WordPress, therefore, performs the basic tasks of publishing articles extremely well, scaling to millions of page hits, enjoys tool support from many word processing environments and benefits from many augmentations for specialist audiences. We have extended this tool with a few specialised extensions of our own and, as a result, made it more suitable for academic publishing. We have then used this tool as the basis for two journals, in this case, aimed at producing educational resources describing ontology technology (http://ontogenesis.knowledgeblog.org), and the JISC-funded Taverna workflow system (http://taverna.knowledgeblog.org).

 

3These two resources are, in effect, “gold open-access” — although not requiring author payment. They present content which has not been presented elsewhere, but was written for the purpose; articles have been (or are progressing through) a formal review process. While this has provided a useful resource, generating over 15k page views, these resources are designed to be coherent in scope; although this is generally a positive virtue, by definition it allows us to investigate the suitability of the tooling for only a small number of articles and a limited domain.

 

4Newcastle University has a strong history in supporting gold open access publication: it was the site for the first open access law journal in the UK (http://webjcli.ncl.ac.uk/). In addition, it also has a large and successful eprints repository (http://eprints.ncl.ac.uk) archive, currently hosting 50k articles or bibliographic records; in this project, we will exploit the eprints archive to provide content, building a substantial knowledge resource; this will both demonstrate the suitability of the Knowledgeblog tool-chain as the basis for green open access publication, the value of this novel form of publication, and provide the vital testing against content “from the wild”, allowing us to extend the suitability of this tool-chain to as many areas of academic discourse as possible.

2. Fit to call

5The project call notes that JISC is or has funded many projects relating to scholarly communication. These include: infrastructural support in the form of institutional repositories; support for open-access; and support for novel mechanisms of publication such as overlay journals. Specifically, theme D – campus-based publishing – is aimed at increasing the capacity of the sector to publish and disseminate research outputs directly. The call also highlights attempts such as the “Beyond the PDF” workshop to move toward more structured forms of knowledge; while, in theory, PDF is capable of supporting relatively rich structuring, in practice, most of the tools which generate files in this format produce a relatively opaque, binary artefact from which it is difficult to extract information, or to repurpose or recast that in any way.

 

6While open-access publishing has made significant strides in the last 10 years, becoming an accepted part of the academic landscape, Gold open-access – the publication of original content – still accounts for the minority of academic publications. Green open-access – author publication of content often published elsewhere – now accounts for up-to 25% of the literature in some fields.

 

7Institutional repositories such as that run by Newcastle (http://eprints.ncl.ac.uk) or author archiving on their website (e.g. http://homepages.cs.ncl.ac.uk/phillip.lord/publications.html) are the most common route for green open-access publication. While increasing access to academic materials is a very positive step, this form of publication is largely limited to providing access to a PDF. From neither the authors, nor the readers point of view, is there significant added value to the publication. For example, our experience is that authors are often equivocal or disinterested in publication in institutional repositories as it is “just-one-more-thing” to do, while maintaining a website requires significant technical expertise.

 

8For this grant, academics at Newcastle supported by the infrastructure provided by the local librarians will provide an alternative; we will identify authors within Newcastle, take their open-access publications and recast them into a form suitable for WordPress. We will do this with their active permission and engagement, using the tooling we have developed or documented as as part of the previously-funded JISC “knowledgeblog” project. Where authors wish to, we will support them in performing this work for themselves; where they do not want “just-one-more-thing”, we will leverage off the existing eprints process, and perform this work for them. In general, this can be performed directly using MS Word, latex or other word-processing software, whichever is the authors’ preferred editing environment. In addition, we will use this process to increase the usability of the tooling, increasing the ability to and likelihood that authors will directly publish their work in fashion. As this proposal is built on existing work from the University eprints archive, library-support is implicit within FEC and not specifically or additionally costed.

 

9Once publications are available in this framework, authors and readers will be able to take advantage of the additional features which come either from WordPress directly, or from augmentations provided or assessed by the WebPrints team. For example, authors will be able to see rich content-access statistics, including page-views, referrer and incoming link information. Published articles will be bi-directional linkable using trackbacks. Authors will be able to add tags, zoomable equations or automatically generated reference lists depending on their level of technical competence. For viewers, category and tag based RSS feeds will be available, searching, bi-directional linking (again!) will be possible. As a result of the work from the previous knowledgeblog grant, all posts will be tagged with metadata, in various forms, and will be available for formal archiving outside of the University.

 

10The publication framework is based around WordPress which is freely available, scalable, stable and hardened by its multiple user base. The system is continually updated, but has a good reputation for maintaining backward compatibility. The authoring framework is based around commodity tools such as Word or latex. Most of the workflow process within Newcastle is pre-existing as part of the eprints service. This project therefore provides a sustainable and novel enhancement to the existing process.

3. Workplan

3.1 WP1 Management, Systems Administration and Set up.

11This work package will fulfil the basic management and administrative tasks required for the project. This will include setup of the repository, styling and theming appropriately for the project; definition of a basic workflow for management of documents and metadata; fulfilment of standard JISC reporting requirements.

12We request additional funding of 1k as part of this work-package for virtual server upgrades (additional disk space), dropbox space to enable document management, and wordpress anti-comment spam support.

3.2 WP2 User documentation.

13Most of the operational, “how-to” documentation is already available: either at http://process.knowledgeblog.org (developed by the JISC funded knowledgeblog project); or, as the repository is based on commodity technology, from many publicly available websites.

 

14However, there will be information specific to the Webprints archive; about copyright, about document management, and about the relationship to the university. For this, we will need to generate some specific documentation.

 

15As the project progresses, we will improve and enhance this documentation, based on our experiences, including for example, statistics on how long author self-deposition takes.

3.3 WP3 Author advertising and Material identification

16We will seek active engagement with our user community, by linking into the current eprints system. Combined with the Newcastle-specific, internal “myimpact” database (which was designed to capture research outputs for the next REF), this will enable us to identify new publications as they come out. In the first instance, we will select material that has been published in open access journals (or where embargo periods, or other conditions allow). We will contact authors individually, inform them of our project, and advising them about the methods for recasting of their paper (see WP4).

 

17We will not preselect on the basis of academic quality, only technical and legal (copyright) grounds. Although the eprints service displays full text as PDF only, the myimpact database in many cases also stores MS Word (or equivalent) formatted data. We will, therefore, prefer papers where this data is available. We will prefer papers which are recent over those which are older. Finally, we will prefer papers which give us a wide spread of authorship and discipline.

 

18Although the focus of this proposal is on the provision of a service for publication of green open access material in a fully web-capable format, we will be happy to receive grey literature, on an author-publication basis.

3.4 WP4 Paper recasting

19This work package will take papers selected as part of WP3 and publish them to the webprints archive. In most cases, this work will be performed using tooling developed or documented by the previously funded JISC knowledgeblog project.

 

20We will publish articles in three ways:

Webprints team published. All work will be performed by members of the Webprints team. For each paper, we will write a short report, describing any issues with the publication process, and any errors seen (which we will hand-correct). We will gather statistics on the time taken to publish. Papers will be published on an “as-is” basis; that is we will not seek to enhance the content at this point. We will add metadata in a structured way, which will be accessible from the web presented version.

Author published, webprints supported. We will work directly with authors to publish papers and help them. Where possible, we will augment and add new features (latex maths support, citation). These papers will be marked as featured, and augmented. Again, we will gather statistics on the time taken to publish, broken down for additional functionality.

Author published. Authors will publish directly into Webprints, using either their pre-existing experience, or our own user documentation. We will request, but not require statistical feedback. Publication will be as the author wishes — as-is, or augmented with additional functionality.

 

21All papers will be annotated with standard metadata in a structured form; our previous work means that this metadata will be available from the web presentation of the paper.

3.5 WP5 Repository and process enhancement

22For this package, we will focus on two key aspects: tooling for publishing papers and their presentation once there.

 

23For the presentational issues, in the first instance we will focus on enhancements which do not require support from the article material. For example, as we will add metadata to articles, which will allow us to generate metadata headers (CoINS, standard meta tags etc) without further analysis of the article material itself. Likewise, our experience with the knowledgeblog project means that we can support “out-of-the-box”: multiple export formats (including HTML, PDF and ePUB); site wide indexes (by year, author, subject etc); comments; trackbacks and page feeds (including from subsections). Through use of third-party software, we will also be able to add: related papers through textual analysis; tag clouds; twitter backs; automated multi-lingual presentation and social networking support.

 

24We will also investigate enhancements which require modification of the original content (and therefore increased interaction with authors). From the knowledgeblog project these will include: scalable equation presentation; and client-side generated bibliographies. We will also add “custom posts” for supplementary material (spreadsheets for instance). And, finally, through the use of third-party material, enhancements such as syntax highlighting, zoomable maps, slideshows and so forth. This part of the proposal is designed to be open-ended and exploratory; which forms of enhancements, we pursue will depend on the types papers selected and interactions with the authors. There are currently over 13,000 plugins available for wordpress, which provides us with a considerable resource to build from.

3.6. Timetable

Name

Begin date

End date

Resources

WP1.1 – Setup Repository

02/05/11

14/05/11

SC, AL, DS

WP1.2 – Document Workflow

02/05/11

14/05/11

PL

WP2.1 – User Documentation

09/05/11

24/05/11

DS, PL

WP2.2 – User Statistics

16/05/11

31/08/11

SC, AL

WP3.1 – Author Engagement

16/05/11

31/08/11

SC, AL, DS, PL

WP4.1 – Paper Recasting

01/06/11

30/09/11

SC, AL, DS, PL

WP5.1 – Repository Enhancement

01/07/11

30/09/11

SC, AL, DS, PL

4. Deliverables

25A repository of open-access articles in a fully web-capable format. This will act as a supplement to the existing eprints archive at Newcastle. We expect to generate around 100 articles in this form, although this is likely to be an underestimate. We are currently estimating throughput from our experiences with Knowledgeblog, which involved relatively few articles. The process should benefit from high-throughput experience. Further documentation, published on http://process.knowledgeblog.org, describing the process that we have used to set up this repository. Enhancements to tooling, enabling others to publish more easily in this manner. Additional experience and software enhancing the presentation of data held in this form.

5. Project management arrangements

26The project will be managed by Dr Lord, who will be responsible for:

  • Developing Project Management Plans;
  • Ensuring that the Project work package objectives are met;
  • Prioritising and reconciling conflicting opportunities;
  • Reporting and collaborating with JISC programme manager
  • Dissemination of research results.

 

27Project progress will be evaluated through scheduled, short, “stand-up” meetings on a weekly basis, conducted face-to-face, via Skype or phone as appropriate. Primary unscheduled communication will be via public mailing list, ensuring maximum visibility and openness. We will use other readily available tooling to manage the document process pipeline – Google spreadsheets, dropbox, and likewise for software development (Google code). All staff are associated with other projects or service provision (research, teaching, training); they will be individually responsible for managing these workloads, and are highly experienced at doing so.

5.1 Risk Management

28Staff risks – the basic organisation of the project has been designed to mitigate against staffing issues. All staff are in post and are highly experienced, with long-track records at Newcastle. Costs have been split three ways, therefore even if in the unlikely event that one member of the team leaves during the project, it will not cause significant distruption.

29Software risks – we are using commodity technology, which is very well proven and supported. None of the software is critical (even our basic blogging engine, wordpress, is replaceable). Therefore, while changes in third-party software might degrade or slow progress, it will not halt.

30Engagement Risks – the project requires a level of engagement from Newcastle researchers, which may not materialize. We have minimized this risk by minimizing the effort the engagement takes on behalf of the researchers. The project members are well known to many in the university (DS and SC comprise the “Bioinformatics Support Unit” and have worked for many PIs personally). We have active engagement from the library, in particular from Moira Bent (Science Faculty Liaison Librarian), and Paula Fitzpatrick (Digital Libraries).

5.2 IPR position

31The bulk of the content handled by this work will come from authors within the University. The current restrictive copyright requirements of many publishers place uncertain limits on what can or cannot be done with this content. For this reason, we will use articles that have been published with or have become available under creative commons or other open access license.

 

32Project members will release written work (documentation etc) under a Creative Commons Attribution ShareAlike 3.0 Unported License (CC BY-SA), which allows re-use and modification for non-commercial purposes with attribution. This is in line with the JISC Model Licence. Software linked to WordPress will be released under GPL, as required by the WordPress license. Software which is separable will be released under LGPL. Software linked to other third-party libraries may use other license if required; this will be limited to Free/Open source licences.

 

5.3 Sustainability

33This project is largely based around innovative, novel and leading use of existing software. As such the sustainability of the majority of the technology base is not dependent on project members but large companies with established and proven business models.

 

34The WebPrints archive will be run from the same server as knowledgeblog.org; this is being developed and maintained and will be for the foreseeable future, and the additional of the WebPrints archive will not be a substantial additional cost. However, should this cease to happen, the content of the WebPrints archive will be creative commons or an equivalent permissive license. This will make it possible for the JISC funded UK Web Archive to store the website for the future.

 

35Although, we will not be able to sustain publication by the WebPrints team past the lifetime of this proposal without further funding, author publication will be possible; our experience with existing tooling is that this is possible for many, although requires some level of technical skill, depending on the word-processor package, and level of complexity of the paper.

5.4 Staff Recruitment

36All staff are already in post. Recruitment during the project will therefore be unnecessary.

5.5 Key Beneficiairies

37Our immediate beneficiaries Newcastle University staff, who will have their work published using a new and novel publication technique. Critically, we will demonstrate the value of this form of publication technique to both researchers and librarians within the University who will in future be better placed to use or support this technology to publish their own or others work in future.

 

38Although presented here as a discrete project, the work fits within the background of the wider blogging community. So, our own knowledgeblog project and website will be able to take advantage of software improvements that will happen as a result of this work. Additionally, the general academic blogging community will gain a new resource. Increasingly, this community is a critical path for public engagement in the academic process.

5.6 Community Engagement

39Community engagement will take place initially by direct contact; we will email authors to ask for their engagement in the publishing process. This should have the secondary effect of advertising the presence of our project. We have active engagement from the library staff, who are well known within the University. In terms of engagement with the resource outside of Newcastle, we will make active use of various web and social networking facilities. Our experience has shown that this can generate significant amounts of engagement in a relatively short period of time. Finally, we will advertise the work through standard academic channels of conference and journal publication; although effective, this tends to be slow. This is problematic for a short project, hence we consider this to be a secondary means of communication.

 

6. Budget

 

Removed for privacy reasons.

7. Project Team

 

40Dr. Phillip Lord is a Lecturer of Computing Science at Newcastle University. He has a PhD in yeast genetics from University of Edinburgh, after which he moved into bioinformatics. He is well known for his work on ontologies in biology, as well as his contributions to eScience beginning with his role as a RA on the myGrid project. Since his move to Newcastle, he has been an investigator on there more eScience projects; CARMEN, ONDEX and InstantSOAP, as well as maintaining an active engagement in standards development (OBI, MIGS, MIBBI), and publishing on the fundamentals of ontology design. He is a active participant in the Scientific Blogging community, developed the initial idea for knowledgeblogs. As well as managing the knowledgeblog project, he is the developer of tools such as “Latextowordpress”, as well as WordPress plugins such as “Mathjax-latex” and “Kcite” all of which improve the usefulness of wordpress for academic communication.

 

41Dr. Daniel Swan has a PhD in developmental biology and continued to work in developmental biology as a post-doctoral researcher before moving into bioinformatics in 2001. Subsequent positions included working for Bart’s and the London Genome Centre and the Centre for Hydrology and Ecology in informatics driven roles dealing with large, distributed biological datasets generated by large user communities. Currently the manager of the Newcastle University Bioinformatics Support Unit, he leads a small team aiding biological researchers generate, capture, store and analyse their digital data. His interdisciplinary background means he has grounding in both computer and biological sciences and is comfortable working on CS focused projects (CARMEN, InstantSOAP, Bio- Linux) as well as acting in a research capacity analysing high-throughput data. He is currently active within the knowledgeblog project, having been responsible for adding software support for a review process, gravatars, syntax highlighting, PDF and ePUB exports.

 

42Dr. Simon Cockell has a PhD in Genetics from Leicester University, and refocussed into Bioinformatics with a Masters degree from Leeds in 2005. From there he moved to Newcastle, and the Bioinformatics Support Unit. Since coming to Newcastle, Simon has worked on a range of projects involving large scale analyses (AptaMEMS-ID), data integration (Ondex) and health informatics (MRC Mitochondrial Disease Cohort). He is currently active within the knowledgeblog project, having been responsible for metadata support (including Coins), navigational support (for both humans and robots) and is a co-author of kcite and mathjax-latex.

 

43Allyson Lister worked for 6 years at the EBI in Cambridge, developing and producing the UniProt/TrEMBL protein database. She is currently focusing on the use of ontologies for the semantic integration of systems biology data with her current job at CISBAN in Newcastle University. Both at the EBI and at Newcastle University, she developed structured data formats including UniProt/TrEMBL and SBML. She has also been an early adopter of blog technology as a mechanism for communication of both her own and others primary research. Since 2006, she has co-authored a number of posts with other bloggers in the community and has been invited to be a guest author at both the ISCB news and the BioSharing blog. She has published papers highlighting the importance of social networking and live blogging to bioinformatics.

Paola Marchionni of JISC has give her permission to reproduce the feedback from the peer-review of my last JISC grant which sadly failed. I want to publish it here, as part of my desire for open science rather that as an opportunity to reply which, perhaps unfortunately, the JISC process does not otherwise allow.

I am a little surprised by some of the comments, to be honest. The main criticism was more expected though, which essentially says “it’s not crowd-sourcing if you pay people to develop content”. You have to try these things, but I did think that actually paying for content might be considered to be a little revolutionary. Ah, well, better luck next time.

Markers felt the form of this proposal was “robust”, however there wasn’t enough clarity on the deliverables and especially on how the value of what was being produced would be assessed down stream. They felt there was also some lack of information on how the currently JISC funded K-Blog project, due for completion in July 2011, related to this project and what the impact on its team would be, which seems to be the same team as the one proposed for this project.

The main concerns, however, were around whether this could really qualify as a crowdsourcing or community project – it was felt it was more about disclosing data than community engagement – also considering that the authors of the articles would be paid. There were some doubts about the sustainability of the project beyond the 7 months duration of the funding, as lack of funding would prevent more articles being created and metadata added by the team. One marker also felt that a risk analysis should have taken into account the risk of disparate communities not being aware of the content and using and engaging with it. A more clear identification of the various communities the project aimed to reach and a more targeted strategy for engaging with such communities would have been useful.

Finally, another issue that was raised was that there wasn’t sufficient information on how the partnership with Manchester University would work, either formally or informally, and the dissemination plans could have been stronger, as they relied mainly on the role of K-Blog.

— Paola Marchionni