About

This is the full text of a grant called “Knowledge in Biology” that we submitted to JISC, as a follow-up to our knowledgeblog grant. Unfortunately, this grant was not accepted. This blog post is the direct output result of Word; apologies if the conversion is imperfect.

 

 

 

Outline Project Description:

Many disciplines within the sciences are knowledge-rich; of these, biology is an extreme example. In order to make advances, biologists need to be able to access knowledge from both their own and related communities in an easily digestible form. However, the publishing of this knowledge does not fit well with existing scientific communities, as it is often not regarded as “research based” – rather it is a stored body of grey literature, often not publically available. In the Knowledge in Biology project, we will engage with disparate communities in disciplines that engage with biologists as well as the community of biologists themselves. We will generate substantial content describing how “Knowledge in Biology” is both produced and consumed in the pursuit of new discoveries, by commissioning the authorship of this content directly from the funding for this project.

We will leverage the output of the JISC-funded Knowledge Blog platform, as a tool for coordination, publication and dissemination of this content. The result will be a publically accessible, high-impact resource of short, readable and accessible articles describing how to gather, manipulate and synthesise knowledge in biology. This will be of significant value in supporting the multidisciplinary research that is necessary for advance in modern biomedicine.

 

1. Introduction

1This document describes a proposal for a project within the JISC “e-Content” programme call.

2Modern biology is a rich, complex, multi-disciplinary field. In particular, practitioners need knowledge about how to access, organise and structure knowledge itself. As a result, members of the community often need to cross the boundaries of traditional societal structures within research. By definition, this is not well supported by the more formal structures that scientists use for the publication and dissemination of knowledge. So while the information exists, it is not accessible; hidden from the community on the desks and hard-drives of individuals.

3One of the difficulties with migrating this community-based knowledge away from grey literature to a more openly-accessible archived and referenceable form is the lack of a formal reward structure. Although scientists may engage in this form of activity from a sense of public duty, this form of documentation is not critical for their career advancement, or for gaining academic creditability, and so it is rarely made a priority. While technological advances have made publication of this material straightforward, the social structure of science has not supported it. As a result, there is a large body of knowledge about how biologists conduct their work that is simply lost to the community, meaning considerable lost time and effort recreating this knowledge, only for it to be lost again.

4We plan to circumvent this societal barrier using a novel approach – we will directly commission the authoring and reviewing of articles embodying this content. As the knowledge will often be readily available to individual members of the community, and we are aiming for articles which are neither of the size nor complexity of formal research publications, it will be possible to generate a substantial body of content, at relatively low-cost.

5An ideal mechanism for publication of this knowledge has already been funded by JISC as a part of the “Managing Research Data” programme. This is the Knowledge Blog project: a light-weight publication tool, that integrates well with scientists’ existing work-practices, based around a commodity blogging engine. This ‘Knowledge in Biology’ project (KiB) will utilize the work from Knowledge Blog, to the benefit of both: this project will gain a technological underpinning at little cost – Knowledge Blog already exists and will require a small increase in resources to manage the additional content and traffic; Knowledge Blog will gain substantial content and enormously increased visibility.

6The KiB project will provide a small amount of funding for the management and commissioning of articles, but the majority of the funds will be spent by using individually small amounts of money, crowd-sourcing the development of a novel digital content resource, engaging the community of biomedical researchers, both as authors and reviewers. The content will address key issues relating to knowledge in biology such as, data standards, linked data, knowledge in synthetic biology and statistical approaches to knowledge, as well as “softer” issues such as the use of Web 2.0, the social web, and the blogosphere as tools for the biomedical researcher.

2.1 WP1 – Knowledge Blog (k-blog) maintenance and support

 

7The primary purpose of this proposal is to generate significant quantities of digital, community-developed content. The k-blog platform already exists, supported by a previous JISC call. We are not, therefore, proposing to make significant enhancements to either the process or the software in the course of this project. However, the additional load placed on the platform will require a small amount of administrative work in terms of maintenance.

8In addition, we will need to provide support to the users of the platform; while k-blog is relatively easy-to-use, issues do arise with authoring, with formatting or with exceptional requests (for example, multi-media documents).

9For articles to be properly citable and maintainable, manual intervention is required to supplement the text with computationally accessible metadata, including DOI assignment. This enables improved archiving and discovery, which increases the value of the resource. As part of WP1, we will annotate documents with this metadata to ensure consistency and to avoid placing the burden on the main authors.

10We will install and refine a licensing plugin for the k-blog platform, which clearly displays license information for each article, based on the author’s selection.

 

2.2 WP2 – Management of publication process

 

11Articles in KiB will be produced by crowd-sourcing and by the in-house team (WP4). Our aim is to bootstrap the KiB k-blog so that it reaches a critical mass of articles that will attract both readers and more authors. We will commission articles from specified, expert authors with the attractor of a small payment. The payment will require the contributor to both submit an article and a review for another article.

12In preparation for this work, we have compiled a list of topics for KiB and put names against these topics. We have clustered the topics around themes in KiB: The role of semantics In biology; the representation of knowledge in ontologies, terminologies and vocabularies; data integration to create knowledge resources; data and knowledge standards; knowledge technologies such as RDF, Linked data, OWL, etc.; text mining; case studies and applications of knowledge in biology. These clusters, and more, will become the categories in the KiB k-blog. The letters of support indicate the significant number of authors that have promised to author an article on one of these topics. We will seek as wide a selection of authors as possible, guided by our advisory committee (see Section 2.8), to help give the KiB k-blog a balanced view on knowledge in biology. A significant part of this WP will be the commissioning of these articles and discussions with authors on this new digital content sourced from the community.

13This process will need managing: requests for particular articles (WP2.1); negotiation on topic and scope (WP2.2); managing of the author-guided review process (WP2.3); and, enabling payments to be made. This activity will help ensure that the core of the KiB k-blog will be of sufficient quality to attract readers to comment and contribute articles, as well as to simply read and learn.

2.3 WP3 – Outreach and Community Engagement

 

14Outreach and community engagement are intrinsic to this project. The presence of a high-quality, organised resource, freely available on the web will attract readers; likewise, a widely-read resource will be attractive as a publication centre for authors, particularly when supported by funding as part of WP2. The use of a rapid publication framework, available on the web, archived by the British Library and indexed for searching by Google, therefore, is our main form of outreach.

15However, this process can be augmented. All content will be available and reusable under a Creative Commons license, making it reusable with citation outside of the KiB environment. We will maintain active “Social Web” streams through Twitter. We will solicit articles relating to the use of Twitter and the blogosphere from members of the scientific blogging community; as well as generating content, this will leverage their existing readership, raising awareness of KiB, both as a resource for readers and authors. We will maintain a well-advertised mailing list allowing requests for, or offers of, new articles either commissioned or otherwise.

16Finally, we will advertise the resource through normal academic channels of paper and poster presentation. Where possible, we will also propose micro-workshops (aka Birds of Feather meetings) at suitable meetings/unconferences.

2.4 WP4 – ‘In house’ article authoring

 

17The staff on the project will contribute a significant number of articles to the KiB k-blog. Stevens will produce 20 articles; Lord 10 articles and Swan 10 articles (WP4.1). Both Lord and Stevens have already contributed articles to the Ontogenesis k-blog and will further extend on Ontogenesis in the wider KiB topics. These topics will include articles on tips for modeling in OWL; using ontologies with linked data; converting data to RDF and linked data; On-line knowledge resources; using ontologies in over-representation analysis of microarray data; integration strategies; and so on. Some of these in-house articles will act as glue that draw together many of the other articles. For example, an article on the role of knowledge in biology will draw together the need for the k-blog and act as a pathfinder. Where appropriate, we will use tools such as “Anthologize” and “Web Trails” to facilitate these aggregation activities. In house articles will be reviewed (WP4.2) by an external reviewer, potentially from the pool of contributors sourced in WP2.

2.5 WP5 – Project Management and JISC Requirements

 

18Management of the project will use regular weekly teleconferences, to ensure that all aspects are proceeding according to the project plan. In addition, we will fulfill the legal requirements for collaboration agreements and the formal reporting requirements from JISC as part of WP5.

19To ensure maximum community and public engagement in this proposal, all appropriate documents will be posted using the k-blog environment in addition to those locations specified by JISC, except where that information is withheld under normal FOI rules.

20Finally, we will gather and collate statistics on the use of these articles as measures of impact; directly in terms of page views from the underlying k-blog platform; indirectly from incoming links (both those using trackbacks, and those discovered using Web searching tools) and comments; and finally through secondary indicators such as Twitter and email communications. These statistics will also be made publicly available where appropriate.

2.6 Timetable

 

Name 

Start 

End 

Staff

Notes 

WP1 

1/3/2011 

30/9/2011 

DS, PL 

Maintenance of k-blog infrastructure 

WP2

1/3/2011

31/7/2011

   

-WP2.1

1/3/2011

1/4/2011

ALL

Crowdsourcing of articles

-WP2.2 

1/4/2011

31/7/2011

ALL

Content negotiation and creation

-WP2.3 

1/4/2011 

30/9/2011 

ALL 

Articles reviewed and published

WP3

1/3/2011

30/9/2011

ALL

Outreach and engagement

WP4

1/3/2011

30/9/2011

   

WP4.1 

1/3/2011 

31/7/2011 

ALL 

In-house content generation

WP4.2 

1/4/2011 

30/9/2011 

ALL 

In-house articles review and publication

WP5 

1/3/2011 

30/9/2011

PL 

Project management and JISC compliance 

 

2.7 Deliverables

 

21A high-quality body of content, consisting of a series of articles from multiple authors; describing different topics fitting within the theme of “Knowledge in Biology”. 40 of these articles will be authored in-house. A further 200 will be sourced with consultancy payment. We anticipate many others will come from crowd-sourced, enthusiastic authors, engaged with the process.

 

22A website, based on the k-blog platform, that delivers this content.

 

2.8 Project management arrangements

 

23The project will be managed from Newcastle University; the primary management will be from Dr Lord, who will be responsible for:

 

    – Developing Project Management Plans;

    – Ensuring that the Project work package objectives are met;

    – Prioritising and reconciling conflicting opportunities;

    – Reporting and collaborating with JISC programme Manager;

    – Dissemination of community content.

 

24Project progress will be evaluated through scheduled, short, “stand-up” meetings on a weekly basis, conducted face-to-face, via Skype or phone as appropriate. Primary unscheduled communication will be via public mailing list, ensuring maximum visibility and openness. User consultation will be via public mailing list. Close tracking of requests for content and payment of authors is essential, and transparent procedures will be put in place for this. All staff are associated with several other projects and duties (research, research support, teaching and training), and are responsible for managing these independent workloads. All have experience with the k-blog platform and process.

 

25We have formed a small, unpaid advisory committee from recognised experts in the field. They will be invited to give feedback on the topics covered at 2, 4 and 6 months into the project; this will help to ensure an even and representative coverage of the area, that is not overly biased by the particular interests of the staff on the project.  Mark Musen (Stanford), Chris Rawlings (BBSRC Rothamsted) and David Shotton (Oxford) have all agreed to be our advisory board.

2.9 Risks

 

26Staff Risk – as with all projects, loss of staff could negatively impact on this project; however, all staff are on permanent contracts, have long histories in research, so this is less likely. Additionally, the nature of the workload means all staff would be able to cover duties relating to sourcing and generating community content, we limit the risk should a single person leave.

 

27Lack of community engagement – the strength of this proposal depends on contributions from many different authors, generating new, novel and, currently, unavailable content. However, there is also a risk that the community will not wish to contribute. We have limited this risk by offering to pay people consultancy rates – an unusual reward within academic research; however, we will only need to commit funds following the submission of the content, so should authors not deliver, we will reallocate these funds. Should we still find it hard to solicit contributions, we will increase the rates per article.

 

28Technology dependencies – Content will be disseminated in the form of k-blogs, and thus there is a dependency on the k-blog platform. It is already suitably developed and packaged. The k-blog platform is a publishing framework only; it is not essential for the authoring of articles. This limits the scope of the risk. Content could be published independently of the k-blog platform, with only a small loss in the feature set. Additionally, content could be relocated elsewhere at any time; it would retain its value outside of the k-blog platform. With the archival agreement under the Sustainability section, archives of the original KiB content will always be available.

 

2.10 IPR position

 

29It is essential that content is released with as few restrictions as possible on re-use and re-purposing, but authors must be allowed to maintain credit associated with the original work, or they are unlikely to contribute. Project members agree to release their work under a Creative Commons Attribution-NonCommercial ShareAlike 3.0 Unported License (CC BY-NC-SA), which allows re-use and modification for non-commercial purposes with attribution. This is in line with the JISC Model Licence. Authors invited to submit articles will be allowed to choose a Creative Commons licence of their own but will be strongly encouraged to use as permissive a licence as possible. Choice is offered to allow considerations of different institutional policies on published content. Public domain submissions will also be accepted to accommodate US government employees; these submissions will be uncommissioned.

 

2.11 Sustainability

 

30To maintain the persistence of the online resources beyond the end of the project, documents produced by project staff and KiB contributors will be publically available and clearly licensed. The k-blog site and sub-domains are already archived by the UK Web Archive, in which JISC is an active partner. The Digital Curation Centre will be asked to provide strategies for long-term database archiving.

 

2.12 Staff Recruitment

 

31All staff are already in post.

 

3 Impact

 

32Our key beneficiaries are the community of researchers working to develop knowledge in biology. Specifically this focuses on groups involved in data standards, linked data, knowledge in synthetic biology and statistical analysis of biological data. The needs to this community are clearly demonstrated from our Ontogenesis experiment, which is currently receiving 1000 page views per month for a small number of articles. Simple question and answer websites such as http://biostar.stackexchange.com/, receive over 2k page views per week; however, there is a gap between this and more formal knowledge.

 

33We will generate statistical information, using the k-blog platform as a clear metric of impact; for freely available, reusable and web-delivered content indicators such as page views are well recognised, and the main form of impact assessment. Both natively, and through tools such as Google analytics, the k-blog platform can provide comprehensive and detailed feedback on access of individual articles. We will also exploit secondary impact measures, including Twitter through appearance of suitable hashtags; comments and trackbacks to articles on KiB; and, finally, links to KiB as provided by web search.

 

34We will seek to increase impact through a number of activities in addition to normal academic channels. First, we will invite contributions from well-known members of the scientific blogging community that should result in secondary readership. Second, we will invite contributions on relevant topics that have become of recent public interest. Thirdly we will monitor article popularity; for areas that prove to be of interest or are controversial we will seek to commission additional content.

 

4 Partnership and dissemination

 

35Internal engagement of core project members, and the wider community of researchers crowd-sourced to supply content will be via the mailing list, after initial approaches are made. The plans for content generation are further outlined in WP3 and WP4. Content generation will allow further interaction with more disparate groups (content consumers), who will be encouraged to engage through the k-blog process and the project mailing lists. The advisory committee will be able to ensure that our engagement with the content-producing community is representative of the community. The nature of the k-blog process means dissemination is intrinsic to content generation.

36Project members are on the existing JISC funded Knowledge Blog grant in the “Managing Research Data” programme. We will approach individuals with funding from this and other programmes, requesting articles describing the value of these projects to biologists. We will, of course, also be pleased if JISC programme managers wish to contribute articles to this knowledge in biology resource.

6 Previous experience of the Project Team

 

37Dr Phillip Lord is a Lecturer of Computing Science at Newcastle University. He has a PhD in yeast genetics from University of Edinburgh, after which he moved into bioinformatics. He is well known for his work on ontologies in biology, as well as his contributions to eScience beginning with his role as a RA on the myGrid project. Since his move to Newcastle, he has been an investigator on there more eScience projects; CARMEN, ONDEX and InstantSOAP, as well as maintaining an active engagement in standards development (OBI, MIGS, MIBBI), and publishing on the fundamentals of ontology design. He was an active participant in the Ontogenesis network, and is currently leading the JISC funded Knowledge Blog project. He is an active blogger and developer.

 

38Dr Robert Stevens is a reader in Bioinformatics in the Bio and Health Informatics group at the University of Manchester. His main areas of research are in the development and use of semantics within the life sciences. This is blended with the use of e-Science platforms to gather and manage the data and knowledge of the life sciences. He was PI on the Ontogenesis network that ran the meetings for the first Knowledge Blog. He is or has been a co-investigator on the myGrid and myExperiment grants that will provide both content and technical input to this project. As well as the JISC funded myExperiment project, Stevens was an investigator on the JISC funded CO-ODE project that developed Protégé 4. On the back of this, Stevens has led the OWL training activities at Manchester that has directly fed in to the Ontogenesis Knowledge Blog. Stevens currently leads content development for the JISC Knowledge Blog grant.

 

39Dr Daniel Swan has a PhD in developmental biology and continued to work in developmental biology as a post-doctoral researcher before moving into bioinformatics in 2001. Subsequent positions included working for Bart’s and the London Genome Centre and the Centre for Hydrology and Ecology in informatics driven roles dealing with large, distributed biological datasets generated by large user communities. Currently the manager of the Newcastle University Bioinformatics Support Unit, he leads a small team aiding biological researchers generate, capture, store and analyse their digital data. His interdisciplinary background means he has grounding in both computer and biological sciences and is comfortable working on CS focused projects (CARMEN, InstantSOAP, Bio-Linux). He has been most recently involved in the JISC Knowledge Blog grant, providing technical support and engagement with microarray community.

 

Leave a Reply