An Exercise in Irrelevance - Extended Semantic Web Conference

It was interesting to go the ESWC 2012; it has been quite a few years since I have been to ESWC or, indeed, any semantic web conference. While I am not generally a live-blogger, I have already commented on some aspects of conference (n.d.a/) Here I will just consider a few of the talks which leapt out at me for good or bad reasons.

I did enjoy the first keynote from Abraham Bernstein (n.d.b) it was a brave talk, not because it managed to wind Greek mythology into it, but because he started off with the opening credits from Star Trek. At a computing conference, this is setting yourself up with a hard act to follow. If I can over-simplify, the key thesis of the talk was largely that trying things out in practice is the best way to see how things work in theory. This is a theme I shall return to.

As I have suggested previously, a talk on the Music Ontology also interested me (n.d.c) Essentially, the idea here is to define a metric assessing the quality of an ontology by measuring how well it fulfils the user requirements. This looks very useful, although at the moment, it does not appear that all of the measurements are automated. The reason that automation would be ideal is that this form of measure would potentially be very useful in more agile forms of ontology development; essentially, they could take the place of an automated test framework, allowing the developer to ask whether new concepts added had helped to address more user queries or not.

The last paper I was involved with at this conference was about a semantic service matching framework called Feta (n.d.d) Since I left Manchester, I have rather lost touch with this work, although the research theme can still be seen in BioCatalogue (Bhagat et al. 2010) I was interested there to listen to a talk on semantic web services (n.d.e) particularly as it was using the information content measures that I used many years ago over the Gene Ontology (n.d.f) Unfortunately, not that much appears to have changed since I have left the field. The only advance seems to have been the generation of a “gold standard” dataset; while this is not a bad thing, it is also a reflection that SWS are just not being used in the wild. I also worry about the methodology, though, of testing against a gold standard that was predefined. To me, it seems like a case of cherry-picking. The results just would not have been reported had they not showed some improvement over previous metrics; the risk is that the metric is being tuned to the individual gold standard, rather than the general research problem.

One criticism that I cannot make of Maria Keet’s presentation on mereotopological relationships is the lack of testing (n.d.g) While the first part of the paper, deals with the theoretical underpinning of part-whole relationships, a significant part of the paper shows their user testing of OntoPartS, a tool they have developed to allow ontology developers to pick the correct type of relationship. While the paper gives good examples, showing that the part-whole relationships used are valuable, in the sense that the allow inferences that could otherwise not happen, I worry about their interpretation of their user testing suggesting that it takes “a mere 4 minutes to choose the correct relation”. This might be reasonable for an ontology developer, but users will need access to these distinctions if they are to be useful, and 4 minutes is a long time. I think, for this reason, I would much prefer the user driven approach of the music ontology, rather than extension from theory approach when determining what part-whole relationships we need. The Gene Ontology managed to get an awful long way with one relationship (Bada et al. 2004)

My biggest worry about the conference as a whole, though, is that how similar the experience is now, to five years ago. This makes me worry that the field is not advancing. Perhaps part of the reason for this is the lack of strong application drivers. This seems to be acknowledged by the separation of the conference into “research” and “in-use” tracks. This categorisation seems broken anyway: so a paper on “Evaluating scientific hypotheses using the SPARQL Inferencing Notation” (Callahan and Dumontier 2012) is apparently not research, but in-use. However, “Curate and storyspace: An ontology and web-based environment for describing curatorial narratives.” (Mulholland, Wolff, and Collins 2012) is research; and, as a result, we should conclude not useful?

In the end semantic web has been significantly rebranded as the linked data initiative. In the bar, I heard the comment, “ah, but don’t they know they will need semantics eventually”. Well, yes, they will. And “they” probably know this. Google has, or at least, is investigating more semantic representations, rather than the pure statistical approach it started off with. But ultimately heavy duty semantics is only ever going to be a niche market, by people who care enough, and need the expressivity enough for it to be worth the hassle. I’ve been working with and on ontologies for many years, and I know the value of a reasoner, and the value of heavy-duty logics. But, if we let ourselves be overwhelmed by the technology, we miss the reality that we can achieve a lot with very little. Perhaps the best indication of this is that the award for most influential paper from 7 years ago went to a paper on SIOC which is relatively light in terms of semantics (Breslin et al. 2005) The semantic web community (of which I have never been more than an interloper) may like to say that a little semantics goes a long way (n.d.h/~hendler/LittleSemanticsWeb.html) but I am not sure that it actually believes it.

n.d.f. https://dx.doi.org/10.1093/bioinformatics/btg153.

———. n.d.a. https://www.russet.org.uk/blog/2012/05/semantic-web-irony.

———. n.d.b. https://dx.doi.org/10.1007/978-3-642-30284-8_1.

———. n.d.c. https://dx.doi.org/10.1007/978-3-642-30284-8_24.

———. n.d.d. https://dx.doi.org/10.1007/11431053_2.

———. n.d.e. https://dx.doi.org/10.1007/978-3-642-30284-8_40.

———. n.d.g. https://dx.doi.org/10.1007/978-3-642-30284-8_23.

———. n.d.h. https://www.cs.rpi.edu.

Bada, Michael, Robert Stevens, Carole Goble, Yolanda Gil, Michael Ashburner, Judith A. Blake, J.Michael Cherry, Midori Harris, and Suzanna Lewis. 2004. “A Short Study on the Success of the Gene Ontology.” Journal of Web Semantics 1 (2): 235–40. https://doi.org/10.1016/j.websem.2003.12.003.

Bhagat, J., F. Tanoh, E. Nzuobontane, T. Laurent, J. Orlowski, M. Roos, K. Wolstencroft, et al. 2010. “BioCatalogue: A Universal Catalogue of Web Services for the Life Sciences.” Nucleic Acids Research 38 (Web Server): W689–94. https://doi.org/10.1093/nar/gkq394.

Breslin, John G., Andreas Harth, Uldis Bojars, and Stefan Decker. 2005. “Towards Semantically-Interlinked Online Communities.” In Lecture Notes in Computer Science, 500–514. Springer Berlin Heidelberg. https://doi.org/10.1007/11431053_34.

Callahan, Alison, and Michel Dumontier. 2012. “Evaluating Scientific Hypotheses Using the SPARQL Inferencing Notation.” In Lecture Notes in Computer Science, 647–58. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-30284-8_50.

Mulholland, Paul, Annika Wolff, and Trevor Collins. 2012. “Curate and Storyspace: An Ontology and Web-Based Environment for Describing Curatorial Narratives.” In Lecture Notes in Computer Science, 748–62. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-30284-8_57.