Archive for June, 2009

After a bit of struggle, I now have another literate OWL tool working, along the lines discussed in a previous blog post. Rather than generating the OWL documentation, I now split a Manchester syntax file up, so that I can refer to bits of it. I have this working with OBI, using Protege to produce a single merged ontology file, in Manchester syntax.

The current implementation is rather simple; it produces one file-per-entity in the OWL file which I don’t think is entirely good. When run on OBI, it creates over 1400 files which is a lot. The other problem is that I’ve had to do some dubious hacking to get the file names work out. Firstly, I have to remove spaces and “\”‘s, as wel as “:” which is illegal on NTFS.

There’s also a problem with some of the OWL. Unfortunately, the OBI to OWL conversion process has a reification step which I don’t quite understand the purpose of. This comes out as this sort of anonymous individual. I’m not sure at all how the definition has come out as the rdfs:label, but, for sure, you can’t use this as a filename!


Individual: relationship:genid7

    Annotations:
        rdfs:label "C located_in C' if and only if: given any c that
instantiates C at a time t, there is some c' such that: c' instantiates
C' at time t and c *located_in* c'. (Here *located_in* is the
instance-level location relation.)"@en,
        oboInOwl:hasDbXref relationship:genid8

    Types:
        oboInOwl:Definition

I think I might change the implementation a bit, though. Having 1400 files in one directory is not good. My idea is to serialize the entire file out as latex, with lots of macros, autogenerated.


%% this would appear in the generated file
\newcommand{\OwlClassowlthing}{
  \begin{omn}
Class: owl:Thing
  \end{omn}
}

%% then in your latex file you would do
\owlclass{owl}{Thing}

%% which would just resolve to the class above

The only worry with this is that latex would then have to read a large file into latex, even if most of the macros are not used. This might be really, really slow. Well, we can but try.

As before, the current version is available at git://github.com/phillord/literate_omn.git.

Just finished two new books in a row. It’s an unusual feature of my life that I am relaxed enough to do this; normally, when my head is too full of stuff, or my diary is too full of deadlines, I tend to be too tired in the evening to give my full attention to reading. At the moment, most of my colleagues seem to be chasing exam marking deadlines, so I feel slightly guilty at this, perhaps, but then I’ve been using the time to think about some new ideas for research, so not too guilty. And, as I mentioned, reading some books; so, onto these.

Both were birthday presents, neither my choice, which made them all the more interesting. The first was by Barack Obama — never been a great one for political memoirs; why spend money on buying a book full of lies; or, at least, honest attempts at deception. Actually, though, Obama writes very lucidly and entertainingly. In this media and fame driven world, I guess this is as good an recommendation for presidency as any. The books interesting, although a bit motherhood and apple-pie; so saying that I agree with a lot of it, doesn’t really say much. Perhaps the most telling line was from the section on foreign affairs: we must, he says, act multi-laterially when possible, with international agreement; and international agreement doesn’t mean “armed only with the signatures of Britain and Togo”. Depressing, rather than insulting, because he is right.

On the other hand, Enough is about the problems with ever seeking for more of, well, lots of different things. As an information and music junkie, perhaps, you might think this is not a great message for me; but as it happens, I’m quite ascetic; I’m don’t have that many things and like to keep the things that I have; I’m not a great traveller. I even thought very hard about buying a new bike, and didn’t spend nearly as much as I could have because, well, the one I bought was enough. Enough is a funny, entertaining and timely rant about the more, more culture that we live in. Good read, well worth looking at.

The weekend just gone was the Northern Rock cyclone weekend. With three events: the Leazes Criterium, Cyclone and Beaumont Trophy it’s a bit a of feast for a cycling nutter like myself. The criterium is a short race just down from my work, the latter is a long, elite rider race. The cyclone is a fun run; on a bike. It’s a bit of an embarrasement of riches, to be honest. Both this year and last, I’ve not seen all of the Criterium and totally missed the Beaumont because I wanted to sleep early or been knackered from the cyclone.

I went for the 62 mile ride this year, as last. I quite fancied trying the 100, but in the end decided that it was a bit beyond me; I think it would have taken 7 or 8 hours. In the end, I was quite glad about this; I managed 80 miles a couple of weekends a go, but the cyclone route is actually pretty hilly; up and down all the way. The weather forecast was not good; rain was expected with 60% chance; in the end, though, it was lovely all day. The hills were still there, but I felt much fitter this year; even the middle section (which is a killer) was slow but not painful like last time. Although, admittedly, this year on the way back, I just got off the middle section of the Ryals; it’s not a big hill but really steep. If you don’t believe me, try picture report which has Bradley Wiggins grimacing going up it. I think I was being a bit wimpish; I think I had the legs and gears for is this year, but last I had a feet-stuck-in-the-pedals experience that I wasn’t keen to replicate.

On the ride down from Stamfordham, I felt good, pushing reasonably hard, chasing the clock; I managed to just beat 5 hours (4:40 cycling by my computer).

It felt great and a fitting christening for my new bike; not so new now with 500 miles on the clock. Fingers crossed for next year.

Well, after a reasonable degree of struggle, I managed to get the first version of my literate OWL system working. As well as learning python, I’ve had a go with git; my repo is hosted on github at git://github.com/phillord/literate_omn.git. There are three components.

omnextract.py this pulls out all the referenced omn files from the TeX document and produces the complete omn file.
omn.sty this is a driver for the listings package which does syntax highlighting in TeX.
omndoc.sty this provides commands for including files into the TeX. It’s a thin wrapper around the listings package.

I decided to make omn.sty seperate from omndoc.sty as it works standalone, if you just want to use the listings package on its own. At the moment, you can only include files; environments don’t work. You can see the the pdf it creates from this TeX


\documentclass{article}

\usepackage[pdftex]{color}
\usepackage{omndoc}

\title{A Test Document for OMNDoc}
\author{Phillip Lord}
%% should be ignored by latex, put read by python

\omndoc{all_test.omn}

\begin{document}
\maketitle

Here is a piece of OWL that should be readable in the documentation and in the
OMN output.

\begin{omn}
Class: FirstClass
\end{omn}

\omn{first.pomn}

Here is a piece of OWL that should be readable in the OMN output but is to
boring to be worth of consideration for the documentation.

% \ignore{
%   \begin{omn}
%     Class: BoringOWL
%   \end{omn}
% }

\ignore{\omn{second.pomn}}

Here is a piece of broken OWL that should be rendered in the documentation (as
broken!) but should be ignored in the OMN.

% \begin{notomn}
% Clazz: BrokenOmn
% \end{notomn}

\notomn{third.pomn}

\end{document}

I’m starting to debate with myself, though, whether I have gone the right route here. The problem is that splitting the omn file up into bits is a pain. It only supports one way of working; if you want to use Protege, for example, to edit the file, you can’t; you can only view. We even miss the big advantage of literate programming; one source for both document and computation. But, then, you are stuck with a poor editing environment for either the documentation or computational representation.

I’ve been thinking instead of a system which would like this:


\omndoc{function.omn}

\omnClass{Function}

\omnProperty{has_role}

\omnSummary{}
\omnMissing{}

Now, the python component would split the function.omn file instead of combining it. Each class, individual or property would be but into it’s own file. The \omnClass macro would then just be a simple include (again using the listings package; it would show the class inline. \omnSummary would include some TeX (generated from python) saying how many classes and so forth were in the omn file; \omnMissing would produce a list of Classes that are not explicitly included. Given a big monitor, you could work on the two sources (documentation and ontology) side-by-side, with only a little bit of editing to support jump-to or equivalent. Finally, it would be more syntax-independent. The TeX would not need to be changed to support, for example, the XML syntax. Just some python to split the XML document up into snippets.

I shall start coding this over the next couple of days. I think I already have most of the python that I need so, hopefully, it should not take too long.

Learning a new language is always a bit stressful. I thought that I would learn python; I need a new, rapid development, build some scripts, but don’t look as awful as perl type language. I’ve recently learnt lua which was fun, but then it’s meant as a very small, quick langauge. It’s nice, but not really the perl-u-like that I wanted.

I have actually been through the process of learning python in the past; I used to generate my website with ht2html which was quite cute and did the job; it was written in python, and I needed some skills to fiddle with it’s output. In the end, I decided that table within table presentation was not ideal and that CSS was the way to go, so I moved to muse which I still use nowadays.

As always, learning a new language is frustrating as you realise that you don’t know how to do even the most elementary things, and bugs are a nightmare to hunt down. Simon has been helping me lots with some of the my more “I’ve really screwed this up question” and I now have a version working version of my literate owl system. I’ll post the results of this soon; there are a few tweaks that need to happen first.

Along the way, I came across a very wierd problem. My script was failing totally; it always appeared to crash with a syntax error. It took several seconds to do this and, at the same time, the mouse cursor changed into a cross. I came across a thread which looked like the same thing, but in a totally different setting. The cause? Well, my script was…


#/usr/bin/env python

import re
import sys

def main()
    TheProgramHere()

The problem is on the first line; the second character should be !. Without this the script is interpreted my BASH; import is part of ImageMagick. I finally worked out what was happening when I found two large files, one called “re” and one called “sys” in the local directory. Computers can be irritating at times.