Archive for the ‘Tech’ Category

This year, our clusters are going to be moved over to Vista, so I’ve decided to downgrade my windows box from XP to vista. It’s been an inevitable fun-filled afternoon as a result.

Tried a remote installation to save the effort of finding disks. Unfortunately, we tried an installation which booted into Windows 7, and then allowed you to install vista from there; this results in a mysterious 100M partition for use with bitlocker; vista doesn’t know about this, so mounted it as D drive and, as it’s marked as a system partition, you can’t change this. Three installations later, it was gone, and Vista is installed.

Next up, install synergy. Turns out that this is hosed because of UAC — the Vista access control. How to Geek was very helpful, although their technique doesn’t completely work. I have some ideas, but basically, had to turn off all UAC elevation dialogs (as synergy doesn’t work then, which rather defeats the point), and I have to start it by hand every login. At this juncture, a hardware KVM seems an option, but it’s clunky in comparison to synergy.

Cygwin installation has been okay, except for some mysterious “Program Compatibility” dialog which tells me that I have done things wrong and offers to make my life better. Next up is the problem of getting security permissions on my files on D, which think that they are owned by another user (from my old OS). Normal Windows problem I can’t get the permissions set up, or percolating downward whatever I do.

(At this point, a friend popped in and said, “Why don’t you install Windows 7 instead”. Not the first to ask).

Think I now have the security permissions set, although it’s going to take about 2 hours to find out for sure, as it traverses my file system. Cygwin appears to have another strange problem where a bash window doesn’t respond to a click—if you want to move it from the front, you have to use the taskbar.

Emacs, skype, miktex all seem to have installed okay; neither webcam nor sound drivers worked in the default installation, but vista did manage to find them, so no complaints there really. I’ve also found one major advantage; when you switch the irritating desktop sounds off, windows no longer asks you whether you want to save the old scheme (yes, being the default); well worth the billions of dollars spent on vista. The machine balked after all these installs, with explorer up to 100% CPU. Restart has solved.

Installing cygwin sshd was a bit hard; the trick is to run ssh-host-config in a cygwin.bat run as administrator. It all works fine then, except for the bit where you try to ssh in to the machine. Then you always get Connection Closed. Giving up for now.

How would I have got this far out with the wonderous Gerry Tomlinson to help me out? No idea.

On the flip side, thought, I was interested to see one of my own great ideas, first expounded in my work on Generating Sewage Systems has been taken up the Institute of Mechanical Engineering, in a report which has even got as far as the BBC. Yep, algae reactors down the side of buildings. It’s the way forward.

I’ve generally been reasonably impressed with wordpress since I moved to it from my old, emacs-driven system. It seems to work mostly and it’s reasonably easy to manage.

One problem has been the regularity of the updates; worse, they all tend to be security updates (2.8.4 was to correct a problem where a crafted URI allowed overwrite of the admin password). So, you have to update. Often.

Fortunately, wordpress provides an automatic mechnism for achieving this. Less fortunately, it doesn’t work for me. We’ve finally pinned down why, which is too tedious to explain, but I don’t like the mechanism anyway, as I have to give wordpress my username/password (for the command line, not for wordpress).

So, I’m trying another solution. Check the whole thing out of SVN. I’ve just moved over to this mechanism for the 2.8.4 upgrade and it seems to work. This is actually the same amount of effort as a regular manual upgrade; you just svn co rather than wget/unzip. In future, it should me much easier, though. Just a simple svn switch. No fiddling with moving wp-config across, and wp-content should be unaffected. Even better the one hack that I have had to apply to formatting.php every time should be automatically merged in, or will conflict — in which case, it will good to be warned.

I’ll post again in a few updates time if it all works; if this blog suddenly goes offline, well, probably this wasn’t such a good idea after all.

Following my holiday, I’ve decided to create two new categories for my blog, one for all my professional pieces and one for my personal.

This blog fulfils two many purposes. Firstly, it serves as a memory aid for myself; I can look back at the things and the ideas that I’ve had in the past. Secondly, I use it to publish these ideas. I’m aware that the former is the more important than the latter; like most blogs, this site is not heavy traffic.

I do publish about my personal life here, but this is not a full disclosure blog; it’s called “an Exercise in Irrelevance” for exactly this reason. I put occasional reviews of things up; places I’ve visited or music that I’ve listened to. All about my reactions to public events. This blog isn’t meant to be a soap opera.

I also publish posts about my work here. I think, over time, these will become more important; recently, I’ve been the blog as lab book but I think it will also start to become a more formal publication route.

Given this, I think it makes sense to separate the two strands, to enable the few subscribers that I have to choose whether to read about my life outside science or not. Personal, Professional or Everything, the choice is yours.

I think I now have my blogging environment as I want it. I’ve been using blogpost.py to do my posting. I couldn’t let go of my text only environment. I don’t care if it’s old fashioned, but I like the separation of editing and viewing. In this case, I’ve even had to learn asciidoc, but it was worth the effort.

Today, I think I have fiddled with blogpost.py for the last time. I can now set both categories and status (published or unpublished) from within the blogfile. I’d added a post command previously; originally, blogpost used to have a create and update command.

The big advantage with this is that all the information about the blog is apparent from the file; this means I can use a single make file to compile the lot. Any changes that I make while on the road will automatically publish to the web when I get online again. I can even put a catch-up in my backfile to make sure everything is up-to-date.

Okay, so I am sad; so sue me.

Blogs are generally seen as a slightly dubious part of the scientific publishing landscape. This is not, of course, unreasonable. I put stuff up here, for example, such as my idea for IDs that I’ve thought about for a few days, but that I am unlikely to follow any further, or stuff opinion pieces on bees about which I have as little expertise as the average journalist.

Fundamentally, though, despite it’s current use, a blog is just a media channel; you can use them to transfer anything you like. A scientific paper, for instance. This might be useful. While, for instance, I love open access publication, it’s quite expensive particularly as the cash tends to come out of my own budget, at least until I can get the library to pay.

So, I’ve been thinking about a cheap and cheerful blog-based system. It would work like this. The author would simply publish their paper onto their own blog. Next, they would send a request (using one of these pingback or trackback thingies that I haven’t worked out yet) to a “journal” which would also be a blog, in this case a private one. The editor would then invite comments from willing reviewers using same technique. Reviewers could then read the blog post, comment on it using their own blog. After the normal revision cycle, the editor would make a decision. If it was accepted, the authors blog post would be linked from the journals main feed (probably grabbing an archival copy at the same time). If it was not accepted, the author could try another journal, this time with initial reviews in-hand; the process would not beed to be reiterated.

This would have several advantages over the current system. Formatting and presentational problems would disappear because they would be controlled by the authors. Prepublication would become unnecessary, because submission and publication would become the same thing. The role of the journal would be limited to what they are best at; getting reviewers in and rubber stamping a seal of approval on worthy papers. Finally, the tireless work of reviewers would be publically acknowledged; their own blogs would have a record of every review that they have ever done.

All the technology for this already exists; it just needs some social conventions layering on top.

Ah, it does on and on. After my last attempt at literate OWL programming, called omnsplit, I decided that there was a problem; this version splits the OWL file into individual statements, and puts them into files with the same name as the OWL class (property, or whatever).

The problem is that, for an ontology like OBI, you get 1400 individual files; this is just inconvienient as many applications don’t like this many files in a directory. Also, there is a naming constraint; you can only use characters legal in the file system; this doesn’t include “:” if you want to be Windows (NTFS) compliant.

So, for my new system, I decided to generate an index file, which just points at locations in the ontology file. Initially, I was just going to index the main ontology file; in the end, I decided a partial copy was the way forward; generating both the index and indexed file ensure that they will stay in-sync.

It required a bit of nasty latex hacking; the basic problem was avoiding the limitation of being only able to use legal LaTeX macro characters (that is letters). The system now works like this:



%% This is generated by python which also generates the
%% function_ont.spt file which is a copy of the ontology (with a
%% few new lines gone.

%% This just defines a new macro in what appears to be an
%% unnecessarily complex way.
\expandafter\def\csname OmnEntityHeaderheader\endcsname%
{\lstinputlisting[language=omn,firstline=1,lastline=8]{function_ont.spt}}

%% But the use of \expandafter and \csname means that you can
%% use any character you like, including underscores and numbers
%% in the macro name.
\expandafter\def\csname OmnEntityObjectPropertyhas_role\endcsname%
{\lstinputlisting[language=omn,firstline=206,lastline=219]{function_ont.spt}}

%% We can now define two commands in the style file. Again
%% we use \csname so that we are not bound to characters legal
%% in latex macros.
\newcommand{\omnclass}[2]{\csname OmnEntityClass#1#2\endcsname}
\newcommand{\omnobjprop}[2]{\csname OmnEntityObjectProperty#1#2\endcsname}

%% now in our source, we can do things like this.
\omnobjprop{}{has_role}

Using an index in this way also has another advantage. I’ve had to make a decision whether to go with rdfs:label or the entity name. I can now back out of this; I can just use both in the index file, without too much extra space, so that either would be referencable within the latex.

To me, this feels like the right solution. It’s relatively simple (with a bit of nasty latex, which is nicely hidden), it doesn’t depend on the file system. It needs a bit more work to bring it to completion, but not that much.

Sadly bio-ontologies looms, so next week will be getting ready for that; perhaps I can finish this off on the way back. “Sadly” is perhaps a poor choice of words; I’m greatly looking forward to it, but I’ve kind of had the bit between my teeth with python and latex hacking for the last few weeks.