Archive for the ‘Professional’ Category

Oh dear, if it seems that we have been here before, it’s because we have. Another Ubuntu upgrade, another broken Marble Mouse.

Took my a while to work out this one, but the answer is hidden in a bug report for RedHat. Actually, if I had read my last blog post I might have worked it out also.

What happens is Wayland, the new, er well what ever it is, for 17.10 looks at the marble mouse, says “it has no scroll wheel”, so disabled the input method. Which is unfortunate because then the emulation doesn’t work.

The solution is to turn it on again:

xinput --set-prop "Logitech USB Trackball" "libinput Scroll Method Enabled" 0 0 1
xinput --set-prop "Logitech USB Trackball" "libinput Button Scrolling Button" 8

Dearie me.

Update

Worked. Was happy. Now it’s stopped working. Less happy.


Abstract

The process of building ontologies is a difficult task that involves collaboration between ontology developers and domain experts and requires an ongoing interaction between them. This collaboration is made more difficult, because they tend to use different tool sets, which can hamper this interaction. In this paper, we propose to decrease this distance between domain experts and ontology developers by creating more readable forms of ontologies, and further to enable editing in normal office environments. Building on a programmatic ontology development environment, such as Tawny-OWL, we are now able to generate these readable/editable from the raw ontological source and its embedded comments. We have this translation to HTML for reading; this environment provides rich hyperlinking as well as active features such as hiding the source code in favour of comments. We are now working on translation to a Word document that also enables editing. Taken together this should provide a significant new route for collaboration between the ontologist and domain specialist.

  • Aisha Blfgeh
  • Phillip Lord


Plain English Summary

Ontologies are a mechanism for organising data, so that it can be generated, searched and retrieved accurately. They do this by building a computational model of an area of knowledge or domain.

But, building ontologies is a challenge for a number of reasons. One of the main problems is that building an ontology requires two skills sets: the use and manipulation of a complex formalism, which tends to be the job of an ontologist; and, the deep understanding of the area that it being modelled, which is the job of a domain specialist. It is fairly rare to find a person who understands both areas; so people have to collaborate.

In this paper, we describe new mechanism to enable this collaboration; rather than trying to train domain specialists to build ontologies or use ontology tooling, we instead manipulate an ontology so that it can be viewed as an office doc, which ultimately is the tool that most people are familiar with.

A chicken is an eggs way of making another egg

One of the joys of Ontology building is that you can end up in some fairly obscure arguments; the one I got in today is whether a sperm is a human being. Of course, this is silly, but mostly because of the limitation of our language. I would like to describe here why a sperm is a human individual and why it is important.

One of the long running discussions in the ontology community is how we define function. With respect to biological organisms and biological function this is particularly challenging; in fact, biology continually raises questions and exceptions which is part of the fun.

I added my contribution to definitions of function several years ago (1309.5984), built largely around evolution and, more importantly, homology.

One of the issues with other definitions available at the time, and specifically, BFO is that it used a definition as follows:

A biological function is a function which inheres in an independent continuant that is i) part of an organism[…]

The point, here, is that by definition an organism cannot have a function because an organism cannot be part of an organism. This works well for people, but badly for some organisms, especially eusocial ones like ants which appear to have functions in their society which they have evolved to fulfil. My argument here is that it also means that a sperm cannot have an function, because, actually, a sperm is an organism. Of course, this seems daft; a sperm is, surely, part of an organism in the same way that a blood cell is. However, this is not true.

All organisms have a genome — their genetic material. Many organisms have a single copy of their genome; for single celled organisms, this gets doubled before they divide, so actually they have two copies of their genome much of the time, but these two copies are identical. These organisms are called haploid.

However, as you might expect sexual reproduction makes things more complex. This involves taking two previously independent organisms, merging them, then dividing again. The merged organism has, now, two different copies of genome; these are called diploid.

Once this happens, the life cycles of an organism gets more complex. Some organisms, such as the yeast (Schizosaccharomyes pombe) really dislike being diploid. It’s possible to maintain them in the lab, but generally given the choice they sporulate and become haploid again. However, others such as brewers yeast (Saccharomyces cerevisiae) behave differently. It grows, develops and lives as in the haploid form; but it also does this in the diploid form and is quite happy.

Many plants do this also and exist in both a multicellular haploid form (called the gametophyte) and a multicellular diploid form called the sporophyte. In the flowering plants, the gametophyte is very small, and exists entirely within the sporophyte stage; in other plants, the gametophyte is larger. But both forms can grow and develop, a process called alternation of generations.

As far as I know, no animals do this. However, there are quite a few organisms where both a haploid and a diploid form exists; the male ant that I refered to earlier will be a haploid, while the females are diploid. This doesn’t disadvantage the male — it simple produces sperm which are genetic clones of itself.

In humans, like the flowering plants, the diploid form is dominant. There are two haploid forms, the egg and sperm, both single cells; the female form exists entirely within the diploid from which it arose; the sperm can travel a bit further but not much.

Of course, in most practical circumstances, the sperm would appear to be a part of the man that produced them; if I was building a medical ontology, I would make this statement, because it would fulfil everyones intuition, common medical and legal practice.

But, there is no real justification for this. It exists, it is independent from that man, has a different genome from that man; it is an organism in the same way that a gametophyte or a male ant is an independent organism. For a biological ontology, working cross-species, we have no basis for making this distinction; if the sperm has a function of fertilizing an egg, then the man has the function of producing more sperm. Alternatively, if a man cannot have a function, neither can a sperm.

Does this mean that sperm is a human being? Obviously this would be silly, nor is a sperm a person; but it is an organism and it is human. We just lack a word to describe this.

This discussion came up at ICBO 2017, following a discussion with Barry Smith.

Bibliography


Abstract

As the quantity of data being depositing into biological databases continues to increase, it becomes ever more vital to develop methods that enable us to understand this data and ensure that the knowledge is correct. It is widely-held that data percolates between different databases, which causes particular concerns for data correctness; if this percolation occurs, incorrect data in one database may eventually affect many others while, conversely, corrections in one database may fail to percolate to others. In this paper, we test this widely-held belief by directly looking for sentence reuse both within and between databases. Further, we investigate patterns of how sentences are reused over time. Finally, we consider the limitations of this form of analysis and the implications that this may have for bioinformatics database design. We show that reuse of annotation is common within many different databases, and that also there is a detectable level of reuse between databases. In addition, we show that there are patterns of reuse that have previously been shown to be associated with percolation errors.

  • Michael J Bell
  • Phillip Lord


Plain English Summary

Bioinformaticians store large amounts of data about proteins in their databases which we call annotation. This annotation is often repetitive; this happens a database might store information about proteins from different organisms and these organisms have very similar proteins. Additionally, there are many databases which store different but related information and these often have repetitive information.

We have previously look at this repetitiveness within one database, and shown that it can lead to problems where one copy will be updated but another will not. We can detect this by looking for certain patterns of reuse.

In this paper, we explictly study the repetition between databases; in some cases, databases are extremely repetitive containing less than 1% of original sentences. More over, we can detect text that is shared between databases and find the same patterns in these that we previously used to detect errors.

This paper opens up new possibilities using bulk data analysis to help improve the quality of knowledge in these databases.

Years ago, after problems with my wrist, I moved to using a trackball when ever I can. Good move it was too, but I am left with one pain. I use a Logitech Marble Mouse and it has no scroll wheel; this is sad because I have loved scroll wheels since they came out. So, instead, I use scroll wheel emulation — you hold down a button and trackball moves are interpreted as scroll events.

Now, this leaves me with one remaining pain. For no readily apparent reason, the method for configuring it has moved from one place to another, normally every couple of releases. At one point, it was in xorg.con, then in HAL, for a joyful period with the gpointer-settings GUI which then broke and disappeared, and I ended up with xinput run from a shell script.

Having just upgrading to Ubuntu 17.04 guess what? Broken again.

I have been using this:

xinput set-button-map "Logitech USB Trackball" 1 2 3 4 5 6 7 8 9
xinput set-int-prop "Logitech USB Trackball" "Evdev Wheel Emulation Button" 8 8
xinput set-int-prop "Logitech USB Trackball" "Evdev Wheel Emulation" 8 1
xinput set-int-prop "Logitech USB Trackball" "Evdev Wheel Emulation Axes" 8 6 7 4 5
xinput set-int-prop "Logitech USB Trackball" "Evdev Wheel Emulation X Axis" 8 6
xinput set-int-prop "Logitech USB Trackball" "Evdev Drag Lock Buttons" 8 9

Well from a bug report here:

https://bugs.launchpad.net/ubuntu/+source/xorg/+bug/1682193/comments/7

It turns out that the reason this no longer works is because Evdev is not used anymore, thanks to the move to Wayland; now I need to use libinput. Unfortunate, since these “big” linux issues (unity, wayland, systemd) are something that I try very hard not to have to care about at all.

I tried lots of dead reckoning with libinput but got all sorts of errors, or just non-functioning. Eventually, I worked out the right way forward this way.

xinput --list
⎡ Virtual core pointer                        id=2    [master pointer  (3)]
⎜   ↳ Virtual core XTEST pointer                    id=4    [slave  pointer  (2)]
⎜   ↳ HID 1267:0103                                 id=11   [slave  pointer  (2)]
⎜   ↳ Logitech USB Trackball                        id=12   [slave  pointer  (2)]

Which gives the device number (12). Then

xinput --list-props 12
Device 'Logitech USB Trackball':
        Device Enabled (136):   1
        libinput Scroll Methods Available (284):        0, 0, 1
        libinput Scroll Method Enabled (285):   0, 0, 1
        libinput Scroll Method Enabled Default (286):   0, 0, 1
        libinput Button Scrolling Button (287): 2
        libinput Button Scrolling Button Default (288): 2

This is mostly correct — the last 1 on Scroll Method Enabled means “button”. But the 262 on Button Scrolling Button is not so good. It needs to be 8.

Next up, set-int-prop is deprecated, so lets not use that. So I tried this instead:

xinput set-prop 12 287 8

The 12 is the device number, 287 is the property number (found from --list-props above), and the 8 is the correct button. But this is ugly and incomprehensible; more, I do not know if the numbers (12 and 287) will remain the same across all my computers. So, let’s use names instead.

xinput --set-prop "Logitech USB Trackball" "libinput Button Scrolling Button" 8

Which leaves me with these properties:

xinput --list-props 12
Device 'Logitech USB Trackball':
        Device Enabled (136):   1
        libinput Scroll Methods Available (284):        0, 0, 1
        libinput Scroll Method Enabled (285):   0, 0, 1
        libinput Scroll Method Enabled Default (286):   0, 0, 1
        libinput Button Scrolling Button (287): 8
        libinput Button Scrolling Button Default (288): 2

In the end, the configuration is simple. Now, please, please devs, don’t break it again!

  • Update

The original post was wrong but settings from earlier experiments were persisting when I thought they were not.

I have written about assess previously (http://www.russet.org.uk/blog/3135); it is a tool which provides predicates, macros and functions to support testing for Emacs. It is actually agnostic to the test environment, although has specialised support for ERT.

My new release of assess (v0.3.2) includes one significant change, and two new features. I have updated the call capture functionality — the first version stored all the call data in a global variable, which was quick and easy, but clearly not a log term solution. It now uses closures instead which means that several functions can be captured at once. This also allows the first new feature, which is the ability to capture calls to hooks, with the function assess-call-capture-hook, which takes a hook and a lambda, and returns any calls to the hook when the lambda is evaluated. As an example usage, from assess-call-tests.el:

(should
 (equal
  '(nil nil)
  (assess-call-capture-hook
   'assess-call-test-hook
   (lambda ()
     (run-hooks 'assess-call-test-hook)
     (run-hooks 'assess-call-test-hook)))))

This is good functionality and should be very useful. The API could be improved a bit; a macro version would avoid the explicit lambda, for example. And returning a list of nil means this function also works with hooks with args, but is a bit ugly for hooks without (which are the majority).

The second area that I wanted to address has come about because of my hacking into the Emacs undo system. This is hard to test automatically; I have often found myself writing things like this test from simple-test.el.

(should
 (with-temp-buffer
   (setq buffer-undo-list nil)
   (insert "hello")
   (member (current-buffer) undo-auto--undoably-changed-buffers)))

This is okay, but it’s painful to write; I am trying to robotize Emacs, and it’s not easy. Some times it’s hard to work out exactly what set of functions you need to call. It would be much easier just to type a key sequence and have Emacs run this for you.

Fortunately, Emacs has special support for this in the form of keyboard macros; you can remember, store and save any set of keypresses and run them, rerun them, automate them or, most importantly, save them to a file as a lisp expression. This example, for instance, comes from viper-test.el.

(kmacro-call-macro nil nil nil
                   [left
                    ;; Delete "c"
                    backspace
                    left left left
                    ;; Delete "a"
                    backspace
                    ;; C-/ or undo
                    67108911])

This is okay, but it’s still not ideal. I have had to add comments to make the test clear by hand. It’s not easy to read, and what is that 67108911 about? It comes from somewhere in Emacs and is stable into the future. But, you only have my word for it that this is undo. It would be all too easy to get this wrong, to have the wrong comment. Tests need to be readable.

Fortunately, Emacs provides a nice solution, in the form of edmacro — this is a major-mode for editing macros after they have been created. It also defines a human readable version of a macro. We can parse this and then execute it directly. This example comes from simple-test.el.

(execute-kbd-macro
  (read-kbd-macro
        "
C-c C-o                 ;; latex-insert-block
RET                     ;; newline
C-/                     ;; undo
"))

The advantage of this is that I didn’t actually write this string; I recorded the macro, the edited it and copied the contents of the edmacro buffer.

This is still not easy enough, though; I want an easier way of editing the macro as it appears in the test. This is, unfortunately, difficult as the edit-kbd-macro is not easy to invoke programmatically — it absolutely hard-codes user interaction (I did even try invoking edit-kbd-macro using a keyboard macro!). So, I have given up with that approach in the short term. Instead, I have written a function assess-robot-execute-macro that combines read-kbd-macro and execute-kbd-macro, but also sets the macro as the last macro, making it easy to edit. I’ve also added a keybinding to edmacro to copy the macro to the kill-ring. And here is a test using it:

(should
 (assess=
   "hello"
   (assess-robot-with-switched-buffer-string
     (assess-robot-execute-kmacro
"
hello                   ;; self-insert-command * 5
"))))

This also demonstrates the final quirk. Keyboard macros work in which ever buffer is selected — not the one which is current. We cannot use with-temp-buffer to select on temporarily and run the macro in it. So I have added macros to display a buffer temporarily instead.

As with many parts of assess, the back end is quite convoluted and complex, as many parts of Emacs were not written with testing in mind (that is they predate the whole idea of unit testing by many years!). But, I hope that the API that assess provides is simple, clear and flexible.

Assess is available at github and MELPA.

Feedback is, as always, welcome.

Bibliography