An Exercise in Irrelevance - Emacs Testing with Assess

Assess is my new package supporting testing in Emacs. It has grown out of my frustration with the existing framework while building the lentic package (n.d.)

For quite a while, the only testing framework in Emacs has been ERT (the Emacs Regression Testing tool) which is part of core. More recently, there have been a number of new ones arriving. For example, buttercup and ecukes both provide behaviour driven testing, rather like Jasmine or Cucumber respectively. Both worth looking at — I’ve used Ecukes for testing Cask, and it’s nicely implemented and very usable. Assess is rather less radical than this though. It focuses on providing a general set of tools for testing, mostly in terms of some macros and predicates that should be useful.

For example, a recurrent problem during the development of lentic was ensuring that no buffers were left around after a test. Particularly when a test fails, this can lead to unexpected failures in later tests. For this purpose, I have added a macro called assess-with-preserved-buffer-list. Any buffers at all created inside this macro will be removed after. Hence this:

(assess-with-preserved-buffer-list
  (get-buffer-create "a")
  (get-buffer-create "b")
  (get-buffer-create "c"))

Which preserves the buffer state. a, b and c will be removed after the macro exits.

Assess also provides some handy predicates for testing. So, we can compare the contents of strings, buffer and files easily. For example:

;; Compare Two Strings
(assess= "hello" "goodbye")

;; Compare the contents of Two Buffers
(assess=
  (assess-buffer "assess.el")
  (assess-buffer "assess-previous.el"))

;; Compare the contents of Two files
(assess=
  (assess-file "~/.emacs")
  (assess-file "~/.emacs"))

Again, this has all been done carefully to avoid changing state. The last example, should work whether ~/.emacs is open already or not, and will not result in new buffer creation.

As well as string comparison, we can also check that indentation is working correctly. This example, for instance, takes a indented string, un-indents and re-indents according to a specific mode, then checks that nothing has changed.

(assess-roundtrip-indentation=
  'emacs-lisp-mode
  "(assess-with-find-file\n    \"~/.emacs\"\n  (buffer-string))")

Likewise, we can check fontification with the assess-face-at= function. In this case, we are checking that three words get highlighted correctly.

(assess-face-at=
 "(defun x ())\n(defmacro y ())\n(defun z ())"
 'emacs-lisp-mode
 '("defun" "defmacro" "defun")
 'font-lock-keyword-face)

Finally, in terms of test functions, I have recently added two new pieces of functionality. Call capturing was an idea I stole from buttercup — this is a non-interative equivalent to the function tracing, returning parameters and return values to a function call.

(assess-call-capture
  '+
  (lambda()
    (+ 1 1)))
;; => (((1 1) . 2))

And discover provides a drop in replacement for ert-run-tests-batch-and-exit except that it automatically finds and loads test files, based on a set of heuritstics. I’ve already started to use this instead of ert-runner, as it requires no configuration.

The final thing I wanted to address was better reporting. While everything described so far as test environment agnostic, I’ve only managed to advance reporting for ERT. All of the functions that I have written plug into ERT, so produce richer output. So instead of:

F temp
    (ert-test-failed
     ((should
       (string= "a" "b"))
      :form
      (string= "a" "b")
      :value nil))

When comparing "a" and "b", this output is fine, but if the strings are more complex (say, for example, the a short piece of code that you expect to indent in a certain way), it is hard to work out what the differences are.

So, assess extends ERT so that that it now calls diff on the strings. We now get this explanation instead:

F test-assess=
    (ert-test-failed
     ((should
       (assess= "a" "b"))
      :form
      (assess= "a" "b")
      :value nil :explanation "Strings:
a
and
b
Differ at:*** /tmp/a935uPW  2016-01-20 13:25:47.373076381 +0000
--- /tmp/b9357Zc    2016-01-20 13:25:47.437076381 +0000
***************
*** 1 ****
! a
\\ No newline at end of file
--- 1 ----
! b
\\ No newline at end of file

"))

Verbose for sure, but very useful for identifying issues, especially white space related. By default, this uses the diff command, but this is also extensible, and has a simple fallback in its absence.

There is more functionality to assess than that shown here: it can create multiple temporary buffers in a single scope; it can create “related” temporary files, prevent conflicts if files are already open; it can re-indent the contents of files rather than buffers, and so on. I think assess is a nice addition to the testing capabilities of Emacs.

More is needed and some of this involve changes to Emacs core — packages which are noisy, enforce interactivity, and so forth. I would also like to add support for “robotized” tests running keyboard macros, tests for checking hooks, output to the message buffer, and testing for asynchronous call backs. But assess is quick, easy to use and makes testing of many features of Emacs much easier. I’ve started to use it in my own packages, and will eventually use it in all of them.

My plans for the future are to move assess to ELPA, and then eventually to Emacs core after 25.1, probably as ert-assess. I hope that along with my restructuring of the Emacs unit tests files, this will make testing of Emacs core simpler and more straight forward; if it does this should make the merge hassles (mostly to other people, sadly) caused by file moves worthwhile. Emacs will be easier to develop, and simpler to change.

Feedback is, as always, welcome.

n.d. https://www.russet.org.uk/blog/3071.