Systematic concurrency testing meets Haskell.
Go to file
2017-04-08 05:42:25 +01:00
concurrency Use stylish-haskell to format import lists. 2017-04-08 05:42:25 +01:00
dejafu Use stylish-haskell to format import lists. 2017-04-08 05:42:25 +01:00
dejafu-tests Add a basic test for uninterruptible masks. 2017-03-04 05:36:32 +00:00
hunit-dejafu Use stylish-haskell to format import lists. 2017-04-08 05:42:25 +01:00
tasty-dejafu Use stylish-haskell to format import lists. 2017-04-08 05:42:25 +01:00
.gitignore Add note about performance profiling to README. 2017-03-20 19:27:42 +00:00
.hlint.yaml Add an HLint2 config file & fix warnings. 2017-04-08 05:42:25 +01:00
.stylish-haskell.yaml Use stylish-haskell to format import lists. 2017-04-08 05:42:25 +01:00
.travis.yml Add Stackage LTS-8 to the travis config. 2017-04-05 21:06:25 +01:00
CHANGELOG.markdown Add a repo-wide changelog. 2017-04-07 20:22:22 +01:00
README.markdown Add a repo-wide changelog. 2017-04-07 20:22:22 +01:00
stack.yaml Use the https clone path for test-framework-hunit. 2017-03-20 18:54:24 +00:00
test-build.sh Remove async-dejafu from travis. 2017-03-03 22:19:42 +00:00

dejafu Build Status

[Déjà Fu is] A martial art in which the user's limbs move in time as well as space, […] It is best described as "the feeling that you have been kicked in the head this way before"

-- Terry Pratchett, Thief of Time

Have you ever written a concurrent Haskell program and then, Heaven forbid, wanted to test it? Testing concurrency is normally a hard problem, because of the nondeterminism of scheduling: you can run your program ten times and get ten different results if you're unlucky.

There is a solution! Through these libraries, you can write concurrent programs and test them deterministically. By abstracting out the actual implementation of concurrency through a typeclass, an alternative implementation can be used for testing, allowing the systematic exploration of the possible results of your program.

Table of Contents

Packages

Version Intended Users Summary
concurrency [docs] [hackage] 1.1.2.0 Authors Typeclasses, functions, and data types for concurrency and STM.
dejafu [docs] [hackage] 0.5.1.3 Testers Systematic testing for Haskell concurrency.
hunit-dejafu [docs] [hackage] 0.4.0.1 Testers Deja Fu support for the HUnit test framework.
tasty-dejafu [docs] [hackage] 0.4.0.0 Testers Deja Fu support for the Tasty test framework.

Each package has its own README and CHANGELOG in its subdirectory.

There is also dejafu-tests, the test suite for dejafu. This is in a separate package due to Cabal being bad with test suite transitive dependencies.

Concurrent Programming

You should read Parallel and Concurrent Programming in Haskell, by Simon Marlow. It's very good, and the API of the concurrency package is intentionally kept very similar to the async, base, and stm packages, so all the knowledge transfers.

Asynchronous IO

The wonderful async package by Simon Marlow greatly eases the difficulty of writing programs which merely need to perform some asynchronous IO. The concurrency package includes an almost-total reimplementation of async.

For example, assuming a suitable getURL function, to fetch the contents of two web pages at the same time:

withAsync (getURL url1) $ \a1 -> do
  withAsync (getURL url2) $ \a2 -> do
    page1 <- wait a1
    page2 <- wait a2
    -- ...

The withAsync function starts an operation in a separate thread, and kills it if the inner action finishes before it completes.

Another example, this time waiting for any of a number of web pages to download, and cancelling the others as soon as one completes:

let download url = do
      res <- getURL url
      pure (url, res)

downloads <- mapM (async . download) urls
(url, res) <- waitAnyCancel downloads
printf "%s was first (%d bytes)\n" url (B.length res)

The async function starts an operation in another thread but, unlike withAsync takes no inner action to execute: the programmer needs to make sure the computation is waited for or cancelled as appropriate.

Threads and MVars

The fundamental unit of concurrency is the thread, and the most basic communication mechanism is the MVar:

main = do
  var <- newEmptyMVar
  fork $ putMVar var 'x'
  fork $ putMVar var 'y'
  r <- takeMVar m
  print r

The fork function starts executing a MonadConc action in a separate thread, and takeMVar/putMVar are used to communicate values (newEmptyMVar just makes an MVar with nothing in it). This will either print 'x' or 'y', depending on which of the two threads "wins".

On top of the simple MVar, we can build more complicated concurrent data structures, like channels. A collection of these are provided in the concurrency package.

If a thread attempts to read from an MVar which is never written to, or write to an MVar which is never read from, it blocks.

Software Transactional Memory

Software transactional memory (STM) simplifies stateful concurrent programming by allowing complex atomic state operations. Whereas only one MVar can be modified atomically at a time, any number of TVars can be. STM is normally provided by the stm package, but the concurrency package exposes it directly.

For example, we can swap the values of two variables, and read them in another thread:

main = do
  var1 <- newTVar 'x'
  var2 <- newTVar 'y'
  fork . atomically $ do
    a <- readTVar var1
    b <- readTVar var2
    writeTVar var2 a
    writeTVar var1 b
  a <- atomically $ readTVar var1
  b <- atomically $ readTVar var2
  print (a, b)

Even though the reads and writes appear to be done in multiple steps inside the forked thread, the entire transaction is executed in a single step, by the atomically function. This means that the main thread will observe the values ('x', 'y') or ('y', 'x'), it can never get ('x', 'x') as naive MVar implementation would.

Relaxed Memory and CRefs

There is a third type of communication primitive, the CRef (known in normal Haskell as the IORef). These do not impose synchronisation, and so the behaviour of concurrent reads and writes depends on the memory model of the underlying processor.

crefs = do
  r1 <- newCRef False
  r2 <- newCRef False
  x <- spawn $ writeCRef r1 True >> readCRef r2
  y <- spawn $ writeCRef r2 True >> readCRef r1
  (,) <$> readMVar x <*> readMVar y

Here spawn forks a thread and gives an MVar which can be read from to get the return value. Under a sequentially consistent memory model, there are three possible results: (True, True), (True, False), and (False, True). Under the relaxed memory model of modern processors, the result (False, False) is also possible. Relaxed memory models allow for reads and writes to be re-ordered between threads.

For testing, three memory models are supported (with the default being TSO):

  • Sequential Consistency: A program behaves as a simple interleaving of the actions in different threads. When a CRef is written to, that write is immediately visible to all threads.

  • Total Store Order (TSO): Each thread has a write buffer. A thread sees its writes immediately, but other threads will only see writes when they are committed, which may happen later. Writes are committed in the same order that they are created.

  • Partial Store Order (PSO): Each CRef has a write buffer. A thread sees its writes immediately, but other threads will only see writes when they are committed, which may happen later. Writes to different CRefs are not necessarily committed in the same order that they are created.

Program Testing

If you just write your concurrent program using the MonadConc and MonadSTM typeclasses (maybe with MonadIO if you need IO as well), then it is testable with dejafu!

Testing is similar to unit testing, the programmer produces a self-contained monadic action to execute. It is run under many schedules, and the results gathered into a list. This is a little different from normal unit testing, where you just return "true" or "false", here you have many results, and need to decide if the collection is valid or not.

For the simple cases, where you just want to check something is deterministic and doesn't deadlock, there is a handy autocheck function. For example:

example = do
  a <- newEmptyMVar
  b <- newEmptyMVar

  c <- newMVar 0

  let lock m = putMVar m ()
  let unlock = takeMVar

  j1 <- spawn $ lock a >> lock b >> modifyMVar_ c (return . succ) >> unlock b >> unlock a
  j2 <- spawn $ lock b >> lock a >> modifyMVar_ c (return . pred) >> unlock a >> unlock b

  takeMVar j1
  takeMVar j2

  takeMVar c

The correct result is 0, as it starts out as 0 and is incremented and decremented by threads 1 and 2, respectively. However, note the order of acquisition of the locks in the two threads. If thread 2 pre-empts thread 1 between the acquisition of the locks (or if thread 1 pre-empts thread 2), a deadlock situation will arise, as thread 1 will have lock a and be waiting on b, and thread 2 will have b and be waiting on a.

Here is what autocheck has to say about it:

> autocheck example
[fail] Never Deadlocks (checked: 5)
        [deadlock] S0------------S1-P2--S1-
[pass] No Exceptions (checked: 12)
[fail] Consistent Result (checked: 11)
        0 S0------------S2-----------------S1-----------------S0----

        [deadlock] S0------------S1-P2--S1-
False

It identifies the deadlock, and also the possible results the computation can produce, and displays a simplified trace leading to each failing outcome. The traces contain thread numbers, which the programmer can give a thread a name when forking. It also returns false as there are test failures.

Note that if your test case does IO, the IO will be executed a lot of times. It needs to be deterministic enough to not invalidate the results of testing. That may seem a burden, but it's a requirement of any form of testing.

Porting

As a general rule of thumb, to convert some existing code to work with dejafu:

  • Depend on "concurrency".
  • Import Control.Concurrent.Classy.* instead of Control.Concurrent.*
  • Change IO a to MonadConc m => m a
  • Change STM a to MonadSTM stm => stm a
  • Parameterise all the types by the monad: MVar -> MVar m, TVar -> TVar stm, IORef -> CRef m, etc
  • Fix the type errors.

Contributing

Bug reports, pull requests, and comments are very welcome!

The general idea (which I'm trying out as of Feb 2017) is that master should always be at most a minor version ahead of what is released on hackage, there shouldn't be any backwards-incompatible changes. Backwards-incompatible changes go on the next-major branch. This is to make it feasible to fix bugs without also introducing breaking changes, even if work on the next major version has already begun.

Feel free to contact me on GitHub, through IRC (#haskell on freenode), or email (mike@barrucadu.co.uk).

Code Coverage

hpc can generate a coverage report from the execution of dejafu-tests:

$ stack build --coverage
$ stack exec dejafu-tests
$ stack hpc report --all dejafu-tests.tix

This will print some stats and generate an HTML coverage report:

Generating combined report
 52% expressions used (4052/7693)
 48% boolean coverage (63/129)
      43% guards (46/106), 31 always True, 9 always False, 20 unevaluated
      68% 'if' conditions (11/16), 2 always True, 3 unevaluated
      85% qualifiers (6/7), 1 unevaluated
 61% alternatives used (392/635)
 80% local declarations used (210/261)
 26% top-level declarations used (280/1063)
The combined report is available at /home/barrucadu/projects/dejafu/.stack-work/install/x86_64-linux/nightly-2016-06-20/8.0.1/hpc/combined/custom/hpc_index.html

The highlighted code in the HTML report emphasises branch coverage:

  • Red means a branch was evaluated as always false.
  • Green means a branch was evaluated as always true.
  • Yellow means an expression was never evaluated.

See also the stack coverage documentation.

Performance

GHC can generate performance statistics from the execution of dejafu-tests:

$ stack build --profile
$ stack exec  -- dejafu-tests +RTS -p
$ less dejafu-tests.prof

This prints a detailed breakdown of where memory and time are being spent:

    Mon Mar 20 19:26 2017 Time and Allocation Profiling Report  (Final)

       dejafu-tests +RTS -p -RTS

    total time  =      105.94 secs   (105938 ticks @ 1000 us, 1 processor)
    total alloc = 46,641,766,952 bytes  (excludes profiling overheads)

COST CENTRE                           MODULE                     %time %alloc

findBacktrackSteps.doBacktrack.idxs'  Test.DejaFu.SCT.Internal    21.9   12.0
==                                    Test.DejaFu.Common          12.4    0.0
yieldCount.go                         Test.DejaFu.SCT             12.1    0.0
dependent'                            Test.DejaFu.SCT              5.1    0.0
runThreads.go                         Test.DejaFu.Conc.Internal    2.7    4.1
[...]

dejafu-tests is a good target for profiling, as it is a fairly representative use: a testsuite where results will be quickly summarised and printed. It may not be so useful for judging performance of programs which keep the test results around for a long time.

Release Notes / Change Logs

See the CHANGELOG.markdown file.

Bibliography

These libraries wouldn't be possible without prior research, which I mention in the documentation. Haddock comments get the full citation, whereas in-line comments just get the shortened name:

There are also a couple of papers on dejafu itself: