A quick walkthrough of Haskell's QuickCheck testing library, using examples to help you get started. This post assumes a basic knowledge of Haskell, but nothing too taxing.

I'm giving a talk on QuickCheck at an Equal Experts Functional Programming meetup in the coming weeks and decided to write up a variation of my talk here for future reference. Hopefully this helps a much wider group than what I can address here in London and also provides a good reference point for those that do attend the talk.

Testing Approach

QuickCheck is a specification-based testing library that feeds randomly-generated values into properties that define how the application should behave. These are universally quantified properties that should hold for all values in the domain.

Properties

Properties are essentially executable specifications that can be universally quantified over. They specify what you expect of your program and can be executed against the program to ensure that it conforms to the spec. Exhaustive testing is practically impossible for all but the smallest programs, so it verifies that the property holds for 100 randomly-generated test cases (this can be configured, of course). Over time, the random generation of test cases ensures that the property is sufficiently tested with a variety of inputs.

This is usually when skeptics raise their (justified) concerns that we can't be sure to have achieved sufficient code coverage if we haven't explicitly selected the test cases. While that's true to some extent, a fair bit of research has been done in this area (over many decades) and the results indicate that random testing is just as good, if not better, than explicit partition testing when roughly 20% more points are used (see Hamlet's Random Testing paper for more details - he also rightly questions the scientific basis of projects choosing to have 80% test coverage... but that's another issue).

So what does a property look like? Well, here's one that specifies that sorting must be idempotent:

Running this in ghci produces the following output:

You can find out more about properties in the docs .

Generators

Given that properties are executed with randomly-generated inputs, we need some way to randomly produce values. QuickCheck provides generators for standard types in Haskell, so you don't need to create your own Int generator - but you will need to create generators for your own types.

There are a bunch of generator combinators that you can use to produce your own types. For example, to produce an Int in the range [0,100] , you can simply use choose (0,100) , which randomly selects a value in the range with equal probability. Or if you want to adjust the distribution, you could use something like frequency [(3, choose (0,50)), (1, choose (51,100))] (which produces a weighed random distribution).

The above examples are intentionally quite simple to show how generator combinators can be used to produce values, but now we want to see how to use them to produce values for our own types. We'll start with a very trivial example:

Here we've created our own data type called Age that can be constructed with a single Int value. To allow QuickCheck to automatically produce random values of type Age , we must create an instance of the Arbitrary typeclass for Age . Here we define the arbitrary function to produce a value of Age with a random value between zero and one hundred (assuming this is a reasonable age range). Now we can define properties that accept values of type Age and QuickCheck will supply it with arbitrary values whenever it is executed.

Let's say we now want to define a Person data type that incorporates our Age type. Because Age is already an instance of the Arbitrary typeclass, we can simply use the arbitrary function to produce values of it for us. You'll see what I mean in the example below:

If you want to sample the values produced by your generator, you can execute sample (arbitrary :: Gen Person) in ghci and it'll print out some sample values. You'll see from the output below that the names of our generated Person type are not really suitable. I'll leave it up to you to create a more suitable generator for names.

You can find out more about generators in the docs .

Modifiers

Sometimes you want to create properties with well-defined categories of values - for example: positive or negative numbers, or non-empty lists. These aren't different types - they're just constraints to apply to the values of the type. Of course you don't want to create custom generators for these all the time. Fortunately, QuickCheck recognises that these situations are common enough for a general solution, and it provides modifiers for this purpose.

Here are two properties that specify our expectations of squares and cubes. Of course, all squares are larger than or equal to their square roots, but cubes only share the same property for all non-negative inputs (because the multiplication of three negative values always produces a negative value).

In prop_CubeOfPositive , we've used the NonNegative modifier to ensure that only zero or positive values are provided for x . Without this modifier, we would need to provide a generator that only produces non-negative values. Instead, this is done for us by QuickCheck.

You can find out more about modifiers in the docs .

QuickCheck in Action

We've just been laying the groundwork to understand the main aspects of defining properties and creating generators. Now we'll put it into practice with another example that incorporates a few more features of QuickCheck. I'll use a function sent to me by a colleague that produces a Fibonacci sequence. One of the nice things with this particular function is that it produces an infinite list - so we get to see how QuickCheck handles infinite data structures and some of the power of Haskell's laziness (for those that might be unfamiliar with Haskell).

First, execute that function in ghci and see what it does. Of course, it'll never terminate so you need to use it in a way that causes it to stop, like take 10 fibs . This should be a familiar sight... we all hopefully remember Fibonacci sequences from our school days. So now let's put QuickCheck to work testing that our function actually does produce a valid Fibonacci sequence.

You'll notice in the example below I've used the forAll function to allow explicit provision of a test case generator rather than automatically using a suitable instance of Arbitrary based on the inferred type.

So let's walk through what prop_Fibonacci actually specifies. Firstly, we refer to three consecutive values contained within the result of fibs : x , y and z . With these references in hand, we specify that the last value must equal the sum of the first two - which is precisely what a Fibonacci sequence is. Secondly, we constrain the values that we extract from fibs to ensure that our function can execute in a reasonable amount of time. Of course, this property should hold for all values of n , but we can't practically test that given the infinite nature of a fibonacci sequence and the limited computing power available to us. So instead, we limit the infinite to some reasonable finite subset and test that part instead (hence the use of forAll ). By the way, this is not that different from testing manually... no human being can test an infinite structure any more than a computer can.

Now let's consider what we would've done without QuickCheck. Typically, a few test cases would be selected and explicitly coded into the test. These tests would run repeatedly for the same values, which is actually more of a regression test than a test intended to find new defects. Depending on the approach taken by the developer, the test might select specific values and ensure they are what we expect them to be; or they may take the relative approach followed in prop_Fibonacci and express the relationship between consecutive values in the sequence. For the purposes of our explanation, it really doesn't matter. The point is that example-based tests are often as static at run-time as they are at compile-time.

Test Data Distribution

Earlier we looked at using the frequency generator combinator to produce a weighed distribution of random values according to some requirement. Test data distribution can also be affected or monitored from your properties, so we'll look at an example of that now.

Let's review the sorting property we defined earlier and retrofit it with some useful extensions. We'll start by getting a more detailed view of what values are being used to test our implementation.

The collect function allows us to label our test cases so the output displays something a bit more meaningful. If you recall, the output previously told us that the test passes for 100 cases. With this modification, it also prints out the length of the list and what percentage of test cases were executed with lists of this length. Now, we get something a bit more detailed:

This is a bit nasty... a lot more detailed than is helpful. So instead, we can classify our tests into categories. Given that we're testing a sorting algorithm, we're concerned that our test might not run against lists of sufficient length (after all, how many bugs are you likely to find when sorting an empty list or a list of only one element). So do this, we use classify instead, which accepts a predicate and a label. In our case, we've labelled all lists of length 0 or 1 as "trivial". The accompanying output is a lot more readable and also more useful.

Hopefully this has given you an idea of how to confirm that your properties are executed against meaningful inputs. There are more options available to you that I won't go into here, but take a look at things like cover , label and ==> for more details .

Shrinking

One of the difficulties with random test cases is that you do sometimes find a failing test case that is absolutely massive (or at least, larger than it needs to be). Fortunately QuickCheck has capabilities in this area to help you. It's called shrinking . When a property fails for some input, QuickCheck retries the property with a slightly smaller variation of the original input. It continues to do this until the property passes, at which point you've found the smallest possible value that causes the property to fail.

Like random data generation, shrinking is not left to chance. It is controlled by the shrink function on the Arbitrary typeclass. The default implementation returns [] , which means that the value cannot be shrunk any further. If you have a data type that can be shrunk, you'll need to provide an implementation that is appropriate for your needs. The shrink function should return a list of "all the possible immediate shrinks of the given value". Take a look at the Arbitrary instances provided by QuickCheck for an example of what shrink does.

Just as an example, here's what shrink does to a short list of integers:

You can see that not only is the list itself shrunk, but the values within it too. I'll leave you to play around with it on your own.

Further Reading

As promised, this was a very brief intro to QuickCheck... too brief in many respects. I hope it has at least helped you get started with it and explained the basics well enough. If you want to dive into this in more detail, here are a few references to get you started.

I'd strongly recommend starting off by reading .

Pedro Vasconcelos has also written an introduction to QuickCheck on FP Complete's School of Haskell .



Published

21 February 2014

Share

submit to reddit