# A brief intro to QuickCheck

*Functional Programming, Haskell, and Testing*

A quick walkthrough of Haskell's QuickCheck testing library, using examples to help you get started. This post assumes a basic knowledge of Haskell, but nothing too taxing.

I'm giving a talk on QuickCheck at an Equal Experts Functional Programming meetup in the coming weeks and decided to write up a variation of my talk here for future reference. Hopefully this helps a much wider group than what I can address here in London and also provides a good reference point for those that do attend the talk.

## Testing Approach

QuickCheck is a specification-based testing library that feeds randomly-generated values into properties that define how the application should behave. These are universally quantified properties that should hold for all values in the domain.

## Properties

Properties are essentially executable specifications that can be universally quantified over. They specify what you expect of your program and can be executed against the program to ensure that it conforms to the spec. Exhaustive testing is practically impossible for all but the smallest programs, so it verifies that the property holds for 100 randomly-generated test cases (this can be configured, of course). Over time, the random generation of test cases ensures that the property is sufficiently tested with a variety of inputs.

This is usually when skeptics raise their (justified) concerns that we can't be sure to have achieved sufficient code coverage if we haven't explicitly selected the test cases. While that's true to some extent, a fair bit of research has been done in this area (over many decades) and the results indicate that random testing is just as good, if not better, than explicit partition testing when roughly 20% more points are used (see Hamlet's
*
Random Testing
*
paper for more details - he also rightly questions the scientific basis of projects choosing to have 80% test coverage... but that's another issue).

So what does a property look like? Well, here's one that specifies that sorting must be idempotent:

Running this in
`
ghci
`
produces the following output:

You can find out more about properties in the docs .

## Generators

Given that properties are executed with randomly-generated inputs, we need some way to randomly produce values. QuickCheck provides generators for standard types in Haskell, so you don't need to create your own
```
Int
```

generator - but you will need to create generators for your own types.

There are a bunch of generator combinators that you can use to produce your own types. For example, to produce an
```
Int
```

in the range
```
[0,100]
```

, you can simply use
```
choose (0,100)
```

, which randomly selects a value in the range with equal probability. Or if you want to adjust the distribution, you could use something like
```
frequency [(3, choose (0,50)), (1, choose (51,100))]
```

(which produces a weighed random distribution).

The above examples are intentionally quite simple to show how generator combinators can be used to produce values, but now we want to see how to use them to produce values for our own types. We'll start with a very trivial example:

Here we've created our own data type called
```
Age
```

that can be constructed with a single
```
Int
```

value. To allow QuickCheck to automatically produce random values of type
```
Age
```

, we must create an instance of the
```
Arbitrary
```

typeclass for
```
Age
```

. Here we define the
```
arbitrary
```

function to produce a value of
```
Age
```

with a random value between zero and one hundred (assuming this is a reasonable age range). Now we can define properties that accept values of type
```
Age
```

and QuickCheck will supply it with
```
arbitrary
```

values whenever it is executed.

Let's say we now want to define a
```
Person
```

data type that incorporates our
```
Age
```

type. Because
```
Age
```

is already an instance of the
```
Arbitrary
```

typeclass, we can simply use the
```
arbitrary
```

function to produce values of it for us. You'll see what I mean in the example below:

If you want to sample the values produced by your generator, you can execute
```
sample (arbitrary :: Gen Person)
```

in
`
ghci
`
and it'll print out some sample values. You'll see from the output below that the names of our generated
```
Person
```

type are not really suitable. I'll leave it up to you to create a more suitable generator for names.

You can find out more about generators in the docs .

## Modifiers

Sometimes you want to create properties with well-defined categories of values - for example: positive or negative numbers, or non-empty lists. These aren't different
*
types
*
- they're just constraints to apply to the values of the type. Of course you don't want to create custom generators for these all the time. Fortunately, QuickCheck recognises that these situations are common enough for a general solution, and it provides modifiers for this purpose.

Here are two properties that specify our expectations of squares and cubes. Of course, all squares are larger than or equal to their square roots, but cubes only share the same property for all
*
non-negative
*
inputs (because the multiplication of three negative values always produces a negative value).

In
```
prop_CubeOfPositive
```

, we've used the
```
NonNegative
```

modifier to ensure that only zero or positive values are provided for
```
x
```

. Without this modifier, we would need to provide a generator that only produces non-negative values. Instead, this is done for us by QuickCheck.

You can find out more about modifiers in the docs .

## QuickCheck in Action

We've just been laying the groundwork to understand the main aspects of defining properties and creating generators. Now we'll put it into practice with another example that incorporates a few more features of QuickCheck. I'll use a function sent to me by a colleague that produces a Fibonacci sequence. One of the nice things with this particular function is that it produces an infinite list - so we get to see how QuickCheck handles infinite data structures and some of the power of Haskell's laziness (for those that might be unfamiliar with Haskell).

First, execute that function in
`
ghci
`
and see what it does. Of course, it'll never terminate so you need to use it in a way that causes it to stop, like
```
take 10 fibs
```

. This should be a familiar sight... we all hopefully remember Fibonacci sequences from our school days. So now let's put QuickCheck to work testing that our function actually does produce a valid Fibonacci sequence.

You'll notice in the example below I've used the
```
forAll
```

function to allow explicit provision of a test case generator rather than automatically using a suitable instance of
```
Arbitrary
```

based on the inferred type.

So let's walk through what
```
prop_Fibonacci
```

actually specifies. Firstly, we refer to three consecutive values contained within the result of
```
fibs
```

:
```
x
```

,
```
y
```

and
```
z
```

. With these references in hand, we specify that the last value must equal the sum of the first two - which is precisely what a Fibonacci sequence is. Secondly, we constrain the values that we extract from
```
fibs
```

to ensure that our function can execute in a reasonable amount of time. Of course, this property should hold for
*
all
*
values of
```
n
```

, but we can't practically test that given the infinite nature of a fibonacci sequence and the limited computing power available to us. So instead, we limit the infinite to some reasonable finite subset and test that part instead (hence the use of
```
forAll
```

). By the way, this is not that different from testing manually... no human being can test an infinite structure any more than a computer can.

Now let's consider what we would've done without QuickCheck. Typically, a few test cases would be selected and explicitly coded into the test. These tests would run repeatedly for the
*
same
*
values, which is actually more of a regression test than a test intended to find
*
new
*
defects. Depending on the approach taken by the developer, the test might select specific values and ensure they are what we expect them to be; or they may take the relative approach followed in
```
prop_Fibonacci
```

and express the relationship between consecutive values in the sequence. For the purposes of our explanation, it really doesn't matter. The point is that example-based tests are often as static at run-time as they are at compile-time.

## Test Data Distribution

Earlier we looked at using the
```
frequency
```

generator combinator to produce a weighed distribution of random values according to some requirement. Test data distribution can also be affected or monitored from your properties, so we'll look at an example of that now.

Let's review the sorting property we defined earlier and retrofit it with some useful extensions. We'll start by getting a more detailed view of what values are being used to test our implementation.

The
```
collect
```

function allows us to label our test cases so the output displays something a bit more meaningful. If you recall, the output previously told us that the test passes for 100 cases. With this modification, it also prints out the length of the list and what percentage of test cases were executed with lists of this length. Now, we get something a bit more detailed:

This is a bit nasty... a lot more detailed than is helpful. So instead, we can classify our tests into categories. Given that we're testing a sorting algorithm, we're concerned that our test might not run against lists of sufficient length (after all, how many bugs are you likely to find when sorting an empty list or a list of only one element). So do this, we use
```
classify
```

instead, which accepts a predicate and a label. In our case, we've labelled all lists of length 0 or 1 as "trivial". The accompanying output is a lot more readable and also more useful.

Hopefully this has given you an idea of how to confirm that your properties are executed against meaningful inputs. There are more options available to you that I won't go into here, but take a look at things like
```
cover
```

,
```
label
```

and
```
==>
```

for
more details
.

## Shrinking

One of the difficulties with random test cases is that you do sometimes find a failing test case that is absolutely massive (or at least, larger than it needs to be). Fortunately QuickCheck has capabilities in this area to help you. It's called
*
shrinking
*
. When a property fails for some input, QuickCheck retries the property with a slightly smaller variation of the original input. It continues to do this until the property passes, at which point you've found the smallest possible value that causes the property to fail.

Like random data generation, shrinking is not left to chance. It is controlled by the
```
shrink
```

function on the
```
Arbitrary
```

typeclass. The default implementation returns
```
[]
```

, which means that the value cannot be shrunk any further. If you have a data type that can be shrunk, you'll need to provide an implementation that is appropriate for your needs. The
```
shrink
```

function should return a list of "all the possible immediate shrinks of the given value". Take a look at the Arbitrary instances provided by QuickCheck for an example of what shrink does.

Just as an example, here's what shrink does to a short list of integers:

You can see that not only is the list itself shrunk, but the values within it too. I'll leave you to play around with it on your own.

## Further Reading

As promised, this was a very brief intro to QuickCheck... too brief in many respects. I hope it has at least helped you get started with it and explained the basics well enough. If you want to dive into this in more detail, here are a few references to get you started.

I'd strongly recommend starting off by reading .

Pedro Vasconcelos has also written an introduction to QuickCheck on FP Complete's School of Haskell .