Proposal: free shrinking with QuickCheck
Reid Draper
reiddraper at gmail.com
Fri Nov 22 00:27:16 UTC 2013
I originally sent this message to the QuickCheck mailing list,
but haven't heard anything in a while, and realized there are
only ten subscribers. Below is the original message, any feedback
appreciated.
---
I have an idea to eliminate the `shrink` function from the `Arbitrary` type
class. Currently, users can optionally implement shrinking manually, this tends
to be boilerplate code and is "...clearly a generic programming problem" [1].
Further, users often have to create separate types to accommodate their
domain's restrictions. For example, a user may wish to only generate
power-of-two numbers for their test. Their code simply uses `Int`, but with
QuickCheck they must create a `newtype` wrapper and implement this logic in
both the `arbitrary` and `shrink` implementation. My idea is to
eliminate the `shrink` function, and to integrate shrinking with the
`arbitrary` function. Currently, the type for `arbitrary` is:
arbitrary :: Gen a
and `Gen` has the definition:
newtype Gen a = MkGen{ unGen :: StdGen -> Int -> a }
I suggest that instead of the inner-function returning `a`s, it should return
`RoseTree a`. The root of the tree is the generated value, and the rest of the
tree is all of the ways to shrink the value. Here's how things would fit
together:
data RoseTree a = RoseTree a [RoseTree a]
-- this is the same as the current Gen
newtype Generator a = Gen { unGen :: StdGen -> Int -> a }
newtype Gen a = MkGen { unGen :: Gen (RoseTree a) }
Conveniently, `Gen` still has a unary type constructor of `a`. Further, if
users have implemented their `arbitrary` implementations in terms of the
QuickCheck combinators, their code won't have to change at all (I don't think…).
The lower-level combinators would be updated to manipulate trees, with
functions like `joinRose` and `filterRose`. Let's next look at how `Gen`
would implement the monad type class:
instance Monad Gen where
return = MkGen . return . return
gen >>= f = MkGen $ helper (unGen gen) (unGen . f)
-- sequence is from Data.Traversable
where helper m k = m >>= \y -> fmap joinRose $ sequence $ fmap k y
The implementation of `return` is clear. `>>=` isn't too bad either: the
function provided to bind will return `Gen b`'s, and `joinRose` and `sequence`
help us pull this `Generator (RoseTree b))` 'out', much like `join` does. This
means our users can still write code like:
(arbitrary :: Gen [Int]) >>= elements
Which brings up a good point. The above code has an issue, `elements` is a
partial function with respect to the empty list. With the current
implementation, we'd use the `NonEmptyList` modifier, which much respect this
predicate both during generation and shrinking. This change would allow all
predicates to be expressed simply with `suchThat`, which, since it acts on both
values _and_ shrink trees, applies the predicate in both places. The
implementation of `suchThat` would have to unwrap `Gen`, and apply `roseFilter`
to the `RoseTree` inside of `Generator`, using `fmap`.
I have implemented the above description in my Clojure port of QuickCheck,
called simple-check [1], and it seems to be working quite nicely. Further, I
suspect Erlang QuickCheck [3] is doing something similar, though their
implementation is not open source, I can't presume too much.
Clearly this would be a big change, and my goal now is simply to start a
discussion: how does this sound? What's wrong, what's likely to break? Feedback
and criticism welcome.
Reid
[1] Scrap your boilerplate with class: extensible generic functions: http://research.microsoft.com/pubs/67439/gmap3.pdf
[2] https://github.com/reiddraper/simple-check
[3] http://www.quviq.com/index.html
More information about the Libraries
mailing list