mirror of https://github.com/ilyakooo0/roboservant.git synced 2024-11-03 20:02:50 +03:00

generate contextually sensible fuzz tests for servant apps

Go to file

Mark Wotton 84f5a34944 Merge pull request #18 from mwotton/queryparams queryparams		2021-03-18 15:01:46 -04:00
.github/workflows	mucking with github actions	2020-11-30 14:10:23 -05:00
app	autogenerated stuff	2020-06-05 12:01:44 -04:00
benchmarks	make servant-client work	2021-03-16 15:20:09 -04:00
scripts	tidying up	2020-06-06 10:24:56 -04:00
src	tiny cleanup	2021-03-18 13:36:08 -04:00
test	tiny cleanup	2021-03-18 13:36:08 -04:00
.gitignore	cleaned up some type level stuff, added some type level bits	2020-08-08 18:46:09 -04:00
ChangeLog.md	autogenerated stuff	2020-06-05 12:01:44 -04:00
Example.lhs	swap roles of lhs & md	2020-12-24 16:18:42 -05:00
EXAMPLE.md	make servant-client work	2021-03-16 15:20:09 -04:00
LICENSE	autogenerated stuff	2020-06-05 12:01:44 -04:00
Makefile	extract servant code from fuzz machinery	2021-03-15 11:18:14 -04:00
package.yaml	queryparams	2021-03-18 12:44:43 -04:00
README.md	some extra instances & readme	2021-01-07 11:53:13 -05:00
roboservant.cabal	queryparams	2021-03-18 12:44:43 -04:00
Setup.hs	autogenerated stuff	2020-06-05 12:01:44 -04:00
stack.yaml	doco fixes and a build tool stanza	2021-01-05 10:38:15 -05:00
stack.yaml.lock	WIP	2020-12-30 13:03:00 -05:00
TODO.md	split up todo/readme	2020-09-23 23:11:05 -04:00

README.md

roboservant

Automatically fuzz your servant apis in a contextually-aware way.

example

see full example here

why?

Servant gives us a lot of information about what a server can do. We use this information to generate arbitrarily long request/response sessions and verify properties that should hold over them.

how?

In essence, fuzz @Api yourServer config will make a bunch of calls to your API, and record the results in a type-indexed dictionary. This means that they are now available for the prerequisites of other calls, so as you proceed, more and more api calls become possible.

We explicitly do not try to come up with plausible values that haven't somehow come back from the API. That's straying into QC/Hedgehog territory: if you want that, come up with the values on that side, and set them as seeds in the configuration.

what does it mean to be "available"?

In a simple API, you may make a call and get back a Foo, which will allow you to make another call that requires a Foo. In a more complicated app, it's likely that you'll send a request body that includes many subcomponents, and it's likely you'll get a response that needs to be broken down into pieces before it's useful.

To cope with this, we have the typeclasses BuildFrom and Breakdown. You can write instances for them if you feel like it, and indeed it's currently required for recursive datatypes if you don't want the fuzzer to hang, but for the majority of your types it should be sufficient to derive them generically. (Sensible instances are provided for lists.)

There are two basic strategies here. In some cases, you want to regard a type as indivisible: that's why we like newtypes, right? In this case, we can derive using the Atom strategy.

deriving via (Atom NewtypedKey) instance Breakdown NewtypedKey
deriving via (Atom NewtypedKey) instance BuildFrom NewtypedKey

This is saying "A can neither be built from components or broken down for spare parts. Hands off!". This is a good strategy for key types, for instance.

If instead it's a big complicated thing with lots of juicy subcomponents, we want to rip it apart using Generics and feast on its succulent headmeats:

deriving via (Compound Payload) instance Breakdown Payload
deriving via (Compound Payload) instance BuildFrom Payload

priming the pump

Sometimes there are values we'd like to smuggle into the API that are not derivable from within the API itself: sometimes this is a warning sign that your API is incomplete, but it can be quite reasonable to require identifying credentials within an API and not provide a way to get them. It might also be reasonable to have some sample values that the user is expected to come up with.

For those cases, override the seed in the Config with a list of seed values, suitably hashed:

defaultConfig { seed = [hashedDyn creds, hashedDyn userJwt]}

why not servant-quickcheck?

servant-quickcheck is a great package and I've learned a lot from it. Unfortunately, as mentioned previously, there's a lot of the state space you just can't explore without context: modern webapps are full of pointer-like structures, whether they're URLs or database keys/uuids, and servant-quickcheck requires that you be able to generate these without context via Arbitrary.

limitations and future work

Currently, the display of failing traces is pretty tragic, both in the formatting and in its non-minimality. This is pretty ticklish: arguably the right way to do this is to return a trace that we can also rerun, and let quickcheck or hedgehog a level up shrink it until it's satisfactorily short. In the interest of being useful earlier rather than later, I'm releasing v1.0 before I crack this particular nut. We do know which calls we made that led to the failing case, so we would want to show that distinction in a visible way: it's possible that other calls that don't have direct data dependencies were important, but we definitely know we need the direct data dependencies.

The provenance stuff is a bit underbaked. It should at least pull a representation of the route chosen rather than just an integer index.

It would also be nice to have a robust strategy for deriving recursive datatypes, or at least rejecting attempts to generate them that don't end in an infinite loop.

Currently the FlattenServer instance for :> is quadratic. It would be nice to fix this but I lack the art.