Add a dev and a user guide (#120)

* Add a dev and a user guide

* Changes
This commit is contained in:
mniip 2021-09-07 22:33:51 +03:00 committed by GitHub
parent e71cbde9c1
commit a1ea91fc50
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 200 additions and 0 deletions

149
docs/Development_guide.md Normal file
View File

@ -0,0 +1,149 @@
CompaREST
=========
The tool works by simultaneously looking at a pair of nodes in OpenAPI schema
trees. At every place we try to understand whether one side (the producer) can
produce something that the other side (the consumer) cannot consume. This often
involves recursively running the same check on child sub-nodes. The necessary
criteria are implemented manually for all types of nodes that appear in the
schema. The `Subtree` class in `Data.OpenApi.Compare.Subtree` designates types
of nodes for which this compatibility checking is defined. The implementations
for various types reside in `Data.OpenApi.Compare.Validate.*`.
Producer/Consumer
-----------------
Rather than looking at which schema is the server's and which is the client's,
in terms of compatibility checking it is more useful to track which side is the
"producer" and which is the "consumer". At the root of the schema the producer
is the client, but as we descend into an HTTP response, the direction flips and
the producer becomes the server.
We deal with a lot of things in pairs, where one thing belongs to the producer
and another belongs to the consumer. The `ProdCons` datatype (in
`Data.OpenApi.Compare.Subtree`) is just a pair type that provides an Applicative
abstraction for working with two things simultaneously.
Trees
-----
We directly use the datatypes defined in the `openapi3` library. For reasons of
identifiability and memoization we tag all nodes in the trees with a path from
the root. The tree is heterogeneous (nodes have different types), and every type
of node has a different arrangement of children. There is a `Step` data family
that defines all the possible children (of a specific type) of a node (of a
specific type).
A `Trace` is a sequence of steps, except in a path the types of adjacent steps
have to agree. The `Paths` datatype achieves this. Together all paths form a
`Category`, and we often concatenate them using `>>>`. This is defined in
`Data.OpenApi.Compare.Paths`.
We often keep a node and a path to it next to each other using an environment
comonad. `Traced` is a type alias around `Env` which ensures that the type of
the path matches the type of the node.
Issues
------
The output of the tool is a list of compatibility issues that came up during
checking. The `Issue` data family describes the kinds of issues that can occur
at each type of node. An issue is supposed to characterize a set of interactions
(requests or responses) that demonstrate that the schemas are incompatible.
Each issue is tagged with a `Behavior` that describes the path to reproducing
the issue. While a `Trace` describes a syntactic path down the schema tree, a
`Behavior` describes which part of an interaction needs to be focused on for the
issue to manifest itself.
In most places where we keep multiple issues (e.g. the output), to store them we
use a prefix tree map, indexed by both types and values of `Behavior`s. See
`Data.OpenApi.Compare.PathsPrefixTree`.
Report
------
The checker ultimately outputs a list of `Issue`s, together with `Behavior`s
that cause them. We run the checker twice: forwards and backwards -- to detect
non-breaking changes. All of this is then compiled into a report in
`Data.OpenApi.Compare.Report`. The tree structure of headings is computed from
`Behavior`s, using `Jet`s to collapse particular sequences of behavior steps
into a single human-readable header item. The text of each paragraph comes from
`describeIssue`.
We then use the `pandoc` library to render the report in a variety of formats.
Formulas
--------
The schema has references, so it can end up being recursive. Similarly the
process for establishing compatibility can end up being recursive. To keep track
of this, the checker actually manipulates *formulas* which can have variables in
them, defined in (`Data.OpenApi.Compare.Formula`).
A formula represents a set of issues, with the empty set corresponding to a
successful compatibility check. A formula can have conjunctions, where we only
report success if all parts succeeded, and if not we take the union of all
issues. A formula can also have disjunctions, but there it's impossible to
guarantee that the issues for the parts also make sense as issues for the whole
thing. Due to this if all disjuncted parts have issues we instead return a
different issue (usually non-descriptive).
The `FormulaF` datatype implements a functor that carries formulaic calculations
as well as a result value (a co-Yoneda extension of formulas for the most part).
The Applicative interface provides conjunction, and the Alternative interface
provides disjunction.
When we detect recursion we introduce a variable, compute the compatibility
check as a formula with that variable, and then we solve a fixed point equation
(`maxFixpoint`) with the variable to obtain the answer.
This `FomulaF` is further wrapped in a couple monad transformers
(`SemanticCompatFormula`, `StructuralCompatFormula`), and the whole checker is
then implemented in the resulting Applicative using ApplicativeDo.
Memoization
-----------
The `Data.OpenApi.Compare.Memo` module contains utilities for stateful
memoization and recursion detection. The entire checking process is memoized,
and can detect, propagate, and resolve recursion.
Structural/Semantic Compatibility
---------------------------------
When we encounter two nodes we first try to optimistically establish whether
they're structurally equal, i.e. the trees are exactly the same modulo reference
inlining. This check is also memoized and recursion-aware. If the check succeeds
we conclude that the schemas are compatible. Otherwise we fall back to the
"normal" semantic compatibility.
JSON Schema
-----------
The language that describes JSON payloads is probably the richest part of
OpenAPI schemas, and the checker for it also comes with a lot of moving parts.
JSON schema allows arbitrary set arithmetic using `allOf`, `anyOf`, `not` and
`oneOf`. To fully check `not` and `oneOf` we would need to compare set
subtraction, which involves negating a comparison result. If recursion is
involved we would need to be able to solve equations with negation, which we
cannot do. So `not` is unsupported altogether, and `oneOf` is only supported
when it has disjoint branches and behaves like `anyOf`.
With that in mind a JSON schema is first pre-processed into a Disjunctive Normal
Form (`DNF` in `Data.OpenApi.Compare.Validate.Schema.DNF`). This pre-processing
happens in `Data.OpenApi.Compare.Validate.Schema.Process`.
The DNF contains elementary clauses that match on a specific aspect of a JSON
object (`Condition`). Most clauses only affect objects of a single type, so JSON
values and clauses are segregated by type (`JsonType`, `TypedValue` in
`Data.OpenApi.Compare.Validate.Schema.JsonFormula`)
Then we reduce the problem of comparing two DNF's to the comparison of a
conjunction of clauses in the producer with a single clause in the consumer
(`checkFormulas` in `Data.OpenApi.Compare.Validate.Schema`).
When we do disjunction in the formula we lose information, so we try to avoid
having disjunctions in the consumer DNF. This is done by the means of
partitioning. We look for factors that would let us partition the set of all
objects into several bins, in a way that hopefully corresponds only a single
consumer to each producer. This is implemented in
`Data.OpenApi.Compare.Validate.Schema.Partition`. Currently we can only
partition by an `enum` value that is accessible via a chain of `required`
fields.
The same partitioning mechanism is used to test whether the branches of a
`oneOf` are disjoint, and emit a warning otherwise.

51
docs/User_guide.md Normal file
View File

@ -0,0 +1,51 @@
CompaREST User Guide
====================
Running the Tool
----------------
The tool accepts two OpenAPI 3.0.0 schema files in either JSON or YAML format.
One is assumed to be the client version of the schema, and the other is the
server version.
The tool will look for changes between the schemas and detect whether they are
breaking or not -- that is, if they prevent interoperability between the client
and the server.
Running:
```
comparest -c client.json -s server.json
```
will output a markdown report of the changes to the standard output.
Compatibility status can be signaled via the exit code using
`--signal-exit-code`. In case no report is needed, output can be suppressed with
`--silent`. For example:
```
comparest -c client.json -s server.json --signal-exit-code --silent
echo $?
```
If there were changes that the tool determined to be breaking, the exit code
will be 1. If there were some changes the tool couldn't understand, the exit
code will be 2. Otherwise if there were no breaking changes, the exit code will
be 0.
Controlling the Report
----------------------
By default the report includes breaking changes, as well as non-breaking:
changes that would be considered breaking in the opposite direction. To only
include breaking changes in the report use `--only-breaking`. The `--all` option
restores the default behavior.
The report can be formatted in a variety of ways supported by Pandoc. The `-o`
option causes the report to be written to a file. The format of the file is
determined from the extension. The supported extensions include:
- `.md` for markdown
- `.html` for an HTML snippet
- `.rst` for restructured text
- no extension for a self-contained HTML document with styles
By default the report is split up into parts relating to different paths,
requests, responses, etc, using headers of various levels. Alternatively, the
report can use indended block-quotes to visualize the tree structure of the
report. The header style is enabled with `--header-style`, and the block-quote
style is enabled with `--folding-block-quotes-style`.