From a1ea91fc50db6f56a8cfe5e9be20eb0b2b19ff17 Mon Sep 17 00:00:00 2001 From: mniip Date: Tue, 7 Sep 2021 22:33:51 +0300 Subject: [PATCH] Add a dev and a user guide (#120) * Add a dev and a user guide * Changes --- docs/Development_guide.md | 149 ++++++++++++++++++++++++++++++++++++++ docs/User_guide.md | 51 +++++++++++++ 2 files changed, 200 insertions(+) create mode 100644 docs/Development_guide.md create mode 100644 docs/User_guide.md diff --git a/docs/Development_guide.md b/docs/Development_guide.md new file mode 100644 index 0000000..26a35ba --- /dev/null +++ b/docs/Development_guide.md @@ -0,0 +1,149 @@ +CompaREST +========= +The tool works by simultaneously looking at a pair of nodes in OpenAPI schema +trees. At every place we try to understand whether one side (the producer) can +produce something that the other side (the consumer) cannot consume. This often +involves recursively running the same check on child sub-nodes. The necessary +criteria are implemented manually for all types of nodes that appear in the +schema. The `Subtree` class in `Data.OpenApi.Compare.Subtree` designates types +of nodes for which this compatibility checking is defined. The implementations +for various types reside in `Data.OpenApi.Compare.Validate.*`. + +Producer/Consumer +----------------- +Rather than looking at which schema is the server's and which is the client's, +in terms of compatibility checking it is more useful to track which side is the +"producer" and which is the "consumer". At the root of the schema the producer +is the client, but as we descend into an HTTP response, the direction flips and +the producer becomes the server. + +We deal with a lot of things in pairs, where one thing belongs to the producer +and another belongs to the consumer. The `ProdCons` datatype (in +`Data.OpenApi.Compare.Subtree`) is just a pair type that provides an Applicative +abstraction for working with two things simultaneously. + +Trees +----- +We directly use the datatypes defined in the `openapi3` library. For reasons of +identifiability and memoization we tag all nodes in the trees with a path from +the root. The tree is heterogeneous (nodes have different types), and every type +of node has a different arrangement of children. There is a `Step` data family +that defines all the possible children (of a specific type) of a node (of a +specific type). + +A `Trace` is a sequence of steps, except in a path the types of adjacent steps +have to agree. The `Paths` datatype achieves this. Together all paths form a +`Category`, and we often concatenate them using `>>>`. This is defined in +`Data.OpenApi.Compare.Paths`. + +We often keep a node and a path to it next to each other using an environment +comonad. `Traced` is a type alias around `Env` which ensures that the type of +the path matches the type of the node. + +Issues +------ +The output of the tool is a list of compatibility issues that came up during +checking. The `Issue` data family describes the kinds of issues that can occur +at each type of node. An issue is supposed to characterize a set of interactions +(requests or responses) that demonstrate that the schemas are incompatible. + +Each issue is tagged with a `Behavior` that describes the path to reproducing +the issue. While a `Trace` describes a syntactic path down the schema tree, a +`Behavior` describes which part of an interaction needs to be focused on for the +issue to manifest itself. + +In most places where we keep multiple issues (e.g. the output), to store them we +use a prefix tree map, indexed by both types and values of `Behavior`s. See +`Data.OpenApi.Compare.PathsPrefixTree`. + +Report +------ +The checker ultimately outputs a list of `Issue`s, together with `Behavior`s +that cause them. We run the checker twice: forwards and backwards -- to detect +non-breaking changes. All of this is then compiled into a report in +`Data.OpenApi.Compare.Report`. The tree structure of headings is computed from +`Behavior`s, using `Jet`s to collapse particular sequences of behavior steps +into a single human-readable header item. The text of each paragraph comes from +`describeIssue`. + +We then use the `pandoc` library to render the report in a variety of formats. + +Formulas +-------- +The schema has references, so it can end up being recursive. Similarly the +process for establishing compatibility can end up being recursive. To keep track +of this, the checker actually manipulates *formulas* which can have variables in +them, defined in (`Data.OpenApi.Compare.Formula`). + +A formula represents a set of issues, with the empty set corresponding to a +successful compatibility check. A formula can have conjunctions, where we only +report success if all parts succeeded, and if not we take the union of all +issues. A formula can also have disjunctions, but there it's impossible to +guarantee that the issues for the parts also make sense as issues for the whole +thing. Due to this if all disjuncted parts have issues we instead return a +different issue (usually non-descriptive). + +The `FormulaF` datatype implements a functor that carries formulaic calculations +as well as a result value (a co-Yoneda extension of formulas for the most part). +The Applicative interface provides conjunction, and the Alternative interface +provides disjunction. + +When we detect recursion we introduce a variable, compute the compatibility +check as a formula with that variable, and then we solve a fixed point equation +(`maxFixpoint`) with the variable to obtain the answer. + +This `FomulaF` is further wrapped in a couple monad transformers +(`SemanticCompatFormula`, `StructuralCompatFormula`), and the whole checker is +then implemented in the resulting Applicative using ApplicativeDo. + +Memoization +----------- +The `Data.OpenApi.Compare.Memo` module contains utilities for stateful +memoization and recursion detection. The entire checking process is memoized, +and can detect, propagate, and resolve recursion. + +Structural/Semantic Compatibility +--------------------------------- +When we encounter two nodes we first try to optimistically establish whether +they're structurally equal, i.e. the trees are exactly the same modulo reference +inlining. This check is also memoized and recursion-aware. If the check succeeds +we conclude that the schemas are compatible. Otherwise we fall back to the +"normal" semantic compatibility. + + +JSON Schema +----------- +The language that describes JSON payloads is probably the richest part of +OpenAPI schemas, and the checker for it also comes with a lot of moving parts. + +JSON schema allows arbitrary set arithmetic using `allOf`, `anyOf`, `not` and +`oneOf`. To fully check `not` and `oneOf` we would need to compare set +subtraction, which involves negating a comparison result. If recursion is +involved we would need to be able to solve equations with negation, which we +cannot do. So `not` is unsupported altogether, and `oneOf` is only supported +when it has disjoint branches and behaves like `anyOf`. + +With that in mind a JSON schema is first pre-processed into a Disjunctive Normal +Form (`DNF` in `Data.OpenApi.Compare.Validate.Schema.DNF`). This pre-processing +happens in `Data.OpenApi.Compare.Validate.Schema.Process`. + +The DNF contains elementary clauses that match on a specific aspect of a JSON +object (`Condition`). Most clauses only affect objects of a single type, so JSON +values and clauses are segregated by type (`JsonType`, `TypedValue` in +`Data.OpenApi.Compare.Validate.Schema.JsonFormula`) + +Then we reduce the problem of comparing two DNF's to the comparison of a +conjunction of clauses in the producer with a single clause in the consumer +(`checkFormulas` in `Data.OpenApi.Compare.Validate.Schema`). + +When we do disjunction in the formula we lose information, so we try to avoid +having disjunctions in the consumer DNF. This is done by the means of +partitioning. We look for factors that would let us partition the set of all +objects into several bins, in a way that hopefully corresponds only a single +consumer to each producer. This is implemented in +`Data.OpenApi.Compare.Validate.Schema.Partition`. Currently we can only +partition by an `enum` value that is accessible via a chain of `required` +fields. + +The same partitioning mechanism is used to test whether the branches of a +`oneOf` are disjoint, and emit a warning otherwise. diff --git a/docs/User_guide.md b/docs/User_guide.md new file mode 100644 index 0000000..7b85a63 --- /dev/null +++ b/docs/User_guide.md @@ -0,0 +1,51 @@ +CompaREST User Guide +==================== + +Running the Tool +---------------- +The tool accepts two OpenAPI 3.0.0 schema files in either JSON or YAML format. +One is assumed to be the client version of the schema, and the other is the +server version. + +The tool will look for changes between the schemas and detect whether they are +breaking or not -- that is, if they prevent interoperability between the client +and the server. + +Running: +``` +comparest -c client.json -s server.json +``` +will output a markdown report of the changes to the standard output. + +Compatibility status can be signaled via the exit code using +`--signal-exit-code`. In case no report is needed, output can be suppressed with +`--silent`. For example: +``` +comparest -c client.json -s server.json --signal-exit-code --silent +echo $? +``` +If there were changes that the tool determined to be breaking, the exit code +will be 1. If there were some changes the tool couldn't understand, the exit +code will be 2. Otherwise if there were no breaking changes, the exit code will +be 0. + +Controlling the Report +---------------------- +By default the report includes breaking changes, as well as non-breaking: +changes that would be considered breaking in the opposite direction. To only +include breaking changes in the report use `--only-breaking`. The `--all` option +restores the default behavior. + +The report can be formatted in a variety of ways supported by Pandoc. The `-o` +option causes the report to be written to a file. The format of the file is +determined from the extension. The supported extensions include: + - `.md` for markdown + - `.html` for an HTML snippet + - `.rst` for restructured text + - no extension for a self-contained HTML document with styles + +By default the report is split up into parts relating to different paths, +requests, responses, etc, using headers of various levels. Alternatively, the +report can use indended block-quotes to visualize the tree structure of the +report. The header style is enabled with `--header-style`, and the block-quote +style is enabled with `--folding-block-quotes-style`.