From a1ea91fc50db6f56a8cfe5e9be20eb0b2b19ff17 Mon Sep 17 00:00:00 2001
From: mniip <mniip@users.noreply.github.com>
Date: Tue, 7 Sep 2021 22:33:51 +0300
Subject: [PATCH] Add a dev and a user guide (#120)

* Add a dev and a user guide

* Changes
---
 docs/Development_guide.md | 149 ++++++++++++++++++++++++++++++++++++++
 docs/User_guide.md        |  51 +++++++++++++
 2 files changed, 200 insertions(+)
 create mode 100644 docs/Development_guide.md
 create mode 100644 docs/User_guide.md

diff --git a/docs/Development_guide.md b/docs/Development_guide.md
new file mode 100644
index 0000000..26a35ba
--- /dev/null
+++ b/docs/Development_guide.md
@@ -0,0 +1,149 @@
+CompaREST
+=========
+The tool works by simultaneously looking at a pair of nodes in OpenAPI schema
+trees. At every place we try to understand whether one side (the producer) can
+produce something that the other side (the consumer) cannot consume. This often
+involves recursively running the same check on child sub-nodes. The necessary
+criteria are implemented manually for all types of nodes that appear in the
+schema. The `Subtree` class in `Data.OpenApi.Compare.Subtree` designates types
+of nodes for which this compatibility checking is defined. The implementations
+for various types reside in `Data.OpenApi.Compare.Validate.*`.
+
+Producer/Consumer
+-----------------
+Rather than looking at which schema is the server's and which is the client's,
+in terms of compatibility checking it is more useful to track which side is the
+"producer" and which is the "consumer". At the root of the schema the producer
+is the client, but as we descend into an HTTP response, the direction flips and
+the producer becomes the server.
+
+We deal with a lot of things in pairs, where one thing belongs to the producer
+and another belongs to the consumer. The `ProdCons` datatype (in
+`Data.OpenApi.Compare.Subtree`) is just a pair type that provides an Applicative
+abstraction for working with two things simultaneously.
+
+Trees
+-----
+We directly use the datatypes defined in the `openapi3` library. For reasons of
+identifiability and memoization we tag all nodes in the trees with a path from
+the root. The tree is heterogeneous (nodes have different types), and every type
+of node has a different arrangement of children. There is a `Step` data family
+that defines all the possible children (of a specific type) of a node (of a
+specific type).
+
+A `Trace` is a sequence of steps, except in a path the types of adjacent steps
+have to agree. The `Paths` datatype achieves this. Together all paths form a
+`Category`, and we often concatenate them using `>>>`. This is defined in
+`Data.OpenApi.Compare.Paths`.
+
+We often keep a node and a path to it next to each other using an environment
+comonad. `Traced` is a type alias around `Env` which ensures that the type of
+the path matches the type of the node.
+
+Issues
+------
+The output of the tool is a list of compatibility issues that came up during
+checking. The `Issue` data family describes the kinds of issues that can occur
+at each type of node. An issue is supposed to characterize a set of interactions
+(requests or responses) that demonstrate that the schemas are incompatible.
+
+Each issue is tagged with a `Behavior` that describes the path to reproducing
+the issue. While a `Trace` describes a syntactic path down the schema tree, a
+`Behavior` describes which part of an interaction needs to be focused on for the
+issue to manifest itself.
+
+In most places where we keep multiple issues (e.g. the output), to store them we
+use a prefix tree map, indexed by both types and values of `Behavior`s. See
+`Data.OpenApi.Compare.PathsPrefixTree`.
+
+Report
+------
+The checker ultimately outputs a list of `Issue`s, together with `Behavior`s
+that cause them. We run the checker twice: forwards and backwards -- to detect
+non-breaking changes. All of this is then compiled into a report in
+`Data.OpenApi.Compare.Report`. The tree structure of headings is computed from
+`Behavior`s, using `Jet`s to collapse particular sequences of behavior steps
+into a single human-readable header item.  The text of each paragraph comes from
+`describeIssue`.
+
+We then use the `pandoc` library to render the report in a variety of formats.
+
+Formulas
+--------
+The schema has references, so it can end up being recursive. Similarly the
+process for establishing compatibility can end up being recursive. To keep track
+of this, the checker actually manipulates *formulas* which can have variables in
+them, defined in (`Data.OpenApi.Compare.Formula`).
+
+A formula represents a set of issues, with the empty set corresponding to a
+successful compatibility check. A formula can have conjunctions, where we only
+report success if all parts succeeded, and if not we take the union of all
+issues. A formula can also have disjunctions, but there it's impossible to
+guarantee that the issues for the parts also make sense as issues for the whole
+thing. Due to this if all disjuncted parts have issues we instead return a
+different issue (usually non-descriptive).
+
+The `FormulaF` datatype implements a functor that carries formulaic calculations
+as well as a result value (a co-Yoneda extension of formulas for the most part).
+The Applicative interface provides conjunction, and the Alternative interface
+provides disjunction.
+
+When we detect recursion we introduce a variable, compute the compatibility
+check as a formula with that variable, and then we solve a fixed point equation
+(`maxFixpoint`) with the variable to obtain the answer.
+
+This `FomulaF` is further wrapped in a couple monad transformers
+(`SemanticCompatFormula`, `StructuralCompatFormula`), and the whole checker is
+then implemented in the resulting Applicative using ApplicativeDo.
+
+Memoization
+-----------
+The `Data.OpenApi.Compare.Memo` module contains utilities for stateful
+memoization and recursion detection. The entire checking process is memoized,
+and can detect, propagate, and resolve recursion.
+
+Structural/Semantic Compatibility
+---------------------------------
+When we encounter two nodes we first try to optimistically establish whether
+they're structurally equal, i.e. the trees are exactly the same modulo reference
+inlining. This check is also memoized and recursion-aware. If the check succeeds
+we conclude that the schemas are compatible. Otherwise we fall back to the
+"normal" semantic compatibility.
+
+
+JSON Schema
+-----------
+The language that describes JSON payloads is probably the richest part of
+OpenAPI schemas, and the checker for it also comes with a lot of moving parts.
+
+JSON schema allows arbitrary set arithmetic using `allOf`, `anyOf`, `not` and
+`oneOf`. To fully check `not` and `oneOf` we would need to compare set
+subtraction, which involves negating a comparison result. If recursion is
+involved we would need to be able to solve equations with negation, which we
+cannot do. So `not` is unsupported altogether, and `oneOf` is only supported
+when it has disjoint branches and behaves like `anyOf`.
+
+With that in mind a JSON schema is first pre-processed into a Disjunctive Normal
+Form (`DNF` in `Data.OpenApi.Compare.Validate.Schema.DNF`). This pre-processing
+happens in `Data.OpenApi.Compare.Validate.Schema.Process`.
+
+The DNF contains elementary clauses that match on a specific aspect of a JSON
+object (`Condition`). Most clauses only affect objects of a single type, so JSON
+values and clauses are segregated by type (`JsonType`, `TypedValue` in
+`Data.OpenApi.Compare.Validate.Schema.JsonFormula`)
+
+Then we reduce the problem of comparing two DNF's to the comparison of a
+conjunction of clauses in the producer with a single clause in the consumer
+(`checkFormulas` in `Data.OpenApi.Compare.Validate.Schema`).
+
+When we do disjunction in the formula we lose information, so we try to avoid
+having disjunctions in the consumer DNF. This is done by the means of
+partitioning. We look for factors that would let us partition the set of all
+objects into several bins, in a way that hopefully corresponds only a single
+consumer to each producer. This is implemented in
+`Data.OpenApi.Compare.Validate.Schema.Partition`. Currently we can only
+partition by an `enum` value that is accessible via a chain of `required`
+fields.
+
+The same partitioning mechanism is used to test whether the branches of a
+`oneOf` are disjoint, and emit a warning otherwise.
diff --git a/docs/User_guide.md b/docs/User_guide.md
new file mode 100644
index 0000000..7b85a63
--- /dev/null
+++ b/docs/User_guide.md
@@ -0,0 +1,51 @@
+CompaREST User Guide
+====================
+
+Running the Tool
+----------------
+The tool accepts two OpenAPI 3.0.0 schema files in either JSON or YAML format.
+One is assumed to be the client version of the schema, and the other is the
+server version.
+
+The tool will look for changes between the schemas and detect whether they are
+breaking or not -- that is, if they prevent interoperability between the client
+and the server.
+
+Running:
+```
+comparest -c client.json -s server.json
+```
+will output a markdown report of the changes to the standard output.
+
+Compatibility status can be signaled via the exit code using
+`--signal-exit-code`. In case no report is needed, output can be suppressed with
+`--silent`. For example:
+```
+comparest -c client.json -s server.json --signal-exit-code --silent
+echo $?
+```
+If there were changes that the tool determined to be breaking, the exit code
+will be 1. If there were some changes the tool couldn't understand, the exit
+code will be 2. Otherwise if there were no breaking changes, the exit code will
+be 0.
+
+Controlling the Report
+----------------------
+By default the report includes breaking changes, as well as non-breaking:
+changes that would be considered breaking in the opposite direction. To only
+include breaking changes in the report use `--only-breaking`. The `--all` option
+restores the default behavior.
+
+The report can be formatted in a variety of ways supported by Pandoc. The `-o`
+option causes the report to be written to a file. The format of the file is
+determined from the extension. The supported extensions include:
+ - `.md` for markdown
+ - `.html` for an HTML snippet
+ - `.rst` for restructured text
+ - no extension for a self-contained HTML document with styles
+
+By default the report is split up into parts relating to different paths,
+requests, responses, etc, using headers of various levels. Alternatively, the
+report can use indended block-quotes to visualize the tree structure of the
+report. The header style is enabled with `--header-style`, and the block-quote
+style is enabled with `--folding-block-quotes-style`.