graphql-engine/v3/crates/utils/jsonschema-tidying
Tom Harding 1c6b1dffc8 Remove duplicate JSON Schema entries (#692)
<!-- Thank you for submitting this PR! :) -->

## Description

When we generate a JSON schema via `schemars`, we end up with duplicate
types in the schema, which get names like `ValueExpression2`, `Role2`,
and so on. This isn't ideal, and seems to arise for two reasons:

1. The type is polymorphic, and is monomorphised in two ways, and thus
the types can't be unified.
2. The type is monomorphic, but is used inside and outside of its home
module.

The first problem was fixed previously by splitting polymorphic types,
but the second has proven to be a bit more work. This PR finally solves
the problem by introducing a new library, `jsonschema-tidying`:

* First, we search the definitions within the JSON schema for any whose
names end in a number, such as `ValueExpression2` or `MetadataV2`.
* Then, we look for types whose names match everything up to the final
numeric digits, and discard any types for whom we can't find a match (so
we keep `ValueExpression2` because `ValueExpression` exists, but discard
`MetadataV2` because `MetadataV` does not).
* Next, we remove the duplicate definition from the definitions map,
potentially breaking links in both the schema _and_ the rest of the
definitions map.
* Finally, we traverse the entirety of the tree looking for any
references to the duplicate entry, and replace them with references to
the original entry.

This PR has no direct user-facing change, however it _will_ have an
effect on the docs generation code, which will hopefully result in
tidier documentation.

<!--
  Questions to consider answering:
  1. What user-facing changes are being made?
4. What are issues related to this PR? (Consider adding `(close
#<issue-no>)` to the PR title)
  5. What is the conceptual design behind this PR?
  6. How can this PR be tested/verified?
  7. Does the PR have limitations?
  8. Does the PR introduce breaking changes?
-->

## Changelog

- Add a changelog entry (in the "Changelog entry" section below) if the
changes
  in this PR have any user-facing impact. See
[changelog
guide](https://github.com/hasura/graphql-engine-mono/wiki/Changelog-Guide).
- If no changelog is required ignore/remove this section and add a
  `no-changelog-required` label to the PR.

### Product

_(Select all products this will be available in)_

- [ ] community-edition
- [ ] cloud
<!-- product : end : DO NOT REMOVE -->

### Type

<!-- See changelog structure:
https://github.com/hasura/graphql-engine-mono/wiki/Changelog-Guide#structure-of-our-changelog
-->

_(Select only one. In case of multiple, choose the most appropriate)_

- [ ] highlight
- [x] enhancement
- [x] bugfix
- [ ] behaviour-change
- [ ] performance-enhancement
- [ ] security-fix
<!-- type : end : DO NOT REMOVE -->

### Changelog entry

<!--
  - Add a user understandable changelog entry
- Include all details needed to understand the change. Try including
links to docs or issues if relevant
  - For Highlights start with a H4 heading (#### <entry title>)
  - Get the changelog entry reviewed by your team
-->

_Replace with changelog entry_

<!-- changelog-entry : end : DO NOT REMOVE -->

<!-- changelog : end : DO NOT REMOVE -->

V3_GIT_ORIGIN_REV_ID: fe73acf7d9df0b9867852e673e53cb086e3725d3
2024-06-11 08:28:30 +00:00
..
src Remove duplicate JSON Schema entries (#692) 2024-06-11 08:28:30 +00:00
Cargo.toml Remove duplicate JSON Schema entries (#692) 2024-06-11 08:28:30 +00:00
README.md Remove duplicate JSON Schema entries (#692) 2024-06-11 08:28:30 +00:00

jsonschema_tidying

The default schemars output can contain duplicate entries for the same type, even when monomorphised. This package searches for (and removes) duplicate entries, resolving any links that are broken in the process.