# Semantic `semantic` is a Haskell library and command line tool for parsing, analyzing, and comparing source code. | Table of Contents | | :------------- | | [Usage](#usage) | | [Language support](#language-support) | | [Development](#development) | | [Technology and architecture](#technology-and-architecture) | | [Licensing](#licensing) | ## Usage Run `semantic --help` for complete list of up-to-date options. #### Parse ``` Usage: semantic parse ([--sexpression] | [--json] | [--json-graph] | [--symbols] | [--dot] | [--show] | [--quiet]) [FILES...] Generate parse trees for path(s) Available options: --sexpression Output s-expression parse trees (default) --json Output JSON parse trees --json-graph Output JSON adjacency list --symbols Output JSON symbol list --dot Output DOT graph parse trees --show Output using the Show instance (debug only, format subject to change without notice) --quiet Don't produce output, but show timing stats ``` #### Diff ``` Usage: semantic diff ([--sexpression] | [--json] | [--json-graph] | [--toc] | [--dot] | [--show]) [FILE_A] [FILE_B] Compute changes between paths Available options: --sexpression Output s-expression diff tree (default) --json Output JSON diff trees --json-graph Output JSON diff trees --toc Output JSON table of contents diff summary --dot Output the diff as a DOT graph --show Output using the Show instance (debug only, format subject to change without notice) ``` #### Graph ``` Usage: semantic graph ([--imports] | [--calls]) [--packages] ([--dot] | [--json] | [--show]) ([--root DIR] [--exclude-dir DIR] DIR:LANGUAGE | FILE | --language ARG (FILES... | --stdin)) Compute a graph for a directory or from a top-level entry point module Available options: --imports Compute an import graph (default) --calls Compute a call graph --packages Include a vertex for the package, with edges from it to each module --dot Output in DOT graph format (default) --json Output JSON graph --show Output using the Show instance (debug only, format subject to change without notice) --root DIR Root directory of project. Optional, defaults to entry file/directory. --exclude-dir DIR Exclude a directory (e.g. vendor) --language ARG The language for the analysis. --stdin Read a list of newline-separated paths to analyze from stdin. ``` ## Language support | Priority | Language | Parse | Assign | Diff | ToC | Symbols | Import graph | Call graph | Control flow graph | | :---: | :------------- | :---: | :---: | :---: | :--:| :---: | :---: | :---: | :---: | | 1 | Ruby | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | | | 2 | JavaScript | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | | | 3 | TypeScript | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | | | 4 | Python | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | | | 5 | Go | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | | | | PHP | ✅ | ✅ | ✅ | ✅ | ✅ | | | | | | Java | ✅ | ✅ | ✅ | 🔶 | ✅ | | | | | | JSON | ✅ | ✅ | ✅ | N/A | N/A | N/A | N/A| | | | JSX | ✅ | ✅ | ✅ | 🔶 | | | | | | | Haskell | ✅ | ✅ | ✅ | 🔶 | ✅ | | | | | | Markdown | ✅ | ✅ | ✅ | 🔶 | N/A | N/A | N/A |   | * ✅ — Supported * 🔶 — Partial support * 🚧 — Under development ## Development We use `cabal's` [Nix-style local builds][nix] for development. To get started quickly: ```bash git clone git@github.com:github/semantic.git cd semantic git submodule sync --recursive && git submodule update --init --recursive --force cabal new-update cabal new-build cabal new-test cabal new-run semantic -- --help ``` `semantic` requires GHC 8.6.4. We recommend using [`ghcup`][ghcup] to sandbox GHC versions. Our version bounds are based on [Stackage][stackage] LTS versions. The current LTS version is 13.13; `stack` build should also work if you prefer. [nix]: https://www.haskell.org/cabal/users-guide/nix-local-build-overview.html [stackage]: https://stackage.org [ghcup]: https://www.haskell.org/ghcup/ ## Technology and architecture Architecturally, `semantic`: 1. Reads blobs. 2. Generates parse trees for those blobs with [tree-sitter][tree-sitter] (an incremental parsing system for programmings tools). 3. Assigns those trees into a generalized representation of syntax. 4. Performs analysis, computes diffs, or just returns parse trees. 5. Renders output in one of many supported formats. Semantic leverages a number of interesting algorithms and techniques: - Myers' algorithm (SES) as described in the paper [*An O(ND) Difference Algorithm and Its Variations*][SES] - RWS as described in the paper [*RWS-Diff: Flexible and Efficient Change Detection in Hierarchical Data*][RWS]. - Open unions and [data types à la carte](http://www.cs.ru.nl/~W.Swierstra/Publications/DataTypesALaCarte.pdf). - An implementation of [Abstracting Definitional Interpreters][adi] extended to work with an à la carte representation of syntax terms. [SES]: http://www.xmailserver.org/diff2.pdf [RWS]: https://db.in.tum.de/~finis/papers/RWS-Diff.pdf [adi]: https://plum-umd.github.io/abstracting-definitional-interpreters/ [tree-sitter]: https://github.com/tree-sitter/tree-sitter ## Contributions Contributions are welcome! Please see our [contribution guidelines](CONTRIBUTING.md) and our [code of conduct](CODE_OF_CONDUCT.md) for details on how to participate in our community. ## Licensing Semantic is licensed under the [MIT license](LICENSE).