1
1
mirror of https://github.com/github/semantic.git synced 2024-11-28 01:47:01 +03:00
semantic/ROADMAP.md

3.4 KiB
Raw Blame History

Semantic Roadmap

Semantic Code produces data around codes structure and meaning. This is made available via GraphQL APIs, enabling others (both at GitHub and elsewhere) to build features.

See also our roadmap project for our current efforts.

Objectives

Run a resilient, scalable service

Our service should operate with a minimum of headaches. Our service should scale with GitHubs customer base and our language support.

Task PRP Priority (1 to 3) Amount of work (1 to 4)
Production readiness @tclem 1 3
Architecture review @tclem 1

Provide data about code to users, integrators, & researchers

Task PRP Priority (1 to 3) Amount of work (1 to 4)
GraphQL parse tree API @tclem
GraphQL diff API 2

GraphQL APIs

  • Parse tree.
  • Diff.
  • Repository and/or project-level (semantic index)
    • Dependencies
    • Some sort of persistence to “link” together different parse trees/repositories/etc.

Ongoing Work:

  • Extend/improve the data provided via GraphQL. API consumers will have to specifically opt in to any new fields in order to receive them.

LSP integration

Enables:

  • Find References from blobs
  • Go To definition
  • Find workspace symbols

Ongoing Work:

  • Hosting language servers
  • Hosting LSP servers in containers
  • Persistence or caching of requests

Semantic analysis

  • À la carte syntax
  • Cyclomatic complexity
  • Modular abstract interpreters
  • Type inference
    • Is this program well-typed?
    • What is the type at any particular node?

Ongoing Work:

  • type signatures,
  • parse/type errors,
  • callers/callees,
  • labelling symbol declarations (for e.g. jump to…/ToC as well as code search)
  • linking symbol references to declarations.

Language support

  • Python

Ongoing Work:

  • Extend the set of supported languages, guided by data on language usage.
  • Provide/improve tooling for writing/testing tree-sitter grammars.
  • Review contributions to open-source grammars.
  • Automate as much as possible to keep us focused on the core. Can we reduce/offload this in other ways?

Operability/Production readiness

  • Caching
  • Performance
  • Security review of tree-sitter
  • Resolving tree-sitter error recovery hangs

Ongoing Work:

  • Performance improvements.
  • Resiliency improvements/maintenance/operational excellence.
  • Metrics. How does it perform, how are people using it, what data do people care about, what dont they use at all.

ToC

  • Performance with large parse trees
  • Depends on improved operability
  • Production ready tree-sitter
  • ToC in Enterprise