…and update our API usages.
Tree-sitter harmonized the API differences between `web-tree-sitter` and
`node-tree-sitter` in version 0.22.0. This is the first time we’ve had to deal
with that. Luckily, the changes were mostly palatable.
The biggest API difference is in `Query#captures`; two positional arguments for
defining the extent of the query have been moved to keyword arguments. We’ve
updated our internal usages, but any community packages that relied on the old
function signature would break if we didn’t do anything about it. So we’ve
wrapped the `Query#captures` method in one of our own; it detects usages that
expect the old signature and rearranges their arguments, issuing a deprecation
warning in the process. Hopefully this generates enough noise that any such
packages understand what’s going on and can update.
Other API changes are more obscure — which is good, because we can’t wrap them
the way we wrapped `Query#captures`. They involve conversion of functions to
getters (`node.hasErrors` instead of `node.hasErrors()`), and there’s no good
way to make both usages work… short of wrapping nodes in `Proxy` objects, and
that’s not on the table.
Since lots has changed in `tree-sitter` since we last upgraded
`web-tree-sitter`, I updated our documentation about building a custom version
of `web-tree-sitter`.
This one’s got all the frills, including injections into HTML documents and a PHPDoc grammar.
As part of this change, we're also migrating to `web-tree-sitter` version 0.20.8 with some customizations. The PR I submitted at https://github.com/tree-sitter/tree-sitter/pull/2795 is landed on this fork, though if the same issues get fixed in a different way on the source, I'll adopt that approach as well. The PHPDoc parser needed another external added.
I think the previous WASM file was built from 0.19.0, but much has been fixed since then.
This change required a new export (`isalpha`) and therefore a rebuild of the WASM Tree-sitter bindings themselves.
The `tree-sitter-ruby` parser wanted to call
_ZNSt3__212basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE17__assign_no_aliasILb1EEERS5_PKcm
and couldn't find it. Added that function to `exports.json` (as described in
`vendor/README.md`) and rebuilt `web-tree-sitter`. Seems to be fixed now.
The comment in `src/web-tree-sitter.js` and the documentation in
`vendor/web-tree-sitter` explain why this change is needed. I hope it's
temporary.
I haven't removed our `web-tree-sitter` dependency in `package.json` because
it's a useful way of declaring (and remembering) which version of tree-sitter
we've built against, and because it lets us flip a single config flag in
`src/web-tree-sitter.js` to compare the behavior of our custom version with that
of the stock version. The cost of keeping the stock version around is under 300
kilobytes.
The custom builds will come from specialty branches on our tree-sitter fork.
When we upgrade to a new version, someone should follow the directions in
`vendor/web-tree-sitter/README.md` to create a new specialty branch.
When someone wants to build a `language-foo` package and finds that
`tree-sitter-foo` needs a function we don't yet export, they should be able to
see the warning we generate in the console and file a ticket. If we're on our
game, we should be able to generate a new web-tree-sitter build and get it into
a rolling release within a week or so.
I hope web-tree-sitter someday finds a way to fix this so that we no longer need
to do this chore. See https://github.com/tree-sitter/tree-sitter/issues/949.
Just to document this: I tried a HUGE amount of ideas for tests to run
reliably. They need to wait for the editor when it's ready to be
tokenized, but our callbacks are not that reliable - meaning, for the
first tokenization on tree-sitter, they are called _after_ we register
the callback, but for the second time, they are called _before_ we
register them. There's no reliable way to _actually listen_ to changes,
and none of these even were _called_ by TextMate grammars.
So, this `.ready` will stay here so we know when TreeSitter is ready to
tokenize.
A quick explanation on this - the old code was matching if the full
scope was ok. This is fine when we want to test a single grammar, but
for example, TextMate grammars always added `.ruby` on the last part of
the grammar. That was quite bad for TreeSitter, that basically didn't do
the same thing. So now it matches a fine crafted Regexp that basically
checks if the full scope is match from the beginning, OR if part of the
scope (up to just before the `.`) matches. So, for example, for
`constant.other.ruby` it'll match `constant`, `constant.other`, and
`constant.other.ruby` but it'll NOT MATCH `constant.oth`.
If no error message is given, show the filename and line number when a
test fails due to a timeout using waitsFor.
Co-authored-by: Max Brunsfeld <maxbrunsfeld@github.com>