$base works as $self worked previously. $self will need to be fixed. If inside
a embedded grammar $self refers to the embedded grammar while $base refers to the
overall grammar.
Previously, we treated all grammar repositories as rules, but some
grammars have repositories that are a single pattern. If it is a single
pattern, transform it into a rule with one pattern.
Fix how repositories work in TextMate grammars
Previously, we treated all grammar repositories as rules, but some grammars have repositories that are a single pattern. If it is a single pattern, transform it into a rule with one pattern.
We compile a giant regex out of all the individual regexes for each pattern by or'ing together a capture group for each one. Then we use the index of the matched capture group to determine which pattern actually matched, and adjust the capture indexes of the subtree to make it appear to start from index 0, so the capture indices on the pattern align properly. There is still broken-ness on more complex patterns, but basic patterns and patterns w/ captures work.
Our previous implementation only allowed for a single layer of capture groups. Now we can have captures within captures. I achieved this by converting the match into a tree before generating tokens. If there are any capture scopes specified, then we will always emit a token for every capture group in the match. This may create some redundant tokens (a serious of 2 or more tokens with the same scopes), but it will at least be technically correct. I think the overhead of removing these redundancies exceeds the cost of maintaining them for now.
This will be a weapon in dealing with capture groups that nest within other capture groups, and also helps deal with trailing lookahead groups that don't belong in the main match. I made it a class method because it's stateless and that made it easier to test.