Commit Graph

4 Commits

Author SHA1 Message Date
Bartosz Nitka
74936df848 Make matching anywhere vs at pos obvious
Summary:
This change refactors the Engine to use a different
code path for when we're calling `lookupItem` to find
a first token `Node` matching the rule and a different
one for subsequent ones.

This division lets us get better invariants and more importantly
do full text regexp matches only when necessary.

This should be particularly useful for longer texts.

Reviewed By: patapizza

Differential Revision: D4953918

fbshipit-source-id: e3a69ad
2017-04-28 09:19:20 -07:00
Bartosz Nitka
879b103ca3 Fix indexing problems with new regexp matcher
Summary:
My change had a couple of problems:
* utf8 character width logic was completely wrong for characters that need 3 or 4 bytes
* `Array.listArray (start, end)` produces an array where `end` is a valid index
* because of ^ the `arraySize` logic also has to change

Reviewed By: watashi, darshankapashi

Differential Revision: D4894355

fbshipit-source-id: 8d07dfd
2017-04-14 15:49:17 -07:00
Bartosz Nitka
e37bb7c186 Duckling monad for Engine
Summary:
This converts the code to monadic style, so that
we can in the future:
* stop threading the `Document` parameter everywhere
* keep some state, like regexp match cache (I've already checked that it makes a substantial difference)

There should be no difference in performance or behavior
at this point.

Reviewed By: patapizza

Differential Revision: D4778808

fbshipit-source-id: a167ed8
2017-03-31 14:19:40 -07:00
Bartosz Nitka
bd94622f64 Move tests to tests and exes to exe
Summary:
This works around https://github.com/haskell/cabal/issues/4350
If we don't do this files get compiled multiple times
and cabal is unhappy.

Reviewed By: patapizza

Differential Revision: D4782749

fbshipit-source-id: 5bbe425
2017-03-27 16:04:24 -07:00