This code benchmarks the "string" primitive and various non-primitive combinators.
The code coverage of these benchmarks is not 100%: new benchmarks should be added
as relevant performance questions are discovered.
Various languages may vary in how hexadecimal and octal literals should
be prefixed. Following the spirit of the new lexer we leave this to
programmer to decide.
Eliminated ‘Text.Megaparsec.Language’ module because at this point it is
clear that already existing definitions are of little use in
Megaparsec. I started writing “default” language definition in
‘Text.Megaparsec.Lexer’.
At this point it should be possible to parse languages where indentation
matters, although we will need to provide more helpers to make it
easier.
The single test covers 100 % of the module's code. However it doesn't
check quality of error messages, so we still have room for improvement.
Manual tests show that error messages are good.
Obviously order does matter here, since ‘Monoid’ instance for ‘Hints’ is
derived from [], so (<>) is the same as (++) and we should be careful
to keep things in the right order.
These parsers are considered deprecated:
* ‘chainl’
* ‘chainl1’
* ‘chainr’
* ‘chainr1’
* ‘sepEndBy’
* ‘sepEndBy1’
Apart from this, the commit includes various cosmetic changes in
module ‘Text.Megaparsec.Combinator’.
The following functions and data types have been renamed:
* ‘permute’ → ‘makePermParser’
* ‘buildExpressionParser’ → ‘makeExprParser’
* ‘GenLanguageDef’ → ‘LanguageDef’
* ‘GenTokenParser’ → ‘Lexer’
* ‘makeTokenParser’ → ‘makeLexer’
The improved error messages in Megaparsec are quite sensitive to how
parsers are written, which parts of parser are labeled, etc. Current
implementation of token parsers in ‘Text.Megaparsec.Token’ is written
without this in mind. We will improve the module later, for now let us
rewrite/simplify some parts to avoid failing tests.
If ‘x’ in ‘x >>= y’ consumes input but produces some hints, we should
accumulate them nonetheless. Why it's important can be demonstrated by
the following test:
many (char 'a') >> many (char 'b') >> eof
This should fail on input "ac" with the following message:
parse error at line 1, column 2:
unexpected 'c'
expecting 'a', 'b' or end of input
As you can see even though parser ‘many (char 'a')’ consumed input, its
hits may be useful later.