Multi-character operators should use ‘try’ in order to be reported
correctly (as “operator”). I've mentioned it in doc-string of
‘makeExprParser’.
It's tempting to include ‘try’ directly in expression parsing code, but
following general spirit of Parsec toward ‘try’, I think current
solution is the best.
Eliminated ‘Text.Megaparsec.Language’ module because at this point it is
clear that already existing definitions are of little use in
Megaparsec. I started writing “default” language definition in
‘Text.Megaparsec.Lexer’.
At this point it should be possible to parse languages where indentation
matters, although we will need to provide more helpers to make it
easier.
The following functions and data types have been renamed:
* ‘permute’ → ‘makePermParser’
* ‘buildExpressionParser’ → ‘makeExprParser’
* ‘GenLanguageDef’ → ‘LanguageDef’
* ‘GenTokenParser’ → ‘Lexer’
* ‘makeTokenParser’ → ‘makeLexer’
* Removed ‘optionMaybe’ parser, because ‘optional’ from
‘Control.Applicative’ does the same thing.
* Renamed ‘tokenPrim’ → ‘token’, removed old ‘token’, because
‘tokenPrim’ is more general and ‘token’ is little used.
* Fixed bug with ‘notFollowedBy’ always succeeded with parsers that
don't consume input, see #6.
* Hint system introduced that greatly improved quality of error messages
and made code of ‘Text.Megaparsec.Prim’ a lot clearer.
The improvements affected other modules too:
* Some parsers from ‘Text.Megaparsec.Combinators’ now live in
‘Text.Megaparsec.Prim’.
* Hint system improved error messages, so I needed to rewrite test for
‘Text.Megaparsec.Char.eol’, since it's error messages are very
intelligent now and cannot be emulated by ‘newline’ and ‘crlf’ parsers
used separately.
* Test for Bug9 from old-tests is passed successfully again.
Added new character parsers in ‘Text.Megaparsec.Char’:
* ‘controlChar’
* ‘printChar’
* ‘markChar’
* ‘numberChar’
* ‘punctuationChar’
* ‘symbolChar’
* ‘separatorChar’
* ‘asciiChar’
* ‘latin1Char’
* ‘charCategory’
Renamed some parsers:
‘spaces’ → ‘space’
‘space’ → ‘spaceChar’
‘lower’ → ‘lowerChar’
‘upper’ → ‘upperChar’
‘letter’ → ‘letterChar’
‘alphaNum’ → ‘alphaNumChar’
‘digit’ → ‘digitChar’
‘octDigit’ → ‘octDigitChar’
‘hexDigit’ → ‘hexDigitChar’
Descriptions of old parsers have been updated to accent some
Unicode-specific moments. For example, old description of ‘letter’
stated that it parses letters from “a” to “z” and from “A” to “Z”. This
is wrong, since it used ‘Data.Char.isAlpha’ predicate internally and
thus parsed many more characters.