design-patterns-for-parser-combinators

j-mie6/design-patterns-for-parser-combinators

mirror of https://github.com/j-mie6/design-patterns-for-parser-combinators.git synced 2024-10-03 17:29:44 +03:00

Author	SHA1	Message	Date
Jamie Willis	e497fcff48	Removed cabal freeze and quick-check replay	2021-08-26 16:12:56 +01:00
Nicolas Wu	c8c135f182	Added parser, but tests timeout (#6 ) This is the trace for our Haskell Symposium talk, consisting of the following "steps". Check out the original PR to see each of the individual commits and how it all came together! * Added parser, but tests timeout * Added chainl1 and chainr1 combinators (Pattern 1a: Homogeneous Chains) ``` {---------------------------------------------} {- Pattern 1a: Homogenous Chains -} {---------------------------------------------} {- For binary operators where the -} {- associativity is not specified, use -} {- `chainl1` or `chainr1` to combine -} {- operands with the operators. -} {---------------------------------------------} ``` The combinators `chainl1` and `chainr1` are commonly found in many parser combinator libraries, and are the canonical way of handling the left-recursion problem _without_ resorting to grammar refactoring. In `megaparsec`, however, the chains are replaced by precedence parsers. * Fixed parser using chainl1 ``` {---------------------------------------------} {- Pattern 1a: Homogenous Chains -} {---------------------------------------------} {- For binary operators where the -} {- associativity is not specified, use -} {- `chainl1` or `chainr1` to combine -} {- operands with the operators. -} {---------------------------------------------} ``` * Fixed whitespace parsing issue (Patterns 2a and 2bi) ``` {---------------------------------------------} {- Pattern 2a: Whitespace Combinators -} {---------------------------------------------} {- Build a `lexeme` combinator dedicated to -} {- consuming trailing whitespace after a -} {- given lexeme. Build a `fully` combinator -} {- to consume initial whitespace and the end -} {- of input. -} {---------------------------------------------} ``` After factoring the tokens into `Lexer.hs`, building the new combinators helps separate individual responsibilities. ``` {---------------------------------------------} {- Pattern 2bi: Tokenising Combinators -} {---------------------------------------------} {- Annotate terminals with a `token` -} {- combinator, built on `lexeme`, to -} {- atomically parse them with whitespace -} {- consumed. -} {---------------------------------------------} ``` Building a `token` combinator helps keep things atomic. * Added keyword combinator to fix "negatex" -> "negate x" (Pattern 2bii: Keyword Combinators) ``` {---------------------------------------------} {- Pattern 2bii: Keyword Combinators -} {---------------------------------------------} {- Avoid using `string` with `token` for -} {- keywords, use a `keyword` combinator that -} {- enforces that the keyword does not form a -} {- valid prefix of another token. -} {---------------------------------------------} ``` Using a `keyword` combinator helps to distingish keywords from identifiers and other tokens. * Fixed pretty printer Pretty printer was missing its parentheses, and the tests are comparing on pretty printed terms for some reason? All that is fixed now! * Added heterogeneous chains: infixl1, infixr1, prefix, and postfix (Pattern 1b: Heterogeneous Chains) ``` {---------------------------------------------} {- Pattern 1b: Heterogeneous Chains -} {---------------------------------------------} {- For associative binary operators where -} {- operand types may differ, use `infixl1` -} {- or `infixr1` to combine operands with -} {- their operators, in conjunction with -} {- strongly typed semantic actions -} {---------------------------------------------} ``` The problem with the traditional `chainl1` and `chainr1` is that they are indistinguishable in their types, which can cause minor typos to cause bugs. We offer _refined_ versions called `infixl1` and infixr1` that encode the associativity in the types. To use these, the AST needs to more strongly adhere to the grammar, though the original chains can be recovered by using `id :: a -> a`. * Switched to strong ast and infixes ``` {---------------------------------------------} {- Pattern 1b: Heterogeneous Chains -} {---------------------------------------------} {- For associative binary operators where -} {- operand types may differ, use `infixl1` -} {- or `infixr1` to combine operands with -} {- their operators, in conjunction with -} {- strongly typed semantic actions -} {---------------------------------------------} ``` Switching over to the `StrongAST`, broken into `Expr`, `Term`, `Negate`, and `Atom` allows us to use the heterogeneous chains, and improve the type safety of the parser. * Used OverloadedStrings to encapsulate lexing behaviour, tests working! (Pattern 2c: Overloaded Strings) ``` {---------------------------------------------} {- Pattern 2c: Overloaded Strings -} {---------------------------------------------} {- Hide tokenising logic by allowing string -} {- literals to serve as parsers. -} {---------------------------------------------} ``` By using `OverloadedStrings` we can hide all the tokenising logic behind the `IsString` instance for `Parser ()`. This means that we can add and remove keywords without needing to change anything in the parser, whilst keeping it looking clean. If the extension is undesirable, a combinator performing the job of `fromString` is also a great idea. Co-authored-by: Jamie Willis <j.willis19@imperial.ac.uk>	2021-08-26 16:08:23 +01:00
Jamie Willis	62b92b07dd	Linked usernames [skip ci]	2021-08-25 13:52:25 +01:00
Jamie Willis	eb9d9a34c9	added blank line to MiniParsec [skip ci]	2021-08-25 12:05:31 +01:00
Jamie Willis	d4f0082857	disabled CI on PR, our PRs are on this repo anyway, so push suffices [skip ci]	2021-08-25 11:00:38 +01:00
Jamie Willis	2eb9b01de2	CI fixed, but tests must be enabled before they can run on CI [skip ci]	2021-08-25 10:57:27 +01:00
Jamie Willis	9dd1ceaaa2	fix ci	2021-08-25 10:50:56 +01:00
Jamie Willis	85a975c88b	Changed CI	2021-08-25 10:42:05 +01:00
Jamie Willis	dd50152779	Add authors [skip ci]	2021-08-25 10:33:09 +01:00
Jamie Willis	4827144be1	Added CI	2021-08-24 15:22:42 +01:00
Jamie Willis	39995e6b8d	Added freeze file and relaxed base	2021-08-24 15:20:46 +01:00
Jamie Willis	d78b63e5f4	compare expressions based on pretty for normalisation	2021-08-24 14:34:41 +01:00
Jamie Willis	620889272e	Add timeout to test	2021-08-24 14:04:55 +01:00
Jamie Willis	06299910b4	removed redundant parentheses, it's just a wrapper... right?	2021-08-24 13:16:24 +01:00
Jamie Willis	aa815ed4cf	Readme update, more emoji for that authentic start-up feel	2021-08-23 17:48:55 +01:00
Jamie Willis	651d8f4a8d	Change company name	2021-08-23 11:27:17 +01:00
Jamie Willis	4c2513c84e	typos	2021-08-23 11:19:09 +01:00
Jamie Willis	16e401cd66	Added readme	2021-08-23 11:14:52 +01:00
Jamie Willis	96b481223e	Fleshed out the tests and infrastructure	2021-08-23 09:06:11 +01:00
Jamie Willis	f246d3db6b	Almost working parser, enough to check the infrastructure is working!	2021-08-23 09:04:30 +01:00
Jamie Willis	ea0d85d71f	Added parser tests	2021-08-20 17:26:29 +01:00
Jamie Willis	cdfd6a89a4	Added tests for converters	2021-08-20 17:19:58 +01:00
Jamie Willis	2c17568e9e	Added convertion and a pretty printer	2021-08-20 16:38:47 +01:00
Jamie Willis	c595c07035	Added test folder and broke the calculator into main	2021-08-20 15:44:41 +01:00
Jamie Willis	d652614755	Initial repo, can someone write the parser?	2021-08-20 15:24:06 +01:00

25 Commits