From 5be3ee9e20bd64ff122097cfb013583ff1025a38 Mon Sep 17 00:00:00 2001 From: Simon Michael Date: Sun, 7 Apr 2024 23:15:15 -1000 Subject: [PATCH] imp: disallow date: in expr: OR expressions, avoiding unclear semantics [#2177][#2178] --- hledger-lib/Hledger/Query.hs | 18 +++++++++++++----- hledger/hledger.m4.md | 29 +++++++++++++++++++---------- hledger/test/query-expr.test | 19 +++++++++---------- 3 files changed, 41 insertions(+), 25 deletions(-) diff --git a/hledger-lib/Hledger/Query.hs b/hledger-lib/Hledger/Query.hs index a662a2495..72c3013fd 100644 --- a/hledger-lib/Hledger/Query.hs +++ b/hledger-lib/Hledger/Query.hs @@ -324,10 +324,12 @@ parseQueryTerm d s = parseQueryTerm d $ defaultprefix<>":"<>s -- prefix-operator "NOT e" is always parsed before "e AND e", "e AND e" before "e OR e", -- and "e OR e" before "e e". -- --- The space-separation operator is left as it was the default before the introduction of --- boolean operators. It takes the behaviour defined in the interpretQueryList function, --- whereas the NOT, OR, and AND operators simply wrap a list of queries with the associated --- +-- The "space" operator still works as it did before the introduction of boolean operators: +-- it combines terms according to their types, using parseQueryList. +-- Whereas the new NOT, OR, and AND operators work uniformly for all term types. +-- There is an exception: queries being OR'd may not specify a date period, +-- because that can produce multiple, possibly disjoint, report periods and result sets, +-- and we don't have report semantics worked out for it yet. (#2178) -- -- The result of this function is either an error encountered during parsing of the -- expression or the combined query and query options. @@ -361,8 +363,14 @@ parseBooleanQuery d t = _ -> (simplifyQuery $ f qs, qoptss') -- Containing query expressions separated by "or". + -- If there's more than one, make sure none contains a "date:". orExprsP :: SimpleTextParser (Query, [QueryOpt]) - orExprsP = combineWith Or <$> andExprsP `sepBy` (try $ skipNonNewlineSpaces >> string' "or" >> skipNonNewlineSpaces1) + orExprsP = do + exprs <- andExprsP `sepBy` (try $ skipNonNewlineSpaces >> string' "or" >> skipNonNewlineSpaces1) + if ( length exprs > 1 + && (any (/=Any) $ map (filterQuery queryIsDateOrDate2 . fst) exprs)) + then fail "sorry, using date: in OR expressions is not supported." + else return $ combineWith Or exprs where -- Containing query expressions separated by "and". diff --git a/hledger/hledger.m4.md b/hledger/hledger.m4.md index cf9891711..d71e1614f 100644 --- a/hledger/hledger.m4.md +++ b/hledger/hledger.m4.md @@ -5082,22 +5082,31 @@ The [print](#print) command is a little different, showing transactions which: - match all the other terms. We also support more complex boolean queries with the `expr:` prefix. -This allows one to combine queries using `AND`, `OR`, and `NOT`. -(`NOT` is equivalent to the `not:` prefix.) Some examples: +This allows one to combine query terms using `and`, `or`, `not` keywords (case insensitive), +and to group them by enclosing in parentheses. -- Match transactions with 'cool' in the description AND with the 'A' tag +Some examples: - `expr:"desc:cool AND tag:A"` +- Exclude account names containing 'food': -- Match transactions NOT to the 'expenses:food' account OR with the 'A' tag + `expr:"not food"` (`not:food` is equivalent) - `expr:"NOT expenses:food OR tag:A"` +- Match things which have 'cool' in the description and the 'A' tag: -- Match transactions NOT involving the 'expenses:food' account OR - with the 'A' tag AND involving the 'expenses:drink' account. - (the AND is implicitly added by space-separation, following the rules above) + `expr:"desc:cool and tag:A"` (`expr:"desc:cool tag:A"` is equivalent) + +- Match things which either do not reference the 'expenses:food' account, or do have the 'A' tag: + + `expr:"not expenses:food or tag:A"` + +- Match things which either do not reference the 'expenses:food' account, + or which reference the 'expenses:drink' account and also have the 'A' tag: + + `expr:"expenses:food or (expenses:drink and tag:A)"` + +`expr:` has a restriction: `date:` queries may not be used inside `or` expressions. +That would allow disjoint report periods or disjoint result sets, with unclear semantics for our reports. - `expr:"expenses:food OR (tag:A expenses:drink)"` ## Queries and command options diff --git a/hledger/test/query-expr.test b/hledger/test/query-expr.test index 8b452a99c..ead6d525c 100644 --- a/hledger/test/query-expr.test +++ b/hledger/test/query-expr.test @@ -147,14 +147,13 @@ $ hledger -f - print expr:"not (tag:transactiontag=B)" >= -# ** 11. Posting-based reports handle OR'd open-ended date periods properly. (#2177) -< -2023-12-26 2023 - (2023) 2023 +# ** 11. With expr:, it's possible for a query (with OR) to specify multiple different date periods. +# This is problematic for report semantics in several ways. For example, +# expr:'(date:2023 AND drinks) OR (date:2024 AND food)' produces two disjoint result sets, and +# expr:'date:feb or date:may or date:nov' produces three disjoint report periods with holes between them. +# Can all of our reports handle holes properly, calculate historical starting balances properly, etc ? +# For now the answer is no and therefore OR-ing of date periods must be disallowed. (#2178) +$ hledger -f- reg expr:'date:2023 OR date:2024' +>2 /using date: in OR expressions is not supported/ +>=1 -2024-01-26 2024 - (2024) 2024 - -$ hledger -f- reg -w80 expr:'date:2023 or date:2024' -2023-12-26 2023 (2023) 2023 2023 -2024-01-26 2024 (2024) 2024 4047