mirror of
https://github.com/simonmichael/hledger.git
synced 2025-01-06 02:23:46 +03:00
;csv: clarify that whitespace is not stripped when matching
This commit is contained in:
parent
3919f0945a
commit
6b2dfad98c
@ -1075,12 +1075,16 @@ getEffectiveAssignment rules record f = lastMay $ map snd $ assignments
|
|||||||
where
|
where
|
||||||
-- does this individual matcher match the current csv record ?
|
-- does this individual matcher match the current csv record ?
|
||||||
matcherMatches :: Matcher -> Bool
|
matcherMatches :: Matcher -> Bool
|
||||||
matcherMatches (RecordMatcher pat) = regexMatchesCI pat wholecsvline
|
matcherMatches (RecordMatcher pat) = regexMatchesCI pat' wholecsvline
|
||||||
where
|
where
|
||||||
-- a synthetic whole CSV record to match against; note, it has
|
pat' = dbg3 "regex" pat
|
||||||
-- no quotes enclosing fields, and is always comma-separated,
|
-- A synthetic whole CSV record to match against. Note, this can be
|
||||||
-- so may differ from the actual record, and may not be valid CSV.
|
-- different from the original CSV data:
|
||||||
wholecsvline = dbg3 "wholecsvline" $ intercalate "," record
|
-- - any whitespace surrounding field values is preserved
|
||||||
|
-- - any quotes enclosing field values are removed
|
||||||
|
-- - and the field separator is always comma
|
||||||
|
-- which means that a field containing a comma will look like two fields.
|
||||||
|
wholecsvline = dbg3 "wholecsvline" $ intercalate "," record -- $ map strip record ?
|
||||||
matcherMatches (FieldMatcher csvfieldref pat) = regexMatchesCI pat csvfieldvalue
|
matcherMatches (FieldMatcher csvfieldref pat) = regexMatchesCI pat csvfieldvalue
|
||||||
where
|
where
|
||||||
-- the value of the referenced CSV field to match against.
|
-- the value of the referenced CSV field to match against.
|
||||||
|
@ -545,9 +545,12 @@ REGEX
|
|||||||
REGEX is a case-insensitive regular expression which tries to match anywhere within the CSV record.
|
REGEX is a case-insensitive regular expression which tries to match anywhere within the CSV record.
|
||||||
It is a POSIX extended regular expressions with some additions (see
|
It is a POSIX extended regular expressions with some additions (see
|
||||||
[Regular expressions](https://hledger.org/hledger.html#regular-expressions) in the hledger manual).
|
[Regular expressions](https://hledger.org/hledger.html#regular-expressions) in the hledger manual).
|
||||||
Note: the "CSV record" it is matched against is not the original record, but a synthetic one,
|
|
||||||
with enclosing double quotes or whitespace removed, and always comma-separated.
|
Important note: the record that is matched is not the original record, but a synthetic one,
|
||||||
(Eg, an SSV record `2020-01-01; "Acme, Inc."; 1,000` appears to REGEX as `2020-01-01,Acme, Inc.,1,000`).
|
with any enclosing double quotes (but not enclosing whitespace) removed, and always comma-separated
|
||||||
|
(which means that a field containing a comma will appear like two fields).
|
||||||
|
Eg, if the original record is `2020-01-01; "Acme, Inc."; 1,000`,
|
||||||
|
the REGEX will actually see `2020-01-01,Acme, Inc., 1,000`).
|
||||||
|
|
||||||
Or, MATCHER can be a field matcher, like this:
|
Or, MATCHER can be a field matcher, like this:
|
||||||
```rules
|
```rules
|
||||||
|
Loading…
Reference in New Issue
Block a user