stripAnsi is called many times during rendering (by strWidth), so
should be fast. It was originally a regex replacement, and more
recently a custom parser. The parser was slower, particularly the one
in 1.19.1. See #1350, and this rough test:
time118ish = timeIt $ print $ length $ concat $ map (fromRight undefined . regexReplace (toRegex' "\ESC\\[([0-9]+;)*([0-9]+)?[ABCDHJKfmsu]") "") testdata
time119 = timeparser (many (takeWhile1P Nothing (/='\ESC') <|> "" <$ ansi))
time1191 = timeparser (many ("" <$ try ansi <|> pure <$> anySingle))
timeparser p = timeIt $ print $ length $ concat $ map (concat . fromJust . parseMaybe p) testdata
testdata = concat $ replicate 10000
[ "2008-01-01 income assets🏦checking $1 $1"
, "2008-06-01 gift assets🏦checking $1 $2"
, "2008-06-02 save assets🏦saving $1 $3"
, " assets🏦checking ..m$-1\ESC[m\ESC[m $2"
, "2008-06-03 eat & shop assets:cash ..m$-2\ESC[m\ESC[m 0"
, "2008-12-31 pay off assets🏦checking ..m$-1\ESC[m\ESC[m ..m$-1\ESC[m\ESC[m"
]
ghci> time118ish
4560000
CPU time: 0.17s
ghci> time119
4560000
CPU time: 0.91s
ghci> time1191
4560000
CPU time: 2.76s
Possibly a more careful parser could beat regexReplace. Note the
latter does memoisation, which could be faster and/or could also use
more resident memory in some situations.
Ideally we would calculate all widths before adding ANSI colour codes,
so we wouldn't have to wastefully strip them.
This PR #1330, addressing #1312 (parseQuery is partial) and #1245
(internal server error).
User-visible changes:
- hledger-web now handles malformed regular expressions
(eg, a query consisting of the single character `?`) gracefully,
showing a tidy error message instead "internal server error".
API/internal changes:
- The Regex type alias has been replaced by the Regexp ADT, which
contains both the compiled regular expression (so is guaranteed to
be usable at runtime) and the original string (so can be serialised,
printed, compared, etc.) A Regexp also knows whether is it case
sensitive or case insensitive. The Hledger.Utils.Regex api has changed.
- Typeable and Data instances are no longer derived for hledger's
data types; they were redundant/no longer needed
- NFData instances are no longer derived for hledger's data types.
This speeds up a full build by roughly 7%. But it means we can't
deep-evaluate hledger values, or time hledger code with Criterion.
https://github.com/simonmichael/hledger/pull/1330#issuecomment-684075129
has some ideas on this.
- Query no longer has a custom Show instance
- Some internal use of regexps was replaced by text replacement or
parsers.
- Hledger.Utils.String: quoteIfNeeded now actually escapes quotes in
strings; dropped escapeQuotes
- Hledger.Utils.Tree: dropped some old utilities
- dropped some obsolete code for the old --display option
Merge branch 'regexp' into master
matchesAccount_
matchesAmount_
matchesCommodity_
matchesPosting_
matchesPriceDirective_
matchesTags_
matchesTransaction_
These don't yet have tests of their own, but were converted
mechanically from the originals which should help.