Commit Graph

44 Commits

Author SHA1 Message Date
Simon Michael
ee29893040 dev: fix Hledger.Utils.String import 2023-03-16 14:48:59 -10:00
Simon Michael
d3e4f8547c imp: lib: Hledger.Utils.String: more string strippers
added:
strip1Char
stripBy
strip1By

Not used in hledger right now, but useful to offer in our scripting prelude.
2023-03-16 14:35:37 -10:00
Simon Michael
442ef9361c feat: api: quoteForCommandLine: some very shady CLI escaping 2022-07-31 08:26:30 +01:00
Stephen Morgan
ff0132df28 dev: Use realLength from doclayout instead of strWidth and textWidth. (#895)
This gives us more accurate string length calculations. In particular,
it handles emoji and other scripts properly.
2021-11-11 18:29:50 -10:00
Stephen Morgan
1c211f8ab8 cln: hlint: Fix redundant return warning. 2021-08-26 21:00:35 -10:00
Stephen Morgan
063aaf35b5 cln: hlint: Rename pattern variables to avoid hlint parsing errors. 2021-08-25 20:44:36 -10:00
Stephen Morgan
e80bb37b1c lib: Remove unused String utility functions. 2021-06-03 23:23:54 -10:00
Stephen Morgan
0e59fee251 lib,cli: Export Text.Tabular from Text.Tabular.AsciiWide, clean up import lists. 2021-06-03 23:23:54 -10:00
Stephen Morgan
217bfc5e74 lib: Rename alignCell to textCell, minor cleanups. 2021-01-15 12:56:48 -08:00
Stephen Morgan
e4e533eb9f lib,cli,ui: Replace some uses of String with Text, get rid of some unpacks, clean up showMixed options. 2021-01-02 15:08:09 +11:00
Stephen Morgan
07a7c3d3a8 lib: Use Text and Text builder only in postingAsLines. 2021-01-02 15:08:09 +11:00
Stephen Morgan
13c111da73 lib,cli,ui: Use WideBuilder for Tabular.AsciiWide.
Move WideBuilder to Text.WideString.
2021-01-02 15:08:09 +11:00
Stephen Morgan
462a13cad7 lib,cli: Use Text Builder for Balance commands. 2021-01-02 15:08:09 +11:00
Simon Michael
31ea37a785 ;check: accounts, commodities, payees, ordereddates: improve errors
Error messages for these four are now a bit fancier and more
consistent. But not yet optimised for machine readability.
Cf #1436.

Added to hledger-lib: chomp1, linesPrepend[2].
2020-12-30 18:13:34 -08:00
Stephen Morgan
6d7bd9e475 lib: Implement concat(Top|Bottom)Padded in terms of renderRow, allowing them to be width aware. 2020-11-04 14:25:21 +11:00
Stephen Morgan
a2b7a03fc4 lib,cli: bal uses new amount display functions, no longer needs to strip ansi. 2020-11-04 14:25:20 +11:00
Stephen Morgan
97545018f4 lib: quoteIfNeeded should not escape the backslashes in unicode code points. 2020-10-18 21:08:25 -07:00
Simon Michael
f78dc639a5 fix a slowdown with report rendering in 1.19.1 (#1350)
stripAnsi is called many times during rendering (by strWidth), so
should be fast. It was originally a regex replacement, and more
recently a custom parser. The parser was slower, particularly the one
in 1.19.1. See #1350, and this rough test:

time118ish = timeIt $ print $ length $ concat $ map (fromRight undefined . regexReplace (toRegex' "\ESC\\[([0-9]+;)*([0-9]+)?[ABCDHJKfmsu]") "") testdata
time119    = timeparser (many (takeWhile1P Nothing (/='\ESC') <|> "" <$ ansi))
time1191   = timeparser (many ("" <$ try ansi <|> pure <$> anySingle))
timeparser p = timeIt $ print $ length $ concat $ map (concat . fromJust . parseMaybe p) testdata
testdata = concat $ replicate 10000
    [ "2008-01-01 income               assets🏦checking            $1            $1"
    , "2008-06-01 gift                 assets🏦checking            $1            $2"
    , "2008-06-02 save                 assets🏦saving              $1            $3"
    , "                                assets🏦checking  ..m$-1\ESC[m\ESC[m            $2"
    , "2008-06-03 eat & shop           assets:cash           ..m$-2\ESC[m\ESC[m             0"
    , "2008-12-31 pay off              assets🏦checking  ..m$-1\ESC[m\ESC[m  ..m$-1\ESC[m\ESC[m"
    ]

ghci> time118ish
4560000
CPU time:   0.17s
ghci> time119
4560000
CPU time:   0.91s
ghci> time1191
4560000
CPU time:   2.76s

Possibly a more careful parser could beat regexReplace. Note the
latter does memoisation, which could be faster and/or could also use
more resident memory in some situations.

Ideally we would calculate all widths before adding ANSI colour codes,
so we wouldn't have to wastefully strip them.
2020-09-10 18:07:40 -07:00
Stephen Morgan
600dab3976 lib: Correctly strip ansi sequences with no numbers/semicolons. 2020-09-06 19:11:28 -07:00
Stephen Morgan
7d1e6d7d12 lib: Fix quoteIfNeeded so it actually escapes quotes. 2020-09-01 11:41:55 +10:00
Stephen Morgan
b91b391d08 lib: Replace some regex functions with parsers. 2020-08-31 22:44:41 +10:00
Stephen Morgan
e5371d5a6a lib,cli,ui,web: Make Regexp a wrapper for Regex. 2020-08-31 12:04:45 +10:00
Stephen Morgan
081ee390ab lib: Change skipMany spacenonewline to takeWhileP Nothing isNonNewlineSpace. 2020-07-22 14:58:53 -07:00
Stephen Morgan
ed99aea7d5 lib: Introduce takeEnd to get rid of some reverse . take n . reverse. 2020-07-16 10:03:25 -07:00
Simon Michael
ce5eccfbc0 ;spelling fix
[ci skip]
2020-01-04 21:17:50 -08:00
Jacek Generowicz
29211868bb Fix issue 457
Issue #457 pointed out that commands such as

    hledger ui 'amt:>200'

failed. This was becasue the process of dispatching from `hledger ui`
to `hledger-ui` (note addition of `-`) lost the quotes around
`amt:>20` and the `>` character was interpreted as a shell redirection
operator, rather than as part of the argument.

The machinery for quoting or escaping arguements which cointain
characters which require quoting or escaping (thus far whitespace and
quotes) already existed. This solution simply adds shell stdio
redirection characters to this set.

Fixes #457
2019-12-08 18:33:43 +01:00
Dmitry Astapov
24bba96ea2 lib: more robust multi-line joining in csv parser 2019-11-05 21:16:42 +00:00
Dmitry Astapov
e4add6df83 lib: fix for multiline descriptions in csv (fixes #841, #416) 2019-11-05 21:16:42 +00:00
Caleb Maclennan
11d9e5eb6a code: Strip extraneous trailing whitespace from Haskell sources 2019-07-15 16:40:49 +01:00
Alex Chen
b245ec7b3d lib: remove the megaparsec compatability module 2018-05-22 12:16:46 -07:00
Moritz Kiefer
d7b68fbd7d Use skipMany/skipSome for parsing spacenonewline
This avoids allocating the list of space characters only to then
discard it.
2018-03-25 22:59:05 +01:00
Simon Michael
d7d5f8a064 add support for megaparsec 6 (fixes #594)
Older megaparsec is still supported.
Also cleans up our custom parser types,
and some text (un)packing is done in different places
(possible performance impact).
2017-07-27 19:20:46 -07:00
Simon Michael
9a86c9ee52 lib: begin supporting colour
Add some basic helpers for working with ANSI colour codes,
and make strWidth and the various string layout functions aware of them.
2017-04-25 18:27:25 -07:00
Shubham Lagwankar
37b7ebfe22 use isSpace in lstrip (#441) 2016-12-20 09:29:12 -08:00
Moritz Kiefer
4141067428 Replace Parsec with Megaparsec (see #289) (#366)
* Replace Parsec with Megaparsec (see #289)

This builds upon PR #289 by @rasendubi

* Revert renaming of parseWithState to parseWithCtx

* Fix doctests

* Update for Megaparsec 5

* Specialize parser to improve performance

* Pretty print errors

* Swap StateT and ParsecT

This is necessary to get the correct backtracking behavior, i.e. discard
state changes if the parsing fails.
2016-07-29 08:57:10 -07:00
Simon Michael
2538d14ea7 lib: textification begins! account names
The first of several conversions from String to (strict) Text, hopefully
reducing space and time usage.

This one shows a small improvement, with GHC 7.10.3 and text-1.2.2.1:

hledger -f data/100x100x10.journal stats
string: <<ghc: 39471064 bytes, 77 GCs, 198421/275048 avg/max bytes residency (3 samples), 2M in use, 0.000 INIT (0.001 elapsed), 0.015 MUT (0.020 elapsed), 0.010 GC (0.014 elapsed) :ghc>>
text:   <<ghc: 39268024 bytes, 77 GCs, 197018/270840 avg/max bytes residency (3 samples), 2M in use, 0.000 INIT (0.002 elapsed), 0.016 MUT (0.022 elapsed), 0.009 GC (0.011 elapsed) :ghc>>

hledger -f data/1000x100x10.journal stats
string: <<ghc: 318555920 bytes, 617 GCs, 2178997/7134472 avg/max bytes residency (7 samples), 16M in use, 0.000 INIT (0.001 elapsed), 0.129 MUT (0.136 elapsed), 0.067 GC (0.077 elapsed) :ghc>>
text:   <<ghc: 314248496 bytes, 612 GCs, 2074045/6617960 avg/max bytes residency (7 samples), 16M in use, 0.000 INIT (0.003 elapsed), 0.137 MUT (0.145 elapsed), 0.067 GC (0.079 elapsed) :ghc>>

hledger -f data/10000x100x10.journal stats
string: <<ghc: 3114763608 bytes, 6026 GCs, 18858950/75552024 avg/max bytes residency (11 samples), 201M in use, 0.000 INIT (0.000 elapsed), 1.331 MUT (1.372 elapsed), 0.699 GC (0.812 elapsed) :ghc>>
text:   <<ghc: 3071468920 bytes, 5968 GCs, 14120344/62951360 avg/max bytes residency (9 samples), 124M in use, 0.000 INIT (0.003 elapsed), 1.272 MUT (1.349 elapsed), 0.513 GC (0.578 elapsed) :ghc>>

hledger -f data/100000x100x10.journal stats
string: <<ghc: 31186579432 bytes, 60278 GCs, 135332581/740228992 avg/max bytes residency (13 samples), 1697M in use, 0.000 INIT (0.008 elapsed), 14.677 MUT (15.508 elapsed), 7.081 GC (8.074 elapsed) :ghc>>
text:   <<ghc: 30753427672 bytes, 59763 GCs, 117595958/666457240 avg/max bytes residency (14 samples), 1588M in use, 0.000 INIT (0.008 elapsed), 13.713 MUT (13.966 elapsed), 6.220 GC (7.108 elapsed) :ghc>>
2016-05-24 19:00:49 -07:00
Simon Michael
76ab5df833 lib: credit pandoc for the charWidth function 2015-10-29 09:19:20 -07:00
Simon Michael
155722d7ee make strWidth aware of multi-line strings (#242) 2015-10-10 15:08:28 -07:00
Simon Michael
3b40edba9c print: fix wide char support, add tests (#242)
The print command wasn't lining up amounts with wide chars in account
names, fixed it properly this time. Transaction and Posting's Show instances
should also be wide-char-aware now.
2015-10-10 11:53:28 -07:00
Simon Michael
ef27e5c427 string utils cleanup 2015-09-28 18:47:05 -10:00
Simon Michael
42e2da4bb6 balance, print; more wide char support (#242)
Simple (non-multicolumn) balance reports containing wide characters
should now align correctly (in apps and fonts that show wide chars as
double width). Likewise, the print command.
2015-09-28 18:33:18 -10:00
Simon Michael
5b5e5eeaf4 register: wide-character-aware layout (#242)
Wide characters, eg chinese/japanese/korean characters, are typically
rendered wider than latin characters. In some applications (eg gnome
terminal or osx terminal) and fonts (eg monaco) they are exactly double
width. This is a start at making hledger aware of this. A register
report containing wide characters (in descriptions, account names, or
commodity symbols) should now align its columns correctly, when viewed
with a suitable font and application.
2015-09-28 16:12:20 -10:00
Simon Michael
cc98ee39f7 balance, lib: --format/StringFormat improvements
The balance command's --format option (in single-column mode) can now
adjust the rendering of multi-line strings, such as amounts with multiple
commodities. To control this, begin the format string with one of:

 %_  - renders on multiple lines, bottom-aligned (the default)
 %^  - renders on multiple lines, top-aligned
 %,  - render on one line, comma-separated

Also the final total (and the line above it) now adapt themselves to a
custom format.
2015-08-19 20:53:51 -07:00
Simon Michael
7aecbac851 lib: split up Utils more 2015-08-19 20:53:50 -07:00