When we don't know a file's format, instead of choosing a subset of
readers based on content sniffing, now we just try them all.
Also, LedgerReader is now used only as a last resort,
as it's not yet competitive with JournalReader.
This reader is used by default for files with suffix .ledger or .l,
and tried along with the other readers for files of unknown type.
Currently only the bare minimum of the raw parsed data is used:
transaction dates/descriptions and posting accounts/amounts,
with the rest being ignored.
Amounts are parsed the same way as in the hledger journal format.
Malformed amounts might be ignored instead of error-reported.
The journal/timeclock/timedot parsers, instead of constructing (opaque)
journal update functions which are later applied to build the journal,
now construct the journal directly (by modifying the parser state). This
is easier to understand and debug. It also removes any possibility of
the journal updates being a space leak. (They weren't, in fact memory
usage is now slightly higher, but that will be addressed in other ways.)
Also:
Journal data and journal parse info have been merged into one type (for
now), and field names are more consistent.
The ParsedJournal type alias has been added to distinguish being-parsed
and finalised journals.
Journal is now a monoid.
stats: fixed an issue with ordering of include files
journal: fixed an issue with ordering of included same-date transactions
timeclock: sessions can no longer span file boundaries (unclocked-out
sessions will be auto-closed at the end of the file).
expandPath now throws a proper IO error (and requires the IO monad).
When multiple files are specified with multiple -f options, we now
parse each one individually, rather than just concatenating them, so
they can have different formats.
Directives (like default year or account aliases) no longer carry over
from one file to the next. Limitation or feature ?
journal files can now include journal, timeclock or timedot files (but
not yet CSV files). Also timeclock/timedot files no longer support
default year directives.
The Hledger.Read.* modules have been reorganised for better reuse.
Hledger.Read.Utils has been renamed Hledger.Read.Common and holds
low-level parsers & utilities; high-level read utilities have moved to
Hledger.Read.
Timedot is a plain text format for logging dated, categorised
quantities (eg time), supported by hledger. It is convenient for
approximate and retroactive time logging, eg when the real-time
clock-in/out required with a timeclock file is too precise or too
interruptive. It can be formatted like a bar chart, making clear at a
glance where time was spent.
The regex account aliases added in 0.24 trip up people switching between
hledger and Ledger. (Also they are currently slow).
This change makes the old non-regex aliases the default; they are
unsurprising, useful, and pretty close in functionality to Ledger's.
The new regex aliases are also available; they must be enclosed in
forward slashes. Ledger effectively ignores these, which is ok.
Also clarify docs, refactor, and use the same parser for alias
directives and alias options
Can be helpful when reading Ledger files, where assertions may have
different semantics; or for getting some answers from your journal
to help you fix your assertions.
Could be called --no-assertions, but this might create surprise when it
has an effect contrary to --no-new-accounts.
I had to add another flag throughout the parsers & journal read
functions, ok for now.
- The CSV reader no longer writes a "(stdin).rules" file when reading
from stdin.
- Selection of reader(s) is now smarter when input is coming from stdin.
Previously, all readers were considered applicable for stdin. This
meant that when reading a CSV file from stdin, the journal and timelog
readers were always tried first, and if the CSV file was unparseable,
you'd see the first (journal) reader's error instead of the CSV
reader's. Now, the readers do some basic content sniffing when
reading stdin, so it generally tries only the one right reader and
we'll see the right errors.
- The read system now has more debug output.
We should now read all text in universal newline mode, so eg journal
files with DOS/windows line endings are fine.
This also deprecates and disables our IO encoding compatibility layer,
which prevented many encoding-related problems with certain platforms
and GHC versions. With modern GHC (7.x) this is now hopefully totally
unnecessary, but the module remains in place just in case.
- use new query system for command line too, filterspec is no more
- move unit tests near the code they test, run them in bottom up order, add more
- more precise Show instances, used for debugging not ui