lib, cli: allow a READER: prefix on data file paths

This provides a way to override the file format detection logic,
useful eg for files with wrong extensions or standard input.
@ -11,12 +11,14 @@ to import modules below this one.
module Hledger.Read (
-- * Journal files
-- * Journal parsing
@ -33,6 +35,7 @@ module Hledger.Read (
) where
import Control.Applicative ((<|>))
import qualified Control.Exception as C
import Control.Monad.Except
import Data.List
@ -61,6 +64,10 @@ import Prelude hiding (getContents, writeFile)
import Hledger.Utils.UTF8IOCompat (writeFile)
journalEnvVar = "LEDGER_FILE"
journalEnvVar2 = "LEDGER"
journalDefaultFilename = ".hledger.journal"
-- The available journal readers, each one handling a particular data format.
readers :: [Reader]
readers = [
@ -71,9 +78,12 @@ readers = [
journalEnvVar = "LEDGER_FILE"
journalEnvVar2 = "LEDGER"
journalDefaultFilename = ".hledger.journal"
readerNames :: [String]
readerNames = map rFormat readers
-- | A file path optionally prefixed by a reader name and colon
-- (journal:, csv:, timedot:, etc.).
type PrefixedFilePath = FilePath
-- | Read the default journal file specified by the environment, or raise an error.
defaultJournal :: IO Journal
@ -99,34 +109,58 @@ defaultJournalPath = do
home <- getHomeDirectory `C.catch` (\(_::C.IOException) -> return "")
return $ home </> journalDefaultFilename
-- | @readJournalFiles mformat mrulesfile assrt fs@
-- | @readJournalFiles mformat mrulesfile assrt prefixedfiles@
-- Call readJournalFile on each specified file path, and combine the
-- resulting journals into one. If there are any errors, the first is
-- returned, otherwise they are combined per Journal's monoid instance
-- (concatenated, basically). Parse context (eg directives & aliases)
-- is not maintained across file boundaries, it resets at the start of
-- each file (though the final parse state saved in the resulting
-- journal is the combination of parse states from all files).
readJournalFiles :: Maybe StorageFormat -> Maybe FilePath -> Bool -> [FilePath] -> IO (Either String Journal)
readJournalFiles mformat mrulesfile assrt fs = do
-- Read a Journal from each specified file path and combine them into one.
-- Or, return the first error message.
-- Combining Journals means concatenating them, basically.
-- The parse state resets at the start of each file, which means that
-- directives & aliases do not cross file boundaries.
-- (The final parse state saved in the Journal does span all files, however.)
-- As with readJournalFile,
-- file paths can optionally have a READER: prefix,
-- and the @mformat@, @mrulesfile, and @assrt@ arguments are supported
-- (and these are applied to all files).
readJournalFiles :: Maybe StorageFormat -> Maybe FilePath -> Bool -> [PrefixedFilePath] -> IO (Either String Journal)
readJournalFiles mformat mrulesfile assrt prefixedfiles = do
(either Left (Right . mconcat) . sequence)
<$> mapM (readJournalFile mformat mrulesfile assrt) fs
<$> mapM (readJournalFile mformat mrulesfile assrt) prefixedfiles
-- | @readJournalFile mformat mrulesfile assrt f@
-- | @readJournalFile mformat mrulesfile assrt prefixedfile@
-- Read a Journal from this file (or stdin if the file path is -).
-- Assume the specified data format, or a format identified from the file path,
-- or try all readers.
-- A CSV conversion rules file may be specified for better conversion of CSV.
-- Also optionally check any balance assertions in the journal.
-- If parsing or balance assertions fail, return an error message instead.
readJournalFile :: Maybe StorageFormat -> Maybe FilePath -> Bool -> FilePath -> IO (Either String Journal)
readJournalFile mformat mrulesfile assrt f = do
-- Read a Journal from this file, or from stdin if the file path is -,
-- or return an error message. The file path can have a READER: prefix.
-- The reader (data format) is chosen based on (in priority order):
-- the @mformat@ argument;
-- the file path's READER: prefix, if any;
-- a recognised file name extension (in readJournal);
-- if none of these identify a known reader, all built-in readers are tried in turn.
-- A CSV conversion rules file (@mrulesfiles@) can be specified to help convert CSV data.
-- Optionally, any balance assertions in the journal can be checked (@assrt@).
readJournalFile :: Maybe StorageFormat -> Maybe FilePath -> Bool -> PrefixedFilePath -> IO (Either String Journal)
readJournalFile mformat mrulesfile assrt prefixedfile = do
(mprefixformat, f) = splitReaderPrefix prefixedfile
mfmt = mformat <|> mprefixformat
requireJournalFileExists f
readFileOrStdinAnyLineEnding f >>= readJournal mformat mrulesfile assrt (Just f)
readFileOrStdinAnyLineEnding f >>= readJournal mfmt mrulesfile assrt (Just f)
-- | If the specified journal file does not exist, give a helpful error and quit.
-- | If a filepath is prefixed by one of the reader names and a colon,
-- split that off. Eg "csv:-" -> (Just "csv", "-").
splitReaderPrefix :: PrefixedFilePath -> (Maybe String, FilePath)
splitReaderPrefix f =
headDef (Nothing, f)
[(Just r, drop (length r + 1) f) | r <- readerNames, (r++":") `isPrefixOf` f]
-- | If the specified journal file does not exist (and is not "-"),
-- give a helpful error and quit.
requireJournalFileExists :: FilePath -> IO ()
requireJournalFileExists "-" = return ()
requireJournalFileExists f = do
@ -153,7 +187,7 @@ newJournalContent = do
d <- getCurrentDay
return $ printf "; journal created %s by hledger\n" (show d)
-- | Read a journal from the given text, trying all known formats, or simply throw an error.
-- | Read a Journal from the given text trying all readers in turn, or throw an error.
readJournal' :: Text -> IO Journal
readJournal' t = readJournal Nothing Nothing True Nothing t >>= either error' return
@ -163,30 +197,42 @@ tests_readJournal' = [
assertBool "" True
-- | @readJournal mformat mrulesfile assrt mpath t@
-- | @readJournal mformat mrulesfile assrt mfile txt@
-- Read a Journal from some text, or return an error message.
-- The reader (data format) is chosen based on (in priority order):
-- the @mformat@ argument;
-- a recognised file name extension in @mfile@ (if provided).
-- If none of these identify a known reader, all built-in readers are tried in turn
-- (returning the first one's error message if none of them succeed).
-- A CSV conversion rules file (@mrulesfiles@) can be specified to help convert CSV data.
-- Optionally, any balance assertions in the journal can be checked (@assrt@).
-- Try to read a Journal from some text.
-- If a format is specified (mformat), try only that reader.
-- Otherwise if the file path is provided (mpath), and it specifies a format, try only that reader.
-- Otherwise try all readers in turn until one succeeds, or return the first error if none of them succeed.
-- A CSV conversion rules file may be specified (mrulesfile) for use by the CSV reader.
-- If the assrt flag is true, also check and enforce balance assertions in the journal.
readJournal :: Maybe StorageFormat -> Maybe FilePath -> Bool -> Maybe FilePath -> Text -> IO (Either String Journal)
readJournal mformat mrulesfile assrt mpath t =
let rs = maybe readers (:[]) $ findReader mformat mpath
in tryReaders rs mrulesfile assrt mpath t
readJournal mformat mrulesfile assrt mfile txt =
rs = maybe readers (:[]) $ findReader mformat mfile
tryReaders rs mrulesfile assrt mfile txt
-- | @findReader mformat mpath@
-- Find the reader for the given format (mformat), if any.
-- Or if no format is provided, find the first reader that handles the
-- file name's extension, if any.
-- Find the reader named by @mformat@, if provided.
-- Or, if a file path is provided, find the first reader that handles
-- its file extension, if any.
findReader :: Maybe StorageFormat -> Maybe FilePath -> Maybe Reader
findReader Nothing Nothing = Nothing
findReader (Just fmt) _ = headMay [r | r <- readers, fmt == rFormat r]
findReader Nothing (Just path) = headMay [r | r <- readers, ext `elem` rExtensions r]
findReader (Just fmt) _ = headMay [r | r <- readers, rFormat r == fmt]
findReader Nothing (Just path) =
case prefix of
Just fmt -> headMay [r | r <- readers, rFormat r == fmt]
Nothing -> headMay [r | r <- readers, ext `elem` rExtensions r]
ext = drop 1 $ takeExtension path
(prefix,path') = splitReaderPrefix path
ext = drop 1 $ takeExtension path'
-- | @tryReaders readers mrulesfile assrt path t@

@ -397,14 +397,22 @@ aliasesFromOpts = map (\a -> fromparse $ runParser accountaliasp ("--alias "++qu
-- 1. options, 2. an environment variable, or 3. the default.
-- Actually, returns one or more file paths. There will be more
-- than one if multiple -f options were provided.
-- File paths can have a READER: prefix naming a reader/data format.
journalFilePathFromOpts :: CliOpts -> IO [String]
journalFilePathFromOpts opts = do
f <- defaultJournalPath
d <- getCurrentDirectory
mapM (expandPath d) $ ifEmpty (file_ opts) [f]
ifEmpty [] d = d
ifEmpty l _ = l
case file_ opts of
[] -> return [f]
fs -> mapM (expandPathPreservingPrefix d) fs
expandPathPreservingPrefix :: FilePath -> PrefixedFilePath -> IO PrefixedFilePath
expandPathPreservingPrefix d prefixedf = do
let (p,f) = splitReaderPrefix prefixedf
f' <- expandPath d f
return $ case p of
Just p -> p ++ ":" ++ f'
Nothing -> f'
-- | Get the expanded, absolute output file path from options,
-- or the default (-, meaning stdout).

@ -403,10 +403,6 @@ Eg \-p jan \-p feb is equivalent to \-p feb.
hledger reads transactions from a data file (and the add command writes
to it).
Usually this is in hledger\[aq]s journal format, but it can also be one
of the other supported file types, such as timeclock, timedot, CSV, or a
C++ Ledger journal (partial support).
By default this file is \f[C]$HOME/.hledger.journal\f[] (or on Windows,
something like \f[C]C:/Users/USER/.hledger.journal\f[]).
You can override this with the \f[C]$LEDGER_FILE\f[] environment
@ -423,51 +419,10 @@ or with the \f[C]\-f/\-\-file\f[] option:
$\ hledger\ \-f\ some/file.ext\ stats
$\ hledger\ \-f\ /some/file\ stats
hledger tries to identify the file format based on the file extension,
as follows:
l l.
File extension:
Use format:
\f[C]\&.journal\f[], \f[C]\&.j\f[], \f[C]\&.hledger\f[],
If the file name has some other extension, or none, hledger tries each
of these formats in turn.
(Plus one more: the experimental "ledger" format, an alternate parser
for C++ Ledger journals, which we try only as a last resort as it\[aq]s
new and hledger\[aq]s journal parser works better for now.)
The file name \f[C]\-\f[] (hyphen) means standard input, as usual:
@ -476,6 +431,89 @@ $\ cat\ some.journal\ |\ hledger\ \-f\-
Usually this file is in hledger\[aq]s journal format, but it can also be
one of several other formats, shown below.
hledger tries to identify the format based on the file extension, as
l l l.
File extensions:
hledger\[aq]s journal format
\f[C]\&.journal\f[], \f[C]\&.j\f[], \f[C]\&.hledger\f[],
timeclock files (precise time logging)
timedot files (approximate time logging)
comma\-separated values (data interchange)
hledger identifies the format based on the file extension if possible.
If that does not identify a known format, it tries each format in turn.
If needed, eg to ensure correct error messages, you can force a specific
format by prepending it to the file path with a colon.
$\ hledger\ \-f\ csv:/some/csv\-file.dat\ stats
$\ echo\ \[aq]i\ 2009/13/1\ 08:00:00\[aq]\ |\ hledger\ print\ \-ftimeclock:\-
Some other experimental formats are available but not yet used by
l l l.
File extensions:
Ledger\[aq]s journal format (incomplete)
You can specify multiple \f[C]\-f\f[] options, to read multiple files as
one big journal.
Directives in one file will not affect subsequent files in this case (if

@ -324,13 +324,9 @@ File:, Node: Input files, Next: Depth limiting, Prev: Reportin
hledger reads transactions from a data file (and the add command writes
to it). Usually this is in hledger's journal format, but it can also be
one of the other supported file types, such as timeclock, timedot, CSV,
or a C++ Ledger journal (partial support).
By default this file is `$HOME/.hledger.journal' (or on Windows,
something like `C:/Users/USER/.hledger.journal'). You can override this
with the `$LEDGER_FILE' environment variable:
to it). By default this file is `$HOME/.hledger.journal' (or on
Windows, something like `C:/Users/USER/.hledger.journal'). You can
override this with the `$LEDGER_FILE' environment variable:
$ setenv LEDGER_FILE ~/finance/2016.journal
@ -339,29 +335,43 @@ $ hledger stats
or with the `-f/--file' option:
$ hledger -f some/file.ext stats
hledger tries to identify the file format based on the file
extension, as follows:
File extension: Use format:
`.journal', `.j', `.hledger', `.ledger' journal
`.timeclock' timeclock
`.timedot' timedot
`.csv' CSV
If the file name has some other extension, or none, hledger tries
each of these formats in turn. (Plus one more: the experimental "ledger"
format, an alternate parser for C++ Ledger journals, which we try only
as a last resort as it's new and hledger's journal parser works better
for now.)
$ hledger -f /some/file stats
The file name `-' (hyphen) means standard input, as usual:
$ cat some.journal | hledger -f-
Usually this file is in hledger's journal format, but it can also be
one of several other formats, shown below. hledger tries to identify the
format based on the file extension, as follows:
Format: Description: File extensions:
journal hledger's journal format `.journal', `.j', `.hledger', `.ledger'
timeclock timeclock files (precise time logging) `.timeclock'
timedot timedot files (approximate time logging) `.timedot'
CSV comma-separated values (data interchange) `.csv'
hledger identifies the format based on the file extension if
possible. If that does not identify a known format, it tries each
format in turn.
If needed, eg to ensure correct error messages, you can force a
specific format by prepending it to the file path with a colon.
$ hledger -f csv:/some/csv-file.dat stats
$ echo 'i 2009/13/1 08:00:00' | hledger print -ftimeclock:-
Some other experimental formats are available but not yet used by
Format: Description: File extensions:
ledger Ledger's journal format (incomplete)
You can specify multiple `-f' options, to read multiple files as one
big journal. Directives in one file will not affect subsequent files in
this case (if you need that, use the include directive instead).
@ -75,13 +75,6 @@ Eg -p jan -p feb is equivalent to -p feb.
## Input files
hledger reads transactions from a data file (and the add command writes to it).
Usually this is in hledger's journal format,
but it can also be one of the other supported file types, such as
or a C++ Ledger journal (partial support).
By default this file is `$HOME/.hledger.journal`
(or on Windows, something like `C:/Users/USER/.hledger.journal`).
You can override this with the `$LEDGER_FILE` environment variable:
@ -91,30 +84,41 @@ $ hledger stats
or with the `-f/--file` option:
$ hledger -f some/file.ext stats
$ hledger -f /some/file stats
hledger tries to identify the file format based on the file extension,
as follows:
| File extension: | Use format:
| `.journal`, `.j`, `.hledger`, `.ledger` | journal
| `.timeclock` | timeclock
| `.timedot` | timedot
| `.csv` | CSV
If the file name has some other extension, or none,
hledger tries each of these formats in turn.
(Plus one more: the experimental "ledger" format, an alternate
parser for C++ Ledger journals, which we try only as a last resort
as it's new and hledger's journal parser works better for now.)
The file name `-` (hyphen) means standard input, as usual:
$ cat some.journal | hledger -f-
Usually this file is in hledger's journal format,
but it can also be one of several other formats, shown below.
hledger tries to identify the format based on the file extension, as follows:
| Format: | Description: | File extensions:
| journal | hledger's journal format | `.journal`, `.j`, `.hledger`, `.ledger`
| timeclock | timeclock files (precise time logging) | `.timeclock`
| timedot | timedot files (approximate time logging) | `.timedot`
| CSV | comma-separated values (data interchange) | `.csv`
hledger identifies the format based on the file extension if possible.
If that does not identify a known format, it tries each format in turn.
If needed, eg to ensure correct error messages, you can force a specific format
by prepending it to the file path with a colon. Examples:
$ hledger -f csv:/some/csv-file.dat stats
$ echo 'i 2009/13/1 08:00:00' | hledger print -ftimeclock:-
Some other experimental formats are available but not yet used by default:
| Format: | Description: | File extensions:
| ledger | Ledger's journal format (incomplete) |
You can specify multiple `-f` options, to read multiple files as one big journal.
Directives in one file will not affect subsequent files in this case (if you need that,
use the [include directive](#including-other-files) instead).