hledger/hledger-lib/hledger_csv.5
2019-01-19 15:31:22 -08:00

335 lines
9.2 KiB
Groff

.TH "hledger_csv" "5" "January 2019" "hledger 1.12.99" "hledger User Manuals"
.SH NAME
.PP
CSV \- how hledger reads CSV data, and the CSV rules file format
.SH DESCRIPTION
.PP
hledger can read CSV (comma\-separated value) files as if they were
journal files, automatically converting each CSV record into a
transaction.
(To learn about \f[I]writing\f[] CSV, see CSV output.)
.PP
Converting CSV to transactions requires some special conversion rules.
These do several things:
.IP \[bu] 2
they describe the layout and format of the CSV data
.IP \[bu] 2
they can customize the generated journal entries using a simple
templating language
.IP \[bu] 2
they can add refinements based on patterns in the CSV data, eg
categorizing transactions with more detailed account names.
.PP
When reading a CSV file named \f[C]FILE.csv\f[], hledger looks for a
conversion rules file named \f[C]FILE.csv.rules\f[] in the same
directory.
You can override this with the \f[C]\-\-rules\-file\f[] option.
If the rules file does not exist, hledger will auto\-create one with
some example rules, which you'll need to adjust.
.PP
At minimum, the rules file must identify the \f[C]date\f[] and
\f[C]amount\f[] fields.
It may also be necessary to specify the date format, and the number of
header lines to skip.
Eg:
.IP
.nf
\f[C]
fields\ date,\ _,\ _,\ amount
date\-format\ \ %d/%m/%Y
skip\ 1
\f[]
.fi
.PP
A more complete example:
.IP
.nf
\f[C]
#\ hledger\ CSV\ rules\ for\ amazon.com\ order\ history
#\ sample:
#\ "Date","Type","To/From","Name","Status","Amount","Fees","Transaction\ ID"
#\ "Jul\ 29,\ 2012","Payment","To","Adapteva,\ Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
#\ skip\ one\ header\ line
skip\ 1
#\ name\ the\ csv\ fields\ (and\ assign\ the\ transaction\[aq]s\ date,\ amount\ and\ code)
fields\ date,\ _,\ toorfrom,\ name,\ amzstatus,\ amount,\ fees,\ code
#\ how\ to\ parse\ the\ date
date\-format\ %b\ %\-d,\ %Y
#\ combine\ two\ fields\ to\ make\ the\ description
description\ %toorfrom\ %name
#\ save\ these\ fields\ as\ tags
comment\ \ \ \ \ status:%amzstatus,\ fees:%fees
#\ set\ the\ base\ account\ for\ all\ transactions
account1\ \ \ \ assets:amazon
#\ flip\ the\ sign\ on\ the\ amount
amount\ \ \ \ \ \ \-%amount
\f[]
.fi
.PP
For more examples, see Convert CSV files.
.SH CSV RULES
.PP
The following seven kinds of rule can appear in the rules file, in any
order.
Blank lines and lines beginning with \f[C]#\f[] or \f[C];\f[] are
ignored.
.SS skip
.PP
\f[C]skip\f[]\f[I]\f[CI]N\f[I]\f[]
.PP
Skip this number of CSV records at the beginning.
You'll need this whenever your CSV data contains header lines.
Eg:
.IP
.nf
\f[C]
#\ ignore\ the\ first\ CSV\ line
skip\ 1
\f[]
.fi
.SS date\-format
.PP
\f[C]date\-format\f[]\f[I]\f[CI]DATEFMT\f[I]\f[]
.PP
When your CSV date fields are not formatted like \f[C]YYYY/MM/DD\f[] (or
\f[C]YYYY\-MM\-DD\f[] or \f[C]YYYY.MM.DD\f[]), you'll need to specify
the format.
DATEFMT is a strptime\-like date parsing pattern, which must parse the
date field values completely.
Examples:
.IP
.nf
\f[C]
#\ for\ dates\ like\ "11/06/2013":
date\-format\ %m/%d/%Y
\f[]
.fi
.IP
.nf
\f[C]
#\ for\ dates\ like\ "6/11/2013"\ (note\ the\ \-\ to\ make\ leading\ zeros\ optional):
date\-format\ %\-d/%\-m/%Y
\f[]
.fi
.IP
.nf
\f[C]
#\ for\ dates\ like\ "2013\-Nov\-06":
date\-format\ %Y\-%h\-%d
\f[]
.fi
.IP
.nf
\f[C]
#\ for\ dates\ like\ "11/6/2013\ 11:32\ PM":
date\-format\ %\-m/%\-d/%Y\ %l:%M\ %p
\f[]
.fi
.SS field list
.PP
\f[C]fields\f[]\f[I]\f[CI]FIELDNAME1\f[I]\f[],
\f[I]\f[CI]FIELDNAME2\f[I]\f[]\&...
.PP
This (a) names the CSV fields, in order (names may not contain
whitespace; uninteresting names may be left blank), and (b) assigns them
to journal entry fields if you use any of these standard field names:
\f[C]date\f[], \f[C]date2\f[], \f[C]status\f[], \f[C]code\f[],
\f[C]description\f[], \f[C]comment\f[], \f[C]account1\f[],
\f[C]account2\f[], \f[C]amount\f[], \f[C]amount\-in\f[],
\f[C]amount\-out\f[], \f[C]currency\f[], \f[C]balance\f[].
Eg:
.IP
.nf
\f[C]
#\ use\ the\ 1st,\ 2nd\ and\ 4th\ CSV\ fields\ as\ the\ entry\[aq]s\ date,\ description\ and\ amount,
#\ and\ give\ the\ 7th\ and\ 8th\ fields\ meaningful\ names\ for\ later\ reference:
#
#\ CSV\ field:
#\ \ \ \ \ \ 1\ \ \ \ \ 2\ \ \ \ \ \ \ \ \ \ \ \ 3\ 4\ \ \ \ \ \ \ 5\ 6\ 7\ \ \ \ \ \ \ \ \ \ 8
#\ entry\ field:
fields\ date,\ description,\ ,\ amount,\ ,\ ,\ somefield,\ anotherfield
\f[]
.fi
.SS field assignment
.PP
\f[I]\f[CI]ENTRYFIELDNAME\f[I]\f[] \f[I]\f[CI]FIELDVALUE\f[I]\f[]
.PP
This sets a journal entry field (one of the standard names above) to the
given text value, which can include CSV field values interpolated by
name (\f[C]%CSVFIELDNAME\f[]) or 1\-based position (\f[C]%N\f[]).
Eg:
.IP
.nf
\f[C]
#\ set\ the\ amount\ to\ the\ 4th\ CSV\ field\ with\ "USD\ "\ prepended
amount\ USD\ %4
\f[]
.fi
.IP
.nf
\f[C]
#\ combine\ three\ fields\ to\ make\ a\ comment\ (containing\ two\ tags)
comment\ note:\ %somefield\ \-\ %anotherfield,\ date:\ %1
\f[]
.fi
.PP
Field assignments can be used instead of or in addition to a field list.
.SS conditional block
.PP
\f[C]if\f[] \f[I]\f[CI]PATTERN\f[I]\f[]
.PD 0
.P
.PD
\ \ \ \ \f[I]\f[CI]FIELDASSIGNMENTS\f[I]\f[]\&...
.PP
\f[C]if\f[]
.PD 0
.P
.PD
\f[I]\f[CI]PATTERN\f[I]\f[]
.PD 0
.P
.PD
\f[I]\f[CI]PATTERN\f[I]\f[]\&...
.PD 0
.P
.PD
\ \ \ \ \f[I]\f[CI]FIELDASSIGNMENTS\f[I]\f[]\&...
.PP
This applies one or more field assignments, only to those CSV records
matched by one of the PATTERNs.
The patterns are case\-insensitive regular expressions which match
anywhere within the whole CSV record (it's not yet possible to match
within a specific field).
When there are multiple patterns they can be written on separate lines,
unindented.
The field assignments are on separate lines indented by at least one
space.
Examples:
.IP
.nf
\f[C]
#\ if\ the\ CSV\ record\ contains\ "groceries",\ set\ account2\ to\ "expenses:groceries"
if\ groceries
\ account2\ expenses:groceries
\f[]
.fi
.IP
.nf
\f[C]
#\ if\ the\ CSV\ record\ contains\ any\ of\ these\ patterns,\ set\ account2\ and\ comment\ as\ shown
if
monthly\ service\ fee
atm\ transaction\ fee
banking\ thru\ software
\ account2\ expenses:business:banking
\ comment\ \ XXX\ deductible\ ?\ check\ it
\f[]
.fi
.SS include
.PP
\f[C]include\f[]\f[I]\f[CI]RULESFILE\f[I]\f[]
.PP
Include another rules file at this point.
\f[C]RULESFILE\f[] is either an absolute file path or a path relative to
the current file's directory.
Eg:
.IP
.nf
\f[C]
#\ rules\ reused\ with\ several\ CSV\ files
include\ common.rules
\f[]
.fi
.SS newest\-first
.PP
\f[C]newest\-first\f[]
.PP
Consider adding this rule if all of the following are true: you might be
processing just one day of data, your CSV records are in reverse
chronological order (newest first), and you care about preserving the
order of same\-day transactions.
It usually isn't needed, because hledger autodetects the CSV order, but
when all CSV records have the same date it will assume they are oldest
first.
.SH CSV TIPS
.SS CSV ordering
.PP
The generated journal entries will be sorted by date.
The order of same\-day entries will be preserved (except in the special
case where you might need \f[C]newest\-first\f[], see above).
.SS CSV accounts
.PP
Each journal entry will have two postings, to \f[C]account1\f[] and
\f[C]account2\f[] respectively.
It's not yet possible to generate entries with more than two postings.
It's conventional and recommended to use \f[C]account1\f[] for the
account whose CSV we are reading.
.SS CSV amounts
.PP
The \f[C]amount\f[] field sets the amount of the \f[C]account1\f[]
posting.
.PP
If the CSV has debit/credit amounts in separate fields, assign to the
\f[C]amount\-in\f[] and \f[C]amount\-out\f[] pseudo fields instead.
(Whichever one has a value will be used, with appropriate sign.
If both contain a value, it may not work so well.)
.PP
If an amount value is parenthesised, it will be de\-parenthesised and
sign\-flipped.
.PP
If an amount value begins with a double minus sign, those will cancel
out and be removed.
.PP
If the CSV has the currency symbol in a separate field, assign that to
the \f[C]currency\f[] pseudo field to have it prepended to the amount.
Or, you can use a field assignment to \f[C]amount\f[] that interpolates
both CSV fields (giving more control, eg to put the currency symbol on
the right).
.SS CSV balance assertions
.PP
If the CSV includes a running balance, you can assign that to the
\f[C]balance\f[] pseudo field; whenever the running balance value is
non\-empty, it will be asserted as the balance after the
\f[C]account1\f[] posting.
.SS Reading multiple CSV files
.PP
You can read multiple CSV files at once using multiple \f[C]\-f\f[]
arguments on the command line, and hledger will look for a
correspondingly\-named rules file for each.
Note if you use the \f[C]\-\-rules\-file\f[] option, this one rules file
will be used for all the CSV files being read.
.SH "REPORTING BUGS"
Report bugs at http://bugs.hledger.org
(or on the #hledger IRC channel or hledger mail list)
.SH AUTHORS
Simon Michael <simon@joyful.com> and contributors
.SH COPYRIGHT
Copyright (C) 2007-2016 Simon Michael.
.br
Released under GNU GPL v3 or later.
.SH SEE ALSO
hledger(1), hledger\-ui(1), hledger\-web(1), hledger\-api(1),
hledger_csv(5), hledger_journal(5), hledger_timeclock(5), hledger_timedot(5),
ledger(1)
http://hledger.org