mirror of
https://github.com/simonmichael/hledger.git
synced 2024-11-09 00:15:48 +03:00
b1859769ac
[ci skip]
371 lines
10 KiB
Groff
371 lines
10 KiB
Groff
|
|
.TH "hledger_csv" "5" "August 2019" "hledger 1.15" "hledger User Manuals"
|
|
|
|
|
|
|
|
.SH NAME
|
|
.PP
|
|
CSV - how hledger reads CSV data, and the CSV rules file format
|
|
.SH DESCRIPTION
|
|
.PP
|
|
hledger can read CSV (comma-separated value) files as if they were
|
|
journal files, automatically converting each CSV record into a
|
|
transaction.
|
|
(To learn about \f[I]writing\f[R] CSV, see CSV output.)
|
|
.PP
|
|
Converting CSV to transactions requires some special conversion rules.
|
|
These do several things:
|
|
.IP \[bu] 2
|
|
they describe the layout and format of the CSV data
|
|
.IP \[bu] 2
|
|
they can customize the generated journal entries using a simple
|
|
templating language
|
|
.IP \[bu] 2
|
|
they can add refinements based on patterns in the CSV data, eg
|
|
categorizing transactions with more detailed account names.
|
|
.PP
|
|
When reading a CSV file named \f[C]FILE.csv\f[R], hledger looks for a
|
|
conversion rules file named \f[C]FILE.csv.rules\f[R] in the same
|
|
directory.
|
|
You can override this with the \f[C]--rules-file\f[R] option.
|
|
If the rules file does not exist, hledger will auto-create one with some
|
|
example rules, which you\[aq]ll need to adjust.
|
|
.PP
|
|
At minimum, the rules file must identify the date and amount fields.
|
|
It\[aq]s often necessary to specify the date format, and the number of
|
|
header lines to skip, also.
|
|
Eg:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
fields date, _, _, amount
|
|
date-format %d/%m/%Y
|
|
skip 1
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
A more complete example:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# hledger CSV rules for amazon.com order history
|
|
|
|
# sample:
|
|
# \[dq]Date\[dq],\[dq]Type\[dq],\[dq]To/From\[dq],\[dq]Name\[dq],\[dq]Status\[dq],\[dq]Amount\[dq],\[dq]Fees\[dq],\[dq]Transaction ID\[dq]
|
|
# \[dq]Jul 29, 2012\[dq],\[dq]Payment\[dq],\[dq]To\[dq],\[dq]Adapteva, Inc.\[dq],\[dq]Completed\[dq],\[dq]$25.00\[dq],\[dq]$0.00\[dq],\[dq]17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL\[dq]
|
|
|
|
# skip one header line
|
|
skip 1
|
|
|
|
# name the csv fields (and assign the transaction\[aq]s date, amount and code)
|
|
fields date, _, toorfrom, name, amzstatus, amount, fees, code
|
|
|
|
# how to parse the date
|
|
date-format %b %-d, %Y
|
|
|
|
# combine two fields to make the description
|
|
description %toorfrom %name
|
|
|
|
# save these fields as tags
|
|
comment status:%amzstatus, fees:%fees
|
|
|
|
# set the base account for all transactions
|
|
account1 assets:amazon
|
|
|
|
# flip the sign on the amount
|
|
amount -%amount
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
For more examples, see Convert CSV files.
|
|
.SH CSV RULES
|
|
.PP
|
|
The following seven kinds of rule can appear in the rules file, in any
|
|
order.
|
|
Blank lines and lines beginning with \f[C]#\f[R] or \f[C];\f[R] are
|
|
ignored.
|
|
.SS skip
|
|
.PP
|
|
\f[C]skip\f[R]\f[I]\f[CI]N\f[I]\f[R]
|
|
.PP
|
|
Skip this number of CSV records at the beginning.
|
|
You\[aq]ll need this whenever your CSV data contains header lines.
|
|
Eg:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# ignore the first CSV line
|
|
skip 1
|
|
\f[R]
|
|
.fi
|
|
.SS date-format
|
|
.PP
|
|
\f[C]date-format\f[R]\f[I]\f[CI]DATEFMT\f[I]\f[R]
|
|
.PP
|
|
When your CSV date fields are not formatted like \f[C]YYYY/MM/DD\f[R]
|
|
(or \f[C]YYYY-MM-DD\f[R] or \f[C]YYYY.MM.DD\f[R]), you\[aq]ll need to
|
|
specify the format.
|
|
DATEFMT is a strptime-like date parsing pattern, which must parse the
|
|
date field values completely.
|
|
Examples:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# for dates like \[dq]11/06/2013\[dq]:
|
|
date-format %m/%d/%Y
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# for dates like \[dq]6/11/2013\[dq] (note the - to make leading zeros optional):
|
|
date-format %-d/%-m/%Y
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# for dates like \[dq]2013-Nov-06\[dq]:
|
|
date-format %Y-%h-%d
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# for dates like \[dq]11/6/2013 11:32 PM\[dq]:
|
|
date-format %-m/%-d/%Y %l:%M %p
|
|
\f[R]
|
|
.fi
|
|
.SS field list
|
|
.PP
|
|
\f[C]fields\f[R]\f[I]\f[CI]FIELDNAME1\f[I]\f[R],
|
|
\f[I]\f[CI]FIELDNAME2\f[I]\f[R]...
|
|
.PP
|
|
This (a) names the CSV fields, in order (names may not contain
|
|
whitespace; uninteresting names may be left blank), and (b) assigns them
|
|
to journal entry fields if you use any of these standard field names:
|
|
\f[C]date\f[R], \f[C]date2\f[R], \f[C]status\f[R], \f[C]code\f[R],
|
|
\f[C]description\f[R], \f[C]comment\f[R], \f[C]account1\f[R],
|
|
\f[C]account2\f[R], \f[C]amount\f[R], \f[C]amount-in\f[R],
|
|
\f[C]amount-out\f[R], \f[C]currency\f[R], \f[C]balance\f[R],
|
|
\f[C]balance1\f[R], \f[C]balance2\f[R].
|
|
Eg:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# use the 1st, 2nd and 4th CSV fields as the entry\[aq]s date, description and amount,
|
|
# and give the 7th and 8th fields meaningful names for later reference:
|
|
#
|
|
# CSV field:
|
|
# 1 2 3 4 5 6 7 8
|
|
# entry field:
|
|
fields date, description, , amount, , , somefield, anotherfield
|
|
\f[R]
|
|
.fi
|
|
.SS field assignment
|
|
.PP
|
|
\f[I]\f[CI]ENTRYFIELDNAME\f[I]\f[R] \f[I]\f[CI]FIELDVALUE\f[I]\f[R]
|
|
.PP
|
|
This sets a journal entry field (one of the standard names above) to the
|
|
given text value, which can include CSV field values interpolated by
|
|
name (\f[C]%CSVFIELDNAME\f[R]) or 1-based position (\f[C]%N\f[R]).
|
|
Eg:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# set the amount to the 4th CSV field with \[dq]USD \[dq] prepended
|
|
amount USD %4
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# combine three fields to make a comment (containing two tags)
|
|
comment note: %somefield - %anotherfield, date: %1
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
Field assignments can be used instead of or in addition to a field list.
|
|
.PP
|
|
Note, interpolation strips any outer whitespace, so a CSV value like
|
|
\f[C]\[dq] 1 \[dq]\f[R] becomes \f[C]1\f[R] when interpolated (#1051).
|
|
.SS conditional block
|
|
.PP
|
|
\f[C]if\f[R] \f[I]\f[CI]PATTERN\f[I]\f[R]
|
|
.PD 0
|
|
.P
|
|
.PD
|
|
\ \ \ \ \f[I]\f[CI]FIELDASSIGNMENTS\f[I]\f[R]...
|
|
.PP
|
|
\f[C]if\f[R]
|
|
.PD 0
|
|
.P
|
|
.PD
|
|
\f[I]\f[CI]PATTERN\f[I]\f[R]
|
|
.PD 0
|
|
.P
|
|
.PD
|
|
\f[I]\f[CI]PATTERN\f[I]\f[R]...
|
|
.PD 0
|
|
.P
|
|
.PD
|
|
\ \ \ \ \f[I]\f[CI]FIELDASSIGNMENTS\f[I]\f[R]...
|
|
.PP
|
|
This applies one or more field assignments, only to those CSV records
|
|
matched by one of the PATTERNs.
|
|
The patterns are case-insensitive regular expressions which match
|
|
anywhere within the whole CSV record (it\[aq]s not yet possible to match
|
|
within a specific field).
|
|
When there are multiple patterns they can be written on separate lines,
|
|
unindented.
|
|
The field assignments are on separate lines indented by at least one
|
|
space.
|
|
Examples:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# if the CSV record contains \[dq]groceries\[dq], set account2 to \[dq]expenses:groceries\[dq]
|
|
if groceries
|
|
account2 expenses:groceries
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# if the CSV record contains any of these patterns, set account2 and comment as shown
|
|
if
|
|
monthly service fee
|
|
atm transaction fee
|
|
banking thru software
|
|
account2 expenses:business:banking
|
|
comment XXX deductible ? check it
|
|
\f[R]
|
|
.fi
|
|
.SS include
|
|
.PP
|
|
\f[C]include\f[R]\f[I]\f[CI]RULESFILE\f[I]\f[R]
|
|
.PP
|
|
Include another rules file at this point.
|
|
\f[C]RULESFILE\f[R] is either an absolute file path or a path relative
|
|
to the current file\[aq]s directory.
|
|
Eg:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# rules reused with several CSV files
|
|
include common.rules
|
|
\f[R]
|
|
.fi
|
|
.SS newest-first
|
|
.PP
|
|
\f[C]newest-first\f[R]
|
|
.PP
|
|
Consider adding this rule if all of the following are true: you might be
|
|
processing just one day of data, your CSV records are in reverse
|
|
chronological order (newest first), and you care about preserving the
|
|
order of same-day transactions.
|
|
It usually isn\[aq]t needed, because hledger autodetects the CSV order,
|
|
but when all CSV records have the same date it will assume they are
|
|
oldest first.
|
|
.SH CSV TIPS
|
|
.SS CSV ordering
|
|
.PP
|
|
The generated journal entries will be sorted by date.
|
|
The order of same-day entries will be preserved (except in the special
|
|
case where you might need \f[C]newest-first\f[R], see above).
|
|
.SS CSV accounts
|
|
.PP
|
|
Each journal entry will have two postings, to \f[C]account1\f[R] and
|
|
\f[C]account2\f[R] respectively.
|
|
It\[aq]s not yet possible to generate entries with more than two
|
|
postings.
|
|
It\[aq]s conventional and recommended to use \f[C]account1\f[R] for the
|
|
account whose CSV we are reading.
|
|
.SS CSV amounts
|
|
.PP
|
|
A transaction amount must be set, in one of these ways:
|
|
.IP \[bu] 2
|
|
with an \f[C]amount\f[R] field assignment, which sets the first
|
|
posting\[aq]s amount
|
|
.IP \[bu] 2
|
|
(When the CSV has debit and credit amounts in separate fields:)
|
|
.PD 0
|
|
.P
|
|
.PD
|
|
with field assignments for the \f[C]amount-in\f[R] and
|
|
\f[C]amount-out\f[R] pseudo fields (both of them).
|
|
Whichever one has a value will be used, with appropriate sign.
|
|
If both contain a value, it might not work so well.
|
|
.IP \[bu] 2
|
|
or implicitly by means of a balance assignment (see below).
|
|
.PP
|
|
There is some special handling for sign in amounts:
|
|
.IP \[bu] 2
|
|
If an amount value is parenthesised, it will be de-parenthesised and
|
|
sign-flipped.
|
|
.IP \[bu] 2
|
|
If an amount value begins with a double minus sign, those will cancel
|
|
out and be removed.
|
|
.PP
|
|
If the currency/commodity symbol is provided as a separate CSV field,
|
|
assign it to the \f[C]currency\f[R] pseudo field; the symbol will be
|
|
prepended to the amount (TODO: when there is an amount).
|
|
Or, you can use an \f[C]amount\f[R] field assignment for more control,
|
|
eg:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
fields date,description,currency,amount
|
|
amount %amount %currency
|
|
\f[R]
|
|
.fi
|
|
.SS CSV balance assertions/assignments
|
|
.PP
|
|
If the CSV includes a running balance, you can assign that to one of the
|
|
pseudo fields \f[C]balance\f[R] (or \f[C]balance1\f[R]) or
|
|
\f[C]balance2\f[R].
|
|
This will generate a balance assertion (or if the amount is left empty,
|
|
a balance assignment), on the first or second posting, whenever the
|
|
running balance field is non-empty.
|
|
(TODO: #1000)
|
|
.SS Reading multiple CSV files
|
|
.PP
|
|
You can read multiple CSV files at once using multiple \f[C]-f\f[R]
|
|
arguments on the command line, and hledger will look for a
|
|
correspondingly-named rules file for each.
|
|
Note if you use the \f[C]--rules-file\f[R] option, this one rules file
|
|
will be used for all the CSV files being read.
|
|
.SS Valid CSV
|
|
.PP
|
|
hledger follows RFC 4180, with the addition of a customisable separator
|
|
character.
|
|
.PP
|
|
Some things to note:
|
|
.PP
|
|
When quoting fields,
|
|
.IP \[bu] 2
|
|
you must use double quotes, not single quotes
|
|
.IP \[bu] 2
|
|
spaces outside the quotes are not allowed.
|
|
|
|
|
|
.SH "REPORTING BUGS"
|
|
Report bugs at http://bugs.hledger.org
|
|
(or on the #hledger IRC channel or hledger mail list)
|
|
|
|
.SH AUTHORS
|
|
Simon Michael <simon@joyful.com> and contributors
|
|
|
|
.SH COPYRIGHT
|
|
|
|
Copyright (C) 2007-2016 Simon Michael.
|
|
.br
|
|
Released under GNU GPL v3 or later.
|
|
|
|
.SH SEE ALSO
|
|
hledger(1), hledger\-ui(1), hledger\-web(1), hledger\-api(1),
|
|
hledger_csv(5), hledger_journal(5), hledger_timeclock(5), hledger_timedot(5),
|
|
ledger(1)
|
|
|
|
http://hledger.org
|