2016-04-14 08:29:16 +03:00
|
|
|
|
|
|
|
hledger_csv(5) hledger User Manuals hledger_csv(5)
|
|
|
|
|
|
|
|
|
|
|
|
|
2016-06-12 07:34:20 +03:00
|
|
|
NAME
|
2016-04-14 08:29:16 +03:00
|
|
|
CSV - how hledger reads CSV data, and the CSV rules file format
|
|
|
|
|
2016-06-12 07:34:20 +03:00
|
|
|
DESCRIPTION
|
2017-11-29 04:20:41 +03:00
|
|
|
hledger can read CSV (comma-separated value) files as if they were
|
|
|
|
journal files, automatically converting each CSV record into a transac-
|
|
|
|
tion. (To learn about writing CSV, see CSV output.)
|
2016-04-14 08:29:16 +03:00
|
|
|
|
2017-11-29 04:20:41 +03:00
|
|
|
Converting CSV to transactions requires some special conversion rules.
|
|
|
|
These do several things:
|
|
|
|
|
|
|
|
o they describe the layout and format of the CSV data
|
|
|
|
|
|
|
|
o they can customize the generated journal entries using a simple tem-
|
|
|
|
plating language
|
|
|
|
|
|
|
|
o they can add refinements based on patterns in the CSV data, eg cate-
|
|
|
|
gorizing transactions with more detailed account names.
|
|
|
|
|
|
|
|
When reading a CSV file named FILE.csv, hledger looks for a conversion
|
|
|
|
rules file named FILE.csv.rules in the same directory. You can over-
|
|
|
|
ride this with the --rules-file option. If the rules file does not
|
|
|
|
exist, hledger will auto-create one with some example rules, which
|
|
|
|
you'll need to adjust.
|
|
|
|
|
|
|
|
At minimum, the rules file must identify the date and amount fields.
|
2019-05-24 08:26:43 +03:00
|
|
|
It's often necessary to specify the date format, and the number of
|
|
|
|
header lines to skip, also. Eg:
|
2017-11-29 04:20:41 +03:00
|
|
|
|
|
|
|
fields date, _, _, amount
|
|
|
|
date-format %d/%m/%Y
|
|
|
|
skip 1
|
|
|
|
|
|
|
|
A more complete example:
|
|
|
|
|
|
|
|
# hledger CSV rules for amazon.com order history
|
|
|
|
|
|
|
|
# sample:
|
|
|
|
# "Date","Type","To/From","Name","Status","Amount","Fees","Transaction ID"
|
|
|
|
# "Jul 29, 2012","Payment","To","Adapteva, Inc.","Completed","$25.00","$0.00","17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL"
|
|
|
|
|
|
|
|
# skip one header line
|
|
|
|
skip 1
|
|
|
|
|
|
|
|
# name the csv fields (and assign the transaction's date, amount and code)
|
|
|
|
fields date, _, toorfrom, name, amzstatus, amount, fees, code
|
|
|
|
|
|
|
|
# how to parse the date
|
|
|
|
date-format %b %-d, %Y
|
|
|
|
|
|
|
|
# combine two fields to make the description
|
|
|
|
description %toorfrom %name
|
|
|
|
|
|
|
|
# save these fields as tags
|
|
|
|
comment status:%amzstatus, fees:%fees
|
|
|
|
|
|
|
|
# set the base account for all transactions
|
|
|
|
account1 assets:amazon
|
|
|
|
|
|
|
|
# flip the sign on the amount
|
|
|
|
amount -%amount
|
|
|
|
|
|
|
|
For more examples, see Convert CSV files.
|
2016-04-14 08:29:16 +03:00
|
|
|
|
2016-06-12 07:34:20 +03:00
|
|
|
CSV RULES
|
2017-07-07 04:01:11 +03:00
|
|
|
The following seven kinds of rule can appear in the rules file, in any
|
2016-04-14 08:29:16 +03:00
|
|
|
order. Blank lines and lines beginning with # or ; are ignored.
|
|
|
|
|
2016-06-12 07:34:20 +03:00
|
|
|
skip
|
|
|
|
skipN
|
2016-04-14 08:29:16 +03:00
|
|
|
|
|
|
|
Skip this number of CSV records at the beginning. You'll need this
|
|
|
|
whenever your CSV data contains header lines. Eg:
|
|
|
|
|
|
|
|
# ignore the first CSV line
|
|
|
|
skip 1
|
|
|
|
|
2016-06-12 07:34:20 +03:00
|
|
|
date-format
|
|
|
|
date-formatDATEFMT
|
2016-04-14 08:29:16 +03:00
|
|
|
|
2019-05-24 08:26:43 +03:00
|
|
|
When your CSV date fields are not formatted like YYYY/MM/DD (or YYYY-
|
|
|
|
MM-DD or YYYY.MM.DD), you'll need to specify the format. DATEFMT is a
|
|
|
|
strptime-like date parsing pattern, which must parse the date field
|
|
|
|
values completely. Examples:
|
2016-04-14 08:29:16 +03:00
|
|
|
|
|
|
|
# for dates like "11/06/2013":
|
|
|
|
date-format %m/%d/%Y
|
|
|
|
|
2019-01-20 02:29:23 +03:00
|
|
|
# for dates like "6/11/2013" (note the - to make leading zeros optional):
|
|
|
|
date-format %-d/%-m/%Y
|
|
|
|
|
2016-04-14 08:29:16 +03:00
|
|
|
# for dates like "2013-Nov-06":
|
|
|
|
date-format %Y-%h-%d
|
|
|
|
|
|
|
|
# for dates like "11/6/2013 11:32 PM":
|
|
|
|
date-format %-m/%-d/%Y %l:%M %p
|
|
|
|
|
2016-06-12 07:34:20 +03:00
|
|
|
field list
|
|
|
|
fieldsFIELDNAME1, FIELDNAME2...
|
2016-04-14 08:29:16 +03:00
|
|
|
|
|
|
|
This (a) names the CSV fields, in order (names may not contain white-
|
2016-08-02 22:55:14 +03:00
|
|
|
space; uninteresting names may be left blank), and (b) assigns them to
|
|
|
|
journal entry fields if you use any of these standard field names:
|
|
|
|
date, date2, status, code, description, comment, account1, account2,
|
2019-05-24 08:26:43 +03:00
|
|
|
amount, amount-in, amount-out, currency, balance, balance1, balance2.
|
|
|
|
Eg:
|
2016-04-14 08:29:16 +03:00
|
|
|
|
|
|
|
# use the 1st, 2nd and 4th CSV fields as the entry's date, description and amount,
|
|
|
|
# and give the 7th and 8th fields meaningful names for later reference:
|
|
|
|
#
|
|
|
|
# CSV field:
|
|
|
|
# 1 2 3 4 5 6 7 8
|
|
|
|
# entry field:
|
|
|
|
fields date, description, , amount, , , somefield, anotherfield
|
|
|
|
|
2016-06-12 07:34:20 +03:00
|
|
|
field assignment
|
|
|
|
ENTRYFIELDNAME FIELDVALUE
|
2016-04-14 08:29:16 +03:00
|
|
|
|
2019-05-24 08:26:43 +03:00
|
|
|
This sets a journal entry field (one of the standard names above) to
|
|
|
|
the given text value, which can include CSV field values interpolated
|
|
|
|
by name (%CSVFIELDNAME) or 1-based position (%N). Eg:
|
2016-04-14 08:29:16 +03:00
|
|
|
|
|
|
|
# set the amount to the 4th CSV field with "USD " prepended
|
|
|
|
amount USD %4
|
|
|
|
|
|
|
|
# combine three fields to make a comment (containing two tags)
|
|
|
|
comment note: %somefield - %anotherfield, date: %1
|
|
|
|
|
2019-05-24 08:26:43 +03:00
|
|
|
Field assignments can be used instead of or in addition to a field
|
2016-04-14 08:29:16 +03:00
|
|
|
list.
|
|
|
|
|
2016-06-12 07:34:20 +03:00
|
|
|
conditional block
|
|
|
|
if PATTERN
|
|
|
|
FIELDASSIGNMENTS...
|
2016-04-14 08:29:16 +03:00
|
|
|
|
|
|
|
if
|
2016-06-12 07:34:20 +03:00
|
|
|
PATTERN
|
|
|
|
PATTERN...
|
|
|
|
FIELDASSIGNMENTS...
|
2016-04-14 08:29:16 +03:00
|
|
|
|
2019-05-24 08:26:43 +03:00
|
|
|
This applies one or more field assignments, only to those CSV records
|
2016-04-14 08:29:16 +03:00
|
|
|
matched by one of the PATTERNs. The patterns are case-insensitive reg-
|
|
|
|
ular expressions which match anywhere within the whole CSV record (it's
|
2019-05-24 08:26:43 +03:00
|
|
|
not yet possible to match within a specific field). When there are
|
|
|
|
multiple patterns they can be written on separate lines, unindented.
|
|
|
|
The field assignments are on separate lines indented by at least one
|
2016-04-14 08:29:16 +03:00
|
|
|
space. Examples:
|
|
|
|
|
|
|
|
# if the CSV record contains "groceries", set account2 to "expenses:groceries"
|
|
|
|
if groceries
|
|
|
|
account2 expenses:groceries
|
|
|
|
|
|
|
|
# if the CSV record contains any of these patterns, set account2 and comment as shown
|
|
|
|
if
|
|
|
|
monthly service fee
|
|
|
|
atm transaction fee
|
|
|
|
banking thru software
|
|
|
|
account2 expenses:business:banking
|
|
|
|
comment XXX deductible ? check it
|
|
|
|
|
2016-06-12 07:34:20 +03:00
|
|
|
include
|
|
|
|
includeRULESFILE
|
2016-04-14 08:29:16 +03:00
|
|
|
|
|
|
|
Include another rules file at this point. RULESFILE is either an abso-
|
|
|
|
lute file path or a path relative to the current file's directory. Eg:
|
|
|
|
|
|
|
|
# rules reused with several CSV files
|
|
|
|
include common.rules
|
|
|
|
|
2017-07-07 04:01:11 +03:00
|
|
|
newest-first
|
|
|
|
newest-first
|
|
|
|
|
2019-05-24 08:26:43 +03:00
|
|
|
Consider adding this rule if all of the following are true: you might
|
|
|
|
be processing just one day of data, your CSV records are in reverse
|
|
|
|
chronological order (newest first), and you care about preserving the
|
|
|
|
order of same-day transactions. It usually isn't needed, because
|
|
|
|
hledger autodetects the CSV order, but when all CSV records have the
|
2017-08-15 18:17:15 +03:00
|
|
|
same date it will assume they are oldest first.
|
2017-07-07 04:01:11 +03:00
|
|
|
|
2017-04-19 18:58:51 +03:00
|
|
|
CSV TIPS
|
2017-08-15 18:17:15 +03:00
|
|
|
CSV ordering
|
2019-05-24 08:26:43 +03:00
|
|
|
The generated journal entries will be sorted by date. The order of
|
|
|
|
same-day entries will be preserved (except in the special case where
|
2017-08-15 18:17:15 +03:00
|
|
|
you might need newest-first, see above).
|
2016-04-14 08:29:16 +03:00
|
|
|
|
2017-08-15 18:17:15 +03:00
|
|
|
CSV accounts
|
2019-05-24 08:26:43 +03:00
|
|
|
Each journal entry will have two postings, to account1 and account2
|
2017-08-15 18:17:15 +03:00
|
|
|
respectively. It's not yet possible to generate entries with more than
|
2019-05-24 08:26:43 +03:00
|
|
|
two postings. It's conventional and recommended to use account1 for
|
2017-08-15 18:17:15 +03:00
|
|
|
the account whose CSV we are reading.
|
2016-04-14 08:29:16 +03:00
|
|
|
|
2017-08-15 18:17:15 +03:00
|
|
|
CSV amounts
|
2019-05-24 08:26:43 +03:00
|
|
|
A transaction amount must be set, in one of these ways:
|
2016-04-14 08:29:16 +03:00
|
|
|
|
2019-05-24 08:26:43 +03:00
|
|
|
o with an amount field assignment, which sets the first posting's
|
|
|
|
amount
|
2017-04-19 18:58:51 +03:00
|
|
|
|
2019-05-24 08:26:43 +03:00
|
|
|
o (When the CSV has debit and credit amounts in separate fields:)
|
|
|
|
with field assignments for the amount-in and amount-out pseudo fields
|
|
|
|
(both of them). Whichever one has a value will be used, with appropri-
|
|
|
|
ate sign. If both contain a value, it might not work so well.
|
2017-08-15 18:17:15 +03:00
|
|
|
|
2019-05-24 08:26:43 +03:00
|
|
|
o or implicitly by means of a balance assignment (see below).
|
2016-04-14 08:29:16 +03:00
|
|
|
|
2019-05-24 08:26:43 +03:00
|
|
|
There is some special handling for sign in amounts:
|
2017-08-15 18:17:15 +03:00
|
|
|
|
2019-05-24 08:26:43 +03:00
|
|
|
o If an amount value is parenthesised, it will be de-parenthesised and
|
|
|
|
sign-flipped.
|
|
|
|
|
|
|
|
o If an amount value begins with a double minus sign, those will cancel
|
|
|
|
out and be removed.
|
|
|
|
|
|
|
|
If the currency/commodity symbol is provided as a separate CSV field,
|
|
|
|
assign it to the currency pseudo field; the symbol will be prepended to
|
|
|
|
the amount (TODO: when there is an amount). Or, you can use an amount
|
|
|
|
field assignment for more control, eg:
|
|
|
|
|
|
|
|
fields date,description,currency,amount
|
|
|
|
amount %amount %currency
|
|
|
|
|
|
|
|
CSV balance assertions/assignments
|
|
|
|
If the CSV includes a running balance, you can assign that to one of
|
|
|
|
the pseudo fields balance (or balance1) or balance2. This will gener-
|
|
|
|
ate a balance assertion (or if the amount is left empty, a balance
|
|
|
|
assignment), on the first or second posting, whenever the running bal-
|
|
|
|
ance field is non-empty. (TODO: #1000)
|
2016-04-14 08:29:16 +03:00
|
|
|
|
2017-09-18 04:57:42 +03:00
|
|
|
Reading multiple CSV files
|
2019-05-24 08:26:43 +03:00
|
|
|
You can read multiple CSV files at once using multiple -f arguments on
|
|
|
|
the command line, and hledger will look for a correspondingly-named
|
2017-09-18 04:57:42 +03:00
|
|
|
rules file for each. Note if you use the --rules-file option, this one
|
|
|
|
rules file will be used for all the CSV files being read.
|
|
|
|
|
2016-04-14 08:29:16 +03:00
|
|
|
|
|
|
|
|
2016-06-12 07:34:20 +03:00
|
|
|
REPORTING BUGS
|
2019-05-24 08:26:43 +03:00
|
|
|
Report bugs at http://bugs.hledger.org (or on the #hledger IRC channel
|
2016-04-14 08:29:16 +03:00
|
|
|
or hledger mail list)
|
|
|
|
|
|
|
|
|
2016-06-12 07:34:20 +03:00
|
|
|
AUTHORS
|
2016-04-14 08:29:16 +03:00
|
|
|
Simon Michael <simon@joyful.com> and contributors
|
|
|
|
|
|
|
|
|
2016-06-12 07:34:20 +03:00
|
|
|
COPYRIGHT
|
2016-04-14 08:29:16 +03:00
|
|
|
Copyright (C) 2007-2016 Simon Michael.
|
|
|
|
Released under GNU GPL v3 or later.
|
|
|
|
|
|
|
|
|
2016-06-12 07:34:20 +03:00
|
|
|
SEE ALSO
|
2019-05-24 08:26:43 +03:00
|
|
|
hledger(1), hledger-ui(1), hledger-web(1), hledger-api(1),
|
2016-04-14 08:29:16 +03:00
|
|
|
hledger_csv(5), hledger_journal(5), hledger_timeclock(5), hledger_time-
|
|
|
|
dot(5), ledger(1)
|
|
|
|
|
|
|
|
http://hledger.org
|
|
|
|
|
|
|
|
|
|
|
|
|
2019-05-24 08:26:43 +03:00
|
|
|
hledger 1.14.99 March 2019 hledger_csv(5)
|