mirror of
https://github.com/simonmichael/hledger.git
synced 2024-11-08 07:09:28 +03:00
1301 lines
39 KiB
Groff
1301 lines
39 KiB
Groff
.\"t
|
|
|
|
.TH "hledger_csv" "5" "December 2020" "hledger 1.20.99" "hledger User Manuals"
|
|
|
|
|
|
|
|
.SH NAME
|
|
.PP
|
|
CSV - how hledger reads CSV data, and the CSV rules file format
|
|
.SH DESCRIPTION
|
|
.PP
|
|
hledger can read CSV files (Character Separated Value - usually comma,
|
|
semicolon, or tab) containing dated records as if they were journal
|
|
files, automatically converting each CSV record into a transaction.
|
|
.PP
|
|
(To learn about \f[I]writing\f[R] CSV, see CSV output.)
|
|
.PP
|
|
We describe each CSV file\[aq]s format with a corresponding \f[I]rules
|
|
file\f[R].
|
|
By default this is named like the CSV file with a \f[C].rules\f[R]
|
|
extension added.
|
|
Eg when reading \f[C]FILE.csv\f[R], hledger also looks for
|
|
\f[C]FILE.csv.rules\f[R] in the same directory as \f[C]FILE.csv\f[R].
|
|
You can specify a different rules file with the \f[C]--rules-file\f[R]
|
|
option.
|
|
If a rules file is not found, hledger will create a sample rules file,
|
|
which you\[aq]ll need to adjust.
|
|
.PP
|
|
This file contains rules describing the CSV data (header line, fields
|
|
layout, date format etc.), and how to construct hledger journal entries
|
|
(transactions) from it.
|
|
Often there will also be a list of conditional rules for categorising
|
|
transactions based on their descriptions.
|
|
Here\[aq]s an overview of the CSV rules; these are described more fully
|
|
below, after the examples:
|
|
.PP
|
|
.TS
|
|
tab(@);
|
|
lw(30.1n) lw(39.9n).
|
|
T{
|
|
\f[B]\f[CB]skip\f[B]\f[R]
|
|
T}@T{
|
|
skip one or more header lines or matched CSV records
|
|
T}
|
|
T{
|
|
\f[B]\f[CB]fields\f[B]\f[R]
|
|
T}@T{
|
|
name CSV fields, assign them to hledger fields
|
|
T}
|
|
T{
|
|
\f[B]field assignment\f[R]
|
|
T}@T{
|
|
assign a value to one hledger field, with interpolation
|
|
T}
|
|
T{
|
|
\f[B]\f[CB]separator\f[B]\f[R]
|
|
T}@T{
|
|
a custom field separator
|
|
T}
|
|
T{
|
|
\f[B]\f[CB]if\f[B] block\f[R]
|
|
T}@T{
|
|
apply some rules to CSV records matched by patterns
|
|
T}
|
|
T{
|
|
\f[B]\f[CB]if\f[B] table\f[R]
|
|
T}@T{
|
|
apply some rules to CSV records matched by patterns, alternate syntax
|
|
T}
|
|
T{
|
|
\f[B]\f[CB]end\f[B]\f[R]
|
|
T}@T{
|
|
skip the remaining CSV records
|
|
T}
|
|
T{
|
|
\f[B]\f[CB]date-format\f[B]\f[R]
|
|
T}@T{
|
|
how to parse dates in CSV records
|
|
T}
|
|
T{
|
|
\f[B]\f[CB]decimal-mark\f[B]\f[R]
|
|
T}@T{
|
|
the decimal mark used in CSV amounts, if ambiguous
|
|
T}
|
|
T{
|
|
\f[B]\f[CB]newest-first\f[B]\f[R]
|
|
T}@T{
|
|
disambiguate record order when there\[aq]s only one date
|
|
T}
|
|
T{
|
|
\f[B]\f[CB]include\f[B]\f[R]
|
|
T}@T{
|
|
inline another CSV rules file
|
|
T}
|
|
T{
|
|
\f[B]\f[CB]balance-type\f[B]\f[R]
|
|
T}@T{
|
|
choose which type of balance assignments to use
|
|
T}
|
|
.TE
|
|
.PP
|
|
Note, for best error messages when reading CSV files, use a
|
|
\f[C].csv\f[R], \f[C].tsv\f[R] or \f[C].ssv\f[R] file extension or file
|
|
prefix - see File Extension below.
|
|
.PP
|
|
There\[aq]s an introductory Convert CSV files tutorial on hledger.org.
|
|
.SH EXAMPLES
|
|
.PP
|
|
Here are some sample hledger CSV rules files.
|
|
See also the full collection at:
|
|
.PD 0
|
|
.P
|
|
.PD
|
|
https://github.com/simonmichael/hledger/tree/master/examples/csv
|
|
.SS Basic
|
|
.PP
|
|
At minimum, the rules file must identify the date and amount fields, and
|
|
often it also specifies the date format and how many header lines there
|
|
are.
|
|
Here\[aq]s a simple CSV file and a rules file for it:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
Date, Description, Id, Amount
|
|
12/11/2019, Foo, 123, 10.23
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# basic.csv.rules
|
|
skip 1
|
|
fields date, description, _, amount
|
|
date-format %d/%m/%Y
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
$ hledger print -f basic.csv
|
|
2019-11-12 Foo
|
|
expenses:unknown 10.23
|
|
income:unknown -10.23
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
Default account names are chosen, since we didn\[aq]t set them.
|
|
.SS Bank of Ireland
|
|
.PP
|
|
Here\[aq]s a CSV with two amount fields (Debit and Credit), and a
|
|
balance field, which we can use to add balance assertions, which is not
|
|
necessary but provides extra error checking:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
Date,Details,Debit,Credit,Balance
|
|
07/12/2012,LODGMENT 529898,,10.0,131.21
|
|
07/12/2012,PAYMENT,5,,126
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# bankofireland-checking.csv.rules
|
|
|
|
# skip the header line
|
|
skip
|
|
|
|
# name the csv fields, and assign some of them as journal entry fields
|
|
fields date, description, amount-out, amount-in, balance
|
|
|
|
# We generate balance assertions by assigning to \[dq]balance\[dq]
|
|
# above, but you may sometimes need to remove these because:
|
|
#
|
|
# - the CSV balance differs from the true balance,
|
|
# by up to 0.0000000000005 in my experience
|
|
#
|
|
# - it is sometimes calculated based on non-chronological ordering,
|
|
# eg when multiple transactions clear on the same day
|
|
|
|
# date is in UK/Ireland format
|
|
date-format %d/%m/%Y
|
|
|
|
# set the currency
|
|
currency EUR
|
|
|
|
# set the base account for all txns
|
|
account1 assets:bank:boi:checking
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
$ hledger -f bankofireland-checking.csv print
|
|
2012-12-07 LODGMENT 529898
|
|
assets:bank:boi:checking EUR10.0 = EUR131.2
|
|
income:unknown EUR-10.0
|
|
|
|
2012-12-07 PAYMENT
|
|
assets:bank:boi:checking EUR-5.0 = EUR126.0
|
|
expenses:unknown EUR5.0
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
The balance assertions don\[aq]t raise an error above, because we\[aq]re
|
|
reading directly from CSV, but they will be checked if these entries are
|
|
imported into a journal file.
|
|
.SS Amazon
|
|
.PP
|
|
Here we convert amazon.com order history, and use an if block to
|
|
generate a third posting if there\[aq]s a fee.
|
|
(In practice you\[aq]d probably get this data from your bank instead,
|
|
but it\[aq]s an example.)
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
\[dq]Date\[dq],\[dq]Type\[dq],\[dq]To/From\[dq],\[dq]Name\[dq],\[dq]Status\[dq],\[dq]Amount\[dq],\[dq]Fees\[dq],\[dq]Transaction ID\[dq]
|
|
\[dq]Jul 29, 2012\[dq],\[dq]Payment\[dq],\[dq]To\[dq],\[dq]Foo.\[dq],\[dq]Completed\[dq],\[dq]$20.00\[dq],\[dq]$0.00\[dq],\[dq]16000000000000DGLNJPI1P9B8DKPVHL\[dq]
|
|
\[dq]Jul 30, 2012\[dq],\[dq]Payment\[dq],\[dq]To\[dq],\[dq]Adapteva, Inc.\[dq],\[dq]Completed\[dq],\[dq]$25.00\[dq],\[dq]$1.00\[dq],\[dq]17LA58JSKRD4HDGLNJPI1P9B8DKPVHL\[dq]
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# amazon-orders.csv.rules
|
|
|
|
# skip one header line
|
|
skip 1
|
|
|
|
# name the csv fields, and assign the transaction\[aq]s date, amount and code.
|
|
# Avoided the \[dq]status\[dq] and \[dq]amount\[dq] hledger field names to prevent confusion.
|
|
fields date, _, toorfrom, name, amzstatus, amzamount, fees, code
|
|
|
|
# how to parse the date
|
|
date-format %b %-d, %Y
|
|
|
|
# combine two fields to make the description
|
|
description %toorfrom %name
|
|
|
|
# save the status as a tag
|
|
comment status:%amzstatus
|
|
|
|
# set the base account for all transactions
|
|
account1 assets:amazon
|
|
# leave amount1 blank so it can balance the other(s).
|
|
# I\[aq]m assuming amzamount excludes the fees, don\[aq]t remember
|
|
|
|
# set a generic account2
|
|
account2 expenses:misc
|
|
amount2 %amzamount
|
|
# and maybe refine it further:
|
|
#include categorisation.rules
|
|
|
|
# add a third posting for fees, but only if they are non-zero.
|
|
if %fees [1-9]
|
|
account3 expenses:fees
|
|
amount3 %fees
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
$ hledger -f amazon-orders.csv print
|
|
2012-07-29 (16000000000000DGLNJPI1P9B8DKPVHL) To Foo. ; status:Completed
|
|
assets:amazon
|
|
expenses:misc $20.00
|
|
|
|
2012-07-30 (17LA58JSKRD4HDGLNJPI1P9B8DKPVHL) To Adapteva, Inc. ; status:Completed
|
|
assets:amazon
|
|
expenses:misc $25.00
|
|
expenses:fees $1.00
|
|
\f[R]
|
|
.fi
|
|
.SS Paypal
|
|
.PP
|
|
Here\[aq]s a real-world rules file for (customised) Paypal CSV, with
|
|
some Paypal-specific rules, and a second rules file included:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
\[dq]Date\[dq],\[dq]Time\[dq],\[dq]TimeZone\[dq],\[dq]Name\[dq],\[dq]Type\[dq],\[dq]Status\[dq],\[dq]Currency\[dq],\[dq]Gross\[dq],\[dq]Fee\[dq],\[dq]Net\[dq],\[dq]From Email Address\[dq],\[dq]To Email Address\[dq],\[dq]Transaction ID\[dq],\[dq]Item Title\[dq],\[dq]Item ID\[dq],\[dq]Reference Txn ID\[dq],\[dq]Receipt ID\[dq],\[dq]Balance\[dq],\[dq]Note\[dq]
|
|
\[dq]10/01/2019\[dq],\[dq]03:46:20\[dq],\[dq]PDT\[dq],\[dq]Calm Radio\[dq],\[dq]Subscription Payment\[dq],\[dq]Completed\[dq],\[dq]USD\[dq],\[dq]-6.99\[dq],\[dq]0.00\[dq],\[dq]-6.99\[dq],\[dq]simon\[at]joyful.com\[dq],\[dq]memberships\[at]calmradio.com\[dq],\[dq]60P57143A8206782E\[dq],\[dq]MONTHLY - $1 for the first 2 Months: Me - Order 99309. Item total: $1.00 USD first 2 months, then $6.99 / Month\[dq],\[dq]\[dq],\[dq]I-R8YLY094FJYR\[dq],\[dq]\[dq],\[dq]-6.99\[dq],\[dq]\[dq]
|
|
\[dq]10/01/2019\[dq],\[dq]03:46:20\[dq],\[dq]PDT\[dq],\[dq]\[dq],\[dq]Bank Deposit to PP Account \[dq],\[dq]Pending\[dq],\[dq]USD\[dq],\[dq]6.99\[dq],\[dq]0.00\[dq],\[dq]6.99\[dq],\[dq]\[dq],\[dq]simon\[at]joyful.com\[dq],\[dq]0TU1544T080463733\[dq],\[dq]\[dq],\[dq]\[dq],\[dq]60P57143A8206782E\[dq],\[dq]\[dq],\[dq]0.00\[dq],\[dq]\[dq]
|
|
\[dq]10/01/2019\[dq],\[dq]08:57:01\[dq],\[dq]PDT\[dq],\[dq]Patreon\[dq],\[dq]PreApproved Payment Bill User Payment\[dq],\[dq]Completed\[dq],\[dq]USD\[dq],\[dq]-7.00\[dq],\[dq]0.00\[dq],\[dq]-7.00\[dq],\[dq]simon\[at]joyful.com\[dq],\[dq]support\[at]patreon.com\[dq],\[dq]2722394R5F586712G\[dq],\[dq]Patreon* Membership\[dq],\[dq]\[dq],\[dq]B-0PG93074E7M86381M\[dq],\[dq]\[dq],\[dq]-7.00\[dq],\[dq]\[dq]
|
|
\[dq]10/01/2019\[dq],\[dq]08:57:01\[dq],\[dq]PDT\[dq],\[dq]\[dq],\[dq]Bank Deposit to PP Account \[dq],\[dq]Pending\[dq],\[dq]USD\[dq],\[dq]7.00\[dq],\[dq]0.00\[dq],\[dq]7.00\[dq],\[dq]\[dq],\[dq]simon\[at]joyful.com\[dq],\[dq]71854087RG994194F\[dq],\[dq]Patreon* Membership\[dq],\[dq]\[dq],\[dq]2722394R5F586712G\[dq],\[dq]\[dq],\[dq]0.00\[dq],\[dq]\[dq]
|
|
\[dq]10/19/2019\[dq],\[dq]03:02:12\[dq],\[dq]PDT\[dq],\[dq]Wikimedia Foundation, Inc.\[dq],\[dq]Subscription Payment\[dq],\[dq]Completed\[dq],\[dq]USD\[dq],\[dq]-2.00\[dq],\[dq]0.00\[dq],\[dq]-2.00\[dq],\[dq]simon\[at]joyful.com\[dq],\[dq]tle\[at]wikimedia.org\[dq],\[dq]K9U43044RY432050M\[dq],\[dq]Monthly donation to the Wikimedia Foundation\[dq],\[dq]\[dq],\[dq]I-R5C3YUS3285L\[dq],\[dq]\[dq],\[dq]-2.00\[dq],\[dq]\[dq]
|
|
\[dq]10/19/2019\[dq],\[dq]03:02:12\[dq],\[dq]PDT\[dq],\[dq]\[dq],\[dq]Bank Deposit to PP Account \[dq],\[dq]Pending\[dq],\[dq]USD\[dq],\[dq]2.00\[dq],\[dq]0.00\[dq],\[dq]2.00\[dq],\[dq]\[dq],\[dq]simon\[at]joyful.com\[dq],\[dq]3XJ107139A851061F\[dq],\[dq]\[dq],\[dq]\[dq],\[dq]K9U43044RY432050M\[dq],\[dq]\[dq],\[dq]0.00\[dq],\[dq]\[dq]
|
|
\[dq]10/22/2019\[dq],\[dq]05:07:06\[dq],\[dq]PDT\[dq],\[dq]Noble Benefactor\[dq],\[dq]Subscription Payment\[dq],\[dq]Completed\[dq],\[dq]USD\[dq],\[dq]10.00\[dq],\[dq]-0.59\[dq],\[dq]9.41\[dq],\[dq]noble\[at]bene.fac.tor\[dq],\[dq]simon\[at]joyful.com\[dq],\[dq]6L8L1662YP1334033\[dq],\[dq]Joyful Systems\[dq],\[dq]\[dq],\[dq]I-KC9VBGY2GWDB\[dq],\[dq]\[dq],\[dq]9.41\[dq],\[dq]\[dq]
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# paypal-custom.csv.rules
|
|
|
|
# Tips:
|
|
# Export from Activity -> Statements -> Custom -> Activity download
|
|
# Suggested transaction type: \[dq]Balance affecting\[dq]
|
|
# Paypal\[aq]s default fields in 2018 were:
|
|
# \[dq]Date\[dq],\[dq]Time\[dq],\[dq]TimeZone\[dq],\[dq]Name\[dq],\[dq]Type\[dq],\[dq]Status\[dq],\[dq]Currency\[dq],\[dq]Gross\[dq],\[dq]Fee\[dq],\[dq]Net\[dq],\[dq]From Email Address\[dq],\[dq]To Email Address\[dq],\[dq]Transaction ID\[dq],\[dq]Shipping Address\[dq],\[dq]Address Status\[dq],\[dq]Item Title\[dq],\[dq]Item ID\[dq],\[dq]Shipping and Handling Amount\[dq],\[dq]Insurance Amount\[dq],\[dq]Sales Tax\[dq],\[dq]Option 1 Name\[dq],\[dq]Option 1 Value\[dq],\[dq]Option 2 Name\[dq],\[dq]Option 2 Value\[dq],\[dq]Reference Txn ID\[dq],\[dq]Invoice Number\[dq],\[dq]Custom Number\[dq],\[dq]Quantity\[dq],\[dq]Receipt ID\[dq],\[dq]Balance\[dq],\[dq]Address Line 1\[dq],\[dq]Address Line 2/District/Neighborhood\[dq],\[dq]Town/City\[dq],\[dq]State/Province/Region/County/Territory/Prefecture/Republic\[dq],\[dq]Zip/Postal Code\[dq],\[dq]Country\[dq],\[dq]Contact Phone Number\[dq],\[dq]Subject\[dq],\[dq]Note\[dq],\[dq]Country Code\[dq],\[dq]Balance Impact\[dq]
|
|
# This rules file assumes the following more detailed fields, configured in \[dq]Customize report fields\[dq]:
|
|
# \[dq]Date\[dq],\[dq]Time\[dq],\[dq]TimeZone\[dq],\[dq]Name\[dq],\[dq]Type\[dq],\[dq]Status\[dq],\[dq]Currency\[dq],\[dq]Gross\[dq],\[dq]Fee\[dq],\[dq]Net\[dq],\[dq]From Email Address\[dq],\[dq]To Email Address\[dq],\[dq]Transaction ID\[dq],\[dq]Item Title\[dq],\[dq]Item ID\[dq],\[dq]Reference Txn ID\[dq],\[dq]Receipt ID\[dq],\[dq]Balance\[dq],\[dq]Note\[dq]
|
|
|
|
fields date, time, timezone, description_, type, status_, currency, grossamount, feeamount, netamount, fromemail, toemail, code, itemtitle, itemid, referencetxnid, receiptid, balance, note
|
|
|
|
skip 1
|
|
|
|
date-format %-m/%-d/%Y
|
|
|
|
# ignore some paypal events
|
|
if
|
|
In Progress
|
|
Temporary Hold
|
|
Update to
|
|
skip
|
|
|
|
# add more fields to the description
|
|
description %description_ %itemtitle
|
|
|
|
# save some other fields as tags
|
|
comment itemid:%itemid, fromemail:%fromemail, toemail:%toemail, time:%time, type:%type, status:%status_
|
|
|
|
# convert to short currency symbols
|
|
if %currency USD
|
|
currency $
|
|
if %currency EUR
|
|
currency E
|
|
if %currency GBP
|
|
currency P
|
|
|
|
# generate postings
|
|
|
|
# the first posting will be the money leaving/entering my paypal account
|
|
# (negative means leaving my account, in all amount fields)
|
|
account1 assets:online:paypal
|
|
amount1 %netamount
|
|
|
|
# the second posting will be money sent to/received from other party
|
|
# (account2 is set below)
|
|
amount2 -%grossamount
|
|
|
|
# if there\[aq]s a fee, add a third posting for the money taken by paypal.
|
|
if %feeamount [1-9]
|
|
account3 expenses:banking:paypal
|
|
amount3 -%feeamount
|
|
comment3 business:
|
|
|
|
# choose an account for the second posting
|
|
|
|
# override the default account names:
|
|
# if the amount is positive, it\[aq]s income (a debit)
|
|
if %grossamount \[ha][\[ha]-]
|
|
account2 income:unknown
|
|
# if negative, it\[aq]s an expense (a credit)
|
|
if %grossamount \[ha]-
|
|
account2 expenses:unknown
|
|
|
|
# apply common rules for setting account2 & other tweaks
|
|
include common.rules
|
|
|
|
# apply some overrides specific to this csv
|
|
|
|
# Transfers from/to bank. These are usually marked Pending,
|
|
# which can be disregarded in this case.
|
|
if
|
|
Bank Account
|
|
Bank Deposit to PP Account
|
|
description %type for %referencetxnid %itemtitle
|
|
account2 assets:bank:wf:pchecking
|
|
account1 assets:online:paypal
|
|
|
|
# Currency conversions
|
|
if Currency Conversion
|
|
account2 equity:currency conversion
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# common.rules
|
|
|
|
if
|
|
darcs
|
|
noble benefactor
|
|
account2 revenues:foss donations:darcshub
|
|
comment2 business:
|
|
|
|
if
|
|
Calm Radio
|
|
account2 expenses:online:apps
|
|
|
|
if
|
|
electronic frontier foundation
|
|
Patreon
|
|
wikimedia
|
|
Advent of Code
|
|
account2 expenses:dues
|
|
|
|
if Google
|
|
account2 expenses:online:apps
|
|
description google | music
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
$ hledger -f paypal-custom.csv print
|
|
2019-10-01 (60P57143A8206782E) Calm Radio MONTHLY - $1 for the first 2 Months: Me - Order 99309. Item total: $1.00 USD first 2 months, then $6.99 / Month ; itemid:, fromemail:simon\[at]joyful.com, toemail:memberships\[at]calmradio.com, time:03:46:20, type:Subscription Payment, status:Completed
|
|
assets:online:paypal $-6.99 = $-6.99
|
|
expenses:online:apps $6.99
|
|
|
|
2019-10-01 (0TU1544T080463733) Bank Deposit to PP Account for 60P57143A8206782E ; itemid:, fromemail:, toemail:simon\[at]joyful.com, time:03:46:20, type:Bank Deposit to PP Account, status:Pending
|
|
assets:online:paypal $6.99 = $0.00
|
|
assets:bank:wf:pchecking $-6.99
|
|
|
|
2019-10-01 (2722394R5F586712G) Patreon Patreon* Membership ; itemid:, fromemail:simon\[at]joyful.com, toemail:support\[at]patreon.com, time:08:57:01, type:PreApproved Payment Bill User Payment, status:Completed
|
|
assets:online:paypal $-7.00 = $-7.00
|
|
expenses:dues $7.00
|
|
|
|
2019-10-01 (71854087RG994194F) Bank Deposit to PP Account for 2722394R5F586712G Patreon* Membership ; itemid:, fromemail:, toemail:simon\[at]joyful.com, time:08:57:01, type:Bank Deposit to PP Account, status:Pending
|
|
assets:online:paypal $7.00 = $0.00
|
|
assets:bank:wf:pchecking $-7.00
|
|
|
|
2019-10-19 (K9U43044RY432050M) Wikimedia Foundation, Inc. Monthly donation to the Wikimedia Foundation ; itemid:, fromemail:simon\[at]joyful.com, toemail:tle\[at]wikimedia.org, time:03:02:12, type:Subscription Payment, status:Completed
|
|
assets:online:paypal $-2.00 = $-2.00
|
|
expenses:dues $2.00
|
|
expenses:banking:paypal ; business:
|
|
|
|
2019-10-19 (3XJ107139A851061F) Bank Deposit to PP Account for K9U43044RY432050M ; itemid:, fromemail:, toemail:simon\[at]joyful.com, time:03:02:12, type:Bank Deposit to PP Account, status:Pending
|
|
assets:online:paypal $2.00 = $0.00
|
|
assets:bank:wf:pchecking $-2.00
|
|
|
|
2019-10-22 (6L8L1662YP1334033) Noble Benefactor Joyful Systems ; itemid:, fromemail:noble\[at]bene.fac.tor, toemail:simon\[at]joyful.com, time:05:07:06, type:Subscription Payment, status:Completed
|
|
assets:online:paypal $9.41 = $9.41
|
|
revenues:foss donations:darcshub $-10.00 ; business:
|
|
expenses:banking:paypal $0.59 ; business:
|
|
\f[R]
|
|
.fi
|
|
.SH CSV RULES
|
|
.PP
|
|
The following kinds of rule can appear in the rules file, in any order.
|
|
Blank lines and lines beginning with \f[C]#\f[R] or \f[C];\f[R] are
|
|
ignored.
|
|
.SS \f[C]skip\f[R]
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
skip N
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
The word \[dq]skip\[dq] followed by a number (or no number, meaning 1)
|
|
tells hledger to ignore this many non-empty lines preceding the CSV
|
|
data.
|
|
(Empty/blank lines are skipped automatically.) You\[aq]ll need this
|
|
whenever your CSV data contains header lines.
|
|
.PP
|
|
It also has a second purpose: it can be used inside if blocks to ignore
|
|
certain CSV records (described below).
|
|
.SS \f[C]fields\f[R]
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
fields FIELDNAME1, FIELDNAME2, ...
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
A fields list (the word \[dq]fields\[dq] followed by comma-separated
|
|
field names) is the quick way to assign CSV field values to hledger
|
|
fields.
|
|
It does two things:
|
|
.IP "1." 3
|
|
it names the CSV fields.
|
|
This is optional, but can be convenient later for interpolating them.
|
|
.IP "2." 3
|
|
when you use a standard hledger field name, it assigns the CSV value to
|
|
that part of the hledger transaction.
|
|
.PP
|
|
Here\[aq]s an example that says \[dq]use the 1st, 2nd and 4th fields as
|
|
the transaction\[aq]s date, description and amount; name the last two
|
|
fields for later reference; and ignore the others\[dq]:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
fields date, description, , amount, , , somefield, anotherfield
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
Field names may not contain whitespace.
|
|
Fields you don\[aq]t care about can be left unnamed.
|
|
Currently there must be least two items (there must be at least one
|
|
comma).
|
|
.PP
|
|
Note, always use comma in the fields list, even if your CSV uses another
|
|
separator character.
|
|
.PP
|
|
Here are the standard hledger field/pseudo-field names.
|
|
For more about the transaction parts they refer to, see the manual for
|
|
hledger\[aq]s journal format.
|
|
.SS Transaction field names
|
|
.PP
|
|
\f[C]date\f[R], \f[C]date2\f[R], \f[C]status\f[R], \f[C]code\f[R],
|
|
\f[C]description\f[R], \f[C]comment\f[R] can be used to form the
|
|
transaction\[aq]s first line.
|
|
.SS Posting field names
|
|
.SS account
|
|
.PP
|
|
\f[C]accountN\f[R], where N is 1 to 99, causes a posting to be
|
|
generated, with that account name.
|
|
.PP
|
|
Most often there are two postings, so you\[aq]ll want to set
|
|
\f[C]account1\f[R] and \f[C]account2\f[R].
|
|
Typically \f[C]account1\f[R] is associated with the CSV file, and is set
|
|
once with a top-level assignment, while \f[C]account2\f[R] is set based
|
|
on each transaction\[aq]s description, and in conditional blocks.
|
|
.PP
|
|
If a posting\[aq]s account name is left unset but its amount is set (see
|
|
below), a default account name will be chosen (like
|
|
\[dq]expenses:unknown\[dq] or \[dq]income:unknown\[dq]).
|
|
.SS amount
|
|
.PP
|
|
\f[C]amountN\f[R] sets posting N\[aq]s amount.
|
|
If the CSV uses separate fields for inflows and outflows, you can use
|
|
\f[C]amountN-in\f[R] and \f[C]amountN-out\f[R] instead.
|
|
By assigning to \f[C]amount1\f[R], \f[C]amount2\f[R], ...
|
|
etc.
|
|
you can generate anywhere from 0 to 99 postings.
|
|
.PP
|
|
There is also an older, unnumbered form of these names, suitable for
|
|
2-posting transactions, which sets both posting 1\[aq]s and (negated)
|
|
posting 2\[aq]s amount: \f[C]amount\f[R], or \f[C]amount-in\f[R] and
|
|
\f[C]amount-out\f[R].
|
|
This is still supported because it keeps pre-hledger-1.17 csv rules
|
|
files working, and because it can be more succinct, and because it
|
|
converts posting 2\[aq]s amount to cost if there\[aq]s a transaction
|
|
price, which can be useful.
|
|
.PP
|
|
If you have an existing rules file using the unnumbered form, you might
|
|
want to use the numbered form in certain conditional blocks, without
|
|
having to update and retest all the old rules.
|
|
To facilitate this, posting 1 ignores
|
|
\f[C]amount\f[R]/\f[C]amount-in\f[R]/\f[C]amount-out\f[R] if any of
|
|
\f[C]amount1\f[R]/\f[C]amount1-in\f[R]/\f[C]amount1-out\f[R] are
|
|
assigned, and posting 2 ignores them if any of
|
|
\f[C]amount2\f[R]/\f[C]amount2-in\f[R]/\f[C]amount2-out\f[R] are
|
|
assigned, avoiding conflicts.
|
|
.SS currency
|
|
.PP
|
|
If the CSV has the currency symbol in a separate field (ie, not part of
|
|
the amount field), you can use \f[C]currencyN\f[R] to prepend it to
|
|
posting N\[aq]s amount.
|
|
Or, \f[C]currency\f[R] with no number affects all postings.
|
|
.SS balance
|
|
.PP
|
|
\f[C]balanceN\f[R] sets a balance assertion amount (or if the posting
|
|
amount is left empty, a balance assignment) on posting N.
|
|
.PP
|
|
Also, for compatibility with hledger <1.17: \f[C]balance\f[R] with no
|
|
number is equivalent to \f[C]balance1\f[R].
|
|
.PP
|
|
You can adjust the type of assertion/assignment with the
|
|
\f[C]balance-type\f[R] rule (see below).
|
|
.SS comment
|
|
.PP
|
|
Finally, \f[C]commentN\f[R] sets a comment on the Nth posting.
|
|
Comments can also contain tags, as usual.
|
|
.PP
|
|
See TIPS below for more about setting amounts and currency.
|
|
.SS field assignment
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
HLEDGERFIELDNAME FIELDVALUE
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
Instead of or in addition to a fields list, you can use a \[dq]field
|
|
assignment\[dq] rule to set the value of a single hledger field, by
|
|
writing its name (any of the standard hledger field names above)
|
|
followed by a text value.
|
|
The value may contain interpolated CSV fields, referenced by their
|
|
1-based position in the CSV record (\f[C]%N\f[R]), or by the name they
|
|
were given in the fields list (\f[C]%CSVFIELDNAME\f[R]).
|
|
Some examples:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# set the amount to the 4th CSV field, with \[dq] USD\[dq] appended
|
|
amount %4 USD
|
|
|
|
# combine three fields to make a comment, containing note: and date: tags
|
|
comment note: %somefield - %anotherfield, date: %1
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
Interpolation strips outer whitespace (so a CSV value like
|
|
\f[C]\[dq] 1 \[dq]\f[R] becomes \f[C]1\f[R] when interpolated) (#1051).
|
|
See TIPS below for more about referencing other fields.
|
|
.SS \f[C]separator\f[R]
|
|
.PP
|
|
You can use the \f[C]separator\f[R] rule to read other kinds of
|
|
character-separated data.
|
|
The argument is any single separator character, or the words
|
|
\f[C]tab\f[R] or \f[C]space\f[R] (case insensitive).
|
|
Eg, for comma-separated values (CSV):
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
separator ,
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
or for semicolon-separated values (SSV):
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
separator ;
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
or for tab-separated values (TSV):
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
separator TAB
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
If the input file has a \f[C].csv\f[R], \f[C].ssv\f[R] or \f[C].tsv\f[R]
|
|
file extension (or a \f[C]csv:\f[R], \f[C]ssv:\f[R], \f[C]tsv:\f[R]
|
|
prefix), the appropriate separator will be inferred automatically, and
|
|
you won\[aq]t need this rule.
|
|
.SS \f[C]if\f[R] block
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
if MATCHER
|
|
RULE
|
|
|
|
if
|
|
MATCHER
|
|
MATCHER
|
|
MATCHER
|
|
RULE
|
|
RULE
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
Conditional blocks (\[dq]if blocks\[dq]) are a block of rules that are
|
|
applied only to CSV records which match certain patterns.
|
|
They are often used for customising account names based on transaction
|
|
descriptions.
|
|
.SS Matching the whole record
|
|
.PP
|
|
Each MATCHER can be a record matcher, which looks like this:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
REGEX
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
REGEX is a case-insensitive regular expression which tries to match
|
|
anywhere within the CSV record.
|
|
It is a POSIX ERE (extended regular expression) that also supports GNU
|
|
word boundaries (\f[C]\[rs]b\f[R], \f[C]\[rs]B\f[R], \f[C]\[rs]<\f[R],
|
|
\f[C]\[rs]>\f[R]), and nothing else.
|
|
If you have trouble, be sure to check our
|
|
https://hledger.org/hledger.html#regular-expressions doc.
|
|
.PP
|
|
Important note: the record that is matched is not the original record,
|
|
but a synthetic one, with any enclosing double quotes (but not enclosing
|
|
whitespace) removed, and always comma-separated (which means that a
|
|
field containing a comma will appear like two fields).
|
|
Eg, if the original record is
|
|
\f[C]2020-01-01; \[dq]Acme, Inc.\[dq]; 1,000\f[R], the REGEX will
|
|
actually see \f[C]2020-01-01,Acme, Inc., 1,000\f[R]).
|
|
.SS Matching individual fields
|
|
.PP
|
|
Or, MATCHER can be a field matcher, like this:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
%CSVFIELD REGEX
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
which matches just the content of a particular CSV field.
|
|
CSVFIELD is a percent sign followed by the field\[aq]s name or column
|
|
number, like \f[C]%date\f[R] or \f[C]%1\f[R].
|
|
.SS Combining matchers
|
|
.PP
|
|
A single matcher can be written on the same line as the \[dq]if\[dq]; or
|
|
multiple matchers can be written on the following lines, non-indented.
|
|
Multiple matchers are OR\[aq]d (any one of them can match), unless one
|
|
begins with an \f[C]&\f[R] symbol, in which case it is AND\[aq]ed with
|
|
the previous matcher.
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
if
|
|
MATCHER
|
|
& MATCHER
|
|
RULE
|
|
\f[R]
|
|
.fi
|
|
.SS Rules applied on successful match
|
|
.PP
|
|
After the patterns there should be one or more rules to apply, all
|
|
indented by at least one space.
|
|
Three kinds of rule are allowed in conditional blocks:
|
|
.IP \[bu] 2
|
|
field assignments (to set a hledger field)
|
|
.IP \[bu] 2
|
|
skip (to skip the matched CSV record)
|
|
.IP \[bu] 2
|
|
end (to skip all remaining CSV records).
|
|
.PP
|
|
Examples:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# if the CSV record contains \[dq]groceries\[dq], set account2 to \[dq]expenses:groceries\[dq]
|
|
if groceries
|
|
account2 expenses:groceries
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# if the CSV record contains any of these patterns, set account2 and comment as shown
|
|
if
|
|
monthly service fee
|
|
atm transaction fee
|
|
banking thru software
|
|
account2 expenses:business:banking
|
|
comment XXX deductible ? check it
|
|
\f[R]
|
|
.fi
|
|
.SS \f[C]if\f[R] table
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
if,CSVFIELDNAME1,CSVFIELDNAME2,...,CSVFIELDNAMEn
|
|
MATCHER1,VALUE11,VALUE12,...,VALUE1n
|
|
MATCHER2,VALUE21,VALUE22,...,VALUE2n
|
|
MATCHER3,VALUE31,VALUE32,...,VALUE3n
|
|
<empty line>
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
Conditional tables (\[dq]if tables\[dq]) are a different syntax to
|
|
specify field assignments that will be applied only to CSV records which
|
|
match certain patterns.
|
|
.PP
|
|
MATCHER could be either field or record matcher, as described above.
|
|
When MATCHER matches, values from that row would be assigned to the CSV
|
|
fields named on the \f[C]if\f[R] line, in the same order.
|
|
.PP
|
|
Therefore \f[C]if\f[R] table is exactly equivalent to a sequence of of
|
|
\f[C]if\f[R] blocks:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
if MATCHER1
|
|
CSVFIELDNAME1 VALUE11
|
|
CSVFIELDNAME2 VALUE12
|
|
...
|
|
CSVFIELDNAMEn VALUE1n
|
|
|
|
if MATCHER2
|
|
CSVFIELDNAME1 VALUE21
|
|
CSVFIELDNAME2 VALUE22
|
|
...
|
|
CSVFIELDNAMEn VALUE2n
|
|
|
|
if MATCHER3
|
|
CSVFIELDNAME1 VALUE31
|
|
CSVFIELDNAME2 VALUE32
|
|
...
|
|
CSVFIELDNAMEn VALUE3n
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
Each line starting with MATCHER should contain enough (possibly empty)
|
|
values for all the listed fields.
|
|
.PP
|
|
Rules would be checked and applied in the order they are listed in the
|
|
table and, like with \f[C]if\f[R] blocks, later rules (in the same or
|
|
another table) or \f[C]if\f[R] blocks could override the effect of any
|
|
rule.
|
|
.PP
|
|
Instead of \[aq],\[aq] you can use a variety of other non-alphanumeric
|
|
characters as a separator.
|
|
First character after \f[C]if\f[R] is taken to be the separator for the
|
|
rest of the table.
|
|
It is the responsibility of the user to ensure that separator does not
|
|
occur inside MATCHERs and values - there is no way to escape separator.
|
|
.PP
|
|
Example:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
if,account2,comment
|
|
atm transaction fee,expenses:business:banking,deductible? check it
|
|
%description groceries,expenses:groceries,
|
|
2020/01/12.*Plumbing LLC,expenses:house:upkeep,emergency plumbing call-out
|
|
\f[R]
|
|
.fi
|
|
.SS \f[C]end\f[R]
|
|
.PP
|
|
This rule can be used inside if blocks (only), to make hledger stop
|
|
reading this CSV file and move on to the next input file, or to command
|
|
execution.
|
|
Eg:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# ignore everything following the first empty record
|
|
if ,,,,
|
|
end
|
|
\f[R]
|
|
.fi
|
|
.SS \f[C]date-format\f[R]
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
date-format DATEFMT
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
This is a helper for the \f[C]date\f[R] (and \f[C]date2\f[R]) fields.
|
|
If your CSV dates are not formatted like \f[C]YYYY-MM-DD\f[R],
|
|
\f[C]YYYY/MM/DD\f[R] or \f[C]YYYY.MM.DD\f[R], you\[aq]ll need to add a
|
|
date-format rule describing them with a strptime date parsing pattern,
|
|
which must parse the CSV date value completely.
|
|
Some examples:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# MM/DD/YY
|
|
date-format %m/%d/%y
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# D/M/YYYY
|
|
# The - makes leading zeros optional.
|
|
date-format %-d/%-m/%Y
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# YYYY-Mmm-DD
|
|
date-format %Y-%h-%d
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# M/D/YYYY HH:MM AM some other junk
|
|
# Note the time and junk must be fully parsed, though only the date is used.
|
|
date-format %-m/%-d/%Y %l:%M %p some other junk
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
For the supported strptime syntax, see:
|
|
.PD 0
|
|
.P
|
|
.PD
|
|
https://hackage.haskell.org/package/time/docs/Data-Time-Format.html#v:formatTime
|
|
.SS \f[C]decimal-mark\f[R]
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
decimal-mark .
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
or:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
decimal-mark ,
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
hledger automatically accepts either period or comma as a decimal mark
|
|
when parsing numbers (cf Amounts).
|
|
However if any numbers in the CSV contain digit group marks, such as
|
|
thousand-separating commas, you should declare the decimal mark
|
|
explicitly with this rule, to avoid misparsed numbers.
|
|
.SS \f[C]newest-first\f[R]
|
|
.PP
|
|
hledger always sorts the generated transactions by date.
|
|
Transactions on the same date should appear in the same order as their
|
|
CSV records, as hledger can usually auto-detect whether the CSV\[aq]s
|
|
normal order is oldest first or newest first.
|
|
But if all of the following are true:
|
|
.IP \[bu] 2
|
|
the CSV might sometimes contain just one day of data (all records having
|
|
the same date)
|
|
.IP \[bu] 2
|
|
the CSV records are normally in reverse chronological order (newest at
|
|
the top)
|
|
.IP \[bu] 2
|
|
and you care about preserving the order of same-day transactions
|
|
.PP
|
|
then, you should add the \f[C]newest-first\f[R] rule as a hint.
|
|
Eg:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# tell hledger explicitly that the CSV is normally newest first
|
|
newest-first
|
|
\f[R]
|
|
.fi
|
|
.SS \f[C]include\f[R]
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
include RULESFILE
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
This includes the contents of another CSV rules file at this point.
|
|
\f[C]RULESFILE\f[R] is an absolute file path or a path relative to the
|
|
current file\[aq]s directory.
|
|
This can be useful for sharing common rules between several rules files,
|
|
eg:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# someaccount.csv.rules
|
|
|
|
## someaccount-specific rules
|
|
fields date,description,amount
|
|
account1 assets:someaccount
|
|
account2 expenses:misc
|
|
|
|
## common rules
|
|
include categorisation.rules
|
|
\f[R]
|
|
.fi
|
|
.SS \f[C]balance-type\f[R]
|
|
.PP
|
|
Balance assertions generated by assigning to balanceN are of the simple
|
|
\f[C]=\f[R] type by default, which is a single-commodity,
|
|
subaccount-excluding assertion.
|
|
You may find the subaccount-including variants more useful, eg if you
|
|
have created some virtual subaccounts of checking to help with
|
|
budgeting.
|
|
You can select a different type of assertion with the
|
|
\f[C]balance-type\f[R] rule:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# balance assertions will consider all commodities and all subaccounts
|
|
balance-type ==*
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
Here are the balance assertion types for quick reference:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
= single commodity, exclude subaccounts
|
|
=* single commodity, include subaccounts
|
|
== multi commodity, exclude subaccounts
|
|
==* multi commodity, include subaccounts
|
|
\f[R]
|
|
.fi
|
|
.SH TIPS
|
|
.SS Rapid feedback
|
|
.PP
|
|
It\[aq]s a good idea to get rapid feedback while
|
|
creating/troubleshooting CSV rules.
|
|
Here\[aq]s a good way, using entr from http://eradman.com/entrproject :
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
$ ls foo.csv* | entr bash -c \[aq]echo ----; hledger -f foo.csv print desc:SOMEDESC\[aq]
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
A desc: query (eg) is used to select just one, or a few, transactions of
|
|
interest.
|
|
\[dq]bash -c\[dq] is used to run multiple commands, so we can echo a
|
|
separator each time the command re-runs, making it easier to read the
|
|
output.
|
|
.SS Valid CSV
|
|
.PP
|
|
hledger accepts CSV conforming to RFC 4180.
|
|
When CSV values are enclosed in quotes, note:
|
|
.IP \[bu] 2
|
|
they must be double quotes (not single quotes)
|
|
.IP \[bu] 2
|
|
spaces outside the quotes are not allowed
|
|
.SS File Extension
|
|
.PP
|
|
To help hledger identify the format and show the right error messages,
|
|
CSV/SSV/TSV files should normally be named with a \f[C].csv\f[R],
|
|
\f[C].ssv\f[R] or \f[C].tsv\f[R] filename extension.
|
|
Or, the file path should be prefixed with \f[C]csv:\f[R], \f[C]ssv:\f[R]
|
|
or \f[C]tsv:\f[R].
|
|
Eg:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
$ hledger -f foo.ssv print
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
or:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
$ cat foo | hledger -f ssv:- foo
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
You can override the file extension with a separator rule if needed.
|
|
See also: Input files in the hledger manual.
|
|
.SS Reading multiple CSV files
|
|
.PP
|
|
If you use multiple \f[C]-f\f[R] options to read multiple CSV files at
|
|
once, hledger will look for a correspondingly-named rules file for each
|
|
CSV file.
|
|
But if you use the \f[C]--rules-file\f[R] option, that rules file will
|
|
be used for all the CSV files.
|
|
.SS Valid transactions
|
|
.PP
|
|
After reading a CSV file, hledger post-processes and validates the
|
|
generated journal entries as it would for a journal file - balancing
|
|
them, applying balance assignments, and canonicalising amount styles.
|
|
Any errors at this stage will be reported in the usual way, displaying
|
|
the problem entry.
|
|
.PP
|
|
There is one exception: balance assertions, if you have generated them,
|
|
will not be checked, since normally these will work only when the CSV
|
|
data is part of the main journal.
|
|
If you do need to check balance assertions generated from CSV right
|
|
away, pipe into another hledger:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
$ hledger -f file.csv print | hledger -f- print
|
|
\f[R]
|
|
.fi
|
|
.SS Deduplicating, importing
|
|
.PP
|
|
When you download a CSV file periodically, eg to get your latest bank
|
|
transactions, the new file may overlap with the old one, containing some
|
|
of the same records.
|
|
.PP
|
|
The import command will (a) detect the new transactions, and (b) append
|
|
just those transactions to your main journal.
|
|
It is idempotent, so you don\[aq]t have to remember how many times you
|
|
ran it or with which version of the CSV.
|
|
(It keeps state in a hidden \f[C].latest.FILE.csv\f[R] file.) This is
|
|
the easiest way to import CSV data.
|
|
Eg:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# download the latest CSV files, then run this command.
|
|
# Note, no -f flags needed here.
|
|
$ hledger import *.csv [--dry]
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
This method works for most CSV files.
|
|
(Where records have a stable chronological order, and new records appear
|
|
only at the new end.)
|
|
.PP
|
|
A number of other tools and workflows, hledger-specific and otherwise,
|
|
exist for converting, deduplicating, classifying and managing CSV data.
|
|
See:
|
|
.IP \[bu] 2
|
|
https://hledger.org -> sidebar -> real world setups
|
|
.IP \[bu] 2
|
|
https://plaintextaccounting.org -> data import/conversion
|
|
.SS Setting amounts
|
|
.PP
|
|
A posting amount can be set in one of these ways:
|
|
.IP \[bu] 2
|
|
by assigning (with a fields list or field assignment) to
|
|
\f[C]amountN\f[R] (posting N\[aq]s amount) or \f[C]amount\f[R] (posting
|
|
1\[aq]s amount)
|
|
.IP \[bu] 2
|
|
by assigning to \f[C]amountN-in\f[R] and \f[C]amountN-out\f[R] (or
|
|
\f[C]amount-in\f[R] and \f[C]amount-out\f[R]).
|
|
For each CSV record, whichever of these has a non-zero value will be
|
|
used, with appropriate sign.
|
|
If both contain a non-zero value, this may not work.
|
|
.IP \[bu] 2
|
|
by assigning to \f[C]balanceN\f[R] (or \f[C]balance\f[R]) instead of the
|
|
above, setting the amount indirectly via a balance assignment.
|
|
If you do this the default account name may be wrong, so you should set
|
|
that explicitly.
|
|
.PP
|
|
There is some special handling for an amount\[aq]s sign:
|
|
.IP \[bu] 2
|
|
If an amount value is parenthesised, it will be de-parenthesised and
|
|
sign-flipped.
|
|
.IP \[bu] 2
|
|
If an amount value begins with a double minus sign, those cancel out and
|
|
are removed.
|
|
.IP \[bu] 2
|
|
If an amount value begins with a plus sign, that will be removed
|
|
.SS Setting currency/commodity
|
|
.PP
|
|
If the currency/commodity symbol is included in the CSV\[aq]s amount
|
|
field(s):
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
2020-01-01,foo,$123.00
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
you don\[aq]t have to do anything special for the commodity symbol, it
|
|
will be assigned as part of the amount.
|
|
Eg:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
fields date,description,amount
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
2020-01-01 foo
|
|
expenses:unknown $123.00
|
|
income:unknown $-123.00
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
If the currency is provided as a separate CSV field:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
2020-01-01,foo,USD,123.00
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
You can assign that to the \f[C]currency\f[R] pseudo-field, which has
|
|
the special effect of prepending itself to every amount in the
|
|
transaction (on the left, with no separating space):
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
fields date,description,currency,amount
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
2020-01-01 foo
|
|
expenses:unknown USD123.00
|
|
income:unknown USD-123.00
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
Or, you can use a field assignment to construct the amount yourself,
|
|
with more control.
|
|
Eg to put the symbol on the right, and separated by a space:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
fields date,description,cur,amt
|
|
amount %amt %cur
|
|
\f[R]
|
|
.fi
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
2020-01-01 foo
|
|
expenses:unknown 123.00 USD
|
|
income:unknown -123.00 USD
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
Note we used a temporary field name (\f[C]cur\f[R]) that is not
|
|
\f[C]currency\f[R] - that would trigger the prepending effect, which we
|
|
don\[aq]t want here.
|
|
.SS Referencing other fields
|
|
.PP
|
|
In field assignments, you can interpolate only CSV fields, not hledger
|
|
fields.
|
|
In the example below, there\[aq]s both a CSV field and a hledger field
|
|
named amount1, but %amount1 always means the CSV field, not the hledger
|
|
field:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
# Name the third CSV field \[dq]amount1\[dq]
|
|
fields date,description,amount1
|
|
|
|
# Set hledger\[aq]s amount1 to the CSV amount1 field followed by USD
|
|
amount1 %amount1 USD
|
|
|
|
# Set comment to the CSV amount1 (not the amount1 assigned above)
|
|
comment %amount1
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
Here, since there\[aq]s no CSV amount1 field, %amount1 will produce a
|
|
literal \[dq]amount1\[dq]:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
fields date,description,csvamount
|
|
amount1 %csvamount USD
|
|
# Can\[aq]t interpolate amount1 here
|
|
comment %amount1
|
|
\f[R]
|
|
.fi
|
|
.PP
|
|
When there are multiple field assignments to the same hledger field,
|
|
only the last one takes effect.
|
|
Here, comment\[aq]s value will be be B, or C if \[dq]something\[dq] is
|
|
matched, but never A:
|
|
.IP
|
|
.nf
|
|
\f[C]
|
|
comment A
|
|
comment B
|
|
if something
|
|
comment C
|
|
\f[R]
|
|
.fi
|
|
.SS How CSV rules are evaluated
|
|
.PP
|
|
Here\[aq]s how to think of CSV rules being evaluated (if you really need
|
|
to).
|
|
First,
|
|
.IP \[bu] 2
|
|
\f[C]include\f[R] - all includes are inlined, from top to bottom, depth
|
|
first.
|
|
(At each include point the file is inlined and scanned for further
|
|
includes, recursively, before proceeding.)
|
|
.PP
|
|
Then \[dq]global\[dq] rules are evaluated, top to bottom.
|
|
If a rule is repeated, the last one wins:
|
|
.IP \[bu] 2
|
|
\f[C]skip\f[R] (at top level)
|
|
.IP \[bu] 2
|
|
\f[C]date-format\f[R]
|
|
.IP \[bu] 2
|
|
\f[C]newest-first\f[R]
|
|
.IP \[bu] 2
|
|
\f[C]fields\f[R] - names the CSV fields, optionally sets up initial
|
|
assignments to hledger fields
|
|
.PP
|
|
Then for each CSV record in turn:
|
|
.IP \[bu] 2
|
|
test all \f[C]if\f[R] blocks.
|
|
If any of them contain a \f[C]end\f[R] rule, skip all remaining CSV
|
|
records.
|
|
Otherwise if any of them contain a \f[C]skip\f[R] rule, skip that many
|
|
CSV records.
|
|
If there are multiple matched \f[C]skip\f[R] rules, the first one wins.
|
|
.IP \[bu] 2
|
|
collect all field assignments at top level and in matched \f[C]if\f[R]
|
|
blocks.
|
|
When there are multiple assignments for a field, keep only the last one.
|
|
.IP \[bu] 2
|
|
compute a value for each hledger field - either the one that was
|
|
assigned to it (and interpolate the %CSVFIELDNAME references), or a
|
|
default
|
|
.IP \[bu] 2
|
|
generate a synthetic hledger transaction from these values.
|
|
.PP
|
|
This is all part of the CSV reader, one of several readers hledger can
|
|
use to parse input files.
|
|
When all files have been read successfully, the transactions are passed
|
|
as input to whichever hledger command the user specified.
|
|
|
|
|
|
.SH "REPORTING BUGS"
|
|
Report bugs at http://bugs.hledger.org
|
|
(or on the #hledger IRC channel or hledger mail list)
|
|
|
|
.SH AUTHORS
|
|
Simon Michael <simon@joyful.com> and contributors
|
|
|
|
.SH COPYRIGHT
|
|
|
|
Copyright (C) 2007-2020 Simon Michael.
|
|
.br
|
|
Released under GNU GPL v3 or later.
|
|
|
|
.SH SEE ALSO
|
|
hledger(1), hledger\-ui(1), hledger\-web(1),
|
|
hledger_csv(5), hledger_journal(5), hledger_timeclock(5), hledger_timedot(5),
|
|
ledger(1)
|