.TH "hledger_csv" "5" "September 2019" "hledger 1.15.99" "hledger User Manuals" .SH NAME .PP CSV - how hledger reads CSV data, and the CSV rules file format .SH DESCRIPTION .PP hledger can read CSV (comma-separated value) files as if they were journal files, automatically converting each CSV record into a transaction. (To learn about \f[I]writing\f[R] CSV, see CSV output.) .PP Converting CSV to transactions requires some special conversion rules. These do several things: .IP \[bu] 2 they describe the layout and format of the CSV data .IP \[bu] 2 they can customize the generated journal entries using a simple templating language .IP \[bu] 2 they can add refinements based on patterns in the CSV data, eg categorizing transactions with more detailed account names. .PP When reading a CSV file named \f[C]FILE.csv\f[R], hledger looks for a conversion rules file named \f[C]FILE.csv.rules\f[R] in the same directory. You can override this with the \f[C]--rules-file\f[R] option. If the rules file does not exist, hledger will auto-create one with some example rules, which you\[aq]ll need to adjust. .PP At minimum, the rules file must identify the date and amount fields. It\[aq]s often necessary to specify the date format, and the number of header lines to skip, also. Eg: .IP .nf \f[C] fields date, _, _, amount date-format %d/%m/%Y skip 1 \f[R] .fi .PP A more complete example: .IP .nf \f[C] # hledger CSV rules for amazon.com order history # sample: # \[dq]Date\[dq],\[dq]Type\[dq],\[dq]To/From\[dq],\[dq]Name\[dq],\[dq]Status\[dq],\[dq]Amount\[dq],\[dq]Fees\[dq],\[dq]Transaction ID\[dq] # \[dq]Jul 29, 2012\[dq],\[dq]Payment\[dq],\[dq]To\[dq],\[dq]Adapteva, Inc.\[dq],\[dq]Completed\[dq],\[dq]$25.00\[dq],\[dq]$0.00\[dq],\[dq]17LA58JSK6PRD4HDGLNJQPI1PB9N8DKPVHL\[dq] # skip one header line skip 1 # name the csv fields (and assign the transaction\[aq]s date, amount and code) fields date, _, toorfrom, name, amzstatus, amount, fees, code # how to parse the date date-format %b %-d, %Y # combine two fields to make the description description %toorfrom %name # save these fields as tags comment status:%amzstatus, fees:%fees # set the base account for all transactions account1 assets:amazon # flip the sign on the amount amount -%amount \f[R] .fi .PP For more examples, see Convert CSV files. .SH CSV RULES .PP The following seven kinds of rule can appear in the rules file, in any order. Blank lines and lines beginning with \f[C]#\f[R] or \f[C];\f[R] are ignored. .SS skip .PP \f[C]skip\f[R]\f[I]\f[CI]N\f[I]\f[R] .PP Skip this number of CSV records at the beginning. You\[aq]ll need this whenever your CSV data contains header lines. Eg: .IP .nf \f[C] # ignore the first CSV line skip 1 \f[R] .fi .SS date-format .PP \f[C]date-format\f[R]\f[I]\f[CI]DATEFMT\f[I]\f[R] .PP When your CSV date fields are not formatted like \f[C]YYYY/MM/DD\f[R] (or \f[C]YYYY-MM-DD\f[R] or \f[C]YYYY.MM.DD\f[R]), you\[aq]ll need to specify the format. DATEFMT is a strptime-like date parsing pattern, which must parse the date field values completely. Examples: .IP .nf \f[C] # for dates like \[dq]11/06/2013\[dq]: date-format %m/%d/%Y \f[R] .fi .IP .nf \f[C] # for dates like \[dq]6/11/2013\[dq] (note the - to make leading zeros optional): date-format %-d/%-m/%Y \f[R] .fi .IP .nf \f[C] # for dates like \[dq]2013-Nov-06\[dq]: date-format %Y-%h-%d \f[R] .fi .IP .nf \f[C] # for dates like \[dq]11/6/2013 11:32 PM\[dq]: date-format %-m/%-d/%Y %l:%M %p \f[R] .fi .SS field list .PP \f[C]fields\f[R]\f[I]\f[CI]FIELDNAME1\f[I]\f[R], \f[I]\f[CI]FIELDNAME2\f[I]\f[R]... .PP This (a) names the CSV fields, in order (names may not contain whitespace; uninteresting names may be left blank), and (b) assigns them to journal entry fields if you use any of these standard field names: \f[C]date\f[R], \f[C]date2\f[R], \f[C]status\f[R], \f[C]code\f[R], \f[C]description\f[R], \f[C]comment\f[R], \f[C]account1\f[R], \f[C]account2\f[R], \f[C]amount\f[R], \f[C]amount-in\f[R], \f[C]amount-out\f[R], \f[C]currency\f[R], \f[C]balance\f[R], \f[C]balance1\f[R], \f[C]balance2\f[R]. Eg: .IP .nf \f[C] # use the 1st, 2nd and 4th CSV fields as the entry\[aq]s date, description and amount, # and give the 7th and 8th fields meaningful names for later reference: # # CSV field: # 1 2 3 4 5 6 7 8 # entry field: fields date, description, , amount, , , somefield, anotherfield \f[R] .fi .SS field assignment .PP \f[I]\f[CI]ENTRYFIELDNAME\f[I]\f[R] \f[I]\f[CI]FIELDVALUE\f[I]\f[R] .PP This sets a journal entry field (one of the standard names above) to the given text value, which can include CSV field values interpolated by name (\f[C]%CSVFIELDNAME\f[R]) or 1-based position (\f[C]%N\f[R]). Eg: .IP .nf \f[C] # set the amount to the 4th CSV field with \[dq]USD \[dq] prepended amount USD %4 \f[R] .fi .IP .nf \f[C] # combine three fields to make a comment (containing two tags) comment note: %somefield - %anotherfield, date: %1 \f[R] .fi .PP Field assignments can be used instead of or in addition to a field list. .PP Note, interpolation strips any outer whitespace, so a CSV value like \f[C]\[dq] 1 \[dq]\f[R] becomes \f[C]1\f[R] when interpolated (#1051). .SS conditional block .PP \f[C]if\f[R] \f[I]\f[CI]PATTERN\f[I]\f[R] .PD 0 .P .PD \ \ \ \ \f[I]\f[CI]FIELDASSIGNMENTS\f[I]\f[R]... .PP \f[C]if\f[R] .PD 0 .P .PD \f[I]\f[CI]PATTERN\f[I]\f[R] .PD 0 .P .PD \f[I]\f[CI]PATTERN\f[I]\f[R]... .PD 0 .P .PD \ \ \ \ \f[I]\f[CI]FIELDASSIGNMENTS\f[I]\f[R]... .PP This applies one or more field assignments, only to those CSV records matched by one of the PATTERNs. The patterns are case-insensitive regular expressions which match anywhere within the whole CSV record (it\[aq]s not yet possible to match within a specific field). When there are multiple patterns they can be written on separate lines, unindented. The field assignments are on separate lines indented by at least one space. Examples: .IP .nf \f[C] # if the CSV record contains \[dq]groceries\[dq], set account2 to \[dq]expenses:groceries\[dq] if groceries account2 expenses:groceries \f[R] .fi .IP .nf \f[C] # if the CSV record contains any of these patterns, set account2 and comment as shown if monthly service fee atm transaction fee banking thru software account2 expenses:business:banking comment XXX deductible ? check it \f[R] .fi .SS include .PP \f[C]include\f[R]\f[I]\f[CI]RULESFILE\f[I]\f[R] .PP Include another rules file at this point. \f[C]RULESFILE\f[R] is either an absolute file path or a path relative to the current file\[aq]s directory. Eg: .IP .nf \f[C] # rules reused with several CSV files include common.rules \f[R] .fi .SS newest-first .PP \f[C]newest-first\f[R] .PP Consider adding this rule if all of the following are true: you might be processing just one day of data, your CSV records are in reverse chronological order (newest first), and you care about preserving the order of same-day transactions. It usually isn\[aq]t needed, because hledger autodetects the CSV order, but when all CSV records have the same date it will assume they are oldest first. .SH CSV TIPS .SS CSV ordering .PP The generated journal entries will be sorted by date. The order of same-day entries will be preserved (except in the special case where you might need \f[C]newest-first\f[R], see above). .SS CSV accounts .PP Each journal entry will have two postings, to \f[C]account1\f[R] and \f[C]account2\f[R] respectively. It\[aq]s not yet possible to generate entries with more than two postings. It\[aq]s conventional and recommended to use \f[C]account1\f[R] for the account whose CSV we are reading. .SS CSV amounts .PP A transaction amount must be set, in one of these ways: .IP \[bu] 2 with an \f[C]amount\f[R] field assignment, which sets the first posting\[aq]s amount .IP \[bu] 2 (When the CSV has debit and credit amounts in separate fields:) .PD 0 .P .PD with field assignments for the \f[C]amount-in\f[R] and \f[C]amount-out\f[R] pseudo fields (both of them). Whichever one has a value will be used, with appropriate sign. If both contain a value, it might not work so well. .IP \[bu] 2 or implicitly by means of a balance assignment (see below). .PP There is some special handling for sign in amounts: .IP \[bu] 2 If an amount value is parenthesised, it will be de-parenthesised and sign-flipped. .IP \[bu] 2 If an amount value begins with a double minus sign, those will cancel out and be removed. .PP If the currency/commodity symbol is provided as a separate CSV field, assign it to the \f[C]currency\f[R] pseudo field; the symbol will be prepended to the amount (TODO: when there is an amount). Or, you can use an \f[C]amount\f[R] field assignment for more control, eg: .IP .nf \f[C] fields date,description,currency,amount amount %amount %currency \f[R] .fi .SS CSV balance assertions/assignments .PP If the CSV includes a running balance, you can assign that to one of the pseudo fields \f[C]balance\f[R] (or \f[C]balance1\f[R]) or \f[C]balance2\f[R]. This will generate a balance assertion (or if the amount is left empty, a balance assignment), on the first or second posting, whenever the running balance field is non-empty. (TODO: #1000) .SS Reading multiple CSV files .PP You can read multiple CSV files at once using multiple \f[C]-f\f[R] arguments on the command line, and hledger will look for a correspondingly-named rules file for each. Note if you use the \f[C]--rules-file\f[R] option, this one rules file will be used for all the CSV files being read. .SS Valid CSV .PP hledger follows RFC 4180, with the addition of a customisable separator character. .PP Some things to note: .PP When quoting fields, .IP \[bu] 2 you must use double quotes, not single quotes .IP \[bu] 2 spaces outside the quotes are not allowed. .SH "REPORTING BUGS" Report bugs at http://bugs.hledger.org (or on the #hledger IRC channel or hledger mail list) .SH AUTHORS Simon Michael and contributors .SH COPYRIGHT Copyright (C) 2007-2019 Simon Michael. .br Released under GNU GPL v3 or later. .SH SEE ALSO hledger(1), hledger\-ui(1), hledger\-web(1), hledger\-api(1), hledger_csv(5), hledger_journal(5), hledger_timeclock(5), hledger_timedot(5), ledger(1) http://hledger.org