journal: re-add non-regex aliases, as default (#252)

The regex account aliases added in 0.24 trip up people switching between
hledger and Ledger. (Also they are currently slow).

This change makes the old non-regex aliases the default; they are
unsurprising, useful, and pretty close in functionality to Ledger's.

The new regex aliases are also available; they must be enclosed in
forward slashes. Ledger effectively ignores these, which is ok.

Also clarify docs, refactor, and use the same parser for alias
directives and alias options
This commit is contained in:
Simon Michael 2015-05-14 12:50:32 -07:00
parent 70d87613f2
commit 077e3c6a02
8 changed files with 167 additions and 86 deletions

View File

@ -460,55 +460,77 @@ In [tag queries](manual#queries), remember the tag name must match exactly, whil
##### Account aliases
You can define account aliases to rewrite account names. For a quick example,
see [How to use account aliases](how-to-use-account-aliases.html).
You can define aliases which rewrite your account names (after reading the journal,
before generating reports). hledger's account aliases can be useful for:
In hledger, this feature is quite powerful and requires a little care.
It can be used for
- expanding shorthand account names to their full form, allowing easier data entry and a less verbose journal
- adapting old journals to your current chart of accounts
- experimenting with new account organisations, like a new hierarchy or combining two accounts into one
- customising reports
- expanding shorthand account names to their full form, so your entries require less typing
- adjusting old data to match your current chart of accounts, which tends to change over time
- experimenting with new account organisations
- massaging reports, both cosmetic changes and deeper ones ("combine these separate accounts into one")
See also [How to use account aliases](how-to-use-account-aliases.html).
An account alias can be defined on the command line:
```shell
$ hledger --alias 'REGEX=REPLACEMENT' balance
```
or with a directive in the journal file:
```
alias REGEX = REPLACEMENT
```
###### Basic aliases
To set an account alias, use the `alias` directive in your journal file.
This affects all subsequent journal entries in the current file or its
[included files](#including-other-files).
The spaces around the = are optional:
alias OLD = NEW
Or, you can use the `--alias` option on the command line.
This affects all entries. It's useful for trying out aliases interactively:
--alias 'OLD=NEW'
OLD and NEW are full account names.
hledger will replace any occurrence of the old account name with the
new one. Subaccounts are also affected. Eg:
alias checking = assets:bank:wells fargo:checking
# rewrites "checking" to "assets:bank:wells fargo:checking", or "checking:a" to "assets:bank:wells fargo:checking:a"
###### Regex aliases
There is also a more powerful variant that uses a regular expression,
indicated by the forward slashes. (This was the default behaviour in hledger 0.24-0.25):
alias /REGEX/ = REPLACEMENT
or:
--alias '/REGEX/=REPLACEMENT'
<!-- (Can also be written `'/REGEX/REPLACEMENT/'`). -->
REGEX is a case-insensitive regular expression. Anywhere it matches
inside an account name, the matched part will be replaced by
REPLACEMENT.
If REGEX contains parenthesised match groups, these can be referenced
by the usual numeric backreferences in REPLACEMENT.
Note, currently regular expression aliases may cause noticeable slow-downs.
(And if you use Ledger on your hledger file, they will be ignored.)
Eg:
alias ^expenses = equity:draw:personal
alias /^(.+):bank:([^:]+)(.*)/ = \1:\2 \3
# rewrites "assets:bank:wells fargo:checking" to "assets:wells fargo checking"
Spaces around the = are optional and ignored.
You can define as many aliases as you like.
###### Multiple aliases
Each alias is tested against each account name as those are read from the journal.
When REGEX (a case-insensitive regular expression) matches
anywhere within the account name, the matched part is replaced by
REPLACEMENT.
An alias can replace multiple matches in one account name.
REGEX can contain parenthesised match groups, and REPLACEMENT can
include these with a numeric backreference (like `\1`).
You can define as many aliases as you like using directives or command-line options.
Aliases are recursive - each alias sees the result of applying previous ones.
(This is different from Ledger, where aliases are non-recursive by default).
Aliases are applied in the following order:
An alias becomes active when it is read, and affects all entries
read after it. It will also affect the entries of any files [included](#including-other-files)
after it. It will not affect a parent file (aliases do not "leak"
upward). To forget all aliases defined to this point, use this
directive:
1. alias directives, most recently seen first (recent directives take precedence over earlier ones; directives not yet seen are ignored)
2. alias options, in the order they appear on the command line
###### end aliases
You can clear (forget) all currently defined aliases with the `end aliases` directive:
end aliases
Active aliases are applied in the order they were defined, and are
cumulative (each alias sees the result of applying the previous ones).
Account aliases changed significantly in hledger 0.24 and are
currently somewhat incompatible with Ledger's aliases, which do not
use regular expressions. They can also hurt performance.
##### Default commodity
You can set a default commodity, to be used for amounts without one.

View File

@ -392,10 +392,10 @@ journalApplyAliases aliases j@Journal{jtxns=ts} =
-- else (dbgtrace $
-- "applying additional command-line aliases:\n"
-- ++ chomp (unlines $ map (" "++) $ lines $ ppShow aliases))) $
j{jtxns=map fixtransaction ts}
j{jtxns=map dotransaction ts}
where
fixtransaction t@Transaction{tpostings=ps} = t{tpostings=map fixposting ps}
fixposting p@Posting{paccount=a} = p{paccount=accountNameApplyAliases aliases a}
dotransaction t@Transaction{tpostings=ps} = t{tpostings=map doposting ps}
doposting p@Posting{paccount=a} = p{paccount= accountNameApplyAliases aliases a}
-- | Do post-parse processing on a journal to make it ready for use: check
-- all transactions balance, canonicalise amount formats, close any open

View File

@ -37,7 +37,6 @@ module Hledger.Data.Posting (
joinAccountNames,
concatAccountNames,
accountNameApplyAliases,
accountNameApplyOneAlias,
-- * arithmetic
sumPostings,
-- * rendering
@ -219,22 +218,26 @@ concatAccountNames :: [AccountName] -> AccountName
concatAccountNames as = accountNameWithPostingType t $ intercalate ":" $ map accountNameWithoutPostingType as
where t = headDef RegularPosting $ filter (/= RegularPosting) $ map accountNamePostingType as
-- | Rewrite an account name using all applicable aliases from the given list, in sequence.
-- | Rewrite an account name using all matching aliases from the given list, in sequence.
-- Each alias sees the result of applying the previous aliases.
accountNameApplyAliases :: [AccountAlias] -> AccountName -> AccountName
accountNameApplyAliases aliases a = accountNameWithPostingType atype aname'
where
(aname,atype) = (accountNameWithoutPostingType a, accountNamePostingType a)
matchingaliases = filter (\(re,_) -> regexMatchesCI re aname) aliases
aname' = foldl (flip (uncurry regexReplaceCI)) aname matchingaliases
aname' = foldl
(\acct alias -> dbg6 "got" $ aliasReplace (dbg6 "alias" alias) acct)
aname
aliases
-- aliasMatches :: AccountAlias -> AccountName -> Bool
-- aliasMatches (BasicAlias old _) a = old `isAccountNamePrefixOf` a
-- aliasMatches (RegexAlias re _) a = regexMatchesCI re a
aliasReplace :: AccountAlias -> AccountName -> AccountName
aliasReplace (BasicAlias old new) a | old `isAccountNamePrefixOf` a = new ++ drop (length old) a
| otherwise = a
aliasReplace (RegexAlias re repl) a = regexReplaceCI re repl a
-- | Rewrite an account name using the first applicable alias from the given list, if any.
accountNameApplyOneAlias :: [AccountAlias] -> AccountName -> AccountName
accountNameApplyOneAlias aliases a = accountNameWithPostingType atype aname'
where
(aname,atype) = (accountNameWithoutPostingType a, accountNamePostingType a)
firstmatchingalias = headDef Nothing $ map Just $ filter (\(re,_) -> regexMatchesCI re aname) aliases
applyAlias = uncurry regexReplaceCI
aname' = maybe id applyAlias firstmatchingalias $ aname
tests_Hledger_Data_Posting = TestList [

View File

@ -48,7 +48,16 @@ data Interval = NoInterval
type AccountName = String
type AccountAlias = (Regexp,Replacement)
data AccountAlias = BasicAlias AccountName AccountName
| RegexAlias Regexp Replacement
deriving (
Eq
,Read
,Show
,Ord
,Data
,Typeable
)
data Side = L | R deriving (Eq,Show,Read,Ord,Typeable,Data)

View File

@ -25,6 +25,7 @@ module Hledger.Read (
mamountp',
numberp,
codep,
accountaliasp,
-- * Tests
samplejournal,
tests_Hledger_Read,

View File

@ -37,7 +37,8 @@ module Hledger.Read.JournalReader (
mamountp',
numberp,
emptyorcommentlinep,
followingcommentp
followingcommentp,
accountaliasp
#ifdef TESTS
-- * Tests
-- disabled by default, HTF not available on windows
@ -243,13 +244,34 @@ aliasdirective :: ParsecT [Char] JournalContext (ExceptT String IO) JournalUpdat
aliasdirective = do
string "alias"
many1 spacenonewline
orig <- many1 $ noneOf "="
char '='
alias <- restofline
addAccountAlias (accountNameWithoutPostingType $ strip orig
,accountNameWithoutPostingType $ strip alias)
alias <- accountaliasp
addAccountAlias alias
return $ return id
accountaliasp :: Stream [Char] m Char => ParsecT [Char] st m AccountAlias
accountaliasp = regexaliasp <|> basicaliasp
basicaliasp :: Stream [Char] m Char => ParsecT [Char] st m AccountAlias
basicaliasp = do
-- pdbg 0 "basicaliasp"
old <- rstrip <$> (many1 $ noneOf "=")
char '='
many spacenonewline
new <- rstrip <$> anyChar `manyTill` eolof -- don't require a final newline, good for cli options
return $ BasicAlias old new
regexaliasp :: Stream [Char] m Char => ParsecT [Char] st m AccountAlias
regexaliasp = do
-- pdbg 0 "regexaliasp"
char '/'
re <- many1 $ noneOf "/\n\r" -- paranoid: don't try to read past line end
char '/'
many spacenonewline
char '='
many spacenonewline
repl <- rstrip <$> anyChar `manyTill` eolof
return $ RegexAlias re repl
endaliasesdirective :: ParsecT [Char] JournalContext (ExceptT String IO) JournalUpdate
endaliasesdirective = do
string "end aliases"

View File

@ -360,17 +360,9 @@ getCliOpts mode' = do
-- CliOpts accessors
-- | Get the account name aliases from options, if any.
aliasesFromOpts :: CliOpts -> [(AccountName,AccountName)]
aliasesFromOpts = map parseAlias . alias_
where
-- similar to ledgerAlias
parseAlias :: String -> (AccountName,AccountName)
parseAlias s = (accountNameWithoutPostingType $ strip orig
,accountNameWithoutPostingType $ strip alias')
where
(orig, alias) = break (=='=') s
alias' = case alias of ('=':rest) -> rest
_ -> orig
aliasesFromOpts :: CliOpts -> [AccountAlias]
aliasesFromOpts = map (\a -> fromparse $ runParser accountaliasp () ("--alias "++quoteIfNeeded a) a)
. alias_
-- | Get the (tilde-expanded, absolute) journal file path from
-- 1. options, 2. an environment variable, or 3. the default.

View File

@ -1,22 +1,54 @@
# alias-related tests
# 1. alias directive. The pattern is a case-insensitive regular
# expression matching anywhere in the account name. All matching
# aliases will be applied to an account name in turn, most recently
# declared first. The replacement can replace multiple matches within
# the account name. The replacement pattern supports numeric
# backreferences.
# . simple alias directive
hledgerdev -f- accounts
<<<
alias checking = assets:bank:checking
1/1
(checking:a) 1
>>>
assets:bank:checking:a
>>>=0
# . simple alias matches whole account name components only
hledgerdev -f- accounts
<<<
alias a:b = A:B
1/1
(a:b:c) 1 ; should match this
1/1
(a:bb:d) 1 ; should not match this
>>>
A:B:c
a:bb:d
>>>=0
# . regex alias directive
hledgerdev -f- accounts
<<<
alias /^(.+):bank:([^:]+):?(.*)/ = \1:\2 \3
1/1
(assets:bank:B:checking:a) 1
>>>
assets:B checking:a
>>>=0
# . regex alias pattern is a case-insensitive regular expression
# matching anywhere in the account name. All matching aliases are
# applied to an account name in turn, most recently seen first. The
# replacement can replace multiple matches within the account name.
# The replacement pattern supports numeric backreferences.
#
hledgerdev -f- print
<<<
alias a=b
alias /a/ = b
2011/01/01
A a 1
a a 2
c
alias A (.)=\1
alias /A (.)/=\1
2011/01/01
A a 1
@ -36,10 +68,10 @@ alias A (.)=\1
>>>=0
# 2. command-line --alias option. These are applied in the order
# written. Spaces are allowed if quoted.
# . --alias command-line options are applied in the order written.
# Spaces are allowed if quoted.
#
hledgerdev -f- print --alias 'A (.)=a' --alias a=b
hledgerdev -f- print --alias '/A (.)/=a' --alias /a/=b
<<<
2011/01/01
a a 1
@ -54,12 +86,12 @@ hledgerdev -f- print --alias 'A (.)=a' --alias a=b
>>>=0
# 3. Alias options run after alias directives.
# . alias options are applied after alias directives.
#
hledgerdev -f- print --alias a=A --alias B=C --alias B=D --alias C=D
hledgerdev -f- print --alias /a/=A --alias /B/=C --alias /B/=D --alias /C/=D
<<<
alias ^a=B
alias ^a=E
alias /^a/=B
alias /^a/=E
alias E=F
2011/01/01