Automatically repopulate the tokenizer/ directory with the PHP stub
and the postgresql module, when the directory is missing. This allows
to switch working directories and in particular run the service
from a different maschine then where it was installed.
Users still need to make sure that .env files are set up correctly
or they will shoot themselves in the foot.
See #2515.
An expression of the form 'SELECT (func()).*' will be expanded
by Postgresql _before_ execution with the result that the function
will be called as many times as there are fields in the record.
This is not what we want. The function call needs to go into
the FROM clause instead.
This lays the groundwork for adding variants for housenumbers.
When analysis is enabled, then the 'word' field in the word table
is used as usual, so that variants can be created. There will be
only one analyser allowed which must have the fixed name
'@housenumber'.
When changing something in the default configuration of the sanatizers
that refers to an analyzer that is not yet loaded, there shouldn't be
any errors.
This gives the analyzer more flexibility in choosing the normalized
form. In particular, an analyzer creating different variants can choose
the variant that will be used as the canonical form.
Mutations are regular-expression-based replacements that are applied
after variants have been computed. They are meant to be used for
variations on character level.
Add spelling variations for German umlauts.
Only one addr: tag can be processed currently, so make
sure it is the one without suffixes to not get odd data.
addr:street is the exception because it uses a different
matching mechanism.
Using partial names turned out to not work well because there are
often similarly named streets next to each other. It also
prevents us from being able to take into account all addr:street:*
tags.
This change gets all the full term tokens for the addr:street tags
from the DB. As they are used for matching only, we can assume that
the term must already be there or there will be no match. This
avoid creating unused full name tags.
Check if the API script exists on the expected location before
running php-cli. This way we can add a useful hint about the
project directory.
Fixes#2513.