Sarah Hoffmann
4c66c35ed6
reinit the tokenizer directory on website refresh
...
This means the project directory is usable again, once refresh --website
was run.
2022-03-20 17:49:22 +01:00
Sarah Hoffmann
54db1d8915
docs: copying project dir no longer necessary
2022-03-20 16:01:27 +01:00
Sarah Hoffmann
a0ed80d821
restore the tokenizer directory when missing
...
Automatically repopulate the tokenizer/ directory with the PHP stub
and the postgresql module, when the directory is missing. This allows
to switch working directories and in particular run the service
from a different maschine then where it was installed.
Users still need to make sure that .env files are set up correctly
or they will shoot themselves in the foot.
See #2515 .
2022-03-20 11:31:42 +01:00
Sarah Hoffmann
e65913d376
cache loaded configuration
...
Reading the YAML files is fairly expensive and slows down the BDD tests
significantly. Therefore cache the results from reading the file.
2022-03-20 11:30:03 +01:00
Sarah Hoffmann
2f266d946b
Merge pull request #2639 from lonvia/remove-operator
...
No longer use operator tag as a name
2022-03-18 16:42:18 +01:00
Tareq Al-Ahdal
b6ac4ad837
fix linting error
2022-03-18 21:05:47 +08:00
Sarah Hoffmann
42f0282f14
remove special case for operator names
...
The OSM data has been sufficiently cleaned up by now that
the operator no longer needs to be considered a name tag.
Use 'brand' as the searchable alternative.
2022-03-18 10:48:53 +01:00
Tareq Al-Ahdal
af739d2f57
modify logic of _include_key function
2022-03-18 06:52:16 +08:00
Tareq Al-Ahdal
fa2aca1cbc
adding prefix to keys is now more configurable
2022-03-18 06:20:00 +08:00
Tareq Al-Ahdal
943e5fe699
Revert the removal of new line at the end of the file
2022-03-18 06:07:48 +08:00
Tareq Al-Ahdal
d09670d208
modify logic to prepend 'name:' to keys'
2022-03-18 06:01:25 +08:00
Tareq Al-Ahdal
83b4b8d9c1
reattach 'name:' prefix to keys
2022-03-18 05:46:23 +08:00
Tareq Al-Ahdal
d32a7c1888
initialize an empty dictionary for nested name key
2022-03-18 02:50:33 +08:00
Tareq Al-Ahdal
d0c1b73fb3
remove duplicate values
2022-03-18 02:43:42 +08:00
Tareq Al-Ahdal
90ac15748e
fix comment
2022-03-18 02:38:04 +08:00
Tareq Al-Ahdal
6be2077d92
Merge branch 'master' into country-names-yaml-configuration
2022-03-18 02:36:12 +08:00
Tareq Al-Ahdal
456d439e97
Reformatting of country keys
2022-03-18 02:23:11 +08:00
Sarah Hoffmann
2723553593
Merge pull request #2637 from lonvia/keep-linked-place-names
...
Introduce separation of names from linked places
2022-03-17 16:39:30 +01:00
Sarah Hoffmann
23de4c7aca
adapt ParameterParser tests to new key list
2022-03-17 11:45:05 +01:00
Sarah Hoffmann
ce14964943
fix linting
2022-03-17 11:05:32 +01:00
Sarah Hoffmann
e133476c35
merge linked names correctly into namedetails
...
Convert the '_place_*' entries back to normal entries before
returning them in the 'namedetails' section. If the name field is
duplicated, kept the '_place_*' notation. This preserves the previous
behaviour before _place_ names were introduces but adds the additional
names from the linked place for reference.
2022-03-17 11:02:02 +01:00
Sarah Hoffmann
524dc64ab7
make sure outputs take into account linked place names
2022-03-16 21:44:52 +01:00
Sarah Hoffmann
17da5f45be
fix return code for PHP exceptions
...
These have returned a 0 until now.
2022-03-16 21:44:02 +01:00
Sarah Hoffmann
42cd021d04
save differing linked polace names in extra fields
...
This keeps the names tracable and ensures that all names are searchable
when they differ. Do not keep names when they are exactly the same
to save some space. Linked names are cleaned out before relinking.
2022-03-16 16:38:52 +01:00
Sarah Hoffmann
433d2f4c7d
Merge pull request #2633 from lonvia/fix-reverse-single-interpolation-point
...
Correctly handle single-point interpolations in reverse
2022-03-16 14:22:59 +01:00
Sarah Hoffmann
be8f5778a1
use https protocol for cloning from github
...
Does not need authentication.
2022-03-16 12:05:58 +01:00
Sarah Hoffmann
ef98a85b05
correctly handle single-point interpolations in reverse
...
Lookup in location_property_osmline needs to be special cased
for startnumber = endnumber. Also adds tests for the case.
Fixes #2680 .
2022-03-16 11:19:09 +01:00
Tareq Al-Ahdal
b4bd4ff67d
fix linting error
2022-03-15 19:14:04 +08:00
Sarah Hoffmann
930a5cd12a
Merge pull request #2632 from nslxndr/fix-log-typo
...
Fix typo in log message on replication initialisation
2022-03-15 11:01:57 +01:00
Sandor Nagy
7e3701b64a
Fix typo in log message on replication initialisation
2022-03-15 07:50:47 +01:00
Tareq Al-Ahdal
165d17f7f7
reintroduce 'name:' prefix to country name keys
2022-03-13 18:58:27 +08:00
Tareq Al-Ahdal
3939cb614e
Remove country.sql from CMakeLists.txt
2022-03-13 18:56:19 +08:00
Tareq Al-Ahdal
377cf36be3
modify data import logic to load country names from yaml
2022-03-12 15:20:57 +08:00
Tareq Al-Ahdal
8b6652a40b
move default country names into yaml configuration
2022-03-12 15:17:01 +08:00
Sarah Hoffmann
479d726774
Merge pull request #2627 from mtmail/location-of-osm2pgsql
...
documentation: clarify osm2pgsql isnt in project directory by default
2022-03-10 15:39:10 +01:00
Marc Tobias
1fcc9717bb
documentation: clarify osm2pgsql isnt in project directory by default
2022-03-10 14:16:12 +01:00
Sarah Hoffmann
c35b3ea5c7
Merge pull request #2621 from lonvia/housenumber-analyzer
...
Introduce optional token analysis for housenumbers
2022-03-01 15:19:07 +01:00
Sarah Hoffmann
15beeef6ce
do not expand records in select list
...
An expression of the form 'SELECT (func()).*' will be expanded
by Postgresql _before_ execution with the result that the function
will be called as many times as there are fields in the record.
This is not what we want. The function call needs to go into
the FROM clause instead.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
92bc3cd0a7
fix linting issue
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
0a9f971e44
add tests for new analyzed housenumbers
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
4a3bbd0319
adapt housenumber cleanup to new word table structure
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
89e1446131
bdd: disable some housenumber tests for legacy
...
Optional spaces in housenumbers are not supported by legacy tokenizer,
so disable those tests.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
b694a97edf
add documentation for housenumber analyzer
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
13ed184efd
housenumber analyzer: avoid creating too many variants
...
Housenumber fields with lots of text are likely bad data. So is
data with many changes from letter to digit. Exclude them from adding
optional spaces.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
f03a05f6bb
add new analyser for houenumbers
...
This analyser makes spaces optional.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
a6903651fc
add framework for analysing housenumbers
...
This lays the groundwork for adding variants for housenumbers.
When analysis is enabled, then the 'word' field in the word table
is used as usual, so that variants can be created. There will be
only one analyser allowed which must have the fixed name
'@housenumber'.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
b8c544cc98
icu: move token deduplication into TokenInfo
...
Puts collection into one common place.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
243725aae1
icu: move housenumber token computation out of TokenInfo
...
This was the last function to use the cache. There is a more clean
separation of responsibility now.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
0bb59b2e22
handle unknown analyzer
...
When changing something in the default configuration of the sanatizers
that refers to an analyzer that is not yet loaded, there shouldn't be
any errors.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
837d44391c
move generation of normalized token form to analyzer
...
This gives the analyzer more flexibility in choosing the normalized
form. In particular, an analyzer creating different variants can choose
the variant that will be used as the canonical form.
2022-03-01 09:34:32 +01:00