Commit Graph

3607 Commits

Author SHA1 Message Date
Sarah Hoffmann
4c66c35ed6 reinit the tokenizer directory on website refresh
This means the project directory is usable again, once refresh --website
was run.
2022-03-20 17:49:22 +01:00
Sarah Hoffmann
54db1d8915 docs: copying project dir no longer necessary 2022-03-20 16:01:27 +01:00
Sarah Hoffmann
a0ed80d821 restore the tokenizer directory when missing
Automatically repopulate the tokenizer/ directory with the PHP stub
and the postgresql module, when the directory is missing. This allows
to switch working directories and in particular run the service
from a different maschine then where it was installed.
Users still need to make sure that .env files are set up correctly
or they will shoot themselves in the foot.

See #2515.
2022-03-20 11:31:42 +01:00
Sarah Hoffmann
e65913d376 cache loaded configuration
Reading the YAML files is fairly expensive and slows down the BDD tests
significantly. Therefore cache the results from reading the file.
2022-03-20 11:30:03 +01:00
Sarah Hoffmann
2f266d946b
Merge pull request #2639 from lonvia/remove-operator
No longer use operator tag as a name
2022-03-18 16:42:18 +01:00
Sarah Hoffmann
42f0282f14 remove special case for operator names
The OSM data has been sufficiently cleaned up by now that
the operator no longer needs to be considered a name tag.
Use 'brand' as the searchable alternative.
2022-03-18 10:48:53 +01:00
Sarah Hoffmann
2723553593
Merge pull request #2637 from lonvia/keep-linked-place-names
Introduce separation of names from linked places
2022-03-17 16:39:30 +01:00
Sarah Hoffmann
23de4c7aca adapt ParameterParser tests to new key list 2022-03-17 11:45:05 +01:00
Sarah Hoffmann
ce14964943 fix linting 2022-03-17 11:05:32 +01:00
Sarah Hoffmann
e133476c35 merge linked names correctly into namedetails
Convert the '_place_*' entries back to normal entries before
returning them in the 'namedetails' section. If the name field is
duplicated, kept the '_place_*' notation. This preserves the previous
behaviour before _place_ names were introduces but adds the additional
names from the linked place for reference.
2022-03-17 11:02:02 +01:00
Sarah Hoffmann
524dc64ab7 make sure outputs take into account linked place names 2022-03-16 21:44:52 +01:00
Sarah Hoffmann
17da5f45be fix return code for PHP exceptions
These have returned a 0 until now.
2022-03-16 21:44:02 +01:00
Sarah Hoffmann
42cd021d04 save differing linked polace names in extra fields
This keeps the names tracable and ensures that all names are searchable
when they differ. Do not keep names when they are exactly the same
to save some space. Linked names are cleaned out before relinking.
2022-03-16 16:38:52 +01:00
Sarah Hoffmann
433d2f4c7d
Merge pull request #2633 from lonvia/fix-reverse-single-interpolation-point
Correctly handle single-point interpolations in reverse
2022-03-16 14:22:59 +01:00
Sarah Hoffmann
be8f5778a1 use https protocol for cloning from github
Does not need authentication.
2022-03-16 12:05:58 +01:00
Sarah Hoffmann
ef98a85b05 correctly handle single-point interpolations in reverse
Lookup in location_property_osmline needs to be special cased
for startnumber = endnumber. Also adds tests for the case.

Fixes #2680.
2022-03-16 11:19:09 +01:00
Sarah Hoffmann
930a5cd12a
Merge pull request #2632 from nslxndr/fix-log-typo
Fix typo in log message on replication initialisation
2022-03-15 11:01:57 +01:00
Sandor Nagy
7e3701b64a Fix typo in log message on replication initialisation 2022-03-15 07:50:47 +01:00
Sarah Hoffmann
479d726774
Merge pull request #2627 from mtmail/location-of-osm2pgsql
documentation: clarify osm2pgsql isnt in project directory by default
2022-03-10 15:39:10 +01:00
Marc Tobias
1fcc9717bb documentation: clarify osm2pgsql isnt in project directory by default 2022-03-10 14:16:12 +01:00
Sarah Hoffmann
c35b3ea5c7
Merge pull request #2621 from lonvia/housenumber-analyzer
Introduce optional token analysis for housenumbers
2022-03-01 15:19:07 +01:00
Sarah Hoffmann
15beeef6ce do not expand records in select list
An expression of the form 'SELECT (func()).*' will be expanded
by Postgresql _before_ execution with the result that the function
will be called as many times as there are fields in the record.
This is not what we want. The function call needs to go into
the FROM clause instead.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
92bc3cd0a7 fix linting issue 2022-03-01 09:34:32 +01:00
Sarah Hoffmann
0a9f971e44 add tests for new analyzed housenumbers 2022-03-01 09:34:32 +01:00
Sarah Hoffmann
4a3bbd0319 adapt housenumber cleanup to new word table structure 2022-03-01 09:34:32 +01:00
Sarah Hoffmann
89e1446131 bdd: disable some housenumber tests for legacy
Optional spaces in housenumbers are not supported by legacy tokenizer,
so disable those tests.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
b694a97edf add documentation for housenumber analyzer 2022-03-01 09:34:32 +01:00
Sarah Hoffmann
13ed184efd housenumber analyzer: avoid creating too many variants
Housenumber fields with lots of text are likely bad data. So is
data with many changes from letter to digit. Exclude them from adding
optional spaces.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
f03a05f6bb add new analyser for houenumbers
This analyser makes spaces optional.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
a6903651fc add framework for analysing housenumbers
This lays the groundwork for adding variants for housenumbers.
When analysis is enabled, then the 'word' field in the word table
is used as usual, so that variants can be created. There will be
only one analyser allowed which must have the fixed name
'@housenumber'.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
b8c544cc98 icu: move token deduplication into TokenInfo
Puts collection into one common place.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
243725aae1 icu: move housenumber token computation out of TokenInfo
This was the last function to use the cache. There is a more clean
separation of responsibility now.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
0bb59b2e22 handle unknown analyzer
When changing something in the default configuration of the sanatizers
that refers to an analyzer that is not yet loaded, there shouldn't be
any errors.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
837d44391c move generation of normalized token form to analyzer
This gives the analyzer more flexibility in choosing the normalized
form. In particular, an analyzer creating different variants can choose
the variant that will be used as the canonical form.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
691ec08586
Merge pull request #2614 from lonvia/reorganise-country-names
Reorganise handling of country names imported from OSM
2022-02-25 09:46:20 +01:00
Sarah Hoffmann
5425394654 add migration to add new derived_names column 2022-02-24 20:50:33 +01:00
Sarah Hoffmann
1d82569f6d add tests for country updates 2022-02-24 16:18:49 +01:00
Sarah Hoffmann
f74228830d bdd: run full import on tests
This uncovered a couple of outdated/wrong tests which have been
fixed, too.
2022-02-24 14:27:51 +01:00
Sarah Hoffmann
a9e3329c39 country_name: use separate columns for names from OSM
This allows us to distinguish between base names and imported ones
and consiquently removing imported ones if necessary.
2022-02-23 09:23:06 +01:00
Sarah Hoffmann
a3e4e8e5cd delete unused country name tokens 2022-02-23 09:23:06 +01:00
Sarah Hoffmann
898febcec5 update supported versions 2022-02-23 09:22:17 +01:00
Sarah Hoffmann
855909b4e9 add 'healthcare' as main tag
Given that the tag is most of the time duplicated by an amenity
tag which is already imported, only import it as a fallback when
there is no name.

Fixes #2609.
2022-02-21 11:52:17 +01:00
Sarah Hoffmann
85d65a2fd2 create idx_place_interpolations for import already
It is needed to look up if a node is part of an interpolation.

Fixes #2608.
2022-02-18 11:11:22 +01:00
Sarah Hoffmann
cd9b0c9a20
Merge pull request #2603 from lonvia/one-step-housenumber-search
One step housenumber search
2022-02-10 17:27:56 +01:00
Sarah Hoffmann
0e11ca9b76 add test that interpolations are found by odd/even 2022-02-10 11:23:51 +01:00
Sarah Hoffmann
fd38dd02ce make sure step is taken into account for interpolations 2022-02-09 21:42:28 +01:00
Sarah Hoffmann
474418f03c include houseumber search in name query
The name query already looks for the existence of housenumbers and
may as well retrive them. Saves up to threee additional lookups.
It also means that we can lift the restriction on looking
for existance of housenumbers for simple queries only.
2022-02-08 22:35:12 +01:00
Sarah Hoffmann
6b9fea6f1a disable debug message in interpolation processing 2022-02-07 23:30:25 +01:00
Sarah Hoffmann
02894ca4a4
Merge pull request #2602 from lonvia/filter-bad-housenumbers
Handle mistagged housenumbers like names
2022-02-07 16:27:04 +01:00
Sarah Hoffmann
7d19209fa1 liniting: disable too-many-ancestors
This is triggered by UserDict which is meant of deriving.
2022-02-07 11:49:18 +01:00