Commit Graph

484 Commits

Author SHA1 Message Date
Sarah Hoffmann
042e314589 remove the language parameter in the SPWikiLoader
Languages must always be configured through config or environment.
Also use monkeypatched environment in tests.
2022-05-30 10:26:20 +02:00
Sarah Hoffmann
61d813bfef add get_str_list() for config
Converts a config value written as a comma-sparated list into
a Python list of strings.
2022-05-29 13:53:50 +02:00
Sarah Hoffmann
dc6c4bf22e add offline import mode
In offline mode no attempts are made to download data from the internet.
At the moment that only concerns the computation of the database date.
It contacts the main API to get the date.
2022-05-11 15:03:02 +02:00
Sarah Hoffmann
3ba975466c fix spacing
Some versions of pylint are oddly picky.
2022-05-11 10:36:09 +02:00
Sarah Hoffmann
d14a585cc9 pylint: disable no-self-use check
This checker encourages bad behaviour (namely changing the static
status of a function during inheritence) and will be made optional
in upcoming versions of pylint.
2022-05-11 10:25:00 +02:00
Sarah Hoffmann
7f7a7df3a2 solve assorted issue with newer pylint versions
Includes more use of 'with', adding encodings to open statements
and a couple of issues with parameter renaming.
2022-05-11 10:22:14 +02:00
Sarah Hoffmann
5d5f40a82f use context management when processing Tiger data 2022-05-11 09:48:56 +02:00
Sarah Hoffmann
ae6b029543 remove redundant 'u' prefixes for unicode strings 2022-05-11 09:48:56 +02:00
Sarah Hoffmann
bb2bd76f91 pylint: avoid explicit use of format() function
Use psycopg2 SQL formatters for SQL and formatted string literals
everywhere else.
2022-05-11 09:48:56 +02:00
Sarah Hoffmann
4e1e166c6a add a function to return a formatted version
Replaces the various repeated format strings throughout the code.
2022-05-11 09:01:24 +02:00
Sarah Hoffmann
7e70e5f503 always state encoding when opening files in text mode
Also applies to Path.write_text().
2022-05-10 15:36:29 +02:00
Sarah Hoffmann
ed6fda6968
Merge pull request #2702 from lonvia/move-country-names-into-includes
Clean up country name settings
2022-05-10 09:21:16 +02:00
Marc Tobias
821dabb138 add git commit hash to --version output 2022-05-09 23:56:13 +02:00
Sarah Hoffmann
9d468f6da0 support arbitrary prefixes in country name list
This means we can now get rid of the last special cases for names.
2022-05-09 11:55:26 +02:00
Marc Tobias
0de83c4a51 fix typos of name Nominatim 2022-05-05 01:04:47 +02:00
Marc Tobias
a79ab41782 new nominatim --version CLI argument 2022-05-04 01:33:25 +02:00
Sarah Hoffmann
3d58254462 skip wikipedia table test on reverse-only installations
Wikipedia importances are not imported on reverse-only imports.
2022-04-29 14:12:55 +02:00
Sarah Hoffmann
8bcdba1a14 add check for wikipedia importance data
Adds a new check level WARNING because missing wikipedia importances
are not necessarily an error. If the database is run for reverse
requests only, then it is fine to go without them.
2022-04-29 12:14:53 +02:00
Sarah Hoffmann
4f59644cc2 add tests for new data invalidation functions 2022-04-14 14:52:13 +02:00
Sarah Hoffmann
c3f1d34b71 add new commands for forced invalidation before indexing 2022-04-14 11:05:43 +02:00
Sarah Hoffmann
126cabacb8 support new ReplicationServer as contextmanager 2022-04-07 17:58:04 +02:00
Sarah Hoffmann
fd4ab3f262
Merge pull request #2629 from tareqpi/country-names-yaml-configuration
Move default country names into yaml configuration
2022-04-04 09:04:25 +02:00
Tareq Al-Ahdal
cfbd3652ef fix linting error 2022-04-02 00:14:18 +08:00
Tareq Al-Ahdal
e9c14979a4 remove the conversion to json for name 2022-04-01 22:54:14 +08:00
Sarah Hoffmann
36a1560117 add migration to mark internal country names 2022-03-31 15:55:20 +02:00
Tareq Al-Ahdal
afef83b1c6 fix edge case handling when 'names' is not there 2022-03-25 22:25:55 +08:00
Tareq Al-Ahdal
9a1f891998 fix linting error 2022-03-24 13:27:24 +08:00
Tareq Al-Ahdal
4fc61d260f clean up 2022-03-24 13:16:59 +08:00
Tareq Al-Ahdal
1ceb6926b7 merge of insert query + modularity enhancements 2022-03-24 13:13:38 +08:00
Sarah Hoffmann
4c66c35ed6 reinit the tokenizer directory on website refresh
This means the project directory is usable again, once refresh --website
was run.
2022-03-20 17:49:22 +01:00
Sarah Hoffmann
a0ed80d821 restore the tokenizer directory when missing
Automatically repopulate the tokenizer/ directory with the PHP stub
and the postgresql module, when the directory is missing. This allows
to switch working directories and in particular run the service
from a different maschine then where it was installed.
Users still need to make sure that .env files are set up correctly
or they will shoot themselves in the foot.

See #2515.
2022-03-20 11:31:42 +01:00
Sarah Hoffmann
e65913d376 cache loaded configuration
Reading the YAML files is fairly expensive and slows down the BDD tests
significantly. Therefore cache the results from reading the file.
2022-03-20 11:30:03 +01:00
Tareq Al-Ahdal
b6ac4ad837 fix linting error 2022-03-18 21:05:47 +08:00
Tareq Al-Ahdal
af739d2f57 modify logic of _include_key function 2022-03-18 06:52:16 +08:00
Tareq Al-Ahdal
fa2aca1cbc adding prefix to keys is now more configurable 2022-03-18 06:20:00 +08:00
Tareq Al-Ahdal
d09670d208 modify logic to prepend 'name:' to keys' 2022-03-18 06:01:25 +08:00
Tareq Al-Ahdal
d32a7c1888 initialize an empty dictionary for nested name key 2022-03-18 02:50:33 +08:00
Tareq Al-Ahdal
6be2077d92
Merge branch 'master' into country-names-yaml-configuration 2022-03-18 02:36:12 +08:00
Tareq Al-Ahdal
456d439e97 Reformatting of country keys 2022-03-18 02:23:11 +08:00
Tareq Al-Ahdal
b4bd4ff67d fix linting error 2022-03-15 19:14:04 +08:00
Sandor Nagy
7e3701b64a Fix typo in log message on replication initialisation 2022-03-15 07:50:47 +01:00
Tareq Al-Ahdal
165d17f7f7 reintroduce 'name:' prefix to country name keys 2022-03-13 18:58:27 +08:00
Tareq Al-Ahdal
377cf36be3 modify data import logic to load country names from yaml 2022-03-12 15:20:57 +08:00
Sarah Hoffmann
15beeef6ce do not expand records in select list
An expression of the form 'SELECT (func()).*' will be expanded
by Postgresql _before_ execution with the result that the function
will be called as many times as there are fields in the record.
This is not what we want. The function call needs to go into
the FROM clause instead.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
92bc3cd0a7 fix linting issue 2022-03-01 09:34:32 +01:00
Sarah Hoffmann
4a3bbd0319 adapt housenumber cleanup to new word table structure 2022-03-01 09:34:32 +01:00
Sarah Hoffmann
13ed184efd housenumber analyzer: avoid creating too many variants
Housenumber fields with lots of text are likely bad data. So is
data with many changes from letter to digit. Exclude them from adding
optional spaces.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
f03a05f6bb add new analyser for houenumbers
This analyser makes spaces optional.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
a6903651fc add framework for analysing housenumbers
This lays the groundwork for adding variants for housenumbers.
When analysis is enabled, then the 'word' field in the word table
is used as usual, so that variants can be created. There will be
only one analyser allowed which must have the fixed name
'@housenumber'.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
b8c544cc98 icu: move token deduplication into TokenInfo
Puts collection into one common place.
2022-03-01 09:34:32 +01:00