Sarah Hoffmann
042e314589
remove the language parameter in the SPWikiLoader
...
Languages must always be configured through config or environment.
Also use monkeypatched environment in tests.
2022-05-30 10:26:20 +02:00
Sarah Hoffmann
61d813bfef
add get_str_list() for config
...
Converts a config value written as a comma-sparated list into
a Python list of strings.
2022-05-29 13:53:50 +02:00
Sarah Hoffmann
dc6c4bf22e
add offline import mode
...
In offline mode no attempts are made to download data from the internet.
At the moment that only concerns the computation of the database date.
It contacts the main API to get the date.
2022-05-11 15:03:02 +02:00
Sarah Hoffmann
3ba975466c
fix spacing
...
Some versions of pylint are oddly picky.
2022-05-11 10:36:09 +02:00
Sarah Hoffmann
d14a585cc9
pylint: disable no-self-use check
...
This checker encourages bad behaviour (namely changing the static
status of a function during inheritence) and will be made optional
in upcoming versions of pylint.
2022-05-11 10:25:00 +02:00
Sarah Hoffmann
7f7a7df3a2
solve assorted issue with newer pylint versions
...
Includes more use of 'with', adding encodings to open statements
and a couple of issues with parameter renaming.
2022-05-11 10:22:14 +02:00
Sarah Hoffmann
5d5f40a82f
use context management when processing Tiger data
2022-05-11 09:48:56 +02:00
Sarah Hoffmann
ae6b029543
remove redundant 'u' prefixes for unicode strings
2022-05-11 09:48:56 +02:00
Sarah Hoffmann
bb2bd76f91
pylint: avoid explicit use of format() function
...
Use psycopg2 SQL formatters for SQL and formatted string literals
everywhere else.
2022-05-11 09:48:56 +02:00
Sarah Hoffmann
4e1e166c6a
add a function to return a formatted version
...
Replaces the various repeated format strings throughout the code.
2022-05-11 09:01:24 +02:00
Sarah Hoffmann
7e70e5f503
always state encoding when opening files in text mode
...
Also applies to Path.write_text().
2022-05-10 15:36:29 +02:00
Sarah Hoffmann
ed6fda6968
Merge pull request #2702 from lonvia/move-country-names-into-includes
...
Clean up country name settings
2022-05-10 09:21:16 +02:00
Marc Tobias
821dabb138
add git commit hash to --version output
2022-05-09 23:56:13 +02:00
Sarah Hoffmann
9d468f6da0
support arbitrary prefixes in country name list
...
This means we can now get rid of the last special cases for names.
2022-05-09 11:55:26 +02:00
Marc Tobias
0de83c4a51
fix typos of name Nominatim
2022-05-05 01:04:47 +02:00
Marc Tobias
a79ab41782
new nominatim --version CLI argument
2022-05-04 01:33:25 +02:00
Sarah Hoffmann
3d58254462
skip wikipedia table test on reverse-only installations
...
Wikipedia importances are not imported on reverse-only imports.
2022-04-29 14:12:55 +02:00
Sarah Hoffmann
8bcdba1a14
add check for wikipedia importance data
...
Adds a new check level WARNING because missing wikipedia importances
are not necessarily an error. If the database is run for reverse
requests only, then it is fine to go without them.
2022-04-29 12:14:53 +02:00
Sarah Hoffmann
4f59644cc2
add tests for new data invalidation functions
2022-04-14 14:52:13 +02:00
Sarah Hoffmann
c3f1d34b71
add new commands for forced invalidation before indexing
2022-04-14 11:05:43 +02:00
Sarah Hoffmann
126cabacb8
support new ReplicationServer as contextmanager
2022-04-07 17:58:04 +02:00
Sarah Hoffmann
fd4ab3f262
Merge pull request #2629 from tareqpi/country-names-yaml-configuration
...
Move default country names into yaml configuration
2022-04-04 09:04:25 +02:00
Tareq Al-Ahdal
cfbd3652ef
fix linting error
2022-04-02 00:14:18 +08:00
Tareq Al-Ahdal
e9c14979a4
remove the conversion to json for name
2022-04-01 22:54:14 +08:00
Sarah Hoffmann
36a1560117
add migration to mark internal country names
2022-03-31 15:55:20 +02:00
Tareq Al-Ahdal
afef83b1c6
fix edge case handling when 'names' is not there
2022-03-25 22:25:55 +08:00
Tareq Al-Ahdal
9a1f891998
fix linting error
2022-03-24 13:27:24 +08:00
Tareq Al-Ahdal
4fc61d260f
clean up
2022-03-24 13:16:59 +08:00
Tareq Al-Ahdal
1ceb6926b7
merge of insert query + modularity enhancements
2022-03-24 13:13:38 +08:00
Sarah Hoffmann
4c66c35ed6
reinit the tokenizer directory on website refresh
...
This means the project directory is usable again, once refresh --website
was run.
2022-03-20 17:49:22 +01:00
Sarah Hoffmann
a0ed80d821
restore the tokenizer directory when missing
...
Automatically repopulate the tokenizer/ directory with the PHP stub
and the postgresql module, when the directory is missing. This allows
to switch working directories and in particular run the service
from a different maschine then where it was installed.
Users still need to make sure that .env files are set up correctly
or they will shoot themselves in the foot.
See #2515 .
2022-03-20 11:31:42 +01:00
Sarah Hoffmann
e65913d376
cache loaded configuration
...
Reading the YAML files is fairly expensive and slows down the BDD tests
significantly. Therefore cache the results from reading the file.
2022-03-20 11:30:03 +01:00
Tareq Al-Ahdal
b6ac4ad837
fix linting error
2022-03-18 21:05:47 +08:00
Tareq Al-Ahdal
af739d2f57
modify logic of _include_key function
2022-03-18 06:52:16 +08:00
Tareq Al-Ahdal
fa2aca1cbc
adding prefix to keys is now more configurable
2022-03-18 06:20:00 +08:00
Tareq Al-Ahdal
d09670d208
modify logic to prepend 'name:' to keys'
2022-03-18 06:01:25 +08:00
Tareq Al-Ahdal
d32a7c1888
initialize an empty dictionary for nested name key
2022-03-18 02:50:33 +08:00
Tareq Al-Ahdal
6be2077d92
Merge branch 'master' into country-names-yaml-configuration
2022-03-18 02:36:12 +08:00
Tareq Al-Ahdal
456d439e97
Reformatting of country keys
2022-03-18 02:23:11 +08:00
Tareq Al-Ahdal
b4bd4ff67d
fix linting error
2022-03-15 19:14:04 +08:00
Sandor Nagy
7e3701b64a
Fix typo in log message on replication initialisation
2022-03-15 07:50:47 +01:00
Tareq Al-Ahdal
165d17f7f7
reintroduce 'name:' prefix to country name keys
2022-03-13 18:58:27 +08:00
Tareq Al-Ahdal
377cf36be3
modify data import logic to load country names from yaml
2022-03-12 15:20:57 +08:00
Sarah Hoffmann
15beeef6ce
do not expand records in select list
...
An expression of the form 'SELECT (func()).*' will be expanded
by Postgresql _before_ execution with the result that the function
will be called as many times as there are fields in the record.
This is not what we want. The function call needs to go into
the FROM clause instead.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
92bc3cd0a7
fix linting issue
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
4a3bbd0319
adapt housenumber cleanup to new word table structure
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
13ed184efd
housenumber analyzer: avoid creating too many variants
...
Housenumber fields with lots of text are likely bad data. So is
data with many changes from letter to digit. Exclude them from adding
optional spaces.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
f03a05f6bb
add new analyser for houenumbers
...
This analyser makes spaces optional.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
a6903651fc
add framework for analysing housenumbers
...
This lays the groundwork for adding variants for housenumbers.
When analysis is enabled, then the 'word' field in the word table
is used as usual, so that variants can be created. There will be
only one analyser allowed which must have the fixed name
'@housenumber'.
2022-03-01 09:34:32 +01:00
Sarah Hoffmann
b8c544cc98
icu: move token deduplication into TokenInfo
...
Puts collection into one common place.
2022-03-01 09:34:32 +01:00