Commit Graph

3808 Commits

Author SHA1 Message Date
Sarah Hoffmann
4b12d52ef5 convert admin --analyse-indexing to new indexing method
A proper run of indexing requires the place information from the
analyzer. Add the pre-processing of place data, so the right
information is handed into the update function.
2022-07-07 16:20:08 +02:00
Sarah Hoffmann
300612c5a8
Merge pull request #2760 from lonvia/reorganize-data-classes
Code cleanup: move some common code into the data submodule
2022-07-07 16:12:11 +02:00
Sarah Hoffmann
856925d19b remove analyze() from PlaceInfo class
The function creates circular dependencies.
2022-07-07 12:06:58 +02:00
Sarah Hoffmann
cbbcbb1fd7 move country_info into data submodule 2022-07-06 11:08:36 +02:00
Sarah Hoffmann
bce93d60bd move PlaceInfo into data submodule
This data structure is shared between indexer and tokenizer.
2022-07-06 10:54:47 +02:00
Sarah Hoffmann
69e51aebab test: avoid column names with upper-case letters
This may cause problems when the column names get quoted.
2022-07-05 09:12:55 +02:00
Sarah Hoffmann
8ac133f2ee CI: remove unneed stuff to make space for DB 2022-07-03 16:42:57 +02:00
Sarah Hoffmann
67996929e0
Merge pull request #2706 from mtmail/php-fixes-php7-vs-php8
PHP 8 behaves slightly different with in_array and usort
2022-07-03 11:28:52 +02:00
Marc Tobias
ccf119206d PHP 8 behaves slightly different with in_array and usort 2022-07-03 10:55:34 +02:00
Sarah Hoffmann
bc63f10057 fix syntax error with tablespaces 2022-06-30 09:19:16 +02:00
Sarah Hoffmann
6f15306766 docs: replace deprecated pages option
Fixes #2661.
2022-06-29 20:30:28 +02:00
Sarah Hoffmann
161d83af5b fix handling of zero importance
To avoid importance becoming zero and cancelling out other weights,
df008d99f5 introduced a minimum value
for importance. That broke importances for interpolated addresses,
which are less than zero.

Instead of setting a minimum, set zero importances to a very small
value.

Fixes #2753.
2022-06-29 17:54:30 +02:00
Sarah Hoffmann
3bf3b894ea
Merge pull request #2757 from lonvia/filter-postcodes
Add filtering, normalisation and variants for postcodes
2022-06-24 21:09:41 +02:00
Sarah Hoffmann
536f08f33a ignore 5+ postcodes in the US for now
Hierarchical postcodes need a different treatment.
2022-06-24 19:24:22 +02:00
Sarah Hoffmann
3dd7410bb7 bdd: correctly skip postcode tests for legacy 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
93d5be097a bdd: do not expect legacy word table to be without empty tokens
It can happen for bogus names and this will not get fixed anymore.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
6eb9044353 adapt search algorithm to new postcode format in word 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
612d34930b handle postcodes properly on word table updates
update_postcodes_from_db() needs to do the full postcode treatment
in order to derive the correct word table entries.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
5be320368c add documentation for postcode customization 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
7f2ad4ac7e fix linting issue 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
0f00f4968c fix up BDD tests for postcode changes
Includes smaller code fixes found by the tests.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
37b2c6a830 port legacy tokenizer to new postcode handling
Also documents the changes to the SQL functions of the tokenizer.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
e86db3001f fix postcode pattern for Mozambique
Optional groups are not implemented yet.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
7b6ec4fc6c add tests for discarding bad postcodes 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
67dfa38e60 fix liniting problems 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
2eca9fc8af cache postcode normalization 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
b5e5efc131 only add well-formatted postcodes to location table 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
80ea13437d move postcode matcher in a separate file 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
bf86b45178 move postcode centroid computation to Python 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
4885fdf0f9 add class for online centroid computation 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
b7704833e4 icu: switch postcodes to using the pre-formatted one 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
ca7b46511d introduce and use analyzer for postcodes 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
18864afa8a postcodes: introduce a default pattern for countries without postcodes 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
5ba75df507 postcode: generate a generic form 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
9cf700e85d add postcodes for most of the remaining countries
Now includes all postcodes that have optional parts.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
9172696324 postcodes: add support for optional spaces 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
49626ba709 add postcode formats with optional country code
If the country code is not part of the mandatory output, the
country code filter will do the correct handling.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
baee6f3de0 postcodes: strip leading country codes 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
28ab2f6048 add postcodes patterns without optional spaces 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
90d4d339db initial postcode cleaner for simple patterns
Moves postcodes that are either in countries without a postcode
system or don't correspond to the local pattern for postcodes into
a field for a normal address part. Makes them searchable but not as
a special address. This has two consequences: they are no longer a
skippable part of the address and the postcodes cannot be searched
on their own.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
6e0014e138 add postcode patterns for numeric postcodes
Adds patterns for countries that have simple numeric-only postcodes.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
8080625747 remove postcodes from countries that don't have them
The postcodes will only be removed as a 'computed postcode' they
are still searchable for the given object.
2022-06-23 23:42:31 +02:00
Sarah Hoffmann
21fb501699 add info about countries without a postcode 2022-06-23 23:42:31 +02:00
Sarah Hoffmann
0cd3a1b9bd avoid near searches in very large areas
At some point the contains call becomes too expensive.
2022-06-23 23:42:09 +02:00
Sarah Hoffmann
8de483a45b
Merge pull request #2755 from Luflosi/fix-typo
Fix typo
2022-06-20 22:23:36 +02:00
Luflosi
3ea87169ac
Fix typo 2022-06-20 20:41:00 +02:00
Sarah Hoffmann
42d16d8296
Merge pull request #2751 from mtmail/issue-2750
Documentation fix: should be "nominatim refresh"
2022-06-20 10:21:06 +02:00
marc tobias
adf3ae004f Documentation fix: should be "nominatim refresh" 2022-06-20 02:32:23 +02:00
Sarah Hoffmann
fced1172c4
Merge pull request #2746 from bgo-eiu/patch-2
Added additional languages for Pakistan in country settings
2022-06-18 09:40:47 +02:00
Sarah Hoffmann
299e98776e
Merge pull request #2749 from stefkiourk/patch-1
Typos and syntax on Reverse.md
2022-06-17 22:11:55 +02:00