Sarah Hoffmann
344a2bfc1a
add new command for cleaning word tokens
...
Just pulls outdated housenumbers for the moment.
2022-01-20 20:05:15 +01:00
Sarah Hoffmann
86588419fb
Merge pull request #2588 from lonvia/housenumber-sanitizer
...
Move housenumber parsing into sanitizer
2022-01-20 17:44:24 +01:00
Sarah Hoffmann
d09db09849
adapt ICU tets to new housenumber sanitizer
...
Restrict tests to making sure that handing in multiple housenumbers
works.
2022-01-20 16:05:49 +01:00
Sarah Hoffmann
1e5a8561c0
fix linting issues
2022-01-20 16:00:23 +01:00
Sarah Hoffmann
f3c9578bca
complete documentation for new clean-houseunubmers sanatizer
2022-01-20 15:49:32 +01:00
Sarah Hoffmann
3741afa6dc
generalize filter-kind parameter for sanatizers
...
Now behaves the same for tag_analyzer_by_language and
clean_housenumbers. Adds tests.
2022-01-20 15:42:42 +01:00
Sarah Hoffmann
560a006892
add pytest config
...
We are using custom marks now which need to be registered to avoid
warnings.
2022-01-20 15:38:02 +01:00
Sarah Hoffmann
4774e45218
clean_housenumbers: make kinds and delimiters configurable
...
Also adds unit tests for various options.
2022-01-20 12:07:12 +01:00
Sarah Hoffmann
206ee87188
factor out housenumber splitting into sanitizer
2022-01-19 17:27:50 +01:00
Sarah Hoffmann
a7e048484b
Merge pull request #2585 from lonvia/name-mutations
...
Introduce character mutations to token analysis
2022-01-19 17:09:36 +01:00
Sarah Hoffmann
d6b5f2f5da
docs: add pointer to caddy deployment discussion
2022-01-19 15:28:01 +01:00
Sarah Hoffmann
3df560ea38
fix linting error
2022-01-18 11:09:21 +01:00
Sarah Hoffmann
adbaf700cd
move parsing of mutation config to setup phase
2022-01-18 11:09:21 +01:00
Sarah Hoffmann
4a41bff3ab
add documentation for new mutation feature
2022-01-18 11:09:21 +01:00
Sarah Hoffmann
b453b0ea95
introduce mutation variants to generic token analyser
...
Mutations are regular-expression-based replacements that are applied
after variants have been computed. They are meant to be used for
variations on character level.
Add spelling variations for German umlauts.
2022-01-18 11:09:21 +01:00
Sarah Hoffmann
0192a7af96
move variant configuration reading in separate file
2022-01-18 11:09:21 +01:00
Sarah Hoffmann
630ad38a67
refactor variant production to use generators
2022-01-18 11:09:21 +01:00
Sarah Hoffmann
21156fc2a2
Merge pull request #2578 from lonvia/iso-3166-2
...
Make ISO3166-2 references searchable
2022-01-13 14:54:35 +01:00
Sarah Hoffmann
fa99f5bc03
Merge pull request #2579 from geofabrik/doc-update-typo
...
Fix typo in name of service. The rest of the docs call it nominatim-updateS
2022-01-13 14:01:57 +01:00
Amanda McCann
09aa1e7af4
Fix typo in name of service. The rest of the docs call it nominatim-updateS
2022-01-13 13:14:17 +01:00
Sarah Hoffmann
2034ed387b
make ISO3166-2 references searchable
2022-01-13 09:44:42 +01:00
Sarah Hoffmann
d6140d6d54
Merge pull request #2571 from lonvia/ukrainian-apostrophe
...
Consider "modifier letter apostrophe" to be punctuation
2022-01-11 09:41:07 +01:00
Sarah Hoffmann
fb54bd3fcf
consider "modifier letter apostrophe" to be punctuation
...
While technically being a letter, the apostrophe is often replaced
with a normal apostrophe in writing which is a punctuation mark.
This makes sure that the modifier letter apostrophe yields the same
normalization results and thus is really interchangable.
Only has an effect after the next reimport.
Fixes #2569 .
2022-01-10 17:40:03 +01:00
Sarah Hoffmann
a486ee347a
Merge pull request #2570 from woodpeck/patch-3
...
Fix typos
2022-01-10 14:21:48 +01:00
Frederik Ramm
5fb3582b31
Fix typos
2022-01-10 13:38:53 +01:00
Sarah Hoffmann
8b0b9db31e
Merge pull request #2565 from lonvia/swap-wordset-order
...
Swap order of query interpretation
2022-01-06 09:02:46 +01:00
Sarah Hoffmann
f9889f81d6
swap order of query interpretation
...
A forward interpretation of the form 'street, city, country' is
much more frequent than the reverse form 'country, city, street'.
Thus swap the order of interpretations that the forward order comes
first.
2022-01-05 15:21:14 +01:00
Sarah Hoffmann
efafa52719
Merge pull request #2562 from lonvia/copyright-headers
...
Add consistent copyright headers
2022-01-04 23:10:37 +01:00
Sarah Hoffmann
c3788d765e
add consistent SPDX copyright headers
2022-01-03 16:23:58 +01:00
Sarah Hoffmann
e407558f76
Merge pull request #2559 from lonvia/disable-jit-in-queries
...
Disable JIT and parallel workers on search frontend
2022-01-03 15:13:57 +01:00
Sarah Hoffmann
042df4198a
disable JIT and parallel workers on search frontend
...
Bad query planning now also interferes with queries for search and
reverse.
2021-12-22 10:47:54 +01:00
Sarah Hoffmann
ab6f35d83a
Merge pull request #2553 from lonvia/revert-street-matching-to-full-names
...
Revert street matching to full names
2021-12-14 15:52:34 +01:00
Sarah Hoffmann
f9b56a8581
correctly match abbreviated addr:street
...
This only works when addr:street is abbreviated and the street
name isn't. It does not work the other way around.
2021-12-08 21:58:43 +01:00
Sarah Hoffmann
fedc8ed474
Merge pull request #2542 from lonvia/update-phpunit
...
Update PHPUnit use to 9.5
2021-12-07 15:44:45 +01:00
Sarah Hoffmann
79aeb31088
restrict PHPUnit to 9.5 version
...
There are so many breaking changes with PHPUnit that it is
impossible to give any other guarantees.
2021-12-07 14:49:31 +01:00
Sarah Hoffmann
04857d32cd
enable PHPUnit 9 for coverage
...
A couple of functions have been renamed.
2021-12-07 12:07:17 +01:00
Sarah Hoffmann
109cdce92c
php unit: replace deprecated regex assert
...
The regEx assertion has been renamed in PHPUnit 9.5
and causes deprecation warnings.
2021-12-07 11:34:21 +01:00
Sarah Hoffmann
b7554d9ed8
php unit: don't enforce a name on the test database
...
Also gets rid of a PHPUnit deprecation warning.
2021-12-07 11:31:45 +01:00
Sarah Hoffmann
6106f1a32e
php test: class must be called like the file
2021-12-07 11:20:38 +01:00
Sarah Hoffmann
f2a8307bb6
disable codecov
...
Not working.
2021-12-07 11:13:30 +01:00
Sarah Hoffmann
470ee7aef9
Merge pull request #2540 from lonvia/remove-support-for-centos7
...
Remove installation instructions for CentOS 7
2021-12-07 09:17:29 +01:00
Sarah Hoffmann
aefca48e78
remove installation instructions for CentOS 7
...
This ends official support for CentOS 7.
2021-12-06 16:05:27 +01:00
Sarah Hoffmann
5e792078b3
remove some odd varaints of addr:street from the styles
...
Some import has added names in partial tags which confuse the
street name matching.
2021-12-06 15:17:00 +01:00
Sarah Hoffmann
7f7d2fd5b3
skip most addr: tags with suffixes
...
Only one addr: tag can be processed currently, so make
sure it is the one without suffixes to not get odd data.
addr:street is the exception because it uses a different
matching mechanism.
2021-12-06 14:55:10 +01:00
Sarah Hoffmann
5e435b41ba
ICU: matching any street name will do again
2021-12-06 14:26:08 +01:00
Sarah Hoffmann
44cfce1ca4
revert to using full names for street name matching
...
Using partial names turned out to not work well because there are
often similarly named streets next to each other. It also
prevents us from being able to take into account all addr:street:*
tags.
This change gets all the full term tokens for the addr:street tags
from the DB. As they are used for matching only, we can assume that
the term must already be there or there will be no match. This
avoid creating unused full name tags.
2021-12-06 11:38:38 +01:00
Sarah Hoffmann
bb175cc958
Merge pull request #2539 from lonvia/clean-up-python-tests
...
Restructure and extend python unit tests
2021-12-03 17:08:25 +01:00
Sarah Hoffmann
5a9fb6eaf7
specify text type in test SQL
...
Older version of postgres fail otherwise.
2021-12-03 13:56:23 +01:00
Sarah Hoffmann
54d35ddfe9
split cli tests by subcommand and extend coverage
2021-12-02 23:45:48 +01:00
Sarah Hoffmann
7beccb7997
remove unnecessary pass statements
2021-12-02 15:54:24 +01:00