Commit Graph

3446 Commits

Author SHA1 Message Date
Sarah Hoffmann
641f261495
Merge pull request #2525 from lonvia/fix-replication-indexer
Fix instantiation of indexer for replication
2021-11-19 16:16:30 +01:00
Sarah Hoffmann
10e979e841 only instantiate indexer once for replication
Also makes sure that indexer object exists everywhere were needed.

See #2518.
2021-11-19 14:48:58 +01:00
Sarah Hoffmann
8dc1441635
Merge pull request #2517 from lonvia/transliteration-special-chars
ICU: avoid non-alphanumerical characters in transliteration
2021-11-11 07:42:42 +01:00
Sarah Hoffmann
c79dcfad9a make sure housenumbers are properly quoted 2021-11-10 20:44:28 +01:00
Sarah Hoffmann
1886952666 avoid special characters in word tokens
Transliteration should only consist of ASCII letters
and numbers. Avoid any other characters.
2021-11-10 17:14:13 +01:00
Sarah Hoffmann
7326b246b7
Merge pull request #2516 from lonvia/test-for-website-dir
Better error reporting when API script does not exist
2021-11-10 13:27:09 +01:00
Sarah Hoffmann
345c812e43 better error reporting when API script does not exist
Check if the API script exists on the expected location before
running php-cli. This way we can add a useful hint about the
project directory.

Fixes #2513.
2021-11-10 11:58:20 +01:00
Sarah Hoffmann
fd4ba3989e
Merge pull request #2511 from lonvia/fix-combination-error-needs-address
Fix boolean combination of NeedsAddress flag
2021-11-06 12:11:55 +01:00
Sarah Hoffmann
e2d2571ad0 fix combination of NeedsAddress flag
When dealing with multiple partial terms, only keep the
flag, when all partial terms are so frequent as to need
an address.

Fixes #2510.
2021-11-05 22:18:37 +01:00
Sarah Hoffmann
d479a0585d prepare release 4.0.0 2021-11-02 20:27:55 +01:00
Sarah Hoffmann
addfae31b6 fix typo 2021-11-02 11:09:17 +01:00
Sarah Hoffmann
ccf61db726
Merge pull request #2502 from lonvia/improve-development-documentation
Extend developer's documentation
2021-11-01 16:12:23 +01:00
Sarah Hoffmann
5b86b2078a docs: add overview over indexing 2021-11-01 11:04:03 +01:00
Sarah Hoffmann
a069479340 docs: section about database layout
Replaces the import description which basically was
table layout only now.
2021-10-29 12:03:22 +02:00
Sarah Hoffmann
d11bf9288e
Merge pull request #2498 from lonvia/ordering-for-unlisted-place-results
Include unlisted places in ordering by housenumber
2021-10-28 15:28:47 +02:00
Sarah Hoffmann
86eeb4d2ed
Merge pull request #2497 from lonvia/docs-maintenance
docs: add new maintenance section
2021-10-28 11:33:34 +02:00
Sarah Hoffmann
2275fe59ab include unlisted places in ordering by housenumber
When ordering results by the fact that they have a housenumber,
also take cases into account where the housenumber is on the
place itself. This may happen when the search includes the name
of the place and the housenumber or for addr:place addresses
where the place is unlisted.
2021-10-28 11:27:31 +02:00
Sarah Hoffmann
48be8c33ba docs: add new maintenance section
currently used for postcode updates, word count updates and
deleted relations.
2021-10-28 09:22:37 +02:00
Sarah Hoffmann
d3d07128b2
Merge pull request #2495 from lonvia/fix-normalization-in-php
ICU: use correct normalization during search
2021-10-27 14:40:42 +02:00
Sarah Hoffmann
37eeccbf4c ICU: use normalization from config in PHP
The TERM_NORMALIZATION config option is no longer applicable.
That was already documented but not yet implemented.
2021-10-27 11:32:44 +02:00
Sarah Hoffmann
1722fc537f bdd: add tests for non-latin scripts 2021-10-26 17:29:03 +02:00
Sarah Hoffmann
b240b182cb
Merge pull request #2493 from lonvia/handle-frequent-partials
Tune search queries with frequent partial words
2021-10-26 17:00:43 +02:00
Sarah Hoffmann
c0f347fc8c adapt BDD tests to stricter partial search 2021-10-26 15:52:57 +02:00
Sarah Hoffmann
53dbe58ada do not count words when in reverse-only mode 2021-10-26 12:00:13 +02:00
Sarah Hoffmann
2c4b798f9b further refactor setup to keep function small 2021-10-26 12:00:13 +02:00
Sarah Hoffmann
1cf14a8e94 searches for house numbers must have an address 2021-10-26 12:00:13 +02:00
Sarah Hoffmann
4864bf1509 disallow search for partials without address
Very frequent partial terms take too long to look up and
do not return any valuable results unless the search is
further narrowed down by an address.
2021-10-26 12:00:13 +02:00
Sarah Hoffmann
9934421442 make word count computation part of the import
Accurate word counts are now essential when using
the ICU tokenizer and don't hurt for the legacy one.

Adds about an hour import time.
2021-10-26 12:00:13 +02:00
Sarah Hoffmann
d7267c1603 actions: move ICU tests into its own run 2021-10-26 11:59:13 +02:00
Sarah Hoffmann
5c778c6d32
Merge pull request #2486 from lonvia/fix-special-phrases
Fix parsing of operator in special phrases
2021-10-25 21:45:08 +02:00
Sarah Hoffmann
85797acf1e ICU: add an index over word_ids
Needed for keyword lookup in the details response.
2021-10-25 21:33:27 +02:00
Sarah Hoffmann
c4f5c11a4e be case-insensitve about special phrase operator 2021-10-25 19:51:20 +02:00
Sarah Hoffmann
5a1c3dbea3 fix parsing of operator in special phrases
Because of unstripped input, the operators wouldn't match.
2021-10-25 19:46:30 +02:00
Sarah Hoffmann
8e439d3dd9
Merge pull request #2484 from lonvia/fix-index-use
Reverse: add index hints
2021-10-25 17:20:42 +02:00
Sarah Hoffmann
9ebf921c53
Merge pull request #2483 from lonvia/fix-warming
Fix warming for ICU tokenizer
2021-10-25 16:21:36 +02:00
Sarah Hoffmann
7bd9094aaa reverse: add index hints
The fairly complex where condition of idx_placex_geometry_placenode
won't always be matched by the query planner if the condition
part doesn't appear verbatim in the query.

Fixes #2480.
2021-10-25 15:01:03 +02:00
Sarah Hoffmann
16cc395f78 fix warming for ICU tokenizer
Running the warm-up search requests requires querying
the most frequent words. This must be done via the tokenizer
to honor the different formats of the word table.
2021-10-25 13:08:16 +02:00
Sarah Hoffmann
13e7398566 allow relative paths for log files 2021-10-25 10:26:05 +02:00
Sarah Hoffmann
8b90ee4364
Merge pull request #2476 from lonvia/harmonize-configuration-file-settings
Standardize handling of file names in configuration values
2021-10-24 10:57:48 +02:00
Sarah Hoffmann
1098ab732f allow relative paths for flatnode file 2021-10-22 17:32:51 +02:00
Sarah Hoffmann
507fdd4f40 switch IMPORT_STYLE to use generic file search
Allows relative paths wrt project directory.
2021-10-22 16:49:57 +02:00
Sarah Hoffmann
0ae8d7ac08 have ADDRESS_LEVEL_CONFIG use load_sub_configuration
This means that relative paths now are looked up in the
project directory.
2021-10-22 16:36:52 +02:00
Sarah Hoffmann
c77df2d1eb replace NOMINATIM_PHRASE_CONFIG with command line option 2021-10-22 14:41:14 +02:00
Sarah Hoffmann
cefae021db doc: clarify relative paths for tokenizer config 2021-10-21 16:38:06 +02:00
Sarah Hoffmann
771aee8cd8
Merge pull request #2475 from lonvia/catchup-mode
Add catch-up mode to replication and extend documentation for updating
2021-10-21 16:21:58 +02:00
Sarah Hoffmann
2d13d8b3b6 extend documentation for updating database
Explains the different modes and adds hints for
setting up a systemd job.
2021-10-21 12:14:47 +02:00
Sarah Hoffmann
c1fa70639b add new replication mode catch-up
This mode gets updates until the server reports no new diffs
anymore.

Also adds additional indexing, when the main indexing step left
a couple of objects to process. This happens only when the
next update is expected to be more than 40min away.
2021-10-20 22:05:15 +02:00
Sarah Hoffmann
12643c5986 run Tiger import with parallel threads per default 2021-10-19 15:00:26 +02:00
Sarah Hoffmann
a0f5613a23
Merge pull request #2472 from lonvia/word-count-computation
Fix word count computation for ICU tokenizer
2021-10-19 14:58:57 +02:00
Sarah Hoffmann
824562357b adapt tests for new word count mechanism 2021-10-19 12:03:48 +02:00