Sarah Hoffmann
1098ab732f
allow relative paths for flatnode file
2021-10-22 17:32:51 +02:00
Sarah Hoffmann
507fdd4f40
switch IMPORT_STYLE to use generic file search
...
Allows relative paths wrt project directory.
2021-10-22 16:49:57 +02:00
Sarah Hoffmann
0ae8d7ac08
have ADDRESS_LEVEL_CONFIG use load_sub_configuration
...
This means that relative paths now are looked up in the
project directory.
2021-10-22 16:36:52 +02:00
Sarah Hoffmann
c77df2d1eb
replace NOMINATIM_PHRASE_CONFIG with command line option
2021-10-22 14:41:14 +02:00
Sarah Hoffmann
cefae021db
doc: clarify relative paths for tokenizer config
2021-10-21 16:38:06 +02:00
Sarah Hoffmann
771aee8cd8
Merge pull request #2475 from lonvia/catchup-mode
...
Add catch-up mode to replication and extend documentation for updating
2021-10-21 16:21:58 +02:00
Sarah Hoffmann
2d13d8b3b6
extend documentation for updating database
...
Explains the different modes and adds hints for
setting up a systemd job.
2021-10-21 12:14:47 +02:00
Sarah Hoffmann
c1fa70639b
add new replication mode catch-up
...
This mode gets updates until the server reports no new diffs
anymore.
Also adds additional indexing, when the main indexing step left
a couple of objects to process. This happens only when the
next update is expected to be more than 40min away.
2021-10-20 22:05:15 +02:00
Sarah Hoffmann
12643c5986
run Tiger import with parallel threads per default
2021-10-19 15:00:26 +02:00
Sarah Hoffmann
a0f5613a23
Merge pull request #2472 from lonvia/word-count-computation
...
Fix word count computation for ICU tokenizer
2021-10-19 14:58:57 +02:00
Sarah Hoffmann
824562357b
adapt tests for new word count mechanism
2021-10-19 12:03:48 +02:00
Sarah Hoffmann
ec7184c533
icu: no longer precompute terms
...
The ICU analyzer no longer drops frequent partials, so it is no
longer necessary to know the frequencies in advance.
2021-10-19 11:52:28 +02:00
Sarah Hoffmann
e8e2502e2f
make word recount a tokenizer-specific function
2021-10-19 11:21:16 +02:00
Sarah Hoffmann
c86cfefc48
Merge pull request #2471 from lonvia/update-install-rules
...
Reorganise, update and extend documentation
2021-10-19 09:11:16 +02:00
Sarah Hoffmann
2635fe8b4c
docs: fix more links
2021-10-18 17:26:14 +02:00
Sarah Hoffmann
632436d54d
docs: refer to our new Settings chapter in the import instruchtions
2021-10-18 17:02:52 +02:00
Sarah Hoffmann
74be6828dd
check and fix all liks in documentation
2021-10-18 16:53:24 +02:00
Sarah Hoffmann
f4acfed48f
add extended documentation of settings
2021-10-18 16:30:52 +02:00
Sarah Hoffmann
91e1c1bea8
docs: update overview pages
2021-10-18 09:04:06 +02:00
Sarah Hoffmann
bbb9a41ea4
docs: move place ranking into customization part
2021-10-18 09:04:06 +02:00
Sarah Hoffmann
f6418887b2
docs: nominatim-ui has a new place for custom config
2021-10-18 09:04:06 +02:00
Sarah Hoffmann
a3f8a097a1
docs: move import style description to customize section
2021-10-18 09:04:06 +02:00
Sarah Hoffmann
751563644f
docs: make customization chapter a separate section
2021-10-18 09:04:01 +02:00
Sarah Hoffmann
e52b801cd0
fix typo
2021-10-18 09:03:07 +02:00
Sarah Hoffmann
445a6428a6
docs: remove the development warning for ICU tokenizer
2021-10-18 09:03:07 +02:00
Sarah Hoffmann
d59b26dad7
docs: add a warning about using --no-updates with TIGER data
2021-10-18 09:03:07 +02:00
Sarah Hoffmann
47417d1871
update and extend man page
...
Provide extended descriptions for most subcommands.
2021-10-18 09:03:07 +02:00
Sarah Hoffmann
381aecb952
rename manual directory to man
...
Avoids confusion between 'docs' and 'manual'.
2021-10-18 09:03:07 +02:00
Sarah Hoffmann
45344575c6
add munin scipts and ICU subrules to installation
2021-10-18 09:03:07 +02:00
Sarah Hoffmann
83381625bd
Merge pull request #2469 from lonvia/fix-tablespace-assignment
...
Fix template expressions for tablespaces
2021-10-15 18:20:43 +02:00
Sarah Hoffmann
552fb16cb2
fix template expressions for tablespaces
2021-10-15 15:11:09 +02:00
Sarah Hoffmann
75c631f080
Merge pull request #2450 from mtmail/tiger-data-2021
...
US TIGER data 2021 released
2021-10-11 19:22:15 +02:00
Sarah Hoffmann
e2464fdf62
Merge pull request #2465 from lonvia/use-spgist-index
...
Use SP-GIST for building index
2021-10-11 10:48:44 +02:00
Sarah Hoffmann
9ff98073db
remove outdated country_languages.php
2021-10-10 21:58:43 +02:00
Sarah Hoffmann
98ee5def37
add recommendation for Postgis 3+
2021-10-10 21:55:38 +02:00
Sarah Hoffmann
3649487f5e
use SP-GIST index for building index where available
...
Point-in-polygon queries are much faster with a SP-GIST geometry
index, so use that for the index used to check if a housenumber
is inside a building.
Only available with Postgis 3. There is an automatic fallback to
GIST for Postgis 2.
2021-10-10 21:55:38 +02:00
Sarah Hoffmann
4b007ae740
Merge pull request #2460 from lonvia/multiple-analyzers
...
Add support for multiple token analyzers
2021-10-09 14:41:09 +02:00
Sarah Hoffmann
6c79a60e19
add documentation for new configuration of ICU tokenizer
2021-10-07 11:55:53 +02:00
Sarah Hoffmann
2a94bfc703
fix argument description for check_database
2021-10-07 09:49:13 +02:00
Sarah Hoffmann
299934fd2a
reorganize and complete tests around generic token analysis
2021-10-06 17:03:37 +02:00
Sarah Hoffmann
b18d042832
add tests for sanitizer tagging language
2021-10-06 12:29:25 +02:00
Sarah Hoffmann
97a10ec218
apply variants by languages
...
Adds a tagger for names by language so that the analyzer of that
language is used. Thus variants are now only applied to names
in the specific language and only tag name tags, no longer to
reference-like tags.
2021-10-06 11:09:54 +02:00
Sarah Hoffmann
d35400a7d7
use analyser provided in the 'analyzer' property
...
Implements per-name choice of analyzer. If a non-default
analyzer is choosen, then the 'word' identifier is extended
with the name of the ana;yzer, so that we still have unique
items.
2021-10-05 14:10:32 +02:00
Sarah Hoffmann
92f6ec2328
remove support for properties on variants
...
Those are not going to be used in the near future, so no need to
carry that code around just now.
2021-10-05 10:29:36 +02:00
Sarah Hoffmann
9ba2019470
precompute replacements while loading configuration
2021-10-05 10:20:08 +02:00
Sarah Hoffmann
c171d88194
move parsing of token analysis config to analyzer
...
Adds a second callback for the analyzer which is responsible
for parsing the configuration rules and converting it to
whatever format necessary. This way, each analyzer implementation
can define its own configuration rules.
2021-10-04 18:31:58 +02:00
Sarah Hoffmann
7cfcbacfc7
make token analyzers configurable modules
...
Adds a mandatory section 'analyzer' to the token-analysis entries
which define, which analyser to use. Currently there is exactly
one, generic, which implements the former ICUNameProcessor.
2021-10-04 17:37:34 +02:00
Sarah Hoffmann
52847b61a3
extend ICU config to accomodate multiple analysers
...
Adds parsing of multiple variant lists from the configuration.
Every entry except one must have a unique 'id' paramter to
distinguish the entries. The entry without id is considered
the default. Currently only the list without an id is used
for analysis.
2021-10-04 16:40:28 +02:00
Sarah Hoffmann
5a36559834
move flatten_config_list into config module
...
For general usage by other modules.
2021-10-04 11:56:54 +02:00
Sarah Hoffmann
19d4e047f6
Merge pull request #2458 from lonvia/add-tokenizer-preprocessing
...
Add a "sanitation" step for name and address tags before token processing
2021-10-01 21:53:34 +02:00