Commit Graph

29 Commits

Author SHA1 Message Date
Sarah Hoffmann
8e1d4818ac use yaml config loader for country info 2021-09-04 00:22:55 +02:00
Sarah Hoffmann
7e7dd769fd remove language and partition from name import 2021-09-02 14:41:11 +02:00
Sarah Hoffmann
79da96b369 read partition and languages from config file 2021-09-02 14:41:11 +02:00
Sarah Hoffmann
78fcabade8 move country name generation to country_info module 2021-09-02 14:41:11 +02:00
Sarah Hoffmann
284645f505 move generation of country tables in own module 2021-09-02 14:41:11 +02:00
Sarah Hoffmann
75a5c7013f split up large setup function 2021-08-15 12:24:13 +02:00
Sarah Hoffmann
87dedde5d6 allow multiple files for the import command
The files are forwarded to osm2pgsql which is now able to merge
them correctly.
2021-08-14 21:42:21 +02:00
Sarah Hoffmann
72625dc72a call freeze after running and non-updateable import
Some of the tables will have already been removed but
the tables for indexing are still there and should be
dropped.
2021-06-02 11:08:48 +02:00
Darkshredder
e5ffc59cd5 feat: Added reverse-only-search validation 2021-05-14 02:36:21 +05:30
Sarah Hoffmann
a4aba23a83 move filling of postcode table to python
The Python code now takes care of reading postcodes from placex,
enhancing them with potentially existing external postcodes and
updating location_postcodes accordingly. The initial setup and
updates use exactly the same function.

External postcode handling has been generalized. External postcodes
for any country are now accepted. The format of the external postcode
file has changed. We now expect CSV, potentially gzipped. The
postcodes are no longer saved in the database.
2021-05-13 14:15:42 +02:00
Sarah Hoffmann
388ebcbae2 move index creation for word table to tokenizer
This introduces a finalization routing for the tokenizer
where it can post-process the import if necessary.
2021-04-30 17:41:08 +02:00
Sarah Hoffmann
bef300305e move default country name creation to tokenizer
The new function is also used, when a country us updated. All SQL
function related to country names have been removed.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
ffc2d82b0e move postcode normalization into tokenizer 2021-04-30 11:30:51 +02:00
Sarah Hoffmann
e1c5673ac3 require tokeinzer for indexer 2021-04-30 11:30:51 +02:00
Sarah Hoffmann
fbbdd31399 move word table and normalisation SQL into tokenizer
Creating and populating the word table is now the responsibility
of the tokenizer.

The get_maxwordfreq() function has been replaced with a
simple template parameter to the SQL during function installation.
The number is taken from the parameter list in the database to
ensure that it is not changed after installation.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
296a66558f move module installation to legacy tokenizer 2021-04-30 11:29:57 +02:00
Sarah Hoffmann
af968d4903 introduce tokenizer modules
This adds the boilerplate for selecting configurable tokenizers.
A tokenizer can be chosen at import time and will then install
itself such that it is fixed for the given database import even
when the software itself is updated.

The legacy tokenizer implements Nominatim's traditional algorithms.
2021-04-30 11:29:57 +02:00
Sarah Hoffmann
79d55357e8 simplify sql and website creation functions 2021-04-19 10:53:30 +02:00
Sarah Hoffmann
d74ae669e3 add support index when continuing import at index phase
Indexing scans the placex table sequentially during indexing
on the initial import. That is okay because we know that all
rows need to be processed anywhere. When continuing the import,
however, a large part might already be indexed, so that the
process spends a lot of time going through rows that are no
longer of interest. Create a supporting index for all unindexed
rows to speed up the scan. This is the same index as used later
for updates.
2021-04-17 11:07:04 +02:00
Sarah Hoffmann
886a01c796 port function to compute initial postcodes to Python 2021-04-16 16:11:20 +02:00
Sarah Hoffmann
76b1885595 use absolute imports in Python code
Relative imports are no longer officially recommended.
2021-04-16 14:20:09 +02:00
Darkshredder
21b1b75b08 Rebase with master 2021-03-29 14:00:45 +05:30
Darkshredder
7a874d5b97 Ported createCountryNames() to python and added tests 2021-03-12 10:28:41 +05:30
Sarah Hoffmann
09f4d767e4 port index creation to python
Also switches to jinja-based preprocessing, which allows to
simplify the SQL files. Use 'if not exists' where possible
so that the step can be rerun to fix missing indexes.
2021-03-04 11:11:47 +01:00
Sarah Hoffmann
eacabb0e96 move table creation to jinja-based preprocessing 2021-03-03 22:07:51 +01:00
Sarah Hoffmann
3a0a4b9175 save software version in the database
The version represents the software version that was used to
import the data.
2021-03-01 20:35:15 +01:00
Sarah Hoffmann
b4f64aa770 make sure that calls to PHP legacy scripts are fatal on error 2021-03-01 16:10:45 +01:00
Sarah Hoffmann
dd03aeb966 bdd: use python library where possible
Replace calls to PHP scripts with direct calls into the
nominatim Python library where possible. This speed up
tests quite a bit.
2021-02-26 16:14:29 +01:00
Sarah Hoffmann
15b5906790 move setup function to python
There are still back-calls to PHP for some of the sub-steps.
These needs some larger refactoring to be moved to Python.
2021-02-26 15:02:39 +01:00