AntoJvlt
3676310efe
Improved performance of the postcodes query and some code cleaning
2021-06-12 15:46:08 +02:00
AntoJvlt
a4733eed90
Use place instead of placex to compute postcodes
2021-06-09 09:31:32 +02:00
Sarah Hoffmann
72625dc72a
call freeze after running and non-updateable import
...
Some of the tables will have already been removed but
the tables for indexing are still there and should be
dropped.
2021-06-02 11:08:48 +02:00
Sarah Hoffmann
cc2f152d70
commit changes to replication log table
...
Fixes #2350 .
2021-05-26 11:47:08 +02:00
Sarah Hoffmann
a0e85cc17c
only initialise tokenizer for refresh functions where needed
...
Fixes #2347 .
2021-05-25 19:16:22 +02:00
AntoJvlt
3206bf59df
Resolve conflicts
2021-05-17 13:52:35 +02:00
AntoJvlt
8b8dfc46eb
Added --no-replace command for special phrases importation and added corresponding tests
2021-05-17 13:25:06 +02:00
AntoJvlt
06aab389ed
Code cleaning and SPLoader deleted
2021-05-16 16:59:12 +02:00
Darkshredder
e5ffc59cd5
feat: Added reverse-only-search validation
2021-05-14 02:36:21 +05:30
Sarah Hoffmann
bf864b2c54
index postcodes after refreshing
2021-05-13 14:15:42 +02:00
Sarah Hoffmann
a4aba23a83
move filling of postcode table to python
...
The Python code now takes care of reading postcodes from placex,
enhancing them with potentially existing external postcodes and
updating location_postcodes accordingly. The initial setup and
updates use exactly the same function.
External postcode handling has been generalized. External postcodes
for any country are now accepted. The format of the external postcode
file has changed. We now expect CSV, potentially gzipped. The
postcodes are no longer saved in the database.
2021-05-13 14:15:42 +02:00
AntoJvlt
9d83da830f
Introduction of SPCsvLoader to load special phrases from a csv file
2021-05-10 23:26:39 +02:00
AntoJvlt
00959fac57
Refactoring loading of external special phrases and importation process by introducing SPLoader and SPWikiLoader
2021-05-10 21:49:31 +02:00
Sarah Hoffmann
ced8f0f4a2
fix liniting issues
2021-04-30 17:59:50 +02:00
Sarah Hoffmann
388ebcbae2
move index creation for word table to tokenizer
...
This introduces a finalization routing for the tokenizer
where it can post-process the import if necessary.
2021-04-30 17:41:08 +02:00
Sarah Hoffmann
7cb7cf848d
move amenity creation to tokenizer
...
The BDD tests still use the old-style amenity creation scripts
because we don't have simple means to import a hand-crafted
test file of special phrases right now.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
bef300305e
move default country name creation to tokenizer
...
The new function is also used, when a country us updated. All SQL
function related to country names have been removed.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
ffc2d82b0e
move postcode normalization into tokenizer
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
e1c5673ac3
require tokeinzer for indexer
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
fbbdd31399
move word table and normalisation SQL into tokenizer
...
Creating and populating the word table is now the responsibility
of the tokenizer.
The get_maxwordfreq() function has been replaced with a
simple template parameter to the SQL during function installation.
The number is taken from the parameter list in the database to
ensure that it is not changed after installation.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
296a66558f
move module installation to legacy tokenizer
2021-04-30 11:29:57 +02:00
Sarah Hoffmann
af968d4903
introduce tokenizer modules
...
This adds the boilerplate for selecting configurable tokenizers.
A tokenizer can be chosen at import time and will then install
itself such that it is fixed for the given database import even
when the software itself is updated.
The legacy tokenizer implements Nominatim's traditional algorithms.
2021-04-30 11:29:57 +02:00
AntoJvlt
1b68152fb2
reorganization of folder/file for the special phrases importer
2021-04-25 17:57:42 +02:00
Sarah Hoffmann
89c90bedb9
pylint: disable check too-few-public-methods
2021-04-24 11:39:44 +02:00
Sarah Hoffmann
79d55357e8
simplify sql and website creation functions
2021-04-19 10:53:30 +02:00
Sarah Hoffmann
d74ae669e3
add support index when continuing import at index phase
...
Indexing scans the placex table sequentially during indexing
on the initial import. That is okay because we know that all
rows need to be processed anywhere. When continuing the import,
however, a large part might already be indexed, so that the
process spends a lot of time going through rows that are no
longer of interest. Create a supporting index for all unindexed
rows to speed up the scan. This is the same index as used later
for updates.
2021-04-17 11:07:04 +02:00
Sarah Hoffmann
da98a2102a
remove transition functions from Python
2021-04-16 18:41:14 +02:00
Sarah Hoffmann
886a01c796
port function to compute initial postcodes to Python
2021-04-16 16:11:20 +02:00
Sarah Hoffmann
76b1885595
use absolute imports in Python code
...
Relative imports are no longer officially recommended.
2021-04-16 14:20:09 +02:00
Darkshredder
21b1b75b08
Rebase with master
2021-03-29 14:00:45 +05:30
Sarah Hoffmann
09b2510219
Merge pull request #2228 from AntoJvlt/import-special-phrases-porting-python
...
Import special phrases porting python
2021-03-29 09:49:35 +02:00
AntoJvlt
57ce75eb67
Change command 'import-special-phrases --from-wiki' to 'special-phrases --import-from-wiki'.
2021-03-26 02:22:38 +01:00
AntoJvlt
2c19bd5ea3
Encapsulation of tools/special_phrases.py into SpecialPhrasesImporter class and add new tests.
2021-03-25 21:13:57 +01:00
AntoJvlt
6d56cbb3e8
Changed phrase_settings.py to phrase-settings.json and added migration function for old php settings file.
2021-03-23 23:30:39 +01:00
marc tobias
87d5883ddb
nominatim -h was priting wrong text for lookup and details
2021-03-21 16:06:41 +01:00
AntoJvlt
17cb59efbd
Ported functions for the import of special phrases from php to python.
...
- the command is now --import-special-phrases
- the output is not an sql file anymore, data are directly imported to the database.
- the little part on the documentation (section data import) has been modified.
2021-03-20 19:11:50 +01:00
Darkshredder
7a874d5b97
Ported createCountryNames() to python and added tests
2021-03-12 10:28:41 +05:30
Sarah Hoffmann
9086a794a1
Merge pull request #2204 from darkshredder/tiger-data
...
Ported tiger-data-import to Python and Added Tarball Support
2021-03-11 22:48:38 +01:00
Darkshredder
122c4618b9
Linting fixes
2021-03-08 22:59:51 +05:30
Darkshredder
2af82975cd
Ported tiger-data-import to python and Added Tarball Support
2021-03-08 21:57:56 +05:30
Sarah Hoffmann
764a41b973
automatic migration from 3.6 release
...
Adds a 'admin --migrate' command that checks for the current
database version and runs any necessary migrations. Also
has migrations going back to 3.6.
2021-03-06 16:36:57 +01:00
Sarah Hoffmann
09f4d767e4
port index creation to python
...
Also switches to jinja-based preprocessing, which allows to
simplify the SQL files. Use 'if not exists' where possible
so that the step can be rerun to fix missing indexes.
2021-03-04 11:11:47 +01:00
Sarah Hoffmann
eacabb0e96
move table creation to jinja-based preprocessing
2021-03-03 22:07:51 +01:00
Sarah Hoffmann
3a0a4b9175
save software version in the database
...
The version represents the software version that was used to
import the data.
2021-03-01 20:35:15 +01:00
Sarah Hoffmann
b4f64aa770
make sure that calls to PHP legacy scripts are fatal on error
2021-03-01 16:10:45 +01:00
Sarah Hoffmann
dd03aeb966
bdd: use python library where possible
...
Replace calls to PHP scripts with direct calls into the
nominatim Python library where possible. This speed up
tests quite a bit.
2021-02-26 16:14:29 +01:00
Sarah Hoffmann
15b5906790
move setup function to python
...
There are still back-calls to PHP for some of the sub-steps.
These needs some larger refactoring to be moved to Python.
2021-02-26 15:02:39 +01:00
Sarah Hoffmann
57db5819ef
prot load-data function to python
2021-02-25 21:32:40 +01:00
Sarah Hoffmann
3c186f8030
add a function for the intial indexing run
...
Also moves postcodes to fully parallel indexing.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
c7fd0a7af4
port wikipedia importance functions to python
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
32683f73c7
move import-data option to native python
...
This adds a new dependecy to the Python psutil package.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
7222235579
introduce custom object for cmdline arguments
...
Allows to define special functions over the arguments.
Also splits CLI tests in two files as they have become too many.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
f6e894a53a
port database setup function to python
...
Hide the former PHP functions in a transition command until
they are removed.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
e520613362
convert connect() into a context manager
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
389138abfe
port setup-website to python
2021-02-19 17:51:06 +01:00
Sarah Hoffmann
b169e4c88c
port check-database function to python
...
This change also adapts the hints to use the nominatim tool.
Slightly changed checks, so that they are just as effective on
a frozen database.
2021-02-18 17:32:30 +01:00
Sarah Hoffmann
101a1f895d
port freeze function to python
2021-02-17 21:43:15 +01:00
Sarah Hoffmann
7cc4c53adb
always return 0 for updates unless there is an error
...
This is more in line with previous behavioru than returning
a status code when no updates are available.
2021-02-11 10:33:49 +01:00
Sarah Hoffmann
de37dc9300
forgot to replace one occurence of sql_dir
2021-02-09 19:32:05 +01:00
Sarah Hoffmann
b9517c99ae
rename sql directory to lib-sql
...
Also introduces a separate constant for the sql directory, so that
it can be put separately from the rest of the data if required.
2021-02-09 15:26:56 +01:00
Sarah Hoffmann
d81e152804
integrate analyse of indexing into nominatim tool
2021-02-08 22:22:49 +01:00
Sarah Hoffmann
0cbf98c020
consolidate warm and db-check into single admin command
2021-02-08 21:05:06 +01:00
Sarah Hoffmann
195f9f5ef3
split cli.py by subcommands
...
Reduces file size below 1000 lines.
2021-02-08 17:23:05 +01:00