Commit Graph

44 Commits

Author SHA1 Message Date
Sarah Hoffmann
eb6814d74e convert word info column to json before copying 2021-07-28 11:31:47 +02:00
Sarah Hoffmann
2c8242c8df remove special code for pre9.5 postgresql
9.5 is now the minimum requirement.
2021-07-19 10:24:57 +02:00
Sarah Hoffmann
6f6681ce67 add helper function for execute_values
Make psycopg2's convenience function accessible through
the cursor.
2021-07-12 21:08:20 +02:00
Sarah Hoffmann
06602b4ec0 provide wrapper function for DROP TABLE
Use psycopg2 formatting to ensure correct quoting.
2021-07-12 20:32:46 +02:00
Sarah Hoffmann
cf98cff2a1 more formatting fixes
Found by flake8.
2021-07-12 17:45:42 +02:00
Sarah Hoffmann
f8b5a63de3 factor out connection reset code 2021-07-12 14:58:44 +02:00
Sarah Hoffmann
a08ef43e40 simplify if statements 2021-07-12 11:28:47 +02:00
Sarah Hoffmann
a0a7b05c9f correctly quote strings when copying in data
Encapsulate the copy string in a class that ensures that
copy lines are written with correct quoting.
2021-07-04 10:28:20 +02:00
Sarah Hoffmann
72625dc72a call freeze after running and non-updateable import
Some of the tables will have already been removed but
the tables for indexing are still there and should be
dropped.
2021-06-02 11:08:48 +02:00
Sarah Hoffmann
5feece64c1 use WorkerPool for Tiger data import
Requires adding an option that SQL errors are ignored.
2021-05-13 20:36:50 +02:00
Sarah Hoffmann
b9a09129fa move WorkerPool into db module
The pool is independent of the indexer and may also be used
by other parts of the software.
2021-05-13 17:11:17 +02:00
Sarah Hoffmann
6ce6f62b8e fetch place info asynchronously 2021-04-30 17:41:08 +02:00
Sarah Hoffmann
fbbdd31399 move word table and normalisation SQL into tokenizer
Creating and populating the word table is now the responsibility
of the tokenizer.

The get_maxwordfreq() function has been replaced with a
simple template parameter to the SQL during function installation.
The number is taken from the parameter list in the database to
ensure that it is not changed after installation.
2021-04-30 11:30:51 +02:00
Sarah Hoffmann
89c90bedb9 pylint: disable check too-few-public-methods 2021-04-24 11:39:44 +02:00
Sarah Hoffmann
91d2fb6b1c use group() for regex matches
Needed for compatibility with Python 3.5.
2021-04-23 22:50:08 +02:00
Sarah Hoffmann
3a642d50a4 use more generic ImportError to check for module
ModuleNotFoundError was only introduced in Python 3.6.
2021-04-23 22:50:08 +02:00
Sarah Hoffmann
9685c68e30 replace usages of fromisoformat() with strptime()
fromisoformat was only introduced with Python 3.7 while we
still support Python 3.5.

Fixes #2292.
2021-04-23 22:50:08 +02:00
Sarah Hoffmann
4fa6c0ad53 simplify constructor for SQL preprocessor
Use sql path from config.
2021-04-19 10:26:25 +02:00
Sarah Hoffmann
76b1885595 use absolute imports in Python code
Relative imports are no longer officially recommended.
2021-04-16 14:20:09 +02:00
Sarah Hoffmann
8d8b1d4307 use non-key index to speed up housenumber search
On Postgresql versions 11+ add an index to speed up the lookup
of housenumbers for terms found in search_name. This is really
just a band-aid around the query planer's interpretation of the
query.
2021-04-01 17:10:44 +02:00
Sarah Hoffmann
09f4d767e4 port index creation to python
Also switches to jinja-based preprocessing, which allows to
simplify the SQL files. Use 'if not exists' where possible
so that the step can be rerun to fix missing indexes.
2021-03-04 11:11:47 +01:00
Sarah Hoffmann
eacabb0e96 move table creation to jinja-based preprocessing 2021-03-03 22:07:51 +01:00
Sarah Hoffmann
d2bd6aa78d introduce jinja2 for preprocessing SQL
Replaces various hand-crafted replacements of varying format with
a single Jinja2 templating mechanism. Allows full access to
configuration if necessary.
2021-03-03 17:51:08 +01:00
Sarah Hoffmann
3a0a4b9175 save software version in the database
The version represents the software version that was used to
import the data.
2021-03-01 20:35:15 +01:00
Sarah Hoffmann
b46adbad22 make sure psql always finishes
If an execption is raised by other means, we still have to close
the stdin pipe to psql to make sure that it exits and releases its
connection to the database.
2021-02-27 10:24:40 +01:00
Sarah Hoffmann
c7f40e3cee fix verbose flag for PHP wrapper scripts
The flag must come after the command.
2021-02-26 16:49:32 +01:00
Sarah Hoffmann
c7fd0a7af4 port wikipedia importance functions to python 2021-02-25 18:42:54 +01:00
Sarah Hoffmann
32683f73c7 move import-data option to native python
This adds a new dependecy to the Python psutil package.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
f6e894a53a port database setup function to python
Hide the former PHP functions in a transition command until
they are removed.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
b93ec2522e use psql for executing sql files
This allows to run larger files without needing to keep
them in memory.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
af7226393a add function to set up libpq environment
Instead of parsing the DSN for each external libpq program we
are going to execute, provide a function that feeds them all
necessary parameters through the environment.

osm2pgsql is the first user.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
e520613362 convert connect() into a context manager 2021-02-25 18:42:54 +01:00
Sarah Hoffmann
a1f0fc1a10 improve deadlock detection for various versions of psycopg2
Psycopg2 has changed the kind of exception that is emitted on
deadlocks between versions 2.7 and 2.8. The code was already
trying to catch both kind of errors but because the
psycopg2.errors package is unknown in 2.7 and below, the
code would throw an exception on anything but a deadlock error.

This commit wraps the deadlock handling into a context manager
to avoid code duplication and uses module imports to detect if
the new error codes are available.

Also sets the required psycopg2 version to 2.7 or bigger as
versions below are difficult to test.
2021-02-25 18:11:16 +01:00
Sarah Hoffmann
b169e4c88c port check-database function to python
This change also adapts the hints to use the nominatim tool.
Slightly changed checks, so that they are just as effective on
a frozen database.
2021-02-18 17:32:30 +01:00
Sarah Hoffmann
c9838a02ce disable JIT and parallel execution for osm2pgsql updates again
The gazetteer output doesn't disable these functions when
writing to the place table but the triggers may contain
operations that cause misplanning for the query planner.
2021-02-16 18:23:47 +01:00
Sarah Hoffmann
fbe7be760b ignore failure to get replication date 2021-02-14 12:17:30 +01:00
Sarah Hoffmann
e629a175ed introduce custom UsageError
This is a exception to be thrown when the error occures because
of bad user data. We don't want to print a full stack trace in
these cases but just tell the user what went wrong.
2021-01-30 16:20:10 +01:00
Sarah Hoffmann
4cb6dc01f3 port replication update function to python 2021-01-30 15:50:34 +01:00
Sarah Hoffmann
8f0885f6cb port check-for-update function to python 2021-01-28 14:50:14 +01:00
Sarah Hoffmann
d78f0ba804 port replication initialisation to Python 2021-01-26 22:50:54 +01:00
Sarah Hoffmann
94fa7162be port address level computation to Python
Also adds simple tests for correct table creation.
2021-01-26 22:50:54 +01:00
Sarah Hoffmann
e6c2842b66 move update code for postcode and word count to Python
Adds also tests for the new function to execute a SQL script.
2021-01-26 22:50:54 +01:00
Sarah Hoffmann
c77877a934 implementaion of 'nominatim index' 2021-01-18 15:43:27 +01:00
Sarah Hoffmann
27977411e9 move indexing function into its own Python module
This makes it mow a standard function of our new Python
library instead of a stand-alone program.
2021-01-18 15:43:27 +01:00