Commit Graph

612 Commits

Author SHA1 Message Date
AntoJvlt
1b68152fb2 reorganization of folder/file for the special phrases importer 2021-04-25 17:57:42 +02:00
Sarah Hoffmann
9685c68e30 replace usages of fromisoformat() with strptime()
fromisoformat was only introduced with Python 3.7 while we
still support Python 3.5.

Fixes #2292.
2021-04-23 22:50:08 +02:00
Sarah Hoffmann
788baafa26 bdd tests: fix place dependen ranking tests
The ranks of places may differ for some countries. Force the
place nodes in the test on null island which always uses the
default ranking.
2021-04-22 17:31:00 +02:00
Sarah Hoffmann
50b6d7298c factor out async connection handling into separate class
Also adds a test for reconnecting regularly while indexing.
2021-04-20 14:08:37 +02:00
Sarah Hoffmann
b88b952f56 simplify token precomputation
Rename function to reflect that it is only used for precomputation.
The token IDs are not really needed, so don't bother to compute
the array of tokens.
2021-04-19 17:24:19 +02:00
Darkshredder
1f898405a6 Fix: tiger-data tarfile test 2021-04-19 16:02:52 +05:30
Sarah Hoffmann
79d55357e8 simplify sql and website creation functions 2021-04-19 10:53:30 +02:00
Sarah Hoffmann
4fa6c0ad53 simplify constructor for SQL preprocessor
Use sql path from config.
2021-04-19 10:26:25 +02:00
Sarah Hoffmann
8f63f9516b simplify interface for adding tiger data
Also simplifies tests using existing fixtures.
2021-04-19 10:26:25 +02:00
AntoJvlt
b2ae715699 Only log a warning if a wrong input is detected on the wiki while importing special phrases 2021-04-17 20:19:39 +02:00
AntoJvlt
ec859e41c6 Cleaned tests and add database cleaning tests on test_import_from_wiki 2021-04-17 19:23:33 +02:00
Sarah Hoffmann
2ca11ccc6b add tests for continuing import 2021-04-17 11:10:36 +02:00
Sarah Hoffmann
0f11e311c4 add test for new postcode import function 2021-04-16 16:11:20 +02:00
Sarah Hoffmann
c64193f839
Merge pull request #2263 from AntoJvlt/special-phrases-autoupdate
Implemented auto update of special phrases while importing them
2021-04-15 10:13:25 +02:00
Darkshredder
49ee7505ed Fix: Removed error if endstatement is wrong and improved tests 2021-04-13 15:44:12 +05:30
AntoJvlt
ae2b2cb9a5 Tests added for the auto update of special phrases during import 2021-04-12 14:35:29 +02:00
Sarah Hoffmann
16a66b5326 move transliteration of housenumbers into indexing
Housenumbers are now saved in transliterated form in the housenumber
column. This saves the transliteration step during lookup.
2021-04-04 15:26:47 +02:00
Sarah Hoffmann
3590e76a1c tests for finding non-ascii housenumbers 2021-04-04 15:26:47 +02:00
Darkshredder
0f9df32d11 Added Test for TokenSpecialTerm 2021-04-02 04:49:05 +05:30
AntoJvlt
e82de99e5a Cleaned tests of exceptions and fix phrase_settings.json test file name. 2021-03-29 22:07:29 +02:00
Sarah Hoffmann
09b2510219
Merge pull request #2228 from AntoJvlt/import-special-phrases-porting-python
Import special phrases porting python
2021-03-29 09:49:35 +02:00
AntoJvlt
57ce75eb67 Change command 'import-special-phrases --from-wiki' to 'special-phrases --import-from-wiki'. 2021-03-26 02:22:38 +01:00
AntoJvlt
cde9389e75 Errors fixes, Cleaning code, Improvement and addition of tests 2021-03-26 01:53:33 +01:00
AntoJvlt
2c19bd5ea3 Encapsulation of tools/special_phrases.py into SpecialPhrasesImporter class and add new tests. 2021-03-25 21:13:57 +01:00
AntoJvlt
ff34198569 Code cleaning, tests simplification and use of python3-icu package 2021-03-23 23:56:39 +01:00
AntoJvlt
1ce8b530cd Introduction of PyICU for transliteration in python. Reversed changes in normalization.sql. 2021-03-23 23:34:16 +01:00
AntoJvlt
9d1c23e4f5 Updated specialphrases_testdb.sql 2021-03-20 19:17:03 +01:00
AntoJvlt
17cb59efbd Ported functions for the import of special phrases from php to python.
- the command is now --import-special-phrases
- the output is not an sql file anymore, data are directly imported to the database.
- the little part on the documentation (section data import) has been modified.
2021-03-20 19:11:50 +01:00
Sarah Hoffmann
118befd7d7 bdd tests: make indexing less verbose
Do not print progress info for indexing when there is an error
in the BDD tests.
2021-03-20 10:39:29 +01:00
Sarah Hoffmann
0d9fe6e49c
Merge pull request #2219 from lonvia/bdd-test-remove-php
BDD tests: run all setup via nominatim Python library
2021-03-17 11:40:34 +01:00
Sarah Hoffmann
ebae3553e0 bdd: run all setup via nominatim Python library
Drops all calls to PHP utility functions. nominatim cli functions
are used where possible, to stay as close to the final code as
possible with the tests.

By removing the PHP calls, the test code now only uses osm2pgsql and
the database module from the build directory.
2021-03-16 22:20:41 +01:00
Sarah Hoffmann
4d7c5ec089 reverse: do not prefer interpolations over closer housenumbers
Always look up the closest housenumber before looking up
interpolations. This ensures that closer housenumbers are
preferred over interpolations.

Fixes #2214.
2021-03-15 10:50:04 +01:00
Darkshredder
077a8c1f95 refactored tests and made changes to code for easy readibility 2021-03-12 18:23:20 +05:30
Darkshredder
7a874d5b97 Ported createCountryNames() to python and added tests 2021-03-12 10:28:41 +05:30
Darkshredder
e5719de657 Added fixture for sql_preprocessor and fixed some issues 2021-03-11 15:39:17 +05:30
Darkshredder
8486a83cf5 Added test for tarfile 2021-03-10 18:14:17 +05:30
Darkshredder
ccfad57fca Added test and removed runlegacyscript 2021-03-10 17:18:12 +05:30
Sarah Hoffmann
09f4d767e4 port index creation to python
Also switches to jinja-based preprocessing, which allows to
simplify the SQL files. Use 'if not exists' where possible
so that the step can be rerun to fix missing indexes.
2021-03-04 11:11:47 +01:00
Sarah Hoffmann
eacabb0e96 move table creation to jinja-based preprocessing 2021-03-03 22:07:51 +01:00
Sarah Hoffmann
d2bd6aa78d introduce jinja2 for preprocessing SQL
Replaces various hand-crafted replacements of varying format with
a single Jinja2 templating mechanism. Allows full access to
configuration if necessary.
2021-03-03 17:51:08 +01:00
Sarah Hoffmann
7ae9c3a9f0 add database_version setting to tests 2021-03-01 21:49:33 +01:00
Sarah Hoffmann
3a0a4b9175 save software version in the database
The version represents the software version that was used to
import the data.
2021-03-01 20:35:15 +01:00
Sarah Hoffmann
db663dd92f remove unused import 2021-03-01 09:26:08 +01:00
Sarah Hoffmann
90a5d23016 use tmp_path fixture in config tests 2021-03-01 09:24:04 +01:00
Sarah Hoffmann
afabbeb546 older versions of Postgresql need explicit return type 2021-02-27 09:46:42 +01:00
Sarah Hoffmann
dd03aeb966 bdd: use python library where possible
Replace calls to PHP scripts with direct calls into the
nominatim Python library where possible. This speed up
tests quite a bit.
2021-02-26 16:14:29 +01:00
Sarah Hoffmann
15b5906790 move setup function to python
There are still back-calls to PHP for some of the sub-steps.
These needs some larger refactoring to be moved to Python.
2021-02-26 15:02:39 +01:00
Sarah Hoffmann
3c186f8030 add a function for the intial indexing run
Also moves postcodes to fully parallel indexing.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
c7fd0a7af4 port wikipedia importance functions to python 2021-02-25 18:42:54 +01:00
Sarah Hoffmann
32683f73c7 move import-data option to native python
This adds a new dependecy to the Python psutil package.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
7222235579 introduce custom object for cmdline arguments
Allows to define special functions over the arguments.

Also splits CLI tests in two files as they have become too many.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
f6e894a53a port database setup function to python
Hide the former PHP functions in a transition command until
they are removed.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
b93ec2522e use psql for executing sql files
This allows to run larger files without needing to keep
them in memory.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
af7226393a add function to set up libpq environment
Instead of parsing the DSN for each external libpq program we
are going to execute, provide a function that feeds them all
necessary parameters through the environment.

osm2pgsql is the first user.
2021-02-25 18:42:54 +01:00
Sarah Hoffmann
e520613362 convert connect() into a context manager 2021-02-25 18:42:54 +01:00
Sarah Hoffmann
204fe20b4b
Merge pull request #2185 from lonvia/fix-deadlock-handling-for-psycopg27
Improve deadlock detection for various versions of psycopg2
2021-02-25 18:39:40 +01:00
Sarah Hoffmann
a1f0fc1a10 improve deadlock detection for various versions of psycopg2
Psycopg2 has changed the kind of exception that is emitted on
deadlocks between versions 2.7 and 2.8. The code was already
trying to catch both kind of errors but because the
psycopg2.errors package is unknown in 2.7 and below, the
code would throw an exception on anything but a deadlock error.

This commit wraps the deadlock handling into a context manager
to avoid code duplication and uses module imports to detect if
the new error codes are available.

Also sets the required psycopg2 version to 2.7 or bigger as
versions below are difficult to test.
2021-02-25 18:11:16 +01:00
Sarah Hoffmann
5b7483ada5 return 404 for details when no bject is found in database
Fixes #2157.
2021-02-22 16:28:29 +01:00
Sarah Hoffmann
72b01148d2
Merge pull request #2181 from lonvia/port-more-tool-functions-to-python
Port more tool functions to python
2021-02-22 16:11:21 +01:00
Sarah Hoffmann
f08078ccca bdd tests: directly call python code for setup-website 2021-02-19 18:20:55 +01:00
Sarah Hoffmann
389138abfe port setup-website to python 2021-02-19 17:51:06 +01:00
Sarah Hoffmann
a0ae4945cd add unit tests for new check_database code 2021-02-18 20:36:11 +01:00
Sarah Hoffmann
b169e4c88c port check-database function to python
This change also adapts the hints to use the nominatim tool.
Slightly changed checks, so that they are just as effective on
a frozen database.
2021-02-18 17:32:30 +01:00
Sarah Hoffmann
a60c34bded use a frozen DB for API tests
This way we also test that dropping does the right thing.
2021-02-17 22:35:27 +01:00
Sarah Hoffmann
153dbb71b8 remove unused code 2021-02-17 22:25:23 +01:00
Sarah Hoffmann
101a1f895d port freeze function to python 2021-02-17 21:43:15 +01:00
Sarah Hoffmann
7ebcf602ac add simple test for result splitting with multiple ranks 2021-02-16 17:59:12 +01:00
Sarah Hoffmann
fbe7be760b ignore failure to get replication date 2021-02-14 12:17:30 +01:00
Sarah Hoffmann
7cc4c53adb always return 0 for updates unless there is an error
This is more in line with previous behavioru than returning
a status code when no updates are available.
2021-02-11 10:33:49 +01:00
Sarah Hoffmann
0e0e9a6809 need test database for analysing cli test 2021-02-10 16:19:51 +01:00
Sarah Hoffmann
c60a0784ea adapt unit tests to new directory structure 2021-02-09 20:13:00 +01:00
Sarah Hoffmann
3cb6f3e460 use DataDir constant for data only
So far the data directory constant has pointed to the source
directory to be usable with different subdirectories. Now only
the data subdirectory itself is being used with the constant,
so point to the directory directly.
2021-02-09 20:04:08 +01:00
Sarah Hoffmann
8ffd7d9243 remove unused BINDIR constant 2021-02-09 19:30:31 +01:00
Sarah Hoffmann
298ed11261 introduce constant for configuration directory
This replaces {data_dir}/settings throughout the code, so that
the configuration may be placed somewhere else in the directory
structure (e.g. in /etc).
2021-02-09 18:45:45 +01:00
Sarah Hoffmann
b9517c99ae rename sql directory to lib-sql
Also introduces a separate constant for the sql directory, so that
it can be put separately from the rest of the data if required.
2021-02-09 15:26:56 +01:00
Sarah Hoffmann
db3ced17bb rename lib to lib-php 2021-02-09 11:52:07 +01:00
Sarah Hoffmann
d81e152804 integrate analyse of indexing into nominatim tool 2021-02-08 22:22:49 +01:00
Sarah Hoffmann
0cbf98c020 consolidate warm and db-check into single admin command 2021-02-08 21:05:06 +01:00
Sarah Hoffmann
195f9f5ef3 split cli.py by subcommands
Reduces file size below 1000 lines.
2021-02-08 17:23:05 +01:00
Sarah Hoffmann
0b2abfb115 replace make serve with nominatim serve command
With the website directory now tied to the project directory instead
of the build directory, it is no longer possible to use make for
running the web server.
2021-02-03 16:34:31 +01:00
Sarah Hoffmann
cb06d1f4ca do not overwrite custom set module paths
Given that the module is now copied to the project directory
when no module path is set, we need the information that the
module path is empty. Therefore hand in the default module path
in a separate variable.
2021-02-02 18:31:25 +01:00
Sarah Hoffmann
5f63d4ca1f print nice summary after updates 2021-02-01 10:34:31 +01:00
Sarah Hoffmann
e629a175ed introduce custom UsageError
This is a exception to be thrown when the error occures because
of bad user data. We don't want to print a full stack trace in
these cases but just tell the user what went wrong.
2021-01-30 16:20:10 +01:00
Sarah Hoffmann
4cb6dc01f3 port replication update function to python 2021-01-30 15:50:34 +01:00
Sarah Hoffmann
8f0885f6cb port check-for-update function to python 2021-01-28 14:50:14 +01:00
Sarah Hoffmann
d78f0ba804 port replication initialisation to Python 2021-01-26 22:50:54 +01:00
Sarah Hoffmann
5b46fcad8e convert functon creation to python
The new functions always creates normal and partitioned functions.
Also adds specialised connection and cursor classes for adding
frequently used helper functions.
2021-01-26 22:50:54 +01:00
Sarah Hoffmann
94fa7162be port address level computation to Python
Also adds simple tests for correct table creation.
2021-01-26 22:50:54 +01:00
Sarah Hoffmann
e6c2842b66 move update code for postcode and word count to Python
Adds also tests for the new function to execute a SQL script.
2021-01-26 22:50:54 +01:00
Sarah Hoffmann
e6d9485c4a cli: import python modules for commands on demand
Given that only one command will be executed in the end, it is
not necessary to import what amounts to the whole library. This
becomes in particular important for update functions that have
a dependency on pyosmium. The dependency can remain optional for
people not using updates.
2021-01-26 22:50:54 +01:00
Sarah Hoffmann
063a4cb403 cli indexer tests need a fake database
The Indexer constructor opens a connection to the given database.
2021-01-20 21:30:27 +01:00
Sarah Hoffmann
42ec67f63c add more tests for CLI parameter parser 2021-01-20 21:30:27 +01:00
Sarah Hoffmann
8c02786820 add tests for indexer 2021-01-20 21:30:27 +01:00
Sarah Hoffmann
c26f323bf5 add simple tests for CLI parsing 2021-01-20 21:30:27 +01:00
Sarah Hoffmann
bfa6580ad5 use pytest mocking functions for manipulating os.environ 2021-01-20 21:30:27 +01:00
Sarah Hoffmann
52b76d1d01 add tests for Python exec_utils 2021-01-20 21:30:27 +01:00
Sarah Hoffmann
504922ffbe remove old nominatim.py in favour of 'nominatim index'
The PHP scripts need to know the position of the nominatim
tool in order to call it. This is handed in as environment
variable, so it can be set by the Python script.
2021-01-18 15:43:27 +01:00
Sarah Hoffmann
b79c79fa73 add function to get a DSN for psycopg
Converts the PHP DSN syntax into psycopg syntax when necessary.
2021-01-18 15:43:27 +01:00
Sarah Hoffmann
340e7f7210 bdd: complete coverage for API tests
Also removes some functions that are no longer used and
fixes debug output where the tests found an issue.
2021-01-17 16:12:06 +01:00
Sarah Hoffmann
f9c43137c9 remove unused output formatting functions 2021-01-16 17:39:49 +01:00
Sarah Hoffmann
171ed36e36 bdd: remove duplicated tests 2021-01-16 16:57:28 +01:00
Sarah Hoffmann
c6c907d451 bdd: clean up and extend API tests for details
- remove duplicates created by replacing HTML tests
  with JSON tests
- add tests for newer functions for returning geometries
  and hierarchies
2021-01-16 12:04:13 +01:00
Sarah Hoffmann
19ab038724 collect coverage for /website directory as well 2021-01-15 20:27:14 +01:00
Sarah Hoffmann
eb3b789855 add initial pytest test for Configuration 2021-01-15 14:42:03 +01:00
Sarah Hoffmann
2f73bb3643 bdd: directly call utility scripts in lib
This removes the dependency on php-symfony-dotenv for the tests.
2021-01-14 18:19:22 +01:00
Sarah Hoffmann
0495dbe756 bdd: add new API test data
Make all data necessary for API tests directly available in the
repository.
2021-01-09 17:01:33 +01:00
Sarah Hoffmann
5d656891ba bdd: convert API tests to smaller test db
Changes BDD API tests to restrict themselves to
Liechtenstein. One test moved to DB as no appropriate
data is available.
2021-01-09 16:59:46 +01:00
Sarah Hoffmann
74122dc965 bdd: improve assert output for API query checks
Adds wrapper function for checking address parts and
more explanation strings to asserts.
2021-01-09 16:58:37 +01:00
Sarah Hoffmann
ee18a511c6 bdd: import API test DB as part of step setup
In the future, the BDD tests will simply set up the required
test database themselves. Like with the template database, it
is not reimported when it already exists unless that is explicitly
forced.

Makes most of the API tests currently fail because they still
point to old test data.
2021-01-07 11:51:38 +01:00
Sarah Hoffmann
da20881096
Merge pull request #2129 from lonvia/cleanup-bdd-tests
Clean up Python support code for BDD tests
2021-01-07 09:10:40 +01:00
Sarah Hoffmann
49142eb6e5 use relative dir for sources for phpunit 2021-01-07 08:55:15 +01:00
Sarah Hoffmann
73cbb6eb9a bdd: clean up DB ops steps
Adds comments and modernizes code.
2021-01-06 16:37:32 +01:00
Sarah Hoffmann
1f29475fa5 bdd: move column comparison in separate file
Introduces a new class DBRow that encapsulates the comparison
functions. This also is responsible for formatting more informative
assert messages. place and placex steps are unified.
2021-01-06 12:28:09 +01:00
Sarah Hoffmann
d586b95ff1 bdd: move nominitim id reader to separate file 2021-01-05 16:00:48 +01:00
Sarah Hoffmann
25557e5f14 bdd: factor out reindexing on updates 2021-01-05 15:17:46 +01:00
Sarah Hoffmann
197870e67a bdd: move place table inserter into separate file
Also simplifies usage by implementing a function that inserts
a complete table row.
2021-01-05 12:12:59 +01:00
Sarah Hoffmann
b8e39d2dde bdd: move scene setup to OSM data steps
The step has nothing to do with the database.
2021-01-05 11:42:28 +01:00
Sarah Hoffmann
5dfa76a610 bdd: switch to auto commit mode
Put the connection to the test database into auto-commit mode
and get rid of the explicit commits. Also use cursors always in
context managers and unify the two implementations that copy
data from the place table.
2021-01-05 11:42:28 +01:00
Sarah Hoffmann
58c471c627 bdd: remove class for lazy formatting
assert in combination with format() does the right thing and calls
the __str__() method only when an assertion hits.
2021-01-05 10:39:44 +01:00
Sarah Hoffmann
213bf7d19d bdd: rename db_ops steps
Now all files implementing steps are called steps_*.py.
2021-01-05 10:20:00 +01:00
Sarah Hoffmann
12ae8a4ed3 bdd: move output format computation into response 2021-01-05 10:17:59 +01:00
Sarah Hoffmann
8a93f8ed94 bdd: move Response classes in own file and simplify
Removes most of the duplicated parse functions, introduces
a common assert_field function with a more expressive error
message.
2021-01-05 10:03:47 +01:00
Sarah Hoffmann
2712c5f90e bdd: rename and clean up osm_data steps
Move common OPL creation code into a function and remove
unused imports.
2021-01-04 20:17:17 +01:00
Sarah Hoffmann
72587b08fa bdd: move external process execution in separate func 2021-01-04 19:58:59 +01:00
Sarah Hoffmann
faa85ded50 bdd: move NominatimEnvironment into separate file
Also cleans up and modernizes the code and adds documentation.
2021-01-04 17:54:51 +01:00
Sarah Hoffmann
14e5bc7a17 bdd: move grid generation code into geometry factory 2021-01-04 17:04:47 +01:00
Sarah Hoffmann
f727620859 bdd: move geoemtry creation into separate file
Also renames the OsmDataFactory in the more appropriate
GeometryFactory and modernizes code for python3.
2021-01-04 16:34:40 +01:00
Sarah Hoffmann
843d3a137c remove stale code for python2 2021-01-04 14:14:34 +01:00
Sarah Hoffmann
4aba70caee create a temporary project dir for tests
The project directory contains the website script as
configured through the test configuration. This means
that tests are now completely independet of any
configuration that may be contained in the build
directory.

Also removes the hack to inject additional settings via
a environment variable.
2021-01-04 11:39:45 +01:00
Sarah Hoffmann
4ca7197826 replace nose assertions with simple asserts 2021-01-03 17:21:24 +01:00
Sarah Hoffmann
33b038ce6f tests: always create the config file
There is also one database test that uses the API functions.
2020-12-19 17:55:46 +01:00
Sarah Hoffmann
f62c65e9d9 adapt php tests to new directory constants 2020-12-19 14:33:04 +01:00
Sarah Hoffmann
d97aed8741 adapt tests to new dotenv environment
DB tests now can simply set the environment to change configuration
variables. API tests still rely on a configuration file.

Also, query.php needs to set up the CONST_* variables to work with
the query scripts. That is a tiny bit messy and duplicates code
but this part will need to be reworked later.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
b5480f6e36 reorganise path settings in config
CONST_BasePath is split into separate configuration variables
for binaries, libraries and data. These variables as well as
the installation path are now set in the executable directly and
no longer configurable via project settings.

This is the first step towards an installable software. The
executables should know per installation where to find their
necessary data to execute. Project configuration needs to be
restricted to settings that really concern the specific Nominatim
installation.
2020-12-19 14:33:04 +01:00
Sarah Hoffmann
b59d01fe85 update country names
Copies all name:xx country names that are in OSM as of today
into the country name fallback table.
2020-12-09 17:52:25 +01:00
Sarah Hoffmann
65d8770b28 update country_names from OSM data
Update names in the coutry_names table on the fly from incomming
OSM country data. Adding a small sanity check that the country
must be an OSM relation and within the area where we expect the
country to be.
2020-12-09 11:38:19 +01:00
Sarah Hoffmann
987d60ccda place nodes can only be linked once against boundaries
If a place node is already linked against a boundary, it should not
be used for linking again. It is usually a sign of a mapping error,
when there are multiple boundary candidates. This change just avoids
inconsistent data in the database, it does not guarantee that the
linking is against the more correct boundary.
2020-12-02 15:31:02 +01:00
Sarah Hoffmann
63544db8f9 null entries need to be typed 2020-12-01 14:54:42 +01:00
Sarah Hoffmann
7295cad715 compute address parts for rank 30 objects on the fly
Rank 30 objects usually use the address parts of their parent.
When the parent has address parts that are areas but not marked
as isaddress, then the parent might go through multiple administrative
areas. In that case recheck if the right area has been choosen
for the object in question instead of relying on isaddress.
Note that we really only have to do the recomputation in the
case of 'isarea = True and isaddress = False' which hopefully
keeps the number of additional geometric operations we have to do
to a minimum.

There is one more special case to be taken into account here: a
street may go through two administrative areas and a house along
that street is placed in one of the area while the addr:* tags
says it belongs to the other. In that case we must not switch
the isaddress to the one it is situated. To avoid that recheck
the address names against the name of the ara. That is not perfect
but should cover most cases.

Fixes #328.
2020-12-01 11:58:25 +01:00
Sarah Hoffmann
c5d98effc0
Merge pull request #2074 from lonvia/add-housenumber-to-unknown-places
Improve finding addresses that have their own search_name entry because of unknown addr:* parts
2020-11-25 16:57:09 +01:00
Sarah Hoffmann
0f87da017f improve handling of multi-word partials in SearchDescription
Multi-word partial terms had an undue advantage over separate partial
terms because they only need to pay the penalty once. This changes
the behaviour by setting the penalty according to the number of
words in the token. This should get rid of search interpretations
with low chance of matching.

This also fixes handling of exact term matching. We now match against
all exact terms of the query, not just a couple of them collected
while building the interpretations.

Also adds a penalty to very short postcodes.
2020-11-25 12:07:04 +01:00
Sarah Hoffmann
22800d7d59 Search housenumbers with unknown address parts by housenumber term
House numbers need special handling because they may appear after
the street term. That means we canot just use them as the main name
for searches where the address has its own search term entries.
Doing this right now, we are able to find '40, Main St, Town' but not
'Main St 40, Town'.

This switches to using the housenumber token as the name term instead.
House number tokens can get special handling when building the search
query that covers the case where they come after the street.

The main disadvantage is that this once more increases the numbers
of possible search interpretation of which we have already too many.

no penalty for housenumber searches
2020-11-25 11:36:10 +01:00
Sarah Hoffmann
b4b50eef15 search rank 30 must always go with address rank 30 2020-11-24 17:57:28 +01:00
Sarah Hoffmann
49083c2597
Merge pull request #2058 from lonvia/split-address-words
Split addr:* tags into words before adding to the search index
2020-11-18 08:58:17 +01:00
Sarah Hoffmann
ffb2c93ba3 POIs with unknown addr:place must add parent name to address
The previous behaviour was a left-over from a former version
where such POIs parented to the street. Now that they parent to
places, it should be included.
2020-11-17 19:44:43 +01:00
Sarah Hoffmann
30a6b6bdac split addr: tags into words before adding to the search index
Address parts are only matched by single partial words. If
the addr: names are not split, then multi-word names cannot
be found.
2020-11-17 18:03:33 +01:00
Sarah Hoffmann
9ede048769 disallow linking for postcode areas 2020-11-17 10:53:26 +01:00
Sarah Hoffmann
885dc0a8e1 more tests for absense of additional addressline entries 2020-11-16 15:28:01 +01:00
Sarah Hoffmann
7324431b12 get additional addresses for rank 30 objects
get_addressdata() now also checks if the place itself has entries
in the place_addressline table and merges them into the results.

Also restrict checking for address tag places to cases where the
name cannot be found in the parent's address search terms. Looking
up all address tags is just too slow.
2020-11-16 15:28:01 +01:00
Sarah Hoffmann
021f2bef4c get address terms from address tags for rank 30
For rank 30 objects add extra elements into the place_addressline
table.
2020-11-16 15:28:01 +01:00
Sarah Hoffmann
6260fef2e8 add test for placex from addr tags 2020-11-16 15:28:01 +01:00
Sarah Hoffmann
c7472662a6 lookup places for address tags for rank < 30
While previously the content of addr:* tags was only added
to the list of address search keywords, we now really look up
the matching place. This has the advantage that we pull in all
potential translations from the place, just like all the other
address terms that are looked up by neighbourhood search.

If no place can be found for a given name, the content of the
addr:* tag is still added to the search keywords as before.
2020-11-16 15:28:01 +01:00
Sarah Hoffmann
928c6245c9
Merge pull request #2038 from lonvia/addresses-for-large-areas
Improve addresses for large areas
2020-11-03 08:49:01 +01:00
Sarah Hoffmann
33378dcf6e remove tests for icon attribute
The icon attribute requires the CONST_MapIcon_URL to be present
which we cannot guarantee for the tests.
2020-11-02 16:46:29 +01:00
Sarah Hoffmann
b2ebf4b4b7 adapt tests to rank changes of natural 2020-11-02 11:42:10 +01:00
Sarah Hoffmann
d86cf6801f remove tests for HTML output 2020-10-29 11:13:32 +01:00
Sarah Hoffmann
95f83b90d2 minor fixes for geometry compuation during boundary ranking
Go back to using centroid when determining if one admin level
is within another. There are cases where boundaries are slightly
misaligned due to mapping errors (not using the same ways in the
relations).

Only declare boundaries the same when they have the same wikidata
tag _and_ have exactly the same geometry. This works around tagging
errors with the wikidata tag, which happen because of automated
edits to the wikidata tag.
2020-10-28 10:49:26 +01:00
Sarah Hoffmann
7a16909219 detect and remove admin boundary duplicates
The Polish community maps admin boundaries that span multiple
levels by duplicating the boundary relations. Detect this situation
by looking out for matching wikidata tags. The higher ranked
duplicates are then thrown out from the address pool by setting
their address rank to 0.
2020-10-28 10:49:26 +01:00
Sarah Hoffmann
b0ef84caae add tests for rank computation 2020-10-17 17:51:22 +02:00
Sarah Hoffmann
64899ef54b add tests for address computation 2020-10-16 11:07:17 +02:00
Sarah Hoffmann
ca680fc9fc make housenumber interpolation tests more lenient 2020-10-11 12:04:53 +02:00
Sarah Hoffmann
a40684162a Revert "adapt tests to rank_search removal"
This reverts commit 2a717da850.
2020-10-06 13:59:50 +02:00
Sarah Hoffmann
2a717da850 adapt tests to rank_search removal 2020-09-26 09:10:37 +02:00
Sarah Hoffmann
c84e7e72f1 add unknown addr:place to address output
When a POI has no addr:street but an addr:place that is not
contained in the name list of the parent place, then remember
this situation and merge the content of addr:place into the
address output.

We don't need to care about translations in this case because
it is obvious that no object with translations exists if the
parent isn't the object named in addr:place.
2020-09-23 11:55:18 +02:00
Sarah Hoffmann
248d6b413a remove test for is_in 2020-09-22 21:36:49 +02:00
Sarah Hoffmann
a8dfbcef44 always bind addr:place to place instead of street
If an addr:place is given but no addr:street tag, then bind
the rank 30 object always to a <=25 object, even when there
is none found with the same name.
2020-09-21 10:15:14 +02:00
Sarah Hoffmann
caea14d035 merge addr tags into search_name table
When a place of rank 30 has addr tags that are not covered by the
search terms of the parent, add a separate entry for the POI in
the search_name table that includes the addr tags. We can only
do that with named places. For POIs without a name the housenumber
is used as name. If that is not available either, searching still
won't work.
2020-09-21 10:15:14 +02:00
Sarah Hoffmann
b219374d36 remove special casing for rank 25 postcodes
They can be computed like any other place.
2020-09-18 16:18:02 +02:00
Sarah Hoffmann
4c9cfe2532 remove postcodes entirely from indexing
place=postcode places are artificial places that collect addr:postcode
points for aggration. They should neither show up in the address nor
be searchable. That means that there is no need to index them at all.
Only let boundary=postal_code through which define correct areas for
postcodes.
2020-09-18 15:09:35 +02:00
Sarah Hoffmann
fe250d3ee8
Merge pull request #1961 from lonvia/set-place-type-for-result-in-address
Use place type of for result object in address parts
2020-09-17 21:23:40 +02:00
Sarah Hoffmann
6f55c67d16
Merge pull request #1960 from lonvia/fix-postcodes-duplicated-by-normalization
Make sure that all postcodes have an entry in the word table
2020-09-17 21:23:23 +02:00
Sarah Hoffmann
fe8566928e use place type of for result object in address parts
Boundaries shound derive the address part type from the
linked place if possible. This was already implemented
for the address objects but not for the address information
from the address itself.

Fixes #1949.
2020-09-17 18:17:01 +02:00
Sarah Hoffmann
3600709116 make sure that all postcodes have an entry in word
It may happen that two different postcodes normalize to exactly
the same token. In that case we still need two different entries
in the word table. Token lookup will then make sure that the correct
one is choosen.

Fixes #1953.
2020-09-17 17:11:22 +02:00
Sarah Hoffmann
2b11a47a2f restructure developer's manual
Add a section on setting up the development environment which now
also includes the former chapter on recreating the documentation.
Move the README from test/ into the manual as the new Testing
chapter.
2020-09-17 09:54:46 +02:00
Sarah Hoffmann
b6078de6f8 adapt tests to ranking changes 2020-09-01 18:03:17 +02:00
Sarah Hoffmann
6e4b7eb966 do not block deletion of large highway areas
Deletion of areas should only e blocked for addressable features.
Streets and POIs do not have a large impact on updates.
2020-08-28 09:49:21 +02:00
Sarah Hoffmann
be6ecc388c add support for place=square
Squares are now addressable (on address level 25) and thus can
be attached to a house number via addr:place. Needed to increase
the rank range for matching up addr:place to 25.
2020-08-26 12:12:52 +02:00
Sarah Hoffmann
d730e179bf tests: use larger grid to avoid rouding errors 2020-08-22 16:04:24 +02:00
Sarah Hoffmann
d6ff7475f1 make sure that addr:* tags can always be searched for
Always add contents of addr:* tags into address part of the search
table, even when there is no corresponding other name. This keeps
search tolerant to the kind of tagging where parts show up in the
address that have no corresponding object in the database or where
it is only an unaddressable object.
2020-08-19 11:44:10 +02:00
Sarah Hoffmann
e21a707166 remove linked_place from extratags when updating
Before updating an admin boundary we need to make sure that any
artificially generated 'linked_place' entry is removed from the
extratags column. This ensures that the place designation does
not linger when a linked place disappears and that it is updated
when the linking changes.
2020-08-13 16:59:11 +02:00
Sarah Hoffmann
06aa0f0b76 use address rank for address forming when available 2020-08-12 22:22:24 +02:00
Sarah Hoffmann
fb8bb30144 boundary address ranks must not go above 25
Fixes #1914.
2020-08-12 22:22:24 +02:00
Sarah Hoffmann
7429a33818 add simple tests for address rank computation 2020-08-12 22:22:24 +02:00
Sarah Hoffmann
f29dc7d7ac
Merge pull request #1865 from mtmail/how-to-import-test-db
test/README.md - more instructions how to import test db
2020-08-04 14:31:19 +02:00
Sarah Hoffmann
1347abb1e7 be more strict what areas make up an address
Exclude boundaries that touch a line in only one point and
that touch areas only along the boundary.

Fixes #1900.
2020-08-04 12:08:50 +02:00
Sarah Hoffmann
2cb85e48b4 adapt test results to new ranking 2020-08-03 16:57:22 +02:00
marc tobias
01b009ff24 test/README.md - more instructions how to import test db 2020-07-31 16:50:27 +02:00
Sarah Hoffmann
9a204f6284 test: make road really cross the boundary 2020-07-26 15:57:07 +02:00
Sarah Hoffmann
6e4ee160ee adapt tests to new search ranks 2020-06-17 10:53:11 +02:00
Sarah Hoffmann
8218da27b3 adapt tests to new ranks 2020-05-23 19:40:41 +02:00
Sarah Hoffmann
aa4bd00631 Adapt boundary labels for Sweden and Norway
This also gives us the correct labels for address output in
json and xml.
2020-05-23 16:19:27 +02:00
Sarah Hoffmann
cadbdaff18 fix style 2020-05-18 22:20:36 +02:00
Sarah Hoffmann
57510f517a adapt tests to modified address types 2020-05-17 16:53:33 +02:00
Sarah Hoffmann
528fe6553f adapt php tests
Also fixes some errors found by the tests.
2020-05-17 16:46:45 +02:00
Simon Will
14dba39157 Use assertEqualsWithDelta for float comparisons
PHPUnit 7.3 introduced the functions assertEqualsWithDelta for comparing
floats with a delta. The old four-argument version of assertEquals is
deprecated in PHPUnit 8 and removed in PHPUnit 9.

This commit means that the tests will fail with PHPUnit < 7.3 because
assertEqualsWithDelta is not defined there.
2020-05-05 23:43:09 +02:00
Simon Will
43fd2a7423 Declare return type of testcase setUp method
PHPUnit 7 changed the signature of the TestCase methods to include the
return type.
2020-05-05 23:40:18 +02:00
Sarah Hoffmann
65ee7a8002
Merge pull request #1754 from mtmail/nominatim-db-tests-against-postgres
Nominatim::DB tests against separate postgresql database
2020-04-26 10:20:30 +02:00
marc tobias
a5d0657d9b lonvia PR feedback 2020-04-26 03:33:15 +02:00
Sarah Hoffmann
0b0349f746
Merge pull request #1752 from mtmail/new-oo-shell-class
new PHP Nominatim\Shell class to wrap shell escaping
2020-04-25 16:48:04 +02:00
Sarah Hoffmann
207efe700f highway:construction should appear as 'road' in the address list
Fixes #1763.
2020-04-22 09:08:33 +02:00
marc tobias
38c21de0ee Nominatim::DB tests against separate postgresql database 2020-04-13 18:01:37 +02:00
marc tobias
fc40939775 new PHP Nominatim\Shell class to wrap shell escaping 2020-04-12 03:50:40 +02:00
Sarah Hoffmann
ef47515420 make admin levels 3 and 7 distinct ones in addresses
There really is no need to conflate these two levels as they
are in use in various countries.

Also adds province as a distinct place.

Fixes #1736.
2020-04-10 22:58:11 +02:00
Sarah Hoffmann
98be5bf637 adapt tests to geocodejson format adaptions 2020-04-08 11:19:43 +02:00
Rahul
eb2d816f2a Added test cases for whitespaces in LatLon 2020-04-04 00:53:40 +05:30
Sarah Hoffmann
0d189ac5df
Merge pull request #1733 from krahulreddy/whitespaces-considered-as-single-space
Support whitespace characters(x09-x0d) as single space
2020-04-03 18:01:47 +02:00
K Rahul Reddy
7aa2df5389 Support whitespace characters(x09-x0d) as single space 2020-04-02 05:04:40 +05:30
Sarah Hoffmann
975ef0b305 re-add district to geocodejson 2020-04-01 21:24:42 +02:00
Sarah Hoffmann
8150c3602b add tests for geocodejson address fields 2020-04-01 11:14:48 +02:00
Sarah Hoffmann
19948c378a adapt tests to new borough ranking 2020-03-30 23:04:20 +02:00
marc tobias
7a94872413 remove polygon=1 (polypoints) feature 2020-03-29 21:58:11 +02:00
Sarah Hoffmann
d56c69dd01 adapt API tests to place linkage changes
The missing district is due to a data error for wikidata tags.
2020-03-25 11:38:31 +01:00
Sarah Hoffmann
78526a33b4 Remove linkees from search_name
Fixes #722
2020-03-04 11:36:39 +01:00
Sarah Hoffmann
6d431aebb7 linked centroids must always be within geometry
When using a linked place as centroid for a boundary, check
first that it is really within the area. If it is outside,
just keep the computed centroid because a centroid outside the
area just causes havok.

Fixes #1352.
2020-03-04 09:59:57 +01:00
Sarah Hoffmann
acd8ca2ebd add testing for rank adaption while linking 2020-02-28 15:22:48 +01:00
Sarah Hoffmann
06fdfad89e link against place nodes by place type
If a boundary relation has no label member preferably
link against a place node with the same place type.

Also inherit the rank_address from the place node (only
has an effect when linking via lable member or place type).
2020-02-28 15:22:48 +01:00
Sarah Hoffmann
00ca493f33 move linked place type into linked_place extratags
Using linked_place means that we don't overwrite any
place tags on the boundary. This is important when we
wanto to use the information for linking.
2020-02-28 15:22:48 +01:00
Sarah Hoffmann
5220a92be4 adapt API tests 2020-02-22 16:46:03 +01:00
marc tobias
7fd9d0eeef unit tests for ParameterParser::hasSetAny 2020-02-19 16:55:17 +01:00
Sarah Hoffmann
6073d948e6 fix duplicate keys in tests
The tests suddenly failed because the unique key constraint
is more strict and does no longer include the type.
2020-02-12 11:29:33 +01:00
marc tobias
932ac23f18 document how to extract subset of TIGER data needed for API tests 2020-02-11 18:50:27 +01:00
Sarah Hoffmann
3a3f9b3496 fix formatting 2020-02-09 16:57:55 +01:00
Sarah Hoffmann
c36fd72f99 use detailsPermaLink function on main website as well 2020-02-09 16:05:22 +01:00
Sarah Hoffmann
57ae3d03a1 return place_id link to details when not an OSM object
Stop-gap solution to find the right object for Tiger and
interpolation objects.
2020-02-09 15:45:38 +01:00
Sarah Hoffmann
4856f56d61 adapt test to change in hamlet classification 2020-01-23 22:26:47 +01:00
Sarah Hoffmann
d732dc3bb2 update place address levels
Adds province and allotments and downgrades hamlet.
2020-01-08 23:53:03 +01:00
Sarah Hoffmann
20d541af06 remove osm2pgsql tag tests
These tests are now part of the osm2pgsql test suite.
2020-01-04 16:23:29 +01:00
Sarah Hoffmann
f8bd4f5133 add test for finding housenumber 0 2019-12-01 20:36:59 +01:00
Sarah Hoffmann
bfe92ea191 bdd tests: enforce use of full import style 2019-12-01 16:25:39 +01:00
Sarah Hoffmann
9fed91a47f adapt tests for new wikipedia tables 2019-11-20 09:57:40 +01:00
Sarah Hoffmann
e0de838b13 adapt tests to short_name demotion 2019-10-28 22:53:41 +01:00
marc tobias
3af1520461 lookup endpoint returns boundingbox 2019-08-05 23:32:46 +02:00
Sarah Hoffmann
2bbe5017d4 use bbox of geometry when searching for attached streets
As we are doing a distance search, this improves results for
large places like airports.

Fixes #1442.
2019-07-28 13:28:27 +02:00
marc tobias
1560685020 lookup endpoint supports jsonv2 and geocodejson output now 2019-07-21 23:20:48 +02:00
Sarah Hoffmann
4c1793b4e3 recreate interpolations when one of their support nodes changes
A simple update is not enough because the interpolation splits
might change as well as the housenumbers.

Fixes #1360.
2019-07-03 23:15:54 +02:00
Sarah Hoffmann
cdc7d0fe0e remove visibility modifier from constants again
Only supported on PHP >= 7.1.
2019-07-02 23:24:49 +02:00
Sarah Hoffmann
e164d53fcc adapt tests to new place address ranks 2019-06-30 23:09:43 +02:00
Sarah Hoffmann
38a99856c0 Rework word set computation
Switch from an recursive algorithm for computing the word sets
to an iterative one that benefits from caching intermediate
results. This considerably reduces the amount of memory needed,
so that the depth restriction can be dropped. To ensure that
the number of word sets remains manageable, only sets up to
a certain length are accepted and only a certain number of
total word sets. If word sets need to be dropped, we drop
the ones with more words per word set first.

To further reduce the number of potential word sets, the valid
tokens are looked up first and then only word sets containing
valid tokens are computed.

Fixes #1403, #1404 and #654.
2019-06-29 18:22:31 +02:00
Sarah Hoffmann
2c21cbb5e6 update osm2pgsql (downgrading unnamed places)
Also adds tests for updating unnamed places.
2019-06-10 18:22:11 +02:00
Sarah Hoffmann
3bc4b4bf9f update osm2pgsql (import special tags) 2019-06-09 13:58:05 +02:00
Sarah Hoffmann
b612b99421
Merge pull request #1321 from mtmail/interpolating-0-housenumbers
Support housenumber=0 in interpolations
2019-04-19 18:29:43 +02:00
marc tobias
7d9dbd62c7 Support housenumber=0 in interpolations 2019-04-02 15:13:45 +02:00
marc tobias
c9a6350894 On postcode searches observe given bounded viewbox 2019-04-02 14:49:31 +02:00
Sarah Hoffmann
2a4198f94d add test for issue #1343
Keyword details for countries (which don't have address details).
2019-03-26 21:49:44 +01:00
marc tobias
890d415e1f Nominatim::DB support input variables, custom error messages 2019-03-10 16:56:36 +01:00
marc tobias
d4b633bfc5 replace database abstraction DB with PDO 2019-03-09 00:18:15 +01:00
Sarah Hoffmann
bdd64093e5
Merge pull request #1295 from mtmail/move-searchrank-labels-to-php
Remove get_addressrank_label. Move get_searchrank_label to PHP
2019-02-10 17:22:49 +01:00
marc tobias
3be797c759 BDD: support for DB_PORT environment variable 2019-02-09 20:54:18 +01:00
marc tobias
853b536394 Remove get_addressrank_label. Move get_searchrank_label to PHP 2019-02-09 20:38:36 +01:00
marc tobias
b56f7e8ad2 remove phpunit config key deprecated since version 3.5 2019-02-09 00:37:11 +01:00