Nominatim

mirror of https://github.com/osm-search/Nominatim.git synced 2024-11-26 13:27:52 +03:00

Author	SHA1	Message	Date
Sarah Hoffmann	612d34930b	handle postcodes properly on word table updates update_postcodes_from_db() needs to do the full postcode treatment in order to derive the correct word table entries.	2022-06-23 23:42:31 +02:00
Sarah Hoffmann	0f00f4968c	fix up BDD tests for postcode changes Includes smaller code fixes found by the tests.	2022-06-23 23:42:31 +02:00
Sarah Hoffmann	8080625747	remove postcodes from countries that don't have them The postcodes will only be removed as a 'computed postcode' they are still searchable for the given object.	2022-06-23 23:42:31 +02:00
Sarah Hoffmann	d8623d6818	bdd: remove support for scenes Only keep support for the special point geometry 'country:xx'.	2022-06-17 11:54:18 +02:00
Sarah Hoffmann	6c58a4c46c	bdd: move query tests from scene to grid description	2022-06-17 11:54:18 +02:00
Sarah Hoffmann	19f67e167c	bdd: remove step for scene setup	2022-06-17 11:54:18 +02:00
Sarah Hoffmann	00d8df6fc3	bdd: move update tests from scenes to grid descriptions	2022-06-17 11:54:18 +02:00
Sarah Hoffmann	02068aec7f	bdd: move import tests from scenes to grid descriptions	2022-06-17 11:54:18 +02:00
Sarah Hoffmann	3493d317e4	bdd: clear lof buffer after a successful import run	2022-06-17 11:54:18 +02:00
Sarah Hoffmann	a2b486a5b0	bdd: allow to set an origin of the grid	2022-06-17 11:54:18 +02:00
Sarah Hoffmann	df0142678a	improve address ordering with mixes of place and admin areas Resolves a couple of situations where a mixed use of places areas and administrative boundaries would result in a hierarchy that did not properly respect the contains relation.	2022-06-16 10:44:16 +02:00
Sarah Hoffmann	15cf7dd416	add testcase for #2551 This test proves that places that are linked need to be reindexed.	2022-06-05 21:39:17 +02:00
Sarah Hoffmann	bd0e157b91	fix order when searching for addr:* components When matching addr:* components the preference was given to matches that do not intersect with the place.	2022-05-31 16:57:37 +02:00
Sarah Hoffmann	1d203fdb3c	fix bug with keeping linking on updates When moving the finding of linked places to the precomputation stage, it was also moved before the statement where the linked_place_id was removed from the linkee. The result was that the current linkee was excluded when looking for a linked place on updates because it was still linked to the boundary to be updated. Fixed by allowing to either keep the linkage or change to an unlinked place.	2022-05-23 10:55:10 +02:00
Sarah Hoffmann	f314abcfe1	bdd: restrict imports to four languages This mainly restricts the number of country names that are loaded.	2022-05-11 16:40:53 +02:00
Sarah Hoffmann	e74e577029	bdd: recreate functions on template DB Avoids calling function refresh on every scenario. The content won't change between runs.	2022-05-11 15:50:22 +02:00
Sarah Hoffmann	aa0ae610c6	avoid calling OSM servers during bdd tests	2022-05-11 15:33:01 +02:00
Sarah Hoffmann	adeebec32a	switch tests to ICU tokenizer as default	2022-05-10 14:54:50 +02:00
Sarah Hoffmann	372874e89a	accept any OSM type in street member of associatedStreet This is needed for pedestrian areas mapped as multipolygons and consequently as relations. The lookup in placex guarantees that the referenced OSM object is indeed a street. Fixes #2669.	2022-05-02 09:48:51 +02:00
Sarah Hoffmann	3c68b12176	keep inherited address parts after indexing The inherited housenumber is needed for display output. We can't take the one from the housenumber field because it is already normalized. Remove the inherited address only when reindexing. Fixes #2683.	2022-04-28 21:38:00 +02:00
Artem Ziablytskyi	d1479072ae	fix bdd tests and docs	2022-04-07 16:37:51 +02:00
Artem Ziablytskyi	9a56e53d50	use ISO3166-2-lvl<admin_level> instead of typeLabel prefix	2022-04-07 16:37:51 +02:00
Sarah Hoffmann	36a1560117	add migration to mark internal country names	2022-03-31 15:55:20 +02:00
Sarah Hoffmann	a0ed80d821	restore the tokenizer directory when missing Automatically repopulate the tokenizer/ directory with the PHP stub and the postgresql module, when the directory is missing. This allows to switch working directories and in particular run the service from a different maschine then where it was installed. Users still need to make sure that .env files are set up correctly or they will shoot themselves in the foot. See #2515.	2022-03-20 11:31:42 +01:00
Sarah Hoffmann	e133476c35	merge linked names correctly into namedetails Convert the '_place_' entries back to normal entries before returning them in the 'namedetails' section. If the name field is duplicated, kept the '_place_' notation. This preserves the previous behaviour before _place_ names were introduces but adds the additional names from the linked place for reference.	2022-03-17 11:02:02 +01:00
Sarah Hoffmann	524dc64ab7	make sure outputs take into account linked place names	2022-03-16 21:44:52 +01:00
Sarah Hoffmann	42cd021d04	save differing linked polace names in extra fields This keeps the names tracable and ensures that all names are searchable when they differ. Do not keep names when they are exactly the same to save some space. Linked names are cleaned out before relinking.	2022-03-16 16:38:52 +01:00
Sarah Hoffmann	ef98a85b05	correctly handle single-point interpolations in reverse Lookup in location_property_osmline needs to be special cased for startnumber = endnumber. Also adds tests for the case. Fixes #2680.	2022-03-16 11:19:09 +01:00
Sarah Hoffmann	89e1446131	bdd: disable some housenumber tests for legacy Optional spaces in housenumbers are not supported by legacy tokenizer, so disable those tests.	2022-03-01 09:34:32 +01:00
Sarah Hoffmann	f03a05f6bb	add new analyser for houenumbers This analyser makes spaces optional.	2022-03-01 09:34:32 +01:00
Sarah Hoffmann	1d82569f6d	add tests for country updates	2022-02-24 16:18:49 +01:00
Sarah Hoffmann	f74228830d	bdd: run full import on tests This uncovered a couple of outdated/wrong tests which have been fixed, too.	2022-02-24 14:27:51 +01:00
Sarah Hoffmann	0e11ca9b76	add test that interpolations are found by odd/even	2022-02-10 11:23:51 +01:00
Sarah Hoffmann	a79a3210e6	implement is-a-name option for housenumbers	2022-02-07 09:27:11 +01:00
Sarah Hoffmann	64abc90d30	use new tiger step column for queries	2022-01-27 14:08:08 +01:00
Sarah Hoffmann	6b89624f33	adapt frontend to new interpolation table layout	2022-01-27 11:14:55 +01:00
Sarah Hoffmann	4b28b4fed4	adapt BDD tests for new interpolation style	2022-01-27 11:14:55 +01:00
Sarah Hoffmann	206ee87188	factor out housenumber splitting into sanitizer	2022-01-19 17:27:50 +01:00
Sarah Hoffmann	b453b0ea95	introduce mutation variants to generic token analyser Mutations are regular-expression-based replacements that are applied after variants have been computed. They are meant to be used for variations on character level. Add spelling variations for German umlauts.	2022-01-18 11:09:21 +01:00
Sarah Hoffmann	c3788d765e	add consistent SPDX copyright headers	2022-01-03 16:23:58 +01:00
Sarah Hoffmann	ab6f35d83a	Merge pull request #2553 from lonvia/revert-street-matching-to-full-names Revert street matching to full names	2021-12-14 15:52:34 +01:00
Sarah Hoffmann	f9b56a8581	correctly match abbreviated addr:street This only works when addr:street is abbreviated and the street name isn't. It does not work the other way around.	2021-12-08 21:58:43 +01:00
Sarah Hoffmann	04857d32cd	enable PHPUnit 9 for coverage A couple of functions have been renamed.	2021-12-07 12:07:17 +01:00
Sarah Hoffmann	5e435b41ba	ICU: matching any street name will do again	2021-12-06 14:26:08 +01:00
Sarah Hoffmann	80e0a3cce4	change default rank for highway objects to 30 The highway key is being used more and more for non-ways these days. This clashes with Nominatim's assumption that essentially everything that has a highway tag can be used as the street part of the address. Change the default rank of highway objects to 30 to avoid this. Only the known values for streets keep the rank 26 and are now listed explicitly.	2021-11-24 22:10:40 +01:00
Sarah Hoffmann	1722fc537f	bdd: add tests for non-latin scripts	2021-10-26 17:29:03 +02:00
Sarah Hoffmann	c0f347fc8c	adapt BDD tests to stricter partial search	2021-10-26 15:52:57 +02:00
Sarah Hoffmann	97a10ec218	apply variants by languages Adds a tagger for names by language so that the analyzer of that language is used. Thus variants are now only applied to names in the specific language and only tag name tags, no longer to reference-like tags.	2021-10-06 11:09:54 +02:00
Sarah Hoffmann	40f9d52ad8	Merge pull request #2454 from lonvia/sort-out-token-assignment-in-sql ICU tokenizer: switch match method to using partial terms	2021-09-28 09:45:15 +02:00
Sarah Hoffmann	bd7c7ddad0	icu tokenizer: switch to matching against partial names When matching address parts from addr:* tags against place names, the address names where so far converted to full names and compared those to the place names. This can become problematic with the new ICU tokenizer once we introduce creation of different variants depending on the place name context. It wouldn't be clear which variant to produce to get a match, so we would have to create all of them. To work around this issue, switch to using the partial terms for matching. This introduces a larger fuzziness between matches but that shouldn't be a problem because matching is always geographically restricted. The search terms created for address parts have a different problem: they are already created before we even know if they are going to be used. This can lead to spurious entries in the word table, which slows down searching. This problem can also be circumvented by using only partial terms for the search terms. In terms of searching that means that the address terms would not get the full-word boost, but given that the case where an address part does not exist as an OSM object should be the exception, this is likely acceptable.	2021-09-27 11:36:19 +02:00
Sarah Hoffmann	6d7c067461	force update on rank30 children when place name changes Name changes may have an effect on parenting. Don't update surrounding rank30 objects with addr:place tags as this is potentially too expensive.	2021-09-27 11:04:17 +02:00
Sarah Hoffmann	316205e455	force update of surrounding houses when street name changes When the street changes its name then this may cause changes in the parenting of rank-30 objects with an addr:street tag. Fixes #2242.	2021-09-27 10:22:41 +02:00
Sarah Hoffmann	56124546a6	fix dynamic assignment of address parts A boolean check for dynamic changes of address parts is not sufficient. The order of choice should be: 1. an addr:* part matches the name 2. the address part surrounds the object 3. the address part was declared as isaddress The implementation uses a slightly different ordering to avoid geometry checks unless strictly necessary (isaddress is false and no matching address). See #2446.	2021-09-19 12:34:39 +02:00
Sarah Hoffmann	28ee3d0949	move linking of places to the preparation stage Linked places may bring in extra names. These names need to be processed by the tokenizer. That means that the linking needs to be done before the data is handed to the tokenizer. Move finding the linked place into the preparation stage and update the name fields. Everything else is still done in the indexing stage.	2021-08-20 22:44:17 +02:00
Sarah Hoffmann	118858a55e	rename legacy_icu tokenizer to icu tokenizer The new icu tokenizer is now no longer compatible with the old legacy tokenizer in terms of data structures. Therefore there is also no longer a need to refer to the legacy tokenizer in the name.	2021-08-17 23:11:47 +02:00
Sarah Hoffmann	5f2b9e317a	add tests for US state hacks IL, AS and LA are replaced with the US state in Geocode because the old tokenizer would simply remove the abbreviations otherwise.	2021-08-17 10:49:07 +02:00
Sarah Hoffmann	1db098c05d	reinstate word column in icu word table Postgresql is very bad at creating statistics for jsonb columns. The result is that the query planer tends to use JIT for queries with a where over 'info' even when there is an index.	2021-07-28 11:31:47 +02:00
Sarah Hoffmann	324b1b5575	bdd tests: do not query word table directly The BDD tests cannot make assumptions about the structure of the word table anymore because it depends on the tokenizer. Use more abstract descriptions instead that ask for specific kinds of tokens.	2021-07-28 11:31:47 +02:00
Sarah Hoffmann	f70930b1a0	make compund decomposition pure import feature Compound decomposition now creates a full name variant on import just like abbreviations. This simplifies query time normalization and opens a path for changing abbreviation and compund decomposition lists for an existing database.	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	2e3c5d4c5b	adapt tests for ICU tokenizer	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	e7b4fc70e7	make sure old data gets deleted on place type change When changing from some other place type to place=postcode make sure that the old place type entry in the place table is deleted.	2021-06-18 10:58:41 +02:00
Sarah Hoffmann	457982e1d2	update postcode in place if it already exists	2021-06-18 00:28:52 +02:00
Sarah Hoffmann	fe11d3cbbd	do not return POIs when dropping house number in query We've previously added searching through rank 30 in a house number search to enable searches for house number+name. This had the unintended side effect that rank 30 objects are also returned in s search that dropped the house number from the query. This is wrong because POIs cannot function as a parent to a house number. This fix drops all rank 30 objects from the results for a house number search if they do not match the requested house number.	2021-06-17 14:21:20 +02:00
Sarah Hoffmann	3aac51c81f	switch BDD tests to always use search API	2021-06-06 15:27:52 +02:00
Sarah Hoffmann	4f4d15c28a	reorganize keyword creation for legacy tokenizer - only save partial words without internal spaces - consider comma and semicolon a separator of full words - consider parts before an opening bracket a full word (but not the part after the bracket) Fixes #244.	2021-05-24 10:41:42 +02:00
Sarah Hoffmann	00094c43d1	enable Tiger BDD API test for legacy_icu	2021-05-21 22:39:56 +02:00
AntoJvlt	3206bf59df	Resolve conflicts	2021-05-17 13:52:35 +02:00
AntoJvlt	fb0ebb5bf0	Add tests for the new SPWikiLoader and SPCsvLoader	2021-05-16 16:10:06 +02:00
Darkshredder	e5ffc59cd5	feat: Added reverse-only-search validation	2021-05-14 02:36:21 +05:30
Sarah Hoffmann	1ccd4360b4	correctly handle removing all postcodes for country	2021-05-13 14:15:42 +02:00
Sarah Hoffmann	b2c6eca2c8	add missing transliterations The ICU library only offers transliterations for a limited set of script. Add transliterations for missing scripts from the PostgreSQL module. These means that the same selection of scripts is supported as with the old module.	2021-05-05 21:16:55 +02:00
Sarah Hoffmann	a263e54b94	enable BDD tests for different tokenizers The tokenizer to be used can be choosen with -DTOKENIZER. Adapt all tests, so that they work with legacy_icu tokenizer. Move lookup in word table to a function in the tokenizer. Special phrases are temporarily imported from the wiki until we have an implementation that can import from file. TIGER tests do not work yet.	2021-05-05 10:31:51 +02:00
Sarah Hoffmann	be6262c6ce	move status test to tokenizer The availability of the module is now tested by the tokenizer.	2021-04-30 17:41:08 +02:00
Sarah Hoffmann	3eb4d88057	boilerplate for PHP code of tokenizer This adds an installation step for PHP code for the tokenizer. The PHP code is split in two parts. The updateable code is found in lib-php. The tokenizer installs an additional script in the project directory which then includes the code from lib-php and defines all settings that are static to the database. The website code then always includes the PHP from the project directory.	2021-04-30 11:31:52 +02:00
Sarah Hoffmann	e1c5673ac3	require tokeinzer for indexer	2021-04-30 11:30:51 +02:00
Sarah Hoffmann	9397bf54b8	introduce external processing in indexer Indexing is now split into three parts: first a preparation step that collects the necessary information from the database and returns it to Python. In a second step the data is transformed within Python as necessary and then returned to the database through the usual UPDATE which now not only sets the indexed_status but also other fields. The third step comprises the address computation which is still done inside the update trigger in the database. The second processing step doesn't do anything useful yet.	2021-04-30 11:30:51 +02:00
Sarah Hoffmann	1fd483643b	add tests for different scripts	2021-04-26 23:01:06 +02:00
Sarah Hoffmann	788baafa26	bdd tests: fix place dependen ranking tests The ranks of places may differ for some countries. Force the place nodes in the test on null island which always uses the default ranking.	2021-04-22 17:31:00 +02:00
Sarah Hoffmann	79d55357e8	simplify sql and website creation functions	2021-04-19 10:53:30 +02:00
Sarah Hoffmann	16a66b5326	move transliteration of housenumbers into indexing Housenumbers are now saved in transliterated form in the housenumber column. This saves the transliteration step during lookup.	2021-04-04 15:26:47 +02:00
Sarah Hoffmann	3590e76a1c	tests for finding non-ascii housenumbers	2021-04-04 15:26:47 +02:00
Sarah Hoffmann	118befd7d7	bdd tests: make indexing less verbose Do not print progress info for indexing when there is an error in the BDD tests.	2021-03-20 10:39:29 +01:00
Sarah Hoffmann	0d9fe6e49c	Merge pull request #2219 from lonvia/bdd-test-remove-php BDD tests: run all setup via nominatim Python library	2021-03-17 11:40:34 +01:00
Sarah Hoffmann	ebae3553e0	bdd: run all setup via nominatim Python library Drops all calls to PHP utility functions. nominatim cli functions are used where possible, to stay as close to the final code as possible with the tests. By removing the PHP calls, the test code now only uses osm2pgsql and the database module from the build directory.	2021-03-16 22:20:41 +01:00
Sarah Hoffmann	4d7c5ec089	reverse: do not prefer interpolations over closer housenumbers Always look up the closest housenumber before looking up interpolations. This ensures that closer housenumbers are preferred over interpolations. Fixes #2214.	2021-03-15 10:50:04 +01:00
Sarah Hoffmann	dd03aeb966	bdd: use python library where possible Replace calls to PHP scripts with direct calls into the nominatim Python library where possible. This speed up tests quite a bit.	2021-02-26 16:14:29 +01:00
Sarah Hoffmann	5b7483ada5	return 404 for details when no bject is found in database Fixes #2157.	2021-02-22 16:28:29 +01:00
Sarah Hoffmann	f08078ccca	bdd tests: directly call python code for setup-website	2021-02-19 18:20:55 +01:00
Sarah Hoffmann	a60c34bded	use a frozen DB for API tests This way we also test that dropping does the right thing.	2021-02-17 22:35:27 +01:00
Sarah Hoffmann	3cb6f3e460	use DataDir constant for data only So far the data directory constant has pointed to the source directory to be usable with different subdirectories. Now only the data subdirectory itself is being used with the constant, so point to the directory directly.	2021-02-09 20:04:08 +01:00
Sarah Hoffmann	8ffd7d9243	remove unused BINDIR constant	2021-02-09 19:30:31 +01:00
Sarah Hoffmann	298ed11261	introduce constant for configuration directory This replaces {data_dir}/settings throughout the code, so that the configuration may be placed somewhere else in the directory structure (e.g. in /etc).	2021-02-09 18:45:45 +01:00
Sarah Hoffmann	b9517c99ae	rename sql directory to lib-sql Also introduces a separate constant for the sql directory, so that it can be put separately from the rest of the data if required.	2021-02-09 15:26:56 +01:00
Sarah Hoffmann	db3ced17bb	rename lib to lib-php	2021-02-09 11:52:07 +01:00
Sarah Hoffmann	504922ffbe	remove old nominatim.py in favour of 'nominatim index' The PHP scripts need to know the position of the nominatim tool in order to call it. This is handed in as environment variable, so it can be set by the Python script.	2021-01-18 15:43:27 +01:00
Sarah Hoffmann	340e7f7210	bdd: complete coverage for API tests Also removes some functions that are no longer used and fixes debug output where the tests found an issue.	2021-01-17 16:12:06 +01:00
Sarah Hoffmann	171ed36e36	bdd: remove duplicated tests	2021-01-16 16:57:28 +01:00
Sarah Hoffmann	c6c907d451	bdd: clean up and extend API tests for details - remove duplicates created by replacing HTML tests with JSON tests - add tests for newer functions for returning geometries and hierarchies	2021-01-16 12:04:13 +01:00
Sarah Hoffmann	19ab038724	collect coverage for /website directory as well	2021-01-15 20:27:14 +01:00
Sarah Hoffmann	2f73bb3643	bdd: directly call utility scripts in lib This removes the dependency on php-symfony-dotenv for the tests.	2021-01-14 18:19:22 +01:00
Sarah Hoffmann	5d656891ba	bdd: convert API tests to smaller test db Changes BDD API tests to restrict themselves to Liechtenstein. One test moved to DB as no appropriate data is available.	2021-01-09 16:59:46 +01:00
Sarah Hoffmann	74122dc965	bdd: improve assert output for API query checks Adds wrapper function for checking address parts and more explanation strings to asserts.	2021-01-09 16:58:37 +01:00
Sarah Hoffmann	ee18a511c6	bdd: import API test DB as part of step setup In the future, the BDD tests will simply set up the required test database themselves. Like with the template database, it is not reimported when it already exists unless that is explicitly forced. Makes most of the API tests currently fail because they still point to old test data.	2021-01-07 11:51:38 +01:00
Sarah Hoffmann	73cbb6eb9a	bdd: clean up DB ops steps Adds comments and modernizes code.	2021-01-06 16:37:32 +01:00
Sarah Hoffmann	1f29475fa5	bdd: move column comparison in separate file Introduces a new class DBRow that encapsulates the comparison functions. This also is responsible for formatting more informative assert messages. place and placex steps are unified.	2021-01-06 12:28:09 +01:00
Sarah Hoffmann	d586b95ff1	bdd: move nominitim id reader to separate file	2021-01-05 16:00:48 +01:00
Sarah Hoffmann	25557e5f14	bdd: factor out reindexing on updates	2021-01-05 15:17:46 +01:00
Sarah Hoffmann	197870e67a	bdd: move place table inserter into separate file Also simplifies usage by implementing a function that inserts a complete table row.	2021-01-05 12:12:59 +01:00
Sarah Hoffmann	b8e39d2dde	bdd: move scene setup to OSM data steps The step has nothing to do with the database.	2021-01-05 11:42:28 +01:00
Sarah Hoffmann	5dfa76a610	bdd: switch to auto commit mode Put the connection to the test database into auto-commit mode and get rid of the explicit commits. Also use cursors always in context managers and unify the two implementations that copy data from the place table.	2021-01-05 11:42:28 +01:00
Sarah Hoffmann	58c471c627	bdd: remove class for lazy formatting assert in combination with format() does the right thing and calls the __str__() method only when an assertion hits.	2021-01-05 10:39:44 +01:00
Sarah Hoffmann	213bf7d19d	bdd: rename db_ops steps Now all files implementing steps are called steps_*.py.	2021-01-05 10:20:00 +01:00
Sarah Hoffmann	12ae8a4ed3	bdd: move output format computation into response	2021-01-05 10:17:59 +01:00
Sarah Hoffmann	8a93f8ed94	bdd: move Response classes in own file and simplify Removes most of the duplicated parse functions, introduces a common assert_field function with a more expressive error message.	2021-01-05 10:03:47 +01:00
Sarah Hoffmann	2712c5f90e	bdd: rename and clean up osm_data steps Move common OPL creation code into a function and remove unused imports.	2021-01-04 20:17:17 +01:00
Sarah Hoffmann	72587b08fa	bdd: move external process execution in separate func	2021-01-04 19:58:59 +01:00
Sarah Hoffmann	faa85ded50	bdd: move NominatimEnvironment into separate file Also cleans up and modernizes the code and adds documentation.	2021-01-04 17:54:51 +01:00
Sarah Hoffmann	14e5bc7a17	bdd: move grid generation code into geometry factory	2021-01-04 17:04:47 +01:00
Sarah Hoffmann	f727620859	bdd: move geoemtry creation into separate file Also renames the OsmDataFactory in the more appropriate GeometryFactory and modernizes code for python3.	2021-01-04 16:34:40 +01:00
Sarah Hoffmann	843d3a137c	remove stale code for python2	2021-01-04 14:14:34 +01:00
Sarah Hoffmann	4aba70caee	create a temporary project dir for tests The project directory contains the website script as configured through the test configuration. This means that tests are now completely independet of any configuration that may be contained in the build directory. Also removes the hack to inject additional settings via a environment variable.	2021-01-04 11:39:45 +01:00
Sarah Hoffmann	4ca7197826	replace nose assertions with simple asserts	2021-01-03 17:21:24 +01:00
Sarah Hoffmann	33b038ce6f	tests: always create the config file There is also one database test that uses the API functions.	2020-12-19 17:55:46 +01:00
Sarah Hoffmann	d97aed8741	adapt tests to new dotenv environment DB tests now can simply set the environment to change configuration variables. API tests still rely on a configuration file. Also, query.php needs to set up the CONST_* variables to work with the query scripts. That is a tiny bit messy and duplicates code but this part will need to be reworked later.	2020-12-19 14:33:04 +01:00
Sarah Hoffmann	b5480f6e36	reorganise path settings in config CONST_BasePath is split into separate configuration variables for binaries, libraries and data. These variables as well as the installation path are now set in the executable directly and no longer configurable via project settings. This is the first step towards an installable software. The executables should know per installation where to find their necessary data to execute. Project configuration needs to be restricted to settings that really concern the specific Nominatim installation.	2020-12-19 14:33:04 +01:00
Sarah Hoffmann	b59d01fe85	update country names Copies all name:xx country names that are in OSM as of today into the country name fallback table.	2020-12-09 17:52:25 +01:00
Sarah Hoffmann	65d8770b28	update country_names from OSM data Update names in the coutry_names table on the fly from incomming OSM country data. Adding a small sanity check that the country must be an OSM relation and within the area where we expect the country to be.	2020-12-09 11:38:19 +01:00
Sarah Hoffmann	987d60ccda	place nodes can only be linked once against boundaries If a place node is already linked against a boundary, it should not be used for linking again. It is usually a sign of a mapping error, when there are multiple boundary candidates. This change just avoids inconsistent data in the database, it does not guarantee that the linking is against the more correct boundary.	2020-12-02 15:31:02 +01:00
Sarah Hoffmann	63544db8f9	null entries need to be typed	2020-12-01 14:54:42 +01:00
Sarah Hoffmann	7295cad715	compute address parts for rank 30 objects on the fly Rank 30 objects usually use the address parts of their parent. When the parent has address parts that are areas but not marked as isaddress, then the parent might go through multiple administrative areas. In that case recheck if the right area has been choosen for the object in question instead of relying on isaddress. Note that we really only have to do the recomputation in the case of 'isarea = True and isaddress = False' which hopefully keeps the number of additional geometric operations we have to do to a minimum. There is one more special case to be taken into account here: a street may go through two administrative areas and a house along that street is placed in one of the area while the addr:* tags says it belongs to the other. In that case we must not switch the isaddress to the one it is situated. To avoid that recheck the address names against the name of the ara. That is not perfect but should cover most cases. Fixes #328.	2020-12-01 11:58:25 +01:00
Sarah Hoffmann	22800d7d59	Search housenumbers with unknown address parts by housenumber term House numbers need special handling because they may appear after the street term. That means we canot just use them as the main name for searches where the address has its own search term entries. Doing this right now, we are able to find '40, Main St, Town' but not 'Main St 40, Town'. This switches to using the housenumber token as the name term instead. House number tokens can get special handling when building the search query that covers the case where they come after the street. The main disadvantage is that this once more increases the numbers of possible search interpretation of which we have already too many. no penalty for housenumber searches	2020-11-25 11:36:10 +01:00
Sarah Hoffmann	b4b50eef15	search rank 30 must always go with address rank 30	2020-11-24 17:57:28 +01:00
Sarah Hoffmann	49083c2597	Merge pull request #2058 from lonvia/split-address-words Split addr:* tags into words before adding to the search index	2020-11-18 08:58:17 +01:00
Sarah Hoffmann	ffb2c93ba3	POIs with unknown addr:place must add parent name to address The previous behaviour was a left-over from a former version where such POIs parented to the street. Now that they parent to places, it should be included.	2020-11-17 19:44:43 +01:00
Sarah Hoffmann	30a6b6bdac	split addr: tags into words before adding to the search index Address parts are only matched by single partial words. If the addr: names are not split, then multi-word names cannot be found.	2020-11-17 18:03:33 +01:00
Sarah Hoffmann	9ede048769	disallow linking for postcode areas	2020-11-17 10:53:26 +01:00
Sarah Hoffmann	885dc0a8e1	more tests for absense of additional addressline entries	2020-11-16 15:28:01 +01:00
Sarah Hoffmann	7324431b12	get additional addresses for rank 30 objects get_addressdata() now also checks if the place itself has entries in the place_addressline table and merges them into the results. Also restrict checking for address tag places to cases where the name cannot be found in the parent's address search terms. Looking up all address tags is just too slow.	2020-11-16 15:28:01 +01:00
Sarah Hoffmann	021f2bef4c	get address terms from address tags for rank 30 For rank 30 objects add extra elements into the place_addressline table.	2020-11-16 15:28:01 +01:00
Sarah Hoffmann	6260fef2e8	add test for placex from addr tags	2020-11-16 15:28:01 +01:00
Sarah Hoffmann	c7472662a6	lookup places for address tags for rank < 30 While previously the content of addr:* tags was only added to the list of address search keywords, we now really look up the matching place. This has the advantage that we pull in all potential translations from the place, just like all the other address terms that are looked up by neighbourhood search. If no place can be found for a given name, the content of the addr:* tag is still added to the search keywords as before.	2020-11-16 15:28:01 +01:00
Sarah Hoffmann	928c6245c9	Merge pull request #2038 from lonvia/addresses-for-large-areas Improve addresses for large areas	2020-11-03 08:49:01 +01:00
Sarah Hoffmann	33378dcf6e	remove tests for icon attribute The icon attribute requires the CONST_MapIcon_URL to be present which we cannot guarantee for the tests.	2020-11-02 16:46:29 +01:00
Sarah Hoffmann	b2ebf4b4b7	adapt tests to rank changes of natural	2020-11-02 11:42:10 +01:00
Sarah Hoffmann	d86cf6801f	remove tests for HTML output	2020-10-29 11:13:32 +01:00
Sarah Hoffmann	95f83b90d2	minor fixes for geometry compuation during boundary ranking Go back to using centroid when determining if one admin level is within another. There are cases where boundaries are slightly misaligned due to mapping errors (not using the same ways in the relations). Only declare boundaries the same when they have the same wikidata tag _and_ have exactly the same geometry. This works around tagging errors with the wikidata tag, which happen because of automated edits to the wikidata tag.	2020-10-28 10:49:26 +01:00
Sarah Hoffmann	7a16909219	detect and remove admin boundary duplicates The Polish community maps admin boundaries that span multiple levels by duplicating the boundary relations. Detect this situation by looking out for matching wikidata tags. The higher ranked duplicates are then thrown out from the address pool by setting their address rank to 0.	2020-10-28 10:49:26 +01:00
Sarah Hoffmann	b0ef84caae	add tests for rank computation	2020-10-17 17:51:22 +02:00
Sarah Hoffmann	64899ef54b	add tests for address computation	2020-10-16 11:07:17 +02:00
Sarah Hoffmann	ca680fc9fc	make housenumber interpolation tests more lenient	2020-10-11 12:04:53 +02:00

1 2 3 4 5 ...

425 Commits