Nominatim

mirror of https://github.com/osm-search/Nominatim.git synced 2024-09-20 15:37:49 +03:00

Author	SHA1	Message	Date
Sarah Hoffmann	500c61685b	remove unused variables As reported by sonarqube.	2021-07-09 16:36:42 +02:00
Sarah Hoffmann	106d960f84	fix bad use of echo in PHP output	2021-07-09 12:50:35 +02:00
Sarah Hoffmann	322fa19ceb	Merge pull request #2390 from lonvia/responsible-disclosure Add security issue disclosure policy	2021-07-09 12:32:37 +02:00
Sarah Hoffmann	5bea0b6086	add security issue disclosure policy	2021-07-09 11:36:59 +02:00
Sarah Hoffmann	a5970d7548	Merge pull request #2384 from lonvia/actions-add-icu-tokenizer CI: run tests on Ubuntu 18	2021-07-07 14:39:53 +02:00
Sarah Hoffmann	c216144dd1	add missing pyyaml requirement	2021-07-07 11:29:33 +02:00
Sarah Hoffmann	42e08da7ca	enable PHP 7.2 for Ubuntu 18 CI	2021-07-07 11:29:33 +02:00
Sarah Hoffmann	a2edbbf78a	cannot use capture_output in subprocess.run Only available since Python 3.7.	2021-07-06 22:57:42 +02:00
Sarah Hoffmann	1e86dc1d93	remove default parameter for namedtuple This is only available in Python 3.7.	2021-07-06 22:57:42 +02:00
Sarah Hoffmann	54f295be52	CI: run tests on older Ubuntu version as well	2021-07-06 22:57:42 +02:00
Sarah Hoffmann	8bc3c0a07c	Merge pull request #2382 from lonvia/remove-json-config Remove outdated ICU tokenizer JSON config	2021-07-05 12:34:34 +02:00
Sarah Hoffmann	d75bc20174	Merge pull request #2383 from lonvia/remove-more-names Exclude name:etymology and name:signed	2021-07-05 12:34:16 +02:00
Sarah Hoffmann	fd8751658f	exclude name:etymology and name:signed name:etymology contains a description of the name origin and is thus more informative than search-worthy. name:signed basically indicates that the feature does not have a name.	2021-07-05 11:04:16 +02:00
Sarah Hoffmann	4db5a1a0b8	remove outdated ICU tokenizer JSON config	2021-07-05 11:01:35 +02:00
Sarah Hoffmann	4c52777ef0	Merge pull request #2371 from lonvia/increase-python-version Increase minimum required Python version to 3.6	2021-07-05 10:32:38 +02:00
Sarah Hoffmann	d4c7bf20a2	Merge pull request #2381 from lonvia/reorganise-abbreviations Reorganise abbreviation handling	2021-07-05 10:32:16 +02:00
Sarah Hoffmann	affe1300d9	add warning about experimental nature of ICU tokenizer	2021-07-04 10:44:58 +02:00
Sarah Hoffmann	62d5984b1b	limit the number of variants that can be produced	2021-07-04 10:28:28 +02:00
Sarah Hoffmann	c32551b4e0	restrict partial word counting to names of reasoanble length The partial word count does not split names to save a bit of time. The result is that it might enounter unreasonably long names which in truth consist of multiple words. No accurate statistics are needed so simply restrict the count to words shorter than 75 characters.	2021-07-04 10:28:28 +02:00
Sarah Hoffmann	e85f7e7aa9	fix subsequent replacements Two replacement words directly following each other did not work as expected because each expects a space at the beginning/end while there was only one space available. Also forbit composing a word after a space was added in the end by a previous replacement.	2021-07-04 10:28:28 +02:00
Sarah Hoffmann	7b0f6b7905	leave ICU variant properties empty for now Saving unused properties causes unnecessary duplicates.	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	0894ce9dc3	import abbreviations from OSM Wiki Replaces the variant rules with a slightly cleaned-up version of the abbreviation lists at https://wiki.openstreetmap.org/wiki/Name_finder:Abbreviations	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	4fd2e961b6	improve normalization Make sure all special symbols are removed during normalization already. Those won't be interpreted in any way because they are unlikely to be searched for.	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	b9fbfeff67	only consider partials in multi-words for initial count This ensures that it is less likely that we exclude meaningful words like 'hauptstrasse' just because they are frequent.	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	5dd24b3ef0	add documentation for ICU tokenizer configuration	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	62828fc5c1	switch to a more flexible variant description format The new format combines compound splitting and abbreviation. It also allows to restrict rules to additional conditions (like language or region). This latter ability is not used yet.	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	a6aa6360e0	use yaml tag syntax to mark include files	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	c4f6c06f44	add dependency on datrie	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	0d80a9b897	tests for composing decomposed suffixes	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	f70930b1a0	make compund decomposition pure import feature Compound decomposition now creates a full name variant on import just like abbreviations. This simplifies query time normalization and opens a path for changing abbreviation and compund decomposition lists for an existing database.	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	9ff4f66f55	complete tests for icu tokenizer	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	32ca631b74	fix full term token in special phrases	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	2e81084f35	complete tests for rule loader	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	a0a7b05c9f	correctly quote strings when copying in data Encapsulate the copy string in a class that ensures that copy lines are written with correct quoting.	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	2f6e4edcdb	update unit tests for adapted abbreviation code	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	1bd9f455fc	add abbreviations from legacy tokenizer These abbreviations are not a perfect fit anymore because abbreviation replacement is now applied before transliteration.	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	2e3c5d4c5b	adapt tests for ICU tokenizer	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	8413075249	move abbreviation computation into import phase This adds precomputation of abbreviated terms for names and removes abbreviation of terms in the query. Basic import works but still needs some thorough testing as well as speed improvements during import. New dependency for python library datrie.	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	6ba00e6aee	icu tokenizer: move transliteration rules in separate file The tokenizer configuration has become difficult to handle due to the additional manual transliteration rules. Allow to have a separate rule file that is given to the ICU library as is.	2021-07-04 10:28:20 +02:00
Sarah Hoffmann	de4fac33dc	docs: nominatim-ui should be installed from the release The development version does not provide the pre-packaged dist directory anymore.	2021-07-03 21:16:52 +02:00
Sarah Hoffmann	c9984669a7	Merge pull request #2373 from lonvia/tweak-search-cost Further tweaking of search cost	2021-06-26 16:21:08 +02:00
Sarah Hoffmann	63755c31ff	remove penalty for full words in address Now that mutli-word partials no longer exist, multi-word full words need to be used to search in addresses and therefore no longer should have a penalty. Also changes the condition when a full word is included into the address. It is no longer relevant if an equivalent partial exists but only if the term consists of more than one word.	2021-06-26 11:37:15 +02:00
Sarah Hoffmann	161f5f5cee	adjust penalty for housenumber-in-name searches When searching for house numbers in the name (for place-only terms) then the same penalties need to apply as for the regular house number search. Change the code to first compute the penalties and then create the new search variants.	2021-06-26 11:37:15 +02:00
Sarah Hoffmann	c7073a1fc0	increase minimum Python to 3.6 Python 3.6 introduces formatted string literals and flag enums as well as a much faster dict implementation. These changes make the code so much simpler as to warrant dropping Python 3.5 support. Affected distributions are Ubuntu 16.04 and Debian Stretch.	2021-06-21 18:37:37 +02:00
Sarah Hoffmann	e7b4fc70e7	make sure old data gets deleted on place type change When changing from some other place type to place=postcode make sure that the old place type entry in the place table is deleted.	2021-06-18 10:58:41 +02:00
Sarah Hoffmann	457982e1d2	update postcode in place if it already exists	2021-06-18 00:28:52 +02:00
Sarah Hoffmann	aa558e6080	Merge pull request #2369 from lonvia/exclude-poi-from-housenumber-search Do not return POIs when dropping house number in query	2021-06-17 15:30:05 +02:00
Sarah Hoffmann	fe11d3cbbd	do not return POIs when dropping house number in query We've previously added searching through rank 30 in a house number search to enable searches for house number+name. This had the unintended side effect that rank 30 objects are also returned in s search that dropped the house number from the query. This is wrong because POIs cannot function as a parent to a house number. This fix drops all rank 30 objects from the results for a house number search if they do not match the requested house number.	2021-06-17 14:21:20 +02:00
Sarah Hoffmann	1ce223a83b	Merge pull request #2360 from AntoJvlt/postcodes-place-table Use place instead of placex to compute postcodes	2021-06-16 11:45:07 +02:00
AntoJvlt	3676310efe	Improved performance of the postcodes query and some code cleaning	2021-06-12 15:46:08 +02:00

... 2 3 4 5 6 ...

3370 Commits