Nominatim

mirror of https://github.com/osm-search/Nominatim.git synced 2024-11-25 19:35:02 +03:00

Author	SHA1	Message	Date
Sarah Hoffmann	8f3845660f	add full tokens to addresses This is now needed to weigh results.	2024-05-02 11:47:35 +02:00
Sarah Hoffmann	07b7fd1dbb	add address counts to tokens	2024-03-18 11:25:48 +01:00
Sarah Hoffmann	bb5de9b955	extend word statistics to address index Word frequency in names is not sufficient to interpolate word frequency in the address because names of towns, states etc. are much more frequently used than, say street names.	2024-03-18 11:25:48 +01:00
marc tobias	7205491b84	Correct some typos	2024-02-26 18:13:30 +01:00
Sarah Hoffmann	05fad607ff	make Python frontend default and PHP optional	2024-02-19 18:39:01 +01:00
Sarah Hoffmann	bc51378aee	properly grant rights to read-only user when switching out word table	2024-02-06 17:30:01 +01:00
Sarah Hoffmann	81eed0680c	recreate word table when refreshing counts The counting touches a large part of the word table, leaving bloated tables and indexes. Thus recreate the table instead and swap it in.	2024-02-04 21:35:10 +01:00
Sarah Hoffmann	cafd8e2b1e	fix typos and grammar issues	2023-08-29 12:14:44 +02:00
Sarah Hoffmann	d3372e69ec	update to modern mkdocstrings python handler	2023-08-25 21:40:20 +02:00
Sarah Hoffmann	252fe42612	Merge pull request #3122 from miku0/sanitizer-final Adds sanitizer for Japanese addresses to correspond to block address	2023-08-01 10:38:58 +02:00
miku0	67e1c7dc72	Moved KANJI_MAP to icu-rules	2023-07-31 11:57:49 +00:00
miku0	2350018106	Fixed cosmetic issues	2023-07-31 02:39:04 +00:00
miku0	fac8c32cda	Moved KANJI_MAP to global variable	2023-07-26 21:43:22 +00:00
Sarah Hoffmann	8cba65809c	older version of Postgres cannot convert jsonb to int	2023-07-26 17:45:21 +02:00
miku0	848e5ac5de	Correction to PR's comment	2023-07-26 09:50:25 +00:00
miku0	0722495434	add japanese sanitizer	2023-07-26 07:54:58 +00:00
Sarah Hoffmann	faeee7528f	move warm script to python code	2023-07-25 21:39:23 +02:00
Sarah Hoffmann	d7a3039c2a	also switch legacy tokenizer to new street/place choice behaviour	2023-06-30 17:03:17 +02:00
Sarah Hoffmann	645ea5a057	use information from tokenizer to determine street vs. place address So far the SQL logic used the information from the address field to determine if an address is attached to a street or place. This changes the logic to use the information provided in the token_info. This allows sanitizers to enforce a certain parenting without changing the visible address information.	2023-06-30 11:08:25 +02:00
biswajit-k	8f03c80ce8	generalize filter for sanitizers	2023-04-01 19:24:09 +05:30
Sarah Hoffmann	9a5f75dba7	Merge pull request #2993 from biswajit-k/delete-tags Adds sanitizer for preventing certain tags to enter search index based on parameters	2023-03-09 14:31:45 +01:00
biswajit-k	ca149fb796	Adds sanitizer for preventing certain tags to enter search index based on parameters fix: pylint error added docs for delete tags sanitizer fixed typos in docs and code comments fix: python typechecking error fixed rank address type Revert "fixed typos in docs and code comments" This reverts commit 6839eea755a87f557895f30524fb5c03dd983d60. added default parameters and refactored code added test for all parameters	2023-03-09 14:18:39 +05:30
biswajit-k	36388cafe9	fixed typos in docs and code comments	2023-03-06 17:09:38 +05:30
Sarah Hoffmann	2abe9e6fd9	use data paths from new nominatim.paths	2022-11-27 12:15:41 +01:00
Sarah Hoffmann	fd3dec8efe	add sanitizer for TIGER tags Currently only takes over cleaning the tiger:county data. This was done by the import until now.	2022-11-23 10:37:27 +01:00
Sarah Hoffmann	8d082c13e0	adapt to new type annotations from typeshed Some more functions frrom psycopg are now properly annotated. No ignoring necessary anymore.	2022-08-09 11:06:54 +02:00
Sarah Hoffmann	9864b191b1	fix various typos	2022-07-31 17:10:35 +02:00
Sarah Hoffmann	51b6d16dc6	overhaul the token analysis interface The functional split betweenthe two functions is now that the first one creates the ID that is used in the word table and the second one creates the variants. There no longer is a requirement that the ID is the normalized version. We might later reintroduce the requirement that a normalized version be available but it doesn't necessarily need to be through the ID. The function that creates the ID now gets the full PlaceName. That way it might take into account attributes that were set by the sanitizers. Finally rename both functions to something more sane.	2022-07-29 15:14:11 +02:00
Sarah Hoffmann	34d27ed45c	move PlaceName into the generic data module	2022-07-29 11:42:20 +02:00
Sarah Hoffmann	094100bbf6	harmonize spelling Stick with the American spelling of Analyze.	2022-07-29 10:52:01 +02:00
Sarah Hoffmann	c8873d34af	harmonize interface of token analysis module The configure() function now receives a Transliterator object instead of the ICU rules. This harmonizes the parameters with the create function.	2022-07-29 10:43:07 +02:00
Sarah Hoffmann	f0d640961a	add documentation for custom token analysis	2022-07-29 09:41:28 +02:00
Sarah Hoffmann	3746befd88	add documentation for sanitizer interface Also switches mkdocstrings to 0.18 with the rather unfortunate consequence that now mkdocstrings-python-legacy is needed as well.	2022-07-28 22:00:29 +02:00
Sarah Hoffmann	d819036daa	add support for external token analysis modules	2022-07-25 16:27:22 +02:00
Sarah Hoffmann	6d41046b15	add support for external sanitizer modules	2022-07-25 16:10:19 +02:00
Kian-Meng Ang	f5e52e748f	docs: fix typos	2022-07-20 22:05:31 +08:00
Sarah Hoffmann	83054af46f	remove typing_extensions requirement The typing_extensions package is only necessary now when running mypy. It won't be used at runtime anymore.	2022-07-18 09:55:58 +02:00
Sarah Hoffmann	9963261d8d	add type annotations to special phrase importer	2022-07-18 09:54:29 +02:00
Sarah Hoffmann	6c6bbe5747	add type annotations for ICU tokenizer	2022-07-18 09:47:57 +02:00
Sarah Hoffmann	18b16e06ca	add type annotations for legacy tokenizer	2022-07-18 09:47:57 +02:00
Sarah Hoffmann	e37cfc64d2	add type annotations to ICU tokenizer helper modules	2022-07-18 09:47:57 +02:00
Sarah Hoffmann	d35e3c25b6	add type annotations for token analysis No annotations for ICU types yet.	2022-07-18 09:47:57 +02:00
Sarah Hoffmann	62eedbb8f6	add type hints for sanitizers	2022-07-18 09:47:57 +02:00
Sarah Hoffmann	d0c44431d0	add typing information for place_info and country_info	2022-07-18 09:47:57 +02:00
Sarah Hoffmann	cbbcbb1fd7	move country_info into data submodule	2022-07-06 11:08:36 +02:00
Sarah Hoffmann	bce93d60bd	move PlaceInfo into data submodule This data structure is shared between indexer and tokenizer.	2022-07-06 10:54:47 +02:00
Sarah Hoffmann	612d34930b	handle postcodes properly on word table updates update_postcodes_from_db() needs to do the full postcode treatment in order to derive the correct word table entries.	2022-06-23 23:42:31 +02:00
Sarah Hoffmann	5be320368c	add documentation for postcode customization	2022-06-23 23:42:31 +02:00
Sarah Hoffmann	7f2ad4ac7e	fix linting issue	2022-06-23 23:42:31 +02:00
Sarah Hoffmann	0f00f4968c	fix up BDD tests for postcode changes Includes smaller code fixes found by the tests.	2022-06-23 23:42:31 +02:00

1 2 3 4

195 Commits