Nominatim

mirror of https://github.com/osm-search/Nominatim.git synced 2024-12-26 06:22:13 +03:00

Author	SHA1	Message	Date
Sarah Hoffmann	c3788d765e	add consistent SPDX copyright headers	2022-01-03 16:23:58 +01:00
Sarah Hoffmann	be65c8303f	export more data for the tokenizer name preparation Adds class, type, country and rank to the exported information and removes the rather odd hack for countries. Whether a place represents a country boundary can now be computed by the tokenizer.	2021-09-29 11:54:14 +02:00
Sarah Hoffmann	59fe74ddf6	move name matching into tokenizer module Instead of requesting the match tokens from the tokenizer when looking for parent streets/places and address parts, hand in the saved tokens and ask if they match. This gives the tokenizer more freedom to decide how name matching should be done.	2021-09-27 11:36:19 +02:00
Sarah Hoffmann	28ee3d0949	move linking of places to the preparation stage Linked places may bring in extra names. These names need to be processed by the tokenizer. That means that the linking needs to be done before the data is handed to the tokenizer. Move finding the linked place into the preparation stage and update the name fields. Everything else is still done in the indexing stage.	2021-08-20 22:44:17 +02:00
Sarah Hoffmann	f74dc38766	always compute guessed postcode for POIs from centroid When guessing postcodes from the area, only postcodes within that area are accepted. For POIs that is usually not what we want as the postcode would have to be within a house for example. Fixes #2301.	2021-05-26 11:15:13 +02:00
Sarah Hoffmann	0da481f207	remove debug code	2021-04-30 11:30:51 +02:00
Sarah Hoffmann	d75a235c1f	use address tokens in SQL	2021-04-30 11:30:51 +02:00
Sarah Hoffmann	ffc2d82b0e	move postcode normalization into tokenizer	2021-04-30 11:30:51 +02:00
Sarah Hoffmann	d8ed1bfc60	move houseunumber handling to tokenizer Normalization and token computation are now done in the tokenizer. The tokenizer keeps a cache to the hundred most used house numbers to keep the numbers of calls to the database low.	2021-04-30 11:30:51 +02:00
Sarah Hoffmann	d711f5a81e	move name token creation into tokenizer Name tokens are now handed in via token_info and used from there. Also moves the generic search name insertion function back to placex_triggers.sql.	2021-04-30 11:30:51 +02:00
Sarah Hoffmann	1b1ed820c3	introduce index for finding surrounding buildings	2021-04-30 11:30:51 +02:00
Sarah Hoffmann	9397bf54b8	introduce external processing in indexer Indexing is now split into three parts: first a preparation step that collects the necessary information from the database and returns it to Python. In a second step the data is transformed within Python as necessary and then returned to the database through the usual UPDATE which now not only sets the indexed_status but also other fields. The third step comprises the address computation which is still done inside the update trigger in the database. The second processing step doesn't do anything useful yet.	2021-04-30 11:30:51 +02:00
Sarah Hoffmann	e7266b52ae	simplify name matching between boundary and place node Instead of normalising the names simply compare them in lower case. This removes the dependency on the tokenizer for linking boundaries and nodes. When looking up the linked places by place type also allow that one name is simply contained in the other. This catches the frequent case where one of the names has an addendum (e.g. Newport vs. City of Newport). Drops the special index for the name lookup and insted relies on a slightly extended version of the geometry index used for reverse lookup. Saves around 100MB on a planet.	2021-04-14 17:52:59 +02:00
Sarah Hoffmann	6cbef84cad	use new transliteration in initial housenumber word computation The new create_housenumber_id() function splits housenumber lists correctly. Otherwise there is no difference.	2021-04-04 15:26:47 +02:00
Sarah Hoffmann	55fcc44c8c	correctly handle housenumber lists Lists are now standardised to use a semicolon separator.	2021-04-04 15:26:47 +02:00
Sarah Hoffmann	16a66b5326	move transliteration of housenumbers into indexing Housenumbers are now saved in transliterated form in the housenumber column. This saves the transliteration step during lookup.	2021-04-04 15:26:47 +02:00
Sarah Hoffmann	d2bd6aa78d	introduce jinja2 for preprocessing SQL Replaces various hand-crafted replacements of varying format with a single Jinja2 templating mechanism. Allows full access to configuration if necessary.	2021-03-03 17:51:08 +01:00
Sarah Hoffmann	b9517c99ae	rename sql directory to lib-sql Also introduces a separate constant for the sql directory, so that it can be put separately from the rest of the data if required.	2021-02-09 15:26:56 +01:00

18 Commits