Commit Graph

2600 Commits

Author SHA1 Message Date
Sarah Hoffmann
df12954312 fix use of term count in partial terms
Term count for partial words is one less than the actual number
of words. Take that into account when adding to the search rank.

Fixes #2081.
2020-12-01 17:21:01 +01:00
Sarah Hoffmann
a9357b4dce
Merge pull request #2082 from lonvia/compute-address-on-the-fly-II
Compute address for POIs on the fly
2020-12-01 16:41:31 +01:00
Sarah Hoffmann
63544db8f9 null entries need to be typed 2020-12-01 14:54:42 +01:00
Sarah Hoffmann
7295cad715 compute address parts for rank 30 objects on the fly
Rank 30 objects usually use the address parts of their parent.
When the parent has address parts that are areas but not marked
as isaddress, then the parent might go through multiple administrative
areas. In that case recheck if the right area has been choosen
for the object in question instead of relying on isaddress.
Note that we really only have to do the recomputation in the
case of 'isarea = True and isaddress = False' which hopefully
keeps the number of additional geometric operations we have to do
to a minimum.

There is one more special case to be taken into account here: a
street may go through two administrative areas and a house along
that street is placed in one of the area while the addr:* tags
says it belongs to the other. In that case we must not switch
the isaddress to the one it is situated. To avoid that recheck
the address names against the name of the ara. That is not perfect
but should cover most cases.

Fixes #328.
2020-12-01 11:58:25 +01:00
Sarah Hoffmann
ff85da0a31 cleanup get_addressdata
Save location data in a ROW instead of using separate varaibles
for each value.
2020-11-30 22:54:36 +01:00
Sarah Hoffmann
75b2d7ca99
Merge pull request #2080 from donalhunt/fix-Migration.md-typos
Migration.md: fix typos, improve style consistency and readability.
2020-11-30 16:21:35 +01:00
Donal Hunt
3c9eeb11fa Migration.md: fix typos, improve style consistency and readability. 2020-11-30 11:59:10 +00:00
Sarah Hoffmann
63bacaee2e
Merge pull request #2079 from lonvia/improve-progress-logging
Improve progress logging during indexing
2020-11-30 11:42:08 +01:00
Sarah Hoffmann
5016eace34 improve progress logging during indexing
Wait for 2 seconds before logging the first progress, so that we
have numbers that are a bit more reliable statistically speaking.

Also provides an actual implementation for the log_interval
parameter and fixes some small style issues.
2020-11-30 10:59:29 +01:00
Sarah Hoffmann
2e5c8b5cd3
Merge pull request #2077 from lonvia/optimize-large-rank-0-areas
Restrict size of features that get a full address search
2020-11-26 14:40:54 +01:00
Sarah Hoffmann
bf0f81adcb
Merge pull request #2076 from lonvia/search-name-index-migration
Docs: add migration for search_name_* tables
2020-11-26 12:01:38 +01:00
Sarah Hoffmann
2db751700e restrict size of features that get a full address search
It would be nice to always compute addresses for rank 0 objects
over the complete geometry, so that they can be found via all
the admin boundaries that they intersect. However, there are a
couple of extramely large boundaries in OSM (like timezones)
where this results in thousands of possible address candidates
that need to be checked. Fall back to getting the address of the
centroid for them.
2020-11-26 11:53:58 +01:00
Sarah Hoffmann
62bee4ed37 docs: add migration for search_name_* tables 2020-11-26 09:18:33 +01:00
Sarah Hoffmann
1f07d63dc5
Merge pull request #2075 from lonvia/filter-postcodes-from-location-area-large
Filter postcodes by search rank when adding to address list
2020-11-25 21:42:27 +01:00
Sarah Hoffmann
cc1af99dbd filter postcodes by search rank when adding to address list
The post codes are the last part that does not fit the new
address ranking scheme. In particular, the search rank is still
relevant for choosing if a postcode should be included into
the address terms. Filter out irrelevant postcodes in
getNearFeatures() already, to avoid having to check for
geometry relation.
2020-11-25 21:01:33 +01:00
Sarah Hoffmann
c5d98effc0
Merge pull request #2074 from lonvia/add-housenumber-to-unknown-places
Improve finding addresses that have their own search_name entry because of unknown addr:* parts
2020-11-25 16:57:09 +01:00
Sarah Hoffmann
b68b2ff6b8
Merge pull request #2073 from lonvia/multi-word-partial-terms-in-search-description
Improve handling of multi-word partials in SearchDescription
2020-11-25 16:57:00 +01:00
Sarah Hoffmann
57f0d55c2e make phpcs happy 2020-11-25 16:14:31 +01:00
Sarah Hoffmann
3cf763475f do not use artificial housenumbers as names
If they are artificial they cannot have a search_name entry.
2020-11-25 16:11:32 +01:00
Sarah Hoffmann
0f87da017f improve handling of multi-word partials in SearchDescription
Multi-word partial terms had an undue advantage over separate partial
terms because they only need to pay the penalty once. This changes
the behaviour by setting the penalty according to the number of
words in the token. This should get rid of search interpretations
with low chance of matching.

This also fixes handling of exact term matching. We now match against
all exact terms of the query, not just a couple of them collected
while building the interpretations.

Also adds a penalty to very short postcodes.
2020-11-25 12:07:04 +01:00
Sarah Hoffmann
22800d7d59 Search housenumbers with unknown address parts by housenumber term
House numbers need special handling because they may appear after
the street term. That means we canot just use them as the main name
for searches where the address has its own search term entries.
Doing this right now, we are able to find '40, Main St, Town' but not
'Main St 40, Town'.

This switches to using the housenumber token as the name term instead.
House number tokens can get special handling when building the search
query that covers the case where they come after the street.

The main disadvantage is that this once more increases the numbers
of possible search interpretation of which we have already too many.

no penalty for housenumber searches
2020-11-25 11:36:10 +01:00
Sarah Hoffmann
f21853ea9d
Merge pull request #2071 from lonvia/fix-more-ranks
Search rank 30 must always go with address rank 30
2020-11-24 21:45:30 +01:00
Sarah Hoffmann
1e76d668bd
Merge pull request #2070 from lonvia/unlisted-places-to-rank-25
Move unlisted places to address rank 25
2020-11-24 21:45:16 +01:00
Sarah Hoffmann
b4b50eef15 search rank 30 must always go with address rank 30 2020-11-24 17:57:28 +01:00
Sarah Hoffmann
a9ad390b9e move unlisted places to address rank 25
Unlisted places are derived from addr:place and as such are
still places not streets.
2020-11-24 17:54:00 +01:00
Sarah Hoffmann
2e9e961fff
Merge pull request #2068 from lonvia/fix-reverse-only
Do not create POI search terms in reverse-only mode
2020-11-24 08:22:48 +01:00
Sarah Hoffmann
13180989d9 Test --reverse-only with CI 2020-11-23 22:36:28 +01:00
Sarah Hoffmann
a4f1e40b72 do not create POI search terms on reverse-only
Fixes #2067.
2020-11-23 19:55:36 +01:00
Sarah Hoffmann
04d485c550
Merge pull request #2065 from rustycamper/patch-1
viewbox arguments are no longer accepter "in any order"
2020-11-23 09:55:29 +01:00
Pietro
a92bd1e2db
viewbox arguments are no longer accepter "in any order"
Order should be longitude, then latitude
2020-11-23 10:40:43 +02:00
Sarah Hoffmann
f89e71a861 make sure that admin levels in NL are kept in order 2020-11-19 09:44:02 +01:00
Hendrik Morée
dcc075b34b Admin levels 8 and 10 of the Netherlands are municipal / city 2020-11-18 11:30:24 +01:00
Sarah Hoffmann
49083c2597
Merge pull request #2058 from lonvia/split-address-words
Split addr:* tags into words before adding to the search index
2020-11-18 08:58:17 +01:00
Sarah Hoffmann
29785ba166
Merge pull request #2059 from lonvia/include-parent-name-for-unknown-places
POIs with unknown addr:place must add parent name to address
2020-11-18 08:58:03 +01:00
Sarah Hoffmann
ffb2c93ba3 POIs with unknown addr:place must add parent name to address
The previous behaviour was a left-over from a former version
where such POIs parented to the street. Now that they parent to
places, it should be included.
2020-11-17 19:44:43 +01:00
Sarah Hoffmann
30a6b6bdac split addr: tags into words before adding to the search index
Address parts are only matched by single partial words. If
the addr: names are not split, then multi-word names cannot
be found.
2020-11-17 18:03:33 +01:00
Sarah Hoffmann
cc345f531a
Merge pull request #2056 from lonvia/avoid-linking-postal-areas
Disallow linking for postcode areas
2020-11-17 11:15:56 +01:00
Sarah Hoffmann
9ede048769 disallow linking for postcode areas 2020-11-17 10:53:26 +01:00
Sarah Hoffmann
d23bf6e659
Merge pull request #2054 from lonvia/display-addr-terms
Merge places into address lists referred to by addr:* tags but not computed by Nominatim
2020-11-16 16:08:06 +01:00
Sarah Hoffmann
6b60f0ab03 use bool_or(ST_Intersects) instead of ST_Intersects(ST_Collect)
ST_Intersects segfaults on geometry collections for certain versions
of Postgis 3.
2020-11-16 15:28:01 +01:00
Sarah Hoffmann
aa9923bf07 fix typo 2020-11-16 15:28:01 +01:00
Sarah Hoffmann
9160cce6d8 remove unused columns in search_name_* and use right index
We only need the address rank these days, so get rid of
search rank. Also switch indexes to work on address rank.
2020-11-16 15:28:01 +01:00
Sarah Hoffmann
885dc0a8e1 more tests for absense of additional addressline entries 2020-11-16 15:28:01 +01:00
Sarah Hoffmann
7324431b12 get additional addresses for rank 30 objects
get_addressdata() now also checks if the place itself has entries
in the place_addressline table and merges them into the results.

Also restrict checking for address tag places to cases where the
name cannot be found in the parent's address search terms. Looking
up all address tags is just too slow.
2020-11-16 15:28:01 +01:00
Sarah Hoffmann
021f2bef4c get address terms from address tags for rank 30
For rank 30 objects add extra elements into the place_addressline
table.
2020-11-16 15:28:01 +01:00
Sarah Hoffmann
6260fef2e8 add test for placex from addr tags 2020-11-16 15:28:01 +01:00
Sarah Hoffmann
c7472662a6 lookup places for address tags for rank < 30
While previously the content of addr:* tags was only added
to the list of address search keywords, we now really look up
the matching place. This has the advantage that we pull in all
potential translations from the place, just like all the other
address terms that are looked up by neighbourhood search.

If no place can be found for a given name, the content of the
addr:* tag is still added to the search keywords as before.
2020-11-16 15:28:01 +01:00
Sarah Hoffmann
fecfe62fc6
Merge pull request #2055 from lonvia/fix-actions
Actions: update apt repo before installing software
2020-11-16 11:26:10 +01:00
Sarah Hoffmann
21b0430e46 actions: update apt repo before installing software 2020-11-16 10:14:38 +01:00
Sarah Hoffmann
66595c2d2b
Merge pull request #2046 from lonvia/less-parallel-ranking
Only index larger batches for rank 30
2020-11-06 09:39:07 +01:00