Commit Graph

99 Commits

Author SHA1 Message Date
Sarah Hoffmann
7324431b12 get additional addresses for rank 30 objects
get_addressdata() now also checks if the place itself has entries
in the place_addressline table and merges them into the results.

Also restrict checking for address tag places to cases where the
name cannot be found in the parent's address search terms. Looking
up all address tags is just too slow.
2020-11-16 15:28:01 +01:00
Sarah Hoffmann
021f2bef4c get address terms from address tags for rank 30
For rank 30 objects add extra elements into the place_addressline
table.
2020-11-16 15:28:01 +01:00
Sarah Hoffmann
c7472662a6 lookup places for address tags for rank < 30
While previously the content of addr:* tags was only added
to the list of address search keywords, we now really look up
the matching place. This has the advantage that we pull in all
potential translations from the place, just like all the other
address terms that are looked up by neighbourhood search.

If no place can be found for a given name, the content of the
addr:* tag is still added to the search keywords as before.
2020-11-16 15:28:01 +01:00
Sarah Hoffmann
fa574ae9fd use different area estimates for large countries 2020-11-02 14:21:30 +01:00
Sarah Hoffmann
0f5615b618 guess a base address level for address rank 0 objects
The guess is based on the area and mainly avoids odd
addresses for very large or small objects.
2020-11-02 11:42:10 +01:00
Sarah Hoffmann
95f83b90d2 minor fixes for geometry compuation during boundary ranking
Go back to using centroid when determining if one admin level
is within another. There are cases where boundaries are slightly
misaligned due to mapping errors (not using the same ways in the
relations).

Only declare boundaries the same when they have the same wikidata
tag _and_ have exactly the same geometry. This works around tagging
errors with the wikidata tag, which happen because of automated
edits to the wikidata tag.
2020-10-28 10:49:26 +01:00
Sarah Hoffmann
7a16909219 detect and remove admin boundary duplicates
The Polish community maps admin boundaries that span multiple
levels by duplicating the boundary relations. Detect this situation
by looking out for matching wikidata tags. The higher ranked
duplicates are then thrown out from the address pool by setting
their address rank to 0.
2020-10-28 10:49:26 +01:00
Sarah Hoffmann
bf4d75458c add explicit bbox contains check
Now that the containment check uses ST_Relate, we need to add
a separate bbox contains check to ensure that Postgis does the
efficient check first. Note that we still cannot get rid of the
overlap(&&) check because then Postgis will use the wrong indexes.
2020-10-19 10:39:01 +02:00
Sarah Hoffmann
1064a9264e revert to && comparison for geometries
Postgis 3 picks the wrong index when using ~ or @.
2020-10-16 09:49:48 +02:00
Sarah Hoffmann
acfa7bec9c use computed centroid for location_area_large
The new address computation assumes that the centroid is inside
the area. Therefore we cannot use the centroid function. Use the
pre-computed centroid instead which has already been corrected to
be inside the area.
2020-10-15 17:30:52 +02:00
Sarah Hoffmann
62b94e838b correctly set from area column in place_addressline
This was always set to true which brings us to the question
if it is even still needed.
2020-10-15 12:06:53 +02:00
Sarah Hoffmann
5236e7a03e fix use of geometry operators
@ is contained by while ~ is contains.
2020-10-15 12:06:18 +02:00
Sarah Hoffmann
7e9412a044 demote admin boundaries for place areas
Also demote the address rank of an admin boundary when there
is a place area of higher rank that completely contains the
area. This catches the case where city boundaries do not exactly
align with administrative units (see for example Moscow).
2020-10-14 11:33:47 +02:00
Sarah Hoffmann
e47c19beb9 exclude rank 25 when computing addresses of streets
Address rank 25 is used for squares which are address-wise on the
same level as streets.
2020-10-13 22:36:17 +02:00
Sarah Hoffmann
2fe3c654fc overhaul address computation
This is a complete rewrite of the selection of address parts to
be inserted into the place_addressline table.

The new algorithm selects for each rank:
* the boundary overlapping with the addressee and contained
  in the already selected boundaries of lower rank, or failing that
* the place node closest to the addressee that is contained in
  the already selected boundaries and in the influence radius
  of already selected place nodes of lower rank

Place nodes that are not contained in already selected boundaries
of lower rank are completely thrown away. All other candidates are
added as non-address parts.
2020-10-13 22:10:07 +02:00
Sarah Hoffmann
5ec48c66cb move ordering out of getNearFeatures
The two places where the function is called have different ordering
requirement.
2020-10-13 15:24:54 +02:00
Sarah Hoffmann
887ae7fcab increase radius of influence around city nodes
The current radius does not cover cities with more than a
million inhabitants well.
2020-10-12 14:17:37 +02:00
Sarah Hoffmann
ff47f6f65d when linking always check against original address rank 2020-10-11 12:29:49 +02:00
Sarah Hoffmann
b04463bb2d demote place nodes in admin areas
If a place node of city rank and above finds itself in an
administrative boundary of the same address rank, then
increase the address rank by 2. This catches the rather
frequent case where city suburbs are tagged for historical
reasons as towns or villages.
2020-10-11 12:04:53 +02:00
Sarah Hoffmann
f8694da3c9 Remove more rank_search usage from address computation
Fixes #1904.
2020-09-25 17:50:36 +02:00
Sarah Hoffmann
6625e93be6
Merge pull request #1975 from lonvia/simplify-parent-assignment-for-unlisted-places
Use closest containing place area for parent of unlisted addr:place
2020-09-23 19:10:42 +02:00
Sarah Hoffmann
d9325dc11a use rank_address when invalidating containing objects
Only rank_address is now relevant for determining if a place
could be part of an address.
2020-09-23 17:44:31 +02:00
Sarah Hoffmann
d3ca9dd3f7 remove ST_Covers check when also testing for ST_Intersects
Using both is slightly problematic because they have different
ways to use the index. Newer versions of Postgis exhibit a
query planner issue when both functions appear together.
As ST_Intersects includes ST_Covers, simply remove the latter.
2020-09-23 17:44:31 +02:00
Sarah Hoffmann
e552f6bce5 use closest containing place for unlisted addr:place
We can't use getNearFeatures() to determine the parent of a
place with an unlisted addr:place because this function
returns place nodes that are potentially outside the area
of interest. Doing the complete address computation is too
expensive, so simply use the area with the largest rank that
contains the feature instead.
2020-09-23 17:33:42 +02:00
Sarah Hoffmann
c84e7e72f1 add unknown addr:place to address output
When a POI has no addr:street but an addr:place that is not
contained in the name list of the parent place, then remember
this situation and merge the content of addr:place into the
address output.

We don't need to care about translations in this case because
it is obvious that no object with translations exists if the
parent isn't the object named in addr:place.
2020-09-23 11:55:18 +02:00
Sarah Hoffmann
f2ff351da4
Merge pull request #1971 from lonvia/drop-support-for-isin
Drop support for is_in tag
2020-09-23 09:20:35 +02:00
Sarah Hoffmann
c5c242d193
Merge pull request #1972 from lonvia/exclude-unnamed-highway-areas
Exclude unnamed highway areas
2020-09-23 09:20:16 +02:00
Sarah Hoffmann
72193a1c23 exclude unnamed highway areas
These are used to mark large paved areas. Sometimes they exists
together with named regular streets. In such cases the unnamed
area may overshadow the actual street when computing the address
parent. As unnamed highways are not very useful anyway, we
simply remove them from the database.
2020-09-22 21:42:13 +02:00
Sarah Hoffmann
d04e87fb80 drop suport for is_in tag 2020-09-22 20:26:36 +02:00
Sarah Hoffmann
a8dfbcef44 always bind addr:place to place instead of street
If an addr:place is given but no addr:street tag, then bind
the rank 30 object always to a <=25 object, even when there
is none found with the same name.
2020-09-21 10:15:14 +02:00
Sarah Hoffmann
caea14d035 merge addr tags into search_name table
When a place of rank 30 has addr tags that are not covered by the
search terms of the parent, add a separate entry for the POI in
the search_name table that includes the addr tags. We can only
do that with named places. For POIs without a name the housenumber
is used as name. If that is not available either, searching still
won't work.
2020-09-21 10:15:14 +02:00
Sarah Hoffmann
731c620e31 ignore postcodes with colons
Colons are used as a delimiter in tiger:left and tiger:right tags
when multiple postcodes are given. Ignore those. This was already
done in the postcode update script. This changes just makes the
two places consistent where postcodes are added.
2020-09-19 17:23:40 +02:00
Sarah Hoffmann
b219374d36 remove special casing for rank 25 postcodes
They can be computed like any other place.
2020-09-18 16:18:02 +02:00
Sarah Hoffmann
4c9cfe2532 remove postcodes entirely from indexing
place=postcode places are artificial places that collect addr:postcode
points for aggration. They should neither show up in the address nor
be searchable. That means that there is no need to index them at all.
Only let boundary=postal_code through which define correct areas for
postcodes.
2020-09-18 15:09:35 +02:00
Sarah Hoffmann
fe250d3ee8
Merge pull request #1961 from lonvia/set-place-type-for-result-in-address
Use place type of for result object in address parts
2020-09-17 21:23:40 +02:00
Sarah Hoffmann
fe8566928e use place type of for result object in address parts
Boundaries shound derive the address part type from the
linked place if possible. This was already implemented
for the address objects but not for the address information
from the address itself.

Fixes #1949.
2020-09-17 18:17:01 +02:00
Sarah Hoffmann
3600709116 make sure that all postcodes have an entry in word
It may happen that two different postcodes normalize to exactly
the same token. In that case we still need two different entries
in the word table. Token lookup will then make sure that the correct
one is choosen.

Fixes #1953.
2020-09-17 17:11:22 +02:00
Sarah Hoffmann
07430b0194 tweak size of large POIs
Further reduce the size from which on POIs are no longer bound
to streets but only to larger objects. The point of reference,
of what a largest POI could be that is still bound is JFK airport.
2020-09-01 18:00:40 +02:00
Sarah Hoffmann
fae02fab00 address rank adjustment for addressable boundaries only
Only administrative boundaries with an address rank need
to be adjusted. Otherwise just handle them like any other
object.
2020-09-01 17:59:26 +02:00
Sarah Hoffmann
6e4b7eb966 do not block deletion of large highway areas
Deletion of areas should only e blocked for addressable features.
Streets and POIs do not have a large impact on updates.
2020-08-28 09:49:21 +02:00
Sarah Hoffmann
559fe513fa increase splitting for large geometries
When computing the address parts for a geometry, we need to do
a ST_Relates lookup in the location_area_large_* tables. This is
potentially very expensive for geometries with many vertices.
There is already a funtion for splitting large areas to reduce the
impact. This commit reduces the minimum area of a split, effectively
increasing the number of splits.

The effect on database size is minimal (around 3% increase), while
the indexing speed for streets increases by a good 60%.
2020-08-20 16:37:33 +02:00
Sarah Hoffmann
d6ff7475f1 make sure that addr:* tags can always be searched for
Always add contents of addr:* tags into address part of the search
table, even when there is no corresponding other name. This keeps
search tolerant to the kind of tagging where parts show up in the
address that have no corresponding object in the database or where
it is only an unaddressable object.
2020-08-19 11:44:10 +02:00
Sarah Hoffmann
1529666232 use only centroid to get parent admin boundaries
Using the full geometry is far too expensive.
2020-08-18 15:17:09 +02:00
Sarah Hoffmann
e21a707166 remove linked_place from extratags when updating
Before updating an admin boundary we need to make sure that any
artificially generated 'linked_place' entry is removed from the
extratags column. This ensures that the place designation does
not linger when a linked place disappears and that it is updated
when the linking changes.
2020-08-13 16:59:11 +02:00
Sarah Hoffmann
06aa0f0b76 use address rank for address forming when available 2020-08-12 22:22:24 +02:00
Sarah Hoffmann
fb8bb30144 boundary address ranks must not go above 25
Fixes #1914.
2020-08-12 22:22:24 +02:00
Sarah Hoffmann
5b9f61cff8 also take place tags into account for address rank
An admin boundary might have a place tag but no matching place node.
We still should use the place value as indicator for the address
rank in this case.
2020-08-12 22:22:24 +02:00
Sarah Hoffmann
83b2b4970d Make SQL debug statements execute again
There were some old variable names used that are no longer valid.
Either fix them or remove the statement completely.

Fixes #1907.
2020-08-06 09:29:19 +02:00
Sarah Hoffmann
4e1f245331 make house number reappear in display name on named POIs
After 6cc6cf950c names and house numbers
of POIS got mingled into a single item when creating the display name.
Add the house number as extra information without place_id to avoid
later mangling.
2020-07-30 23:39:55 +02:00
Sarah Hoffmann
6a3eb7edf2 preserve admin level hierarchy between admin boundaries
When the address rank of an admin boundary is changed because
of an attached place type, it may happen that the admin_level
hierarchy gets inversed. Avoid that by adjusting the address
rank if an inversion is detected.
2020-07-28 22:15:25 +02:00