Update names in the coutry_names table on the fly from incomming
OSM country data. Adding a small sanity check that the country
must be an OSM relation and within the area where we expect the
country to be.
If a place node is already linked against a boundary, it should not
be used for linking again. It is usually a sign of a mapping error,
when there are multiple boundary candidates. This change just avoids
inconsistent data in the database, it does not guarantee that the
linking is against the more correct boundary.
Rank 30 objects usually use the address parts of their parent.
When the parent has address parts that are areas but not marked
as isaddress, then the parent might go through multiple administrative
areas. In that case recheck if the right area has been choosen
for the object in question instead of relying on isaddress.
Note that we really only have to do the recomputation in the
case of 'isarea = True and isaddress = False' which hopefully
keeps the number of additional geometric operations we have to do
to a minimum.
There is one more special case to be taken into account here: a
street may go through two administrative areas and a house along
that street is placed in one of the area while the addr:* tags
says it belongs to the other. In that case we must not switch
the isaddress to the one it is situated. To avoid that recheck
the address names against the name of the ara. That is not perfect
but should cover most cases.
Fixes#328.
It would be nice to always compute addresses for rank 0 objects
over the complete geometry, so that they can be found via all
the admin boundaries that they intersect. However, there are a
couple of extramely large boundaries in OSM (like timezones)
where this results in thousands of possible address candidates
that need to be checked. Fall back to getting the address of the
centroid for them.
The post codes are the last part that does not fit the new
address ranking scheme. In particular, the search rank is still
relevant for choosing if a postcode should be included into
the address terms. Filter out irrelevant postcodes in
getNearFeatures() already, to avoid having to check for
geometry relation.
House numbers need special handling because they may appear after
the street term. That means we canot just use them as the main name
for searches where the address has its own search term entries.
Doing this right now, we are able to find '40, Main St, Town' but not
'Main St 40, Town'.
This switches to using the housenumber token as the name term instead.
House number tokens can get special handling when building the search
query that covers the case where they come after the street.
The main disadvantage is that this once more increases the numbers
of possible search interpretation of which we have already too many.
no penalty for housenumber searches
The previous behaviour was a left-over from a former version
where such POIs parented to the street. Now that they parent to
places, it should be included.
get_addressdata() now also checks if the place itself has entries
in the place_addressline table and merges them into the results.
Also restrict checking for address tag places to cases where the
name cannot be found in the parent's address search terms. Looking
up all address tags is just too slow.
While previously the content of addr:* tags was only added
to the list of address search keywords, we now really look up
the matching place. This has the advantage that we pull in all
potential translations from the place, just like all the other
address terms that are looked up by neighbourhood search.
If no place can be found for a given name, the content of the
addr:* tag is still added to the search keywords as before.
Go back to using centroid when determining if one admin level
is within another. There are cases where boundaries are slightly
misaligned due to mapping errors (not using the same ways in the
relations).
Only declare boundaries the same when they have the same wikidata
tag _and_ have exactly the same geometry. This works around tagging
errors with the wikidata tag, which happen because of automated
edits to the wikidata tag.
The Polish community maps admin boundaries that span multiple
levels by duplicating the boundary relations. Detect this situation
by looking out for matching wikidata tags. The higher ranked
duplicates are then thrown out from the address pool by setting
their address rank to 0.
Now that the containment check uses ST_Relate, we need to add
a separate bbox contains check to ensure that Postgis does the
efficient check first. Note that we still cannot get rid of the
overlap(&&) check because then Postgis will use the wrong indexes.
The new address computation assumes that the centroid is inside
the area. Therefore we cannot use the centroid function. Use the
pre-computed centroid instead which has already been corrected to
be inside the area.
Also demote the address rank of an admin boundary when there
is a place area of higher rank that completely contains the
area. This catches the case where city boundaries do not exactly
align with administrative units (see for example Moscow).
This is a complete rewrite of the selection of address parts to
be inserted into the place_addressline table.
The new algorithm selects for each rank:
* the boundary overlapping with the addressee and contained
in the already selected boundaries of lower rank, or failing that
* the place node closest to the addressee that is contained in
the already selected boundaries and in the influence radius
of already selected place nodes of lower rank
Place nodes that are not contained in already selected boundaries
of lower rank are completely thrown away. All other candidates are
added as non-address parts.
If a place node of city rank and above finds itself in an
administrative boundary of the same address rank, then
increase the address rank by 2. This catches the rather
frequent case where city suburbs are tagged for historical
reasons as towns or villages.
The latest version of osm2pgsql no longer creates indexes on
the members of planet_osm_rels. So we do that ourselves.
Given that we only need to access associated street relations,
the index can become quite a bit smaller.
Using both is slightly problematic because they have different
ways to use the index. Newer versions of Postgis exhibit a
query planner issue when both functions appear together.
As ST_Intersects includes ST_Covers, simply remove the latter.
We can't use getNearFeatures() to determine the parent of a
place with an unlisted addr:place because this function
returns place nodes that are potentially outside the area
of interest. Doing the complete address computation is too
expensive, so simply use the area with the largest rank that
contains the feature instead.
When a POI has no addr:street but an addr:place that is not
contained in the name list of the parent place, then remember
this situation and merge the content of addr:place into the
address output.
We don't need to care about translations in this case because
it is obvious that no object with translations exists if the
parent isn't the object named in addr:place.