Polygons with rank_address = 0 are only used in search and (rarely)
for reverse lookup. Geometries do not need to be precise for that
because topology does not matter. OSM has some very large polygons
of natural features with sizes of more than 10MB. Simplify these
polygons to keep the database and indexes smaller.
So far the SQL logic used the information from the address field
to determine if an address is attached to a street or place.
This changes the logic to use the information provided in the
token_info. This allows sanitizers to enforce a certain parenting
without changing the visible address information.
When a associatedStreet relation has multiple street members
always take the closest one. Avoid geometry operations for
the frequent case that there is only one street.
The values in the raster are already normalized between 0 and 2**16,
so a simple conversion to [0, 1] will do.
Check for existance of secondary_importance table statically when
creating the SQL function. For that to work importance tables need
to be created before the functions.
Adds partial indexes for all geometry queries used during import.
A full index is not necessary anymore at that point. Still create
the index afterwards for use in queries.
Also adds documentation for all indexes on where they are used.
When a boundary or place changes its address rank, all places where
it participates as address need to be potentially reindexed.
Also use the computed rank when testing place nodes against
boundaries. Boundaries are computed earlier.
Fixes#2794.
When shifting address ranks, the evaluation is always done against
unshifted address ranks on import because the objects we compare against
have not been indexed yet. This changes for updates when the object have
been touched in the meantime. To ensure consistent behaviour across
imports and updates, always use the unshifted address ranks.
Resolves a couple of situations where a mixed use of places areas and
administrative boundaries would result in a hierarchy that did not
properly respect the contains relation.
When taking over the address rank from a linked place, it needs
to be the originally computed rank, not the one that might have
been adjusted in the meantime. The adjustment was made under the
assumption that the node is not linked.
When moving the finding of linked places to the precomputation stage,
it was also moved before the statement where the linked_place_id was
removed from the linkee. The result was that the current linkee was
excluded when looking for a linked place on updates because it was
still linked to the boundary to be updated.
Fixed by allowing to either keep the linkage or change to an unlinked
place.
This is needed for pedestrian areas mapped as multipolygons
and consequently as relations. The lookup in placex guarantees
that the referenced OSM object is indeed a street.
Fixes#2669.
The inherited housenumber is needed for display output. We can't
take the one from the housenumber field because it is already
normalized. Remove the inherited address only when reindexing.
Fixes#2683.
Instead of computing the distance to the centroid of the area
compute the distance of the area to the centroid of the feature.
This means we give preference to the area that covers the centroid.
It's still a heuristics but one that is a bit less random.
This keeps the names tracable and ensures that all names are searchable
when they differ. Do not keep names when they are exactly the same
to save some space. Linked names are cleaned out before relinking.
An expression of the form 'SELECT (func()).*' will be expanded
by Postgresql _before_ execution with the result that the function
will be called as many times as there are fields in the record.
This is not what we want. The function call needs to go into
the FROM clause instead.
Nodes on an interpolation now only get the address tags of
interpolations and then compute their own parent from that. They no
longer inherit the parent directly.
Use the same update mechanism as for updates on the interpolations
themselves. Updates must solely happen in place_insert as this is
the place where actual changes of the data happen.
Adds class, type, country and rank to the exported information
and removes the rather odd hack for countries. Whether a place
represents a country boundary can now be computed by the tokenizer.
Instead of requesting the match tokens from the tokenizer
when looking for parent streets/places and address parts,
hand in the saved tokens and ask if they match. This gives
the tokenizer more freedom to decide how name matching
should be done.
Linked places may bring in extra names. These names need to be
processed by the tokenizer. That means that the linking needs
to be done before the data is handed to the tokenizer. Move finding
the linked place into the preparation stage and update the name
fields. Everything else is still done in the indexing stage.
When guessing postcodes from the area, only postcodes within
that area are accepted. For POIs that is usually not what we
want as the postcode would have to be within a house for
example.
Fixes#2301.
Normalization and token computation are now done in the tokenizer.
The tokenizer keeps a cache to the hundred most used house numbers
to keep the numbers of calls to the database low.
Indexing is now split into three parts: first a preparation step
that collects the necessary information from the database and
returns it to Python. In a second step the data is transformed
within Python as necessary and then returned to the database
through the usual UPDATE which now not only sets the indexed_status
but also other fields. The third step comprises the address
computation which is still done inside the update trigger in
the database.
The second processing step doesn't do anything useful yet.
Instead of normalising the names simply compare them in lower
case. This removes the dependency on the tokenizer for
linking boundaries and nodes. When looking up the linked places
by place type also allow that one name is simply contained in the
other. This catches the frequent case where one of the names has
an addendum (e.g. Newport vs. City of Newport).
Drops the special index for the name lookup and insted relies
on a slightly extended version of the geometry index used for
reverse lookup. Saves around 100MB on a planet.