When using an explicit cast to char(1) when refering to the
osm_type column postgres won't use the index in some cases.
Observed only on postgres 9.5 from the original Postgres
reporsitories.
Fixes#741.
Pyosmium comes with convenient functions for finding the
right state and does not require external files for
rembering the state. Updates can now conveniently
set up by simply running ./utils/update.php --init-updates
and state is kept directly in the import_status table.
This change requires an update in the database schema.
Run the following to update:
ALTER TABLE import_status ADD COLUMN sequence_id integer;
ALTER TABLE import_status ADD COLUMN indexed boolean;
ALTER TABLE import_osmosis_log ADD COLUMN batchseq integer;
Rank 30 has some very large geometries (peninsulas, time zones,
etc.) for which a near feature search for the full geometry is
too expensive, so do the search on the centroid only.
Derived columns are not needed because parent information is
always computed from scratch. So the columns are just duplicate
information.
Also get address information on nodes from address columns. The
other columns are not necessarily reliable when the node have not
been indexed yet.
* ERROR: sequence "seq_postcodes" does not exist
* ERROR: table "import_polygon_error" does not exist
* ERROR: table "import_polygon_delete" does not exist
* ERROR: sequence "file" does not exist
When roads cross boundaries, both administrative entities should
be added to the address list, so that both entities can be used
for searching. Also allows in a second step to better sort out
addresses of POIs on such roads.
Fixes#121.
The interpolaton computation needs information from the osm2pgsql
slim tables which may not be available when the data is inserted.
Insertion now only adds a line with basic address information to
location_property_osmline. The line is then split during the
indexing, leading to more lines (which are complete in that case)
being inserted.
Fixes#598.
If a country boundary has a country_code that is unknown to
Nominatim, it would delete all names because the coalascing
with country_name would not yield any result.
On placex_update we stop the indexing to the first parent if the rank_search is above 27. We should do the same check in get_adressdata, because place with a rank_address != 30 and a search_rank > 27 will have only 1 parent.
https://github.com/twain47/Nominatim/issues/534
interpolation lines may be missing in osmline when the interpolation
is broken, so we cannot conclude that a way is not in place, just
because there are no entries in location_property_osmline.
It is perfectly valid that interpolated addresses refer to
something else than a street.
Also gets rid of the maximum interpolation size. As we don't
expand, arbitrary sizes are fine.
Introduces two new settings CONST_Use_US_Tiger_Data and
CONST_Use_Aux_Location_data, which are disabled by default.
When false the corresponding tables are ignored in queries
and updates.
Aux and tiger tables are no longer created by default. This
has to be done by the corresponding import scripts. The former
aux table creation can be found in sql/aux_tables.sql for
reference.
- remove query_log table, keeping only new_query_log
- drop unused import_npi_log table
- disable DB logging per default
- use file logging structure from osm.org
Get the version from the database where necessary or simply
probe for existence of features. Fake hstore_to_json when
necessary.
Bumps the minimum required versions fro postgres to 9.1 and
for postgis to 2.0.
When linked the place may not be in the search index,
so it must be reindexed when being unlinked. The status
change will only have an effect during the subsequent
update, so change tests to that effect.
Avoids the occasional rounding problem which might occur when splitting
a line anywhere but on a support point, see postgis doc for ST_Split.
Fixes#253
Mainly there to avoid having many duplicated postcode entries
in place_addressline from nodes which have tags where only
addr:postcode is recognized by Nominatim (e.g. fire hydrants).
This allows address interpolations to work correctly when flatnode storage
is used for node coordinates.
To fix interpolations in an existing database, follow these steps:
* invalidate all interpolations (in psql):
`UPDATE placex SET indexed_status=2 WHERE rank_search = 28`
* disable updates:
./utils/setup.php --create-functions --create-partition-functions
* reindex the whole lot:
./utils/update.php --index --index-instances <number of your cpus>
* enable updates again:
./utils/setup.php --create-functions --enable-diff-updates --create-partition-functions
Avoids to a certain extent propagation of misassignment of
partitions when a country is expanded so far into another
that the centroid ends up in this other country.
Also added a sanity check to ensure that accidental removal of admin_level
tags on large areas doesn't cause huge reindexing load. That can be disabled
by setting CONST_Limit_Reindexing to false.
Name of function was changed in postgis 2.1 and now prints ugly
deprecation warnings. For older versions of postgis, function
will be renamed to the new name during the setup of the DB.
To update existing databases with postgis < 2.1 run:
ALTER FUNCTION st_line_interpolate_point(geometry, double precision) RENAME TO ST_LineInterpolatePoint
and then reinstall the SQL functions:
./utils/setup.php --create-functions --enable-diff-updates --create-partition-functions
This might happen for nameless landuse/natural objects that are added to place
during initial import but then dropped when being copied to placex.
If they later receive a name, thus becoming valid, then place_insert should
delete the orphan object in place and reinsert it. If they are large enough,
the place_delete trigger prevents them from being removed. The additional
update fools the delete trigger.
Removes 'trigram' and 'location' from word.
Removes 'address', 'importance' and 'country_code' from search_name_*.
Use full geometry in centroid column of search_name_*.
Requires migration of existing tables. For more info see pull request
https://github.com/twain47/Nominatim/pull/45
Country is already covered by the country_name entries in the
word table, so removing the country from the address vector will
not change results but reduce the size of search_name significantly.
Patch in names from OSM into the word table
to make sure we have complete coverage. Note that bad entries
still need to be removed by hand.
Empty relations may indeed appear, if the members of a relation
have been deleted but the tags have been retained. That is
detected as accidental error and the old geometry is retained
in placex while the slim tables contain the new version without members.
Adds word counts from a full planet to the word table. There is a
new configuration option CONST_Max_Word_Frequency which allows to
take into account the word count: the value that was set on import
is used to determine if a word is added to the search_name table.
The value during runtime determines if a single term should be
used for partial search or simply be ignored.
Import TIGER data into a temporary table first that later replaces
the current location_property_tiger table. This way index creation
on the table can be delayed until after the import which should
speed up the import and result in significantly smaller indexes.
Also removed index on parent_place_id as it is covered by
idx_location_property_tiger_housenumber_parent_place_id.
Changes slightly the logic which decides if a guessed places
(i.e. a place node) is included in an address: it will be
part of the address only if it is inside the next lower
available boundary. This fixes problematic cases where
neighbouring entities have additional admin levels.
This solves a bug with updating large invalid geometries. These
geometries have an entry in place but not in placex. Thus, place_insert
tries to delete the place entry and reinsert it on update. Deletion would
fail because self-intersecting polygons still have an area and large
areas are not deleted.
function changes:
-----------------
Move to ST_PointOnSurface from ST_Centroid in various places to avoid looking up a point outside the polygon
Move to ST_Covers from ST_Contains to include points on admin boundaries
Re-order preference for get_country_code now our data is better. country_osm_grid is now the preffered source.
Fix code to calculate country code in placex_insert, rank_search test was too early
Add extra field to placex 'calculated_country_code' to improve structure of code
Move split_geometery function out of add_location into its own function
Rewrite split_geometery to be more efficient.
Change place_insert to do more updates and less delete/inserts (delete is slow)
Include wikipedia links in details.php ouput
Cleanup no longer used geometry validation (adding overhead)
Include debug statements in function.sql (--DEBUG: ) and add flag to setup.php to turn them on
setup.php:
----------
add flag --disable-token-precalc to speed up debuging
add flag --index-noanalyse to disable analysising DB at rank 4 and 26 (previously removed, but on my local DB it seems to be required)
add flag --enable-diff-updates (modifier to --create-functions) to turn on the code required for diff updates without having to modify functions.sql
add flag --enable-debug-statements (modifier to --create-functions) to turn on debug warning statements
update.php:
-----------
added flag --no-index to import osmosis changes without indexing them
extend the hack to allow import of JOSM generated osm files
country_grid.sql - reference copy of the sql used to generate the country_osm_grid table, needs cleanup