From f1fe70656f88c558a6f4bf6a09adde4ef59487fb Mon Sep 17 00:00:00 2001 From: Sarah Hoffmann Date: Thu, 3 Jan 2019 21:08:38 +0100 Subject: [PATCH] more style docs --- docs/admin/Import-and-Update.md | 57 +++++++++++++++++++++++++-------- docs/develop/Import.md | 2 +- 2 files changed, 45 insertions(+), 14 deletions(-) diff --git a/docs/admin/Import-and-Update.md b/docs/admin/Import-and-Update.md index bebc3645..4cd46ffb 100644 --- a/docs/admin/Import-and-Update.md +++ b/docs/admin/Import-and-Update.md @@ -60,7 +60,7 @@ Nominatim can use postcodes from an external source to improve searches that inv ## Choosing the Data to Import -In its default setup, Nominatim is configured to import the full OSM data +In its default setup Nominatim is configured to import the full OSM data set for the entire planet. Such a setup requires a powerful machine with at least 32GB of RAM and around 800GB of SSD hard disks. Depending on your use case there are various ways to reduce the amount of data imported. This @@ -69,15 +69,31 @@ section discusses these methods. They can also be combined. ### Using an extract If you only need geocoding for a smaller region, then precomputed extracts -are a good way to reduce the database size. -[Geofabrik](https://download.geofabrik.de) offers extracts for most countries -and also daily updates which can be used with the update process described +are a good way to reduce the database size and import time. +[Geofabrik](https://download.geofabrik.de) offers extracts for most countries. +They even have daily updates which can be used with the update process described below. There are also -[other providers for extracts](https://wiki.openstreetmap.org/wiki/Planet.osm#Downloading) +[other providers for extracts](https://wiki.openstreetmap.org/wiki/Planet.osm#Downloading). -Please be aware that some extracts are not exactly cut along the country -boundaries. As a result some parts of the boundary may be missing and -Nominatim cannot compute the areas as a result. +Please be aware that some extracts are not cut exactly along the country +boundaries. As a result some parts of the boundary may be missing which means +that cannot compute the areas for some administrative areas. + +### Dropping Data Required for Dynamic Updates + +About half of the data in Nominatim's database is not really used for serving +the API. It is only there to allow the data to be updated from the latest +changes from OSM. For many uses these dynamic updates are not really required. +If you don't plan to apply updates, the dynamic part of the database can be +safely dropped using the following command: + +``` +./utils/setup.php --drop +``` + +Note that you still need to provide for sufficient disk space for the initial +import. So this option is particularly interesting if you plan to transfer the +database or reuse the space later. ### Reverse-only Imports @@ -94,13 +110,28 @@ Nominatim normally sets up a full search database containing administrative boundaries, places, streets, addresses and POI data. There are also other import styles available which only read selected data: +* **settings/import-admin.style** + Only import administrative boundaries and places. +* **settings/import-street.style** + Like the admin style but also adds streets. +* **settings/import-address.style** + Import all data necessary to compute addresses down to house number level. +* **settings/import-full.style** + Default style that also includes points of interest. +The style can be changed with the configuration `CONST_Import_Style`. -style Import time DB size after drop -admin 5h 189 GB 20 GB -street -address -full +To give you an idea of the impact of using the different style, the table +below gives rough estimates of the final database size after import of a +2018 planet and after using the `--drop` option. It also shows the time +needed for the import on a machine with 32GB RAM, 4 CPUS and SSDs. + +style | Import time | DB size | after drop +----------|--------------|------------|------------ +admin | 5h | 189 GB | 20 GB +street | 42h | 396 GB | 174 GB +address | 59h | 497 GB | 251 GB +full | 80h | GB | GB You can also customize the styles further. For an description of the style format see [the developement section](../develop/Import.md). diff --git a/docs/develop/Import.md b/docs/develop/Import.md index ddddb1a8..e733c6fd 100644 --- a/docs/develop/Import.md +++ b/docs/develop/Import.md @@ -22,7 +22,7 @@ The gazetteer style produces a single table `place` with the following rows: A single OSM object may appear multiple times in this table when it is tagged with multiple tags that may constitute a principal tag. Take for example a motorway bridge. In OSM, this would be a way which is tagged with -`highway=motoway` and `bridge=yes`. This way would appear in the `place` table +`highway=motorway` and `bridge=yes`. This way would appear in the `place` table once with `class` of `highway` and once with a `class` of `bridge`. Thus the *uique key* for `place` is (`osm_type`, `osm_id`, `class`).