2020-06-20 04:48:07 +03:00
|
|
|
# Importing the Database
|
2018-08-24 00:12:10 +03:00
|
|
|
|
2016-06-07 01:17:15 +03:00
|
|
|
The following instructions explain how to create a Nominatim database
|
2021-02-01 19:27:01 +03:00
|
|
|
from an OSM planet file. It is assumed that you have already successfully
|
2021-02-10 13:15:21 +03:00
|
|
|
installed the Nominatim software itself and the `nominatim` tool can be found
|
|
|
|
in your `PATH`. If this is not the case, return to the
|
2021-02-01 19:27:01 +03:00
|
|
|
[installation page](Installation.md).
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2021-02-01 19:27:01 +03:00
|
|
|
## Creating the project directory
|
2018-01-15 01:43:15 +03:00
|
|
|
|
2021-02-01 19:27:01 +03:00
|
|
|
Before you start the import, you should create a project directory for your
|
|
|
|
new database installation. This directory receives all data that is related
|
|
|
|
to a single Nominatim setup: configuration, extra data, etc. Create a project
|
2021-02-10 13:15:21 +03:00
|
|
|
directory apart from the Nominatim software and change into the directory:
|
2021-02-01 19:27:01 +03:00
|
|
|
|
|
|
|
```
|
2024-04-29 00:44:15 +03:00
|
|
|
mkdir ~/nominatim-project
|
|
|
|
cd ~/nominatim-project
|
2021-02-01 19:27:01 +03:00
|
|
|
```
|
|
|
|
|
|
|
|
In the following, we refer to the project directory as `$PROJECT_DIR`. To be
|
|
|
|
able to copy&paste instructions, you can export the appropriate variable:
|
|
|
|
|
|
|
|
```
|
2024-04-29 00:44:15 +03:00
|
|
|
export PROJECT_DIR=~/nominatim-project
|
2021-02-01 19:27:01 +03:00
|
|
|
```
|
|
|
|
|
|
|
|
The Nominatim tool assumes per default that the current working directory is
|
|
|
|
the project directory but you may explicitly state a different directory using
|
2021-02-10 13:15:21 +03:00
|
|
|
the `--project-dir` parameter. The following instructions assume that you run
|
|
|
|
all commands from the project directory.
|
2021-02-02 12:51:03 +03:00
|
|
|
|
2021-02-01 19:27:01 +03:00
|
|
|
!!! tip "Migration Tip"
|
|
|
|
|
|
|
|
Nominatim used to be run directly from the build directory until version 3.6.
|
|
|
|
Essentially, the build directory functioned as the project directory
|
|
|
|
for the database installation. This setup still works and can be useful for
|
|
|
|
development purposes. It is not recommended anymore for production setups.
|
|
|
|
Create a project directory that is separate from the Nominatim software.
|
|
|
|
|
|
|
|
### Configuration setup in `.env`
|
|
|
|
|
2021-06-01 18:02:45 +03:00
|
|
|
The Nominatim server can be customized via an `.env` configuration file in the
|
2021-04-06 17:09:53 +03:00
|
|
|
project directory. This is a file in [dotenv](https://github.com/theskumar/python-dotenv)
|
|
|
|
format which looks the same as variable settings in a standard shell environment.
|
2020-12-20 14:09:27 +03:00
|
|
|
You can also set the same configuration via environment variables. All
|
|
|
|
settings have a `NOMINATIM_` prefix to avoid conflicts with other environment
|
|
|
|
variables.
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2021-10-18 18:02:52 +03:00
|
|
|
There are lots of configuration settings you can tweak. A full reference
|
|
|
|
can be found in the chapter [Configuration Settings](../customize/Settings.md).
|
|
|
|
Most should have a sensible default.
|
2018-01-15 01:43:15 +03:00
|
|
|
|
2018-08-24 00:12:10 +03:00
|
|
|
#### Flatnode files
|
2018-01-15 01:43:15 +03:00
|
|
|
|
2016-06-07 01:17:15 +03:00
|
|
|
If you plan to import a large dataset (e.g. Europe, North America, planet),
|
|
|
|
you should also enable flatnode storage of node locations. With this
|
|
|
|
setting enabled, node coordinates are stored in a simple file instead
|
|
|
|
of the database. This will save you import time and disk storage.
|
2020-12-20 14:09:27 +03:00
|
|
|
Add to your `.env`:
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2020-12-20 14:09:27 +03:00
|
|
|
NOMINATIM_FLATNODE_FILE="/path/to/flatnode.file"
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2016-08-10 23:02:10 +03:00
|
|
|
Replace the second part with a suitable path on your system and make sure
|
2020-12-06 19:28:33 +03:00
|
|
|
the directory exists. There should be at least 75GB of free space.
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2018-08-24 00:12:10 +03:00
|
|
|
## Downloading additional data
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2019-11-01 12:07:04 +03:00
|
|
|
### Wikipedia/Wikidata rankings
|
2016-06-07 01:17:15 +03:00
|
|
|
|
|
|
|
Wikipedia can be used as an optional auxiliary data source to help indicate
|
2019-05-21 14:55:16 +03:00
|
|
|
the importance of OSM features. Nominatim will work without this information
|
2016-06-07 01:17:15 +03:00
|
|
|
but it will improve the quality of the results if this is installed.
|
2021-02-01 19:27:01 +03:00
|
|
|
This data is available as a binary download. Put it into your project directory:
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2021-02-01 19:27:01 +03:00
|
|
|
cd $PROJECT_DIR
|
2024-08-07 11:43:45 +03:00
|
|
|
wget https://nominatim.org/data/wikimedia-importance.csv.gz
|
2024-05-04 14:13:00 +03:00
|
|
|
wget -O secondary_importance.sql.gz https://nominatim.org/data/wikimedia-secondary-importance.sql.gz
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2024-05-04 14:13:00 +03:00
|
|
|
The files are about 400MB and add around 4GB to the Nominatim database. For
|
|
|
|
more information about importance,
|
|
|
|
see [Importance Customization](../customize/Importance.md).
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2020-01-22 13:44:05 +03:00
|
|
|
!!! tip
|
2022-09-29 15:55:46 +03:00
|
|
|
If you forgot to download the wikipedia rankings, then you can
|
|
|
|
also add importances after the import. Download the SQL files, then
|
2024-05-04 14:13:00 +03:00
|
|
|
run `nominatim refresh --wiki-data --secondary-importance --importance`.
|
|
|
|
Updating importances for a planet will take a couple of hours.
|
2017-07-15 04:52:54 +03:00
|
|
|
|
2021-05-13 13:04:47 +03:00
|
|
|
### External postcodes
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2021-05-13 13:04:47 +03:00
|
|
|
Nominatim can use postcodes from an external source to improve searching with
|
|
|
|
postcodes. We provide precomputed postcodes sets for the US (using TIGER data)
|
|
|
|
and the UK (using the [CodePoint OpenData set](https://osdatahub.os.uk/downloads/open/CodePointOpen).
|
|
|
|
This data can be optionally downloaded into the project directory:
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2021-02-01 19:27:01 +03:00
|
|
|
cd $PROJECT_DIR
|
2022-12-20 18:55:47 +03:00
|
|
|
wget https://nominatim.org/data/gb_postcodes.csv.gz
|
|
|
|
wget https://nominatim.org/data/us_postcodes.csv.gz
|
2021-05-13 13:04:47 +03:00
|
|
|
|
|
|
|
You can also add your own custom postcode sources, see
|
2021-10-18 18:26:14 +03:00
|
|
|
[Customization of postcodes](../customize/Postcodes.md).
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2020-12-06 19:28:33 +03:00
|
|
|
## Choosing the data to import
|
2018-12-05 00:22:19 +03:00
|
|
|
|
2019-01-03 23:08:38 +03:00
|
|
|
In its default setup Nominatim is configured to import the full OSM data
|
2018-12-05 00:22:19 +03:00
|
|
|
set for the entire planet. Such a setup requires a powerful machine with
|
2020-12-08 10:57:46 +03:00
|
|
|
at least 64GB of RAM and around 900GB of SSD hard disks. Depending on your
|
2018-12-05 00:22:19 +03:00
|
|
|
use case there are various ways to reduce the amount of data imported. This
|
|
|
|
section discusses these methods. They can also be combined.
|
|
|
|
|
|
|
|
### Using an extract
|
|
|
|
|
2020-12-06 19:28:33 +03:00
|
|
|
If you only need geocoding for a smaller region, then precomputed OSM extracts
|
2019-01-03 23:08:38 +03:00
|
|
|
are a good way to reduce the database size and import time.
|
|
|
|
[Geofabrik](https://download.geofabrik.de) offers extracts for most countries.
|
|
|
|
They even have daily updates which can be used with the update process described
|
2021-10-18 17:53:24 +03:00
|
|
|
[in the next section](Update.md). There are also
|
2019-01-03 23:08:38 +03:00
|
|
|
[other providers for extracts](https://wiki.openstreetmap.org/wiki/Planet.osm#Downloading).
|
2018-12-05 00:22:19 +03:00
|
|
|
|
2019-01-03 23:08:38 +03:00
|
|
|
Please be aware that some extracts are not cut exactly along the country
|
|
|
|
boundaries. As a result some parts of the boundary may be missing which means
|
2019-05-21 14:55:16 +03:00
|
|
|
that Nominatim cannot compute the areas for some administrative areas.
|
2019-01-03 23:08:38 +03:00
|
|
|
|
|
|
|
### Dropping Data Required for Dynamic Updates
|
|
|
|
|
|
|
|
About half of the data in Nominatim's database is not really used for serving
|
|
|
|
the API. It is only there to allow the data to be updated from the latest
|
|
|
|
changes from OSM. For many uses these dynamic updates are not really required.
|
2021-01-14 14:04:08 +03:00
|
|
|
If you don't plan to apply updates, you can run the import with the
|
|
|
|
`--no-updates` parameter. This will drop the dynamic part of the database as
|
|
|
|
soon as it is not required anymore.
|
|
|
|
|
|
|
|
You can also drop the dynamic part later using the following command:
|
2019-01-03 23:08:38 +03:00
|
|
|
|
|
|
|
```
|
2021-02-01 19:27:01 +03:00
|
|
|
nominatim freeze
|
2019-01-03 23:08:38 +03:00
|
|
|
```
|
|
|
|
|
|
|
|
Note that you still need to provide for sufficient disk space for the initial
|
|
|
|
import. So this option is particularly interesting if you plan to transfer the
|
|
|
|
database or reuse the space later.
|
2018-12-05 00:22:19 +03:00
|
|
|
|
2021-10-12 11:25:50 +03:00
|
|
|
!!! warning
|
2022-07-24 14:04:23 +03:00
|
|
|
The data structure for updates are also required when adding additional data
|
2021-10-18 17:53:24 +03:00
|
|
|
after the import, for example [TIGER housenumber data](../customize/Tiger.md).
|
2021-10-12 11:25:50 +03:00
|
|
|
If you plan to use those, you must not use the `--no-updates` parameter.
|
|
|
|
Do a normal import, add the external data and once you are done with
|
|
|
|
everything run `nominatim freeze`.
|
|
|
|
|
|
|
|
|
2018-12-05 00:22:19 +03:00
|
|
|
### Reverse-only Imports
|
|
|
|
|
|
|
|
If you only want to use the Nominatim database for reverse lookups or
|
|
|
|
if you plan to use the installation only for exports to a
|
2022-02-03 11:39:03 +03:00
|
|
|
[photon](https://photon.komoot.io/) database, then you can set up a database
|
2018-12-05 00:22:19 +03:00
|
|
|
without search indexes. Add `--reverse-only` to your setup command above.
|
|
|
|
|
2024-04-02 23:13:33 +03:00
|
|
|
This saves about 5% of disk space, import time won't be significant faster.
|
2018-12-05 00:22:19 +03:00
|
|
|
|
|
|
|
### Filtering Imported Data
|
|
|
|
|
|
|
|
Nominatim normally sets up a full search database containing administrative
|
|
|
|
boundaries, places, streets, addresses and POI data. There are also other
|
|
|
|
import styles available which only read selected data:
|
|
|
|
|
2021-10-13 00:07:41 +03:00
|
|
|
* **admin**
|
2019-01-03 23:08:38 +03:00
|
|
|
Only import administrative boundaries and places.
|
2021-10-13 00:07:41 +03:00
|
|
|
* **street**
|
2019-01-03 23:08:38 +03:00
|
|
|
Like the admin style but also adds streets.
|
2021-10-13 00:07:41 +03:00
|
|
|
* **address**
|
2019-01-03 23:08:38 +03:00
|
|
|
Import all data necessary to compute addresses down to house number level.
|
2021-10-13 00:07:41 +03:00
|
|
|
* **full**
|
2019-01-03 23:08:38 +03:00
|
|
|
Default style that also includes points of interest.
|
2021-10-13 00:07:41 +03:00
|
|
|
* **extratags**
|
2019-12-29 13:47:10 +03:00
|
|
|
Like the full style but also adds most of the OSM tags into the extratags
|
|
|
|
column.
|
2019-01-03 23:08:38 +03:00
|
|
|
|
2020-12-20 14:09:27 +03:00
|
|
|
The style can be changed with the configuration `NOMINATIM_IMPORT_STYLE`.
|
2019-01-03 23:08:38 +03:00
|
|
|
|
2019-05-21 14:55:16 +03:00
|
|
|
To give you an idea of the impact of using the different styles, the table
|
2019-01-03 23:08:38 +03:00
|
|
|
below gives rough estimates of the final database size after import of a
|
2020-12-06 19:28:33 +03:00
|
|
|
2020 planet and after using the `--drop` option. It also shows the time
|
|
|
|
needed for the import on a machine with 64GB RAM, 4 CPUS and NVME disks.
|
|
|
|
Note that the given sizes are just an estimate meant for comparison of
|
|
|
|
style requirements. Your planet import is likely to be larger as the
|
|
|
|
OSM data grows with time.
|
2019-01-03 23:08:38 +03:00
|
|
|
|
|
|
|
style | Import time | DB size | after drop
|
|
|
|
----------|--------------|------------|------------
|
2020-12-06 19:28:33 +03:00
|
|
|
admin | 4h | 215 GB | 20 GB
|
2020-12-08 10:57:46 +03:00
|
|
|
street | 22h | 440 GB | 185 GB
|
|
|
|
address | 36h | 545 GB | 260 GB
|
|
|
|
full | 54h | 640 GB | 330 GB
|
|
|
|
extratags | 54h | 650 GB | 340 GB
|
2018-12-05 00:22:19 +03:00
|
|
|
|
2020-12-06 19:28:33 +03:00
|
|
|
You can also customize the styles further.
|
2021-10-18 17:53:24 +03:00
|
|
|
A [description of the style format](../customize/Import-Styles.md)
|
2021-10-13 00:07:41 +03:00
|
|
|
can be found in the customization guide.
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2018-08-24 00:12:10 +03:00
|
|
|
## Initial import of the data
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2020-01-22 13:44:05 +03:00
|
|
|
!!! danger "Important"
|
|
|
|
First try the import with a small extract, for example from
|
|
|
|
[Geofabrik](https://download.geofabrik.de).
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2020-12-06 19:28:33 +03:00
|
|
|
Download the data to import. Then issue the following command
|
2021-06-01 18:02:45 +03:00
|
|
|
from the **project directory** to start the import:
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2018-01-15 01:43:15 +03:00
|
|
|
```sh
|
2021-02-01 19:27:01 +03:00
|
|
|
nominatim import --osm-file <data file> 2>&1 | tee setup.log
|
2018-01-15 01:43:15 +03:00
|
|
|
```
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2021-06-01 18:02:45 +03:00
|
|
|
The **project directory** is the one that you have set up at the beginning.
|
2021-10-18 17:53:24 +03:00
|
|
|
See [creating the project directory](#creating-the-project-directory).
|
2021-06-01 18:02:45 +03:00
|
|
|
|
2020-12-06 19:28:33 +03:00
|
|
|
### Notes on full planet imports
|
|
|
|
|
|
|
|
Even on a perfectly configured machine
|
|
|
|
the import of a full planet takes around 2 days. Once you see messages
|
2019-11-24 12:31:34 +03:00
|
|
|
with `Rank .. ETA` appear, the indexing process has started. This part takes
|
|
|
|
the most time. There are 30 ranks to process. Rank 26 and 30 are the most complex.
|
|
|
|
They take each about a third of the total import time. If you have not reached
|
|
|
|
rank 26 after two days of import, it is worth revisiting your system
|
|
|
|
configuration as it may not be optimal for the import.
|
|
|
|
|
|
|
|
### Notes on memory usage
|
|
|
|
|
2020-12-06 19:28:33 +03:00
|
|
|
In the first step of the import Nominatim uses [osm2pgsql](https://osm2pgsql.org)
|
|
|
|
to load the OSM data into the PostgreSQL database. This step is very demanding
|
|
|
|
in terms of RAM usage. osm2pgsql and PostgreSQL are running in parallel at
|
|
|
|
this point. PostgreSQL blocks at least the part of RAM that has been configured
|
|
|
|
with the `shared_buffers` parameter during
|
2024-03-14 15:50:24 +03:00
|
|
|
[PostgreSQL tuning](Installation.md#tuning-the-postgresql-database)
|
2019-11-24 12:31:34 +03:00
|
|
|
and needs some memory on top of that. osm2pgsql needs at least 2GB of RAM for
|
|
|
|
its internal data structures, potentially more when it has to process very large
|
|
|
|
relations. In addition it needs to maintain a cache for node locations. The size
|
|
|
|
of this cache can be configured with the parameter `--osm2pgsql-cache`.
|
|
|
|
|
|
|
|
When importing with a flatnode file, it is best to disable the node cache
|
|
|
|
completely and leave the memory for the flatnode file. Nominatim will do this
|
|
|
|
by default, so you do not need to configure anything in this case.
|
|
|
|
|
|
|
|
For imports without a flatnode file, set `--osm2pgsql-cache` approximately to
|
2020-12-06 19:28:33 +03:00
|
|
|
the size of the OSM pbf file you are importing. The size needs to be given in
|
|
|
|
MB. Make sure you leave enough RAM for PostgreSQL and osm2pgsql as mentioned
|
|
|
|
above. If the system starts swapping or you are getting out-of-memory errors,
|
|
|
|
reduce the cache size or even consider using a flatnode file.
|
2019-11-24 12:31:34 +03:00
|
|
|
|
2021-02-03 18:17:46 +03:00
|
|
|
|
|
|
|
### Testing the installation
|
2019-12-23 23:25:06 +03:00
|
|
|
|
2021-10-12 11:25:50 +03:00
|
|
|
Run this script to verify that all required tables and indices got created
|
|
|
|
successfully.
|
2019-12-23 23:25:06 +03:00
|
|
|
|
|
|
|
```sh
|
2021-02-10 23:55:04 +03:00
|
|
|
nominatim admin --check-database
|
2019-12-23 23:25:06 +03:00
|
|
|
```
|
|
|
|
|
2024-09-15 17:08:26 +03:00
|
|
|
If you have installed the `nominatim-api` package, then you can try out
|
|
|
|
your installation by executing a simple query on the command line:
|
2023-08-28 11:48:34 +03:00
|
|
|
|
|
|
|
``` sh
|
|
|
|
nominatim search --query Berlin
|
|
|
|
```
|
|
|
|
|
|
|
|
or, when you have a reverse-only installation:
|
|
|
|
|
|
|
|
``` sh
|
|
|
|
nominatim reverse --lat 51 --lon 45
|
|
|
|
```
|
|
|
|
|
2024-09-15 17:08:26 +03:00
|
|
|
If you want to run Nominatim as a service, make sure you have installed
|
|
|
|
the right packages as per [Installation](Installation.md#software).
|
2023-08-28 11:48:34 +03:00
|
|
|
|
2024-02-15 21:48:32 +03:00
|
|
|
#### Testing the Python frontend
|
2023-08-28 11:48:34 +03:00
|
|
|
|
2024-02-15 21:48:32 +03:00
|
|
|
To run the test server against the Python frontend, you must choose a
|
|
|
|
web framework to use, either starlette or falcon. Make sure the appropriate
|
|
|
|
packages are installed. Then run
|
2020-09-16 00:51:25 +03:00
|
|
|
|
2024-02-15 21:48:32 +03:00
|
|
|
``` sh
|
2021-02-03 18:17:46 +03:00
|
|
|
nominatim serve
|
2020-09-16 00:51:25 +03:00
|
|
|
```
|
|
|
|
|
2024-02-15 21:48:32 +03:00
|
|
|
or, if you prefer to use Starlette instead of Falcon as webserver,
|
|
|
|
|
|
|
|
``` sh
|
|
|
|
nominatim serve --engine starlette
|
|
|
|
```
|
|
|
|
|
2024-09-15 17:08:26 +03:00
|
|
|
Go to `http://localhost:8088/status` and you should see the message `OK`.
|
|
|
|
You can also run a search query, e.g. `http://localhost:8088/search?q=Berlin`
|
2023-08-28 11:48:34 +03:00
|
|
|
or, for reverse-only installations a reverse query,
|
2024-09-15 17:08:26 +03:00
|
|
|
e.g. `http://localhost:8088/reverse?lat=27.1750090510034&lon=78.04209025`.
|
2023-08-28 11:48:34 +03:00
|
|
|
|
|
|
|
Do not use this test server in production.
|
|
|
|
To run Nominatim via webservers like Apache or nginx, please continue reading
|
2024-02-15 21:48:32 +03:00
|
|
|
[Deploy the Python frontend](Deployment-Python.md).
|
2023-08-28 11:48:34 +03:00
|
|
|
|
2017-12-16 18:13:39 +03:00
|
|
|
|
2023-08-28 11:48:34 +03:00
|
|
|
## Enabling search by category phrases
|
|
|
|
|
2023-08-29 13:14:44 +03:00
|
|
|
To be able to search for places by their type using
|
2021-10-18 17:53:24 +03:00
|
|
|
[special phrases](https://wiki.openstreetmap.org/wiki/Nominatim/Special_Phrases)
|
2021-03-20 20:55:08 +03:00
|
|
|
you also need to import these key phrases like this:
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2021-05-17 13:53:58 +03:00
|
|
|
```sh
|
|
|
|
nominatim special-phrases --import-from-wiki
|
|
|
|
```
|
2016-06-07 01:17:15 +03:00
|
|
|
|
2019-11-24 12:31:34 +03:00
|
|
|
Note that this command downloads the phrases from the wiki link above. You
|
|
|
|
need internet access for the step.
|
2018-01-15 01:43:15 +03:00
|
|
|
|
2021-05-18 00:00:22 +03:00
|
|
|
You can also import special phrases from a csv file, for more
|
2021-10-18 17:53:24 +03:00
|
|
|
information please see the [Customization part](../customize/Special-Phrases.md).
|