An initial script to mass import cities. #326

This commit is contained in:
Dustin Carlino 2020-10-26 17:46:53 -07:00
parent 004ca95842
commit b4c39a0850
3 changed files with 78 additions and 0 deletions

View File

@ -15,6 +15,7 @@
- [Misc developer tricks](dev/misc_tricks.md)
- [API](dev/api.md)
- [Testing](dev/testing.md)
- [Importing many maps](dev/mass_import.md)
- [Map model](map/README.md)
- [Details](map/details.md)
- [Importing](map/importing/README.md)

View File

@ -0,0 +1,42 @@
# Mass importing many maps
For <https://github.com/dabreegster/abstreet/issues/326>, I'm starting to figure
out how to import hundreds of maps into A/B Street. There are many issues with
scaling up the number of supported maps. This document just focuses on
importing.
## The current approach
<https://download.bbbike.org/> conveniently has 200 OSM extracts for major
cities world-wide. The `data/bbike.sh` script downloads these. Then
`data/mass_import.sh` attempts to import them into A/B Street.
The bbike extracts, however, cover huge areas surrounding major cities.
Importing such large areas is slow, and the result is too large to work well in
A/B Street or the OSM viewer. Ideally, we want just the area concentrated around
the "core" of each city.
<https://github.com/dabreegster/abstreet/blob/master/convert_osm/src/bin/extract_cities.rs>
transforms a huge .osm file into smaller pieces, each focusing on one city core.
This tool looks for administrative boundary relations tagged as cities, produces
a clipping polygon covering the city, and uses `osmconvert` to produce a smaller
`.osm` file. The tool has two strategies for generating clipping polygons. One
is to locate the `admin_centre` or `label` node for the region, then generate a
circle of fixed radius around that point. Usually this node is located in the
city core, so it works reasonably, except for "narrow" cities along a coast. The
other strategy glues together the relation's multipolygon boundary, then
simplifies the shape (usually with thousands of points) using a convex hull.
This strategy tends to produce results that're too large, because city limits
are often really huge.
## Problems
- Outside the US, administrative boundaries don't always have a "city" defined.
In Tokyo in particular, this name isn't used. I'm not sure which boundary
level to use yet.
- The tool assumes driving on the right everywhere. OSM has
<https://wiki.openstreetmap.org/wiki/Key:driving_side>, but this is usually
tagged at the country level, which isn't included in the bbike extracts.
- The resulting maps are all "flattened" in A/B Street's list, so you can't see
any hierarchy of areas. Two cities with the same name from different areas
will arbitrarily collide.

35
data/mass_import.sh Executable file
View File

@ -0,0 +1,35 @@
#!/bin/bash
# This assumes you previously ran bbike.sh and tries to extract and import all
# cities from there. You'll also need https://www.gnu.org/software/parallel/.
#
# Be warned, running this eats CPU and disk space.
set -e
# Dump lots of temporary output here
mkdir -p mass_import
cd mass_import
# First extract all "cities" from the huge bbike files. If two names collide,
# the .osm and .poly might mix between the two arbitrarily!
# Don't parallelize (-j1); I think osmconvert must eat CPUs, because my system
# lags heavily with -j4 here.
for raw_extract in `ls ~/bbike_extracts`; do
raw_extract=`basename -s .osm $raw_extract`
echo "cargo run --release --bin extract_cities -- /home/$USER/bbike_extracts/$raw_extract.osm --radius_around_label_miles=6 > extract_$raw_extract.log 2>&1"
done | parallel --bar -j1
# Spaces in filenames will mess stuff up
# If no files have spaces, the loop fails, so temporarily set +e
set +e
for f in *\ *; do
mv "$f" "${f// /_}"
done
set -e
# Then import each smaller .osm
cd ..
for name in `ls mass_import/*.osm`; do
name=`basename -s .osm $name`
echo "./import.sh --oneshot=mass_import/$name.osm --oneshot_clip=mass_import/$name.poly --skip_ch > mass_import/import_$name.log 2>&1"
done | parallel --bar -j4