martin/README.md
2022-12-13 20:14:07 -05:00

29 KiB
Executable File
Raw Blame History

Martin

CI Security audit Docker pulls

Martin is a PostGIS vector tiles server suitable for large databases. Martin is written in Rust using Actix web framework.

Martin

Requirements

Martin requires PostGIS >= 3.0.0

Installation

You can download martin from GitHub releases page.

Platform Downloads (latest)
Linux 64-bit
macOS 64-bit
Windows 64-bit

If you are using macOS and Homebrew you can install martin using Homebrew tap.

brew tap urbica/tap
brew install martin

You can also use official Docker image

docker run -p 3000:3000 -e DATABASE_URL=postgres://postgres@localhost/db maplibre/martin

Usage

Martin requires a database connection string. It can be passed as a command-line argument or as a DATABASE_URL environment variable.

martin postgres://postgres@localhost/db

Martin provides TileJSON endpoint for each geospatial-enabled table in your database.

API

When started, martin will go through all spatial tables and functions with an appropriate signature in the database. These tables and functions will be available as the HTTP endpoints, which you can use to query Mapbox vector tiles.

Method URL Description
GET / Status text, that will eventually show web UI
GET /catalog List of all sources
GET /{name} Source TileJSON
GET /{name}/{z}/{x}/{y} Source Tiles
GET /{name1},...,{nameN} Composite Source TileJSON
GET /{name1},...,{nameN}/{z}/{x}/{y} Composite Source Tiles
GET /health Martin server health check: returns 200 OK

Using with MapLibre

MapLibre is an Open-source JavaScript library for showing maps on a website. MapLibre can accept MVT vector tiles generated by Martin, and applies a style to them to draw a map using Web GL.

You can add a layer to the map and specify Martin TileJSON endpoint as a vector source URL. You should also specify a source-layer property. For Table Sources it is {table_name} by default.

map.addLayer({
  id: 'points',
  type: 'circle',
  source: {
    type: 'vector',
    url: 'http://localhost:3000/points'
  },
  'source-layer': 'points',
  paint: {
    'circle-color': 'red'
  },
});
map.addSource('rpc', {
  type: 'vector',
  url: `http://localhost:3000/function_zxy_query`
});
map.addLayer({
  id: 'points',
  type: 'circle',
  source: 'rpc',
  'source-layer': 'function_zxy_query',
  paint: {
    'circle-color': 'blue'
  },
});

You can also combine multiple sources into one source with Composite Sources. Each source in a composite source can be accessed with its {source_name} as a source-layer property.

map.addSource('points', {
  type: 'vector',
  url: `http://0.0.0.0:3000/points1,points2`
});

map.addLayer({
  id: 'red_points',
  type: 'circle',
  source: 'points',
  'source-layer': 'points1',
  paint: {
    'circle-color': 'red'
  }
});

map.addLayer({
  id: 'blue_points',
  type: 'circle',
  source: 'points',
  'source-layer': 'points2',
  paint: {
    'circle-color': 'blue'
  }
});

Using with Leaflet

Leaflet is the leading open-source JavaScript library for mobile-friendly interactive maps.

You can add vector tiles using Leaflet.VectorGrid plugin. You must initialize a VectorGrid.Protobuf with a URL template, just like in L.TileLayers. The difference is that you should define the styling for all the features.

L.vectorGrid
  .protobuf('http://localhost:3000/points/{z}/{x}/{y}', {
    vectorTileLayerStyles: {
      'points': {
        color: 'red',
        fill: true
      }
    }
  })
  .addTo(map);

Using with deck.gl

deck.gl is a WebGL-powered framework for visual exploratory data analysis of large datasets.

You can add vector tiles using MVTLayer. MVTLayer data property defines the remote data for the MVT layer. It can be

  • String: Either a URL template or a TileJSON URL.
  • Array: an array of URL templates. It allows to balance the requests across different tile endpoints. For example, if you define an array with 4 urls and 16 tiles need to be loaded, each endpoint is responsible to server 16/4 tiles.
  • JSON: A valid TileJSON object.
const pointsLayer = new MVTLayer({
  data: 'http://localhost:3000/points', // 'http://localhost:3000/table_source/{z}/{x}/{y}'
  pointRadiusUnits: 'pixels',
  getRadius: 5,
  getFillColor: [230, 0, 0]
});

const deckgl = new DeckGL({
  container: 'map',
  mapStyle: 'https://basemaps.cartocdn.com/gl/dark-matter-gl-style/style.json',
  initialViewState: {
    latitude: 0,
    longitude: 0,
    zoom: 1
  },
  layers: [pointsLayer]
});

Using with Mapbox

Mapbox GL JS is a JavaScript library for interactive, customizable vector maps on the web. Mapbox GL JS v1.x was open source, and it was forked as MapLibre (see above), so using Martin with Mapbox is similar to MapLibre. Mapbox GL JS can accept MVT vector tiles generated by Martin, and applies a style to them to draw a map using Web GL.

You can add a layer to the map and specify martin TileJSON endpoint as a vector source URL. You should also specify a source-layer property. For Table Sources it is {table_name} by default.

map.addLayer({
  id: 'points',
  type: 'circle',
  source: {
    type: 'vector',
    url: 'http://localhost:3000/points'
  },
  'source-layer': 'points',
  paint: {
    'circle-color': 'red'
  }
});

Source List

A list of all available sources is available in a catalogue:

curl localhost:3000/catalog | jq
[
  {
    "id": "function_zxy_query",
    "name": "public.function_zxy_query"
  },
  {
    "id": "points1",
    "name": "public.points1.geom"
  },
  // ...
]

Table Sources

Table Source is a database table which can be used to query vector tiles. When started, martin will go through all spatial tables in the database and build a list of table sources. A table should have at least one geometry column with non-zero SRID. All other table columns will be represented as properties of a vector tile feature.

Table Source TileJSON

Table Source TileJSON endpoint is available at /{table_name}.

For example, points table will be available at /points, unless there is another source with the same name, or if the table has multiple geometry columns, in which case it will be available at /points.1, /points.2, etc.

curl localhost:3000/points

Table Source Tiles

Table Source tiles endpoint is available at /{table_name}/{z}/{x}/{y}

For example, points table will be available at /points/{z}/{x}/{y}

curl localhost:3000/points/0/0/0

In case if you have multiple geometry columns in that table and want to access a particular geometry column in vector tile, you should also specify the geometry column in the table source name

curl localhost:3000/points.geom/0/0/0

Composite Sources

Composite Sources allows combining multiple sources into one. Composite Source consists of multiple sources separated by comma {source1},...,{sourceN}

Each source in a composite source can be accessed with its {source_name} as a source-layer property.

Composite Source TileJSON

Composite Source TileJSON endpoint is available at /{source1},...,{sourceN}.

For example, composite source for points and lines tables will be available at /points,lines

curl localhost:3000/points,lines

Composite Source Tiles

Composite Source tiles endpoint is available at /{source1},...,{sourceN}/{z}/{x}/{y}

For example, composite source for points and lines tables will be available at /points,lines/{z}/{x}/{y}

curl localhost:3000/points,lines/0/0/0

Function Sources

Function Source is a database function which can be used to query vector tiles. When started, martin will look for the functions with a suitable signature. A function that takes z integer (or zoom integer), x integer, y integer, and an optional query json and returns bytea, can be used as a Function Source. Alternatively the function could return a record with a single bytea field, or a record with two fields of types bytea and text, where the text field is a etag key (i.e. md5 hash).

Argument Type Description
z (or zoom) integer Tile zoom parameter
x integer Tile x parameter
y integer Tile y parameter
query (optional, any name) json Query string parameters

For example, if you have a table table_source in WGS84 (4326 SRID), then you can use this function as a Function Source:

CREATE OR REPLACE FUNCTION function_zxy_query(z integer, x integer, y integer) RETURNS bytea AS $$
DECLARE
  mvt bytea;
BEGIN
  SELECT INTO mvt ST_AsMVT(tile, 'function_zxy_query', 4096, 'geom') FROM (
    SELECT
      ST_AsMVTGeom(ST_Transform(ST_CurveToLine(geom), 3857), ST_TileEnvelope(z, x, y), 4096, 64, true) AS geom
    FROM table_source
    WHERE geom && ST_Transform(ST_TileEnvelope(z, x, y), 4326)
  ) as tile WHERE geom IS NOT NULL;

  RETURN mvt;
END
$$ LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE;
CREATE OR REPLACE FUNCTION function_zxy_query(z integer, x integer, y integer, query_params json) RETURNS bytea AS $$
DECLARE
  mvt bytea;
BEGIN
  SELECT INTO mvt ST_AsMVT(tile, 'function_zxy_query', 4096, 'geom') FROM (
    SELECT
      ST_AsMVTGeom(ST_Transform(ST_CurveToLine(geom), 3857), ST_TileEnvelope(z, x, y), 4096, 64, true) AS geom
    FROM table_source
    WHERE geom && ST_Transform(ST_TileEnvelope(z, x, y), 4326)
  ) as tile WHERE geom IS NOT NULL;

  RETURN mvt;
END
$$ LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE;

The query_params argument is a JSON representation of the tile request query params. For example, if user requested a tile with urlencoded params:

curl \
  --data-urlencode 'arrayParam=[1, 2, 3]' \
  --data-urlencode 'numberParam=42' \
  --data-urlencode 'stringParam=value' \
  --data-urlencode 'booleanParam=true' \
  --data-urlencode 'objectParam={"answer" : 42}' \
  --get localhost:3000/function_zxy_query/0/0/0

then query_params will be parsed as:

{
  "arrayParam": [1, 2, 3],
  "numberParam": 42,
  "stringParam": "value",
  "booleanParam": true,
  "objectParam": { "answer": 42 }
}

You can access this params using json operators:

...WHERE answer = (query_params->'objectParam'->>'answer')::int;

Function Source TileJSON

Function Source TileJSON endpoint is available at /{function_name}

For example, points function will be available at /points

curl localhost:3000/points

Function Source Tiles

Function Source tiles endpoint is available at /{function_name}/{z}/{x}/{y}

For example, points function will be available at /points/{z}/{x}/{y}

curl localhost:3000/points/0/0/0

Command-line Interface

You can configure martin using command-line interface

Usage: martin [OPTIONS] [CONNECTION]

Arguments:
  [CONNECTION]  Database connection string

Options:
  -c, --config <CONFIG>
          Path to config file
  -k, --keep-alive <KEEP_ALIVE>
          Connection keep alive timeout. [DEFAULT: 75]
  -l, --listen-addresses <LISTEN_ADDRESSES>
          The socket address to bind. [DEFAULT: 0.0.0.0:3000]
  -W, --workers <WORKERS>
          Number of web server workers
      --ca-root-file <CA_ROOT_FILE>
          Loads trusted root certificates from a file. The file should contain a sequence of PEM-formatted CA certificates
      --danger-accept-invalid-certs
          Trust invalid certificates. This introduces significant vulnerabilities, and should only be used as a last resort
  -d, --default-srid <DEFAULT_SRID>
          If a spatial table has SRID 0, then this default SRID will be used as a fallback
  -p, --pool-size <POOL_SIZE>
          Maximum connections pool size [DEFAULT: 20]
  -h, --help
          Print help information
  -V, --version
          Print version information

Environment Variables

You can also configure martin using environment variables

Environment variable Example Description
DATABASE_URL postgres://postgres@localhost/db Postgres database connection
CA_ROOT_FILE ./ca-certificate.crt Loads trusted root certificates from a file
DEFAULT_SRID 4326 Fallback SRID
DANGER_ACCEPT_INVALID_CERTS false Trust invalid certificates

Configuration File

If you don't want to expose all of your tables and functions, you can list your sources in a configuration file. To start martin with a configuration file you need to pass a path to a file with a --config argument.

martin --config config.yaml

You can find an example of a configuration file here.

# Database connection string
connection_string: 'postgres://postgres@localhost:5432/db'

# Trust invalid certificates. This introduces significant vulnerabilities, and should only be used as a last resort.
danger_accept_invalid_certs: false

# If a spatial table has SRID 0, then this SRID will be used as a fallback
default_srid: 4326

# Connection keep alive timeout [default: 75]
keep_alive: 75

# The socket address to bind [default: 0.0.0.0:3000]
listen_addresses: '0.0.0.0:3000'

# Maximum connections pool size [default: 20]
pool_size: 20

# Number of web server workers
worker_processes: 8

# Associative arrays of table sources
tables:
  table_source_id:
    # Table schema (required)
    schema: public

    # Table name (required)
    table: table_source

    # Geometry SRID (required)
    srid: 4326

    # Geometry column name (required)
    geometry_column: geom

    # Feature id column name
    id_column: ~

    # An integer specifying the minimum zoom level
    minzoom: 0

    # An integer specifying the maximum zoom level. MUST be >= minzoom
    maxzoom: 30

    # The maximum extent of available map tiles. Bounds MUST define an area
    # covered by all zoom levels. The bounds are represented in WGS:84
    # latitude and longitude values, in the order left, bottom, right, top.
    # Values may be integers or floating point numbers.
    bounds: [-180.0, -90.0, 180.0, 90.0]

    # Tile extent in tile coordinate space
    extent: 4096

    # Buffer distance in tile coordinate space to optionally clip geometries
    buffer: 64

    # Boolean to control if geometries should be clipped or encoded as is
    clip_geom: true

    # Geometry type
    geometry_type: GEOMETRY

    # List of columns, that should be encoded as tile properties (required)
    properties:
      gid: int4

# Associative arrays of function sources
functions:
  function_source_id:
    # Schema name (required)
    schema: public

    # Function name (required)
    function: function_zxy_query

    # An integer specifying the minimum zoom level
    minzoom: 0

    # An integer specifying the maximum zoom level. MUST be >= minzoom
    maxzoom: 30

    # The maximum extent of available map tiles. Bounds MUST define an area
    # covered by all zoom levels. The bounds are represented in WGS:84
    # latitude and longitude values, in the order left, bottom, right, top.
    # Values may be integers or floating point numbers.
    bounds: [-180.0, -90.0, 180.0, 90.0]

Using with Docker

You can use official Docker image maplibre/martin

docker run \
  -p 3000:3000 \
  -e DATABASE_URL=postgres://postgres@localhost/db \
  maplibre/martin

If you are running PostgreSQL instance on localhost, you have to change network settings to allow the Docker container to access the localhost network.

For Linux, add the --net=host flag to access the localhost PostgreSQL service.

docker run \
  --net=host \
  -p 3000:3000 \
  -e DATABASE_URL=postgres://postgres@localhost/db \
  maplibre/martin

For macOS, use host.docker.internal as hostname to access the localhost PostgreSQL service.

docker run \
  -p 3000:3000 \
  -e DATABASE_URL=postgres://postgres@host.docker.internal/db \
  maplibre/martin

For Windows, use docker.for.win.localhost as hostname to access the localhost PostgreSQL service.

docker run \
  -p 3000:3000 \
  -e DATABASE_URL=postgres://postgres@docker.for.win.localhost/db \
  maplibre/martin

Using with Docker Compose

You can use example docker-compose.yml file as a reference

version: '3'

services:
  martin:
    image: maplibre/martin:v0.6.2
    restart: unless-stopped
    ports:
      - "3000:3000"
    environment:
      - DATABASE_URL=postgres://postgres:password@db/db
    depends_on:
      - db

  db:
    image: postgis/postgis:14-3.3-alpine
    restart: unless-stopped
    environment:
      - POSTGRES_DB=db
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=password
    volumes:
      - ./pg_data:/var/lib/postgresql/data

First, you need to start db service

docker-compose up -d db

Then, after db service is ready to accept connections, you can start martin

docker-compose up -d martin

By default, martin will be available at localhost:3000

Using with Nginx

You can run martin behind Nginx proxy, so you can cache frequently accessed tiles and reduce unnecessary pressure on the database.

version: '3'

services:
  nginx:
    image: nginx:alpine
    restart: unless-stopped
    ports:
      - "80:80"
    volumes:
      - ./cache:/var/cache/nginx
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
    depends_on:
      - martin

  martin:
    image: maplibre/martin:v0.6.2
    restart: unless-stopped
    environment:
      - DATABASE_URL=postgres://postgres:password@db/db
    depends_on:
      - db

  db:
    image: postgis/postgis:14-3.3-alpine
    restart: unless-stopped
    environment:
      - POSTGRES_DB=db
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=password
    volumes:
      - ./pg_data:/var/lib/postgresql/data

You can find an example Nginx configuration file here.

Rewriting URLs

If you are running martin behind Nginx proxy, you may want to rewrite the request URL to properly handle tile URLs in TileJSON endpoints.

location ~ /tiles/(?<fwd_path>.*) {
    proxy_set_header  X-Rewrite-URL $uri;
    proxy_set_header  X-Forwarded-Host $host:$server_port;
    proxy_set_header  X-Forwarded-Proto $scheme;
    proxy_redirect    off;

    proxy_pass        http://martin:3000/$fwd_path$is_args$args;
}

Caching tiles

You can also use Nginx to cache tiles. In the example, the maximum cache size is set to 10GB, and caching time is set to 1 hour for responses with codes 200, 204, and 302 and 1 minute for responses with code 404.

http {
  ...
  proxy_cache_path  /var/cache/nginx/
                    levels=1:2
                    max_size=10g
                    use_temp_path=off
                    keys_zone=tiles_cache:10m;

  server {
    ...
    location ~ /tiles/(?<fwd_path>.*) {
        proxy_set_header        X-Rewrite-URL $uri;
        proxy_set_header        X-Forwarded-Host $host:$server_port;
        proxy_set_header        X-Forwarded-Proto $scheme;
        proxy_redirect          off;

        proxy_cache             tiles_cache;
        proxy_cache_lock        on;
        proxy_cache_revalidate  on;

        # Set caching time for responses
        proxy_cache_valid       200 204 302 1h;
        proxy_cache_valid       404 1m;

        proxy_cache_use_stale   error timeout http_500 http_502 http_503 http_504;
        add_header              X-Cache-Status $upstream_cache_status;

        proxy_pass              http://martin:3000/$fwd_path$is_args$args;
    }
  }
}

You can find an example Nginx configuration file here.

Building from Source

You can clone the repository and build martin using cargo package manager.

git clone git@github.com:maplibre/martin.git
cd martin
cargo build --release

The binary will be available at ./target/release/martin.

cd ./target/release/
./martin postgres://postgres@localhost/db

Debugging

Log levels are controlled on a per-module basis, and by default all logging is disabled except for errors. Logging is controlled via the RUST_LOG environment variable. The value of this environment variable is a comma-separated list of logging directives.

This will enable debug logging for all modules:

export RUST_LOG=debug
martin postgres://postgres@localhost/db

While this will only enable verbose logging for the actix_web module and enable debug logging for the martin and tokio_postgres modules:

export RUST_LOG=actix_web=info,martin=debug,tokio_postgres=debug
martin postgres://postgres@localhost/db

Development

  • Clone Martin
  • Install docker, docker-compose, and Just (improved makefile processor)
  • Run just to see available commands:
 git clone git@github.com:maplibre/martin.git
 cd martin
 just
Available recipes:
    run *ARGS             # Start Martin server and a test database
    debug-page *ARGS      # Start Martin server and open a test page
    psql *ARGS            # Run PSQL utility against the test database
    clean                 # Perform  cargo clean  to delete all build files
    clean-test            # Delete test output files
    start-db              # Start a test database
    start-legacy          # Start a legacy test database
    docker-up name        # Start a specific test database, e.g. db or db-legacy
    stop                  # Stop the test database
    bench                 # Run benchmark tests
    test                  # Run all tests using a test database
    test-unit *ARGS       # Run Rust unit and doc tests (cargo test)
    test-int              # Run integration tests
    test-int-legacy       # Run integration tests using legacy database
    test-integration name # Run integration tests with the given docker compose target
    docker-build          # Build martin docker image
    docker-run *ARGS      # Build and run martin docker image
    git *ARGS             # Do any git command, ensuring that the testing environment is set up. Accepts the same arguments as git.
    git-pre-push          # These steps automatically run before git push via a git hook

Other useful commands

# Start db service
just debug-page

# Run Martin server
DATABASE_URL=postgres://postgres@localhost/db cargo run

Open tests/debug.html for debugging. By default, martin will be available at localhost:3000

Make your changes, and check if all the tests are running

DATABASE_URL=postgres://postgres@localhost/db cargo test

You can also run benchmarks with

DATABASE_URL=postgres://postgres@localhost/db cargo bench

An HTML report displaying the results of the benchmark will be generated under target/criterion/report/index.html

Recipes

Using with DigitalOcean PostgreSQL

You can use martin with Managed PostgreSQL from DigitalOcean with PostGIS extension

First, you need to download the CA certificate and get your cluster connection string from the dashboard. After that, you can use the connection string and the CA certificate to connect to the database

martin --ca-root-file ./ca-certificate.crt postgres://user:password@host:port/db?sslmode=require

Using with Heroku PostgreSQL

You can use martin with Managed PostgreSQL from Heroku with PostGIS extension

heroku pg:psql -a APP_NAME -c 'create extension postgis'

In order to trust the Heroku certificate, you can disable certificate validation with either DANGER_ACCEPT_INVALID_CERTS environment variable

DATABASE_URL=$(heroku config:get DATABASE_URL -a APP_NAME) DANGER_ACCEPT_INVALID_CERTS=true martin

or --danger-accept-invalid-certs command-line argument

martin --danger-accept-invalid-certs $(heroku config:get DATABASE_URL -a APP_NAME)