I've built a server for generating OpenStreetMap vector tiles on demand from a GeoDesk database, which is barely larger than an .osm.pbf (100GB vs. 80GB for current planet.osm.pbf) - much smaller than a PostGIS instance: https://github.com/styluslabs/geodesk-tiles
another option that would have been interesting to see here is serving PostGIS GeoJSON export -> tippecanoe encode. Tippecanoe is super fast, parallelizes well and built solely for generating vector tile data (with lots of configurable options that PostGIS lacks)
The term "serving" is a bit misleading here. Most of the time, vector tile servers are serving pre generated tiles, which is extremely fast. This analysis is about generating tiles on the fly from PostGIS through a custom web server.
Yup but super impressive just how much faster Martin was than all the other competition by significant margins with Bbox (Rust) and Tegola (Go) trailing at ~2-4x slower. That indicates the author(s) of Martin really optimized the data structures & algorithms to achieve a new Pareto frontier. Neat - would be nice if there were an accessible summary of the tricks employed to make it so fast that were missing in competitors.
The trick that makes Martin so fast is not doing any geospatial processing, and just being focused on making quick, non-blocking requests to Postgres. All geospatial processing is done by PostGIS, which is essentially just using the C++ geos library (which is by far the most comprehensive and well optimized geospatial processing library).
Oh that’s interesting, and I’m actually kinda peeved by any database-connected system that caches responses by default. Caching should be reserved as a performance optimization with serious correctness tradeoffs.
Regardless, I don’t think caching came into play here, at least according to how I’m reading this repository. I would expect cold caches for everything.
Yeah, that's definitely interesting - I'm surprised there is so much room for variation considering PostGIS is (if I'm not mistaken) doing most of the work.
I couldn't find any description of what test 1, 2, 3 etc actually are though.
Not sure I agree; it sounds like the vector tiles are generated in advance of testing the servers. This description is from the linked GitHub:
> six open-source vector tiles servers (BBOX, Ldproxy, Martin, pg_tileserv, Tegola, and TiPg) are set up and configured using Docker in a public cloud. Vector tiles are created for each server from the vector data of the PostGIS database. Various test scenarios with Apache JMeter are used to determine which server can deliver the vector tiles the fastest.
Yes, true. I had the impression that the tiles themselves were being stored as geometric data in the postgres DB, then fetched and served. But I might have been confused by the article starting "Once you have created your vector tiles...". The GitHub page is a little ambiguous tbh.
okay, do they mean vectors, or tiles, because that's like saying "serving PNG JPEGs" or "serving JPEG PNGs". Some servers chuck back /a picture/, some servers chuck back /an SVG/ or line data.
They mean vector tiles. It’s tiles of vectorized images, usually of a map (or other geographic data). They’re so named because they are a vectorized replacement for raster tiles, which were PNGs. If the server chucks back a picture, it’s not a vector tile server.
In GIS world, a vector tile is a chunk of geographic data (the vectors) limited to a geographic region (the tile boundaries which fit into the projected checkerboard of your map).
You use a vector tile instead of a png or jpeg tile because you don’t want an image representation of the data, you want the raw “vector” data so you can style it, search it, and do other things with it on client devices.
I've built a server for generating OpenStreetMap vector tiles on demand from a GeoDesk database, which is barely larger than an .osm.pbf (100GB vs. 80GB for current planet.osm.pbf) - much smaller than a PostGIS instance: https://github.com/styluslabs/geodesk-tiles
another option that would have been interesting to see here is serving PostGIS GeoJSON export -> tippecanoe encode. Tippecanoe is super fast, parallelizes well and built solely for generating vector tile data (with lots of configurable options that PostGIS lacks)
The term "serving" is a bit misleading here. Most of the time, vector tile servers are serving pre generated tiles, which is extremely fast. This analysis is about generating tiles on the fly from PostGIS through a custom web server.
Yup but super impressive just how much faster Martin was than all the other competition by significant margins with Bbox (Rust) and Tegola (Go) trailing at ~2-4x slower. That indicates the author(s) of Martin really optimized the data structures & algorithms to achieve a new Pareto frontier. Neat - would be nice if there were an accessible summary of the tricks employed to make it so fast that were missing in competitors.
The trick that makes Martin so fast is not doing any geospatial processing, and just being focused on making quick, non-blocking requests to Postgres. All geospatial processing is done by PostGIS, which is essentially just using the C++ geos library (which is by far the most comprehensive and well optimized geospatial processing library).
Martin has an in-memory tile cache, which probably makes a difference: https://github.com/maplibre/martin/pull/1105. BBOX caches to a file instead.
The benchmarking repository has config files used for the test, and they did not use the tile cache feature.
It defaults to 512MiB if not configured explicitly to 0 which it’s not in the repository.
Oh that’s interesting, and I’m actually kinda peeved by any database-connected system that caches responses by default. Caching should be reserved as a performance optimization with serious correctness tradeoffs.
Regardless, I don’t think caching came into play here, at least according to how I’m reading this repository. I would expect cold caches for everything.
Yeah, that's definitely interesting - I'm surprised there is so much room for variation considering PostGIS is (if I'm not mistaken) doing most of the work.
I couldn't find any description of what test 1, 2, 3 etc actually are though.
Not sure I agree; it sounds like the vector tiles are generated in advance of testing the servers. This description is from the linked GitHub:
> six open-source vector tiles servers (BBOX, Ldproxy, Martin, pg_tileserv, Tegola, and TiPg) are set up and configured using Docker in a public cloud. Vector tiles are created for each server from the vector data of the PostGIS database. Various test scenarios with Apache JMeter are used to determine which server can deliver the vector tiles the fastest.
They all seem to connect to PostGIS: https://github.com/FabianRechsteiner/vector-tiles-benchmark/...
Yes, true. I had the impression that the tiles themselves were being stored as geometric data in the postgres DB, then fetched and served. But I might have been confused by the article starting "Once you have created your vector tiles...". The GitHub page is a little ambiguous tbh.
>The GitHub page is a little ambiguous tbh.
Agreed.
okay, do they mean vectors, or tiles, because that's like saying "serving PNG JPEGs" or "serving JPEG PNGs". Some servers chuck back /a picture/, some servers chuck back /an SVG/ or line data.
They mean vector tiles. It’s tiles of vectorized images, usually of a map (or other geographic data). They’re so named because they are a vectorized replacement for raster tiles, which were PNGs. If the server chucks back a picture, it’s not a vector tile server.
In GIS world, a vector tile is a chunk of geographic data (the vectors) limited to a geographic region (the tile boundaries which fit into the projected checkerboard of your map).
You use a vector tile instead of a png or jpeg tile because you don’t want an image representation of the data, you want the raw “vector” data so you can style it, search it, and do other things with it on client devices.