Skip to content

Commit

Permalink
lots of docs updates
Browse files Browse the repository at this point in the history
  • Loading branch information
nyurik committed Oct 31, 2023
1 parent 8b3ef58 commit 429d44b
Show file tree
Hide file tree
Showing 10 changed files with 219 additions and 141 deletions.
3 changes: 0 additions & 3 deletions docs/src/50-mbtiles.md

This file was deleted.

12 changes: 12 additions & 0 deletions docs/src/50-tools.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# CLI Tools

Martin project contains additional tooling to help manage the data servable with Martin tile server.

## `mbtiles`
`mbtiles` is a small utility to interact with the `*.mbtiles` files from the command line. It allows users to examine, copy, validate, compare, and apply diffs between them.

Use `mbtiles --help` to see a list of available commands, and `mbtiles <command> --help` to see help for a specific command.

This tool can be installed by compiling the latest released version with `cargo install mbtiles`, or by downloading a pre-built binary from the [releases page](https://github.com/maplibre/martin/releases/latest).

The `mbtiles` utility builds on top of the [MBTiles specification](https://github.com/mapbox/mbtiles-spec). It adds a few additional conventions to ensure that the content of the tile data is valid, and can be used for reliable diffing and patching of the tilesets.
22 changes: 22 additions & 0 deletions docs/src/51-mbtiles-meta.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# `mbtiles` Metadata Access

## meta-all
Print all metadata values to stdout, as well as the results of tile detection. The format of the values printed is not stable, and should only be used for visual inspection.

```shell
mbtiles meta-all my_file.mbtiles
```

## meta-get
Retrieve raw metadata value by its name. The value is printed to stdout without any modifications. For example, to get the `description` value from an mbtiles file:

```shell
mbtiles meta-get my_file.mbtiles description
```

## meta-set
Set metadata value by its name, or delete the key if no value is supplied. For example, to set the `description` value to `A vector tile dataset`:

```shell
mbtiles meta-set my_file.mbtiles description "A vector tile dataset"
```
83 changes: 0 additions & 83 deletions docs/src/51-tools.md

This file was deleted.

57 changes: 57 additions & 0 deletions docs/src/52-mbtiles-copy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Copying, Diffing, and Patching MBTiles

## `mbtiles copy`
Copy command copies an mbtiles file, optionally filtering its content by zoom levels.

```shell
mbtiles copy src_file.mbtiles dst_file.mbtiles \
--min-zoom 0 --max-zoom 10
```

This command can also be used to generate files of different [supported schema](##supported-schema).

```shell
mbtiles copy normalized.mbtiles dst.mbtiles \
--dst-mbttype flat-with-hash
```

## `mbtiles copy --diff-with-file`
Copy command can also be used to compare two mbtiles files and generate a delta (diff) file. The diff file can be applied to the `src_file.mbtiles` elsewhere, to avoid copying/transmitting the entire modified dataset. The delta file will contain all tiles that are different between the two files (modifications, insertions, and deletions as `NULL` values), for both the tile and metadata tables.

There is one exception: `agg_tiles_hash` metadata value will be renamed to `agg_tiles_hash_in_diff`, and a new `agg_tiles_hash` will be generated for the diff file itself. This is done to avoid confusion when applying the diff file to the original file, as the `agg_tiles_hash` value will be different after the diff is applied. The `apply-diff` command will automatically rename the `agg_tiles_hash_in_diff` value back to `agg_tiles_hash` when applying the diff.

```shell
mbtiles copy src_file.mbtiles diff_file.mbtiles \
--diff-with-file modified_file.mbtiles
```

## `mbtiles copy --apply-patch`

Copy a source file to destination while also applying the diff file generated by `copy --diff-with-file` command above to the destination mbtiles file. This allows safer application of the diff file, as the source file is not modified.

```shell
mbtiles copy src_file.mbtiles dst_file.mbtiles \
--apply-patch diff_file.mbtiles
```

## `mbtiles apply-patch`

Apply the diff file generated from `copy` command above to an mbtiles file. The diff file can be applied to the `src_file.mbtiles` elsewhere, to avoid copying/transmitting the entire modified dataset.

Note that the `agg_tiles_hash_in_diff` metadata value will be renamed to `agg_tiles_hash` when applying the diff. This is done to avoid confusion when applying the diff file to the original file, as the `agg_tiles_hash` value will be different after the diff is applied.

```shell
mbtiles apply_diff src_file.mbtiles diff_file.mbtiles
```

#### Applying diff with SQLite
Another way to apply the diff is to use the `sqlite3` command line tool directly. This SQL will delete all tiles from `src_file.mbtiles` that are set to `NULL` in `diff_file.mbtiles`, and then insert or update all new tiles from `diff_file.mbtiles` into `src_file.mbtiles`, where both files are of `flat` type. The name of the diff file is passed as a query parameter to the sqlite3 command line tool, and then used in the SQL statements. Note that this does not update the `agg_tiles_hash` metadata value, so it will be incorrect after the diff is applied.

```shell
sqlite3 src_file.mbtiles \
-bail \
-cmd ".parameter set @diffDbFilename diff_file.mbtiles" \
"ATTACH DATABASE @diffDbFilename AS diffDb;" \
"DELETE FROM tiles WHERE (zoom_level, tile_column, tile_row) IN (SELECT zoom_level, tile_column, tile_row FROM diffDb.tiles WHERE tile_data ISNULL);" \
"INSERT OR REPLACE INTO tiles (zoom_level, tile_column, tile_row, tile_data) SELECT * FROM diffDb.tiles WHERE tile_data NOTNULL;"
```
51 changes: 0 additions & 51 deletions docs/src/52-schemas.md

This file was deleted.

38 changes: 38 additions & 0 deletions docs/src/53-mbtiles-validation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# MBTiles Validation

The original [MBTiles specification](https://github.com/mapbox/mbtiles-spec#readme) does not provide any guarantees for the content of the tile data in MBTiles. `mbtiles validate` assumes a few additional conventions and uses them to ensure that the content of the tile data is valid performing several validation steps. If the file is not valid, the command will print an error message and exit with a non-zero exit code.

```shell
mbtiles validate src_file.mbtiles
```

## SQLite Integrity check
The `validate` command will run `PRAGMA integrity_check` on the file, and will fail if the result is not `ok`. The `--integrity-check` flag can be used to disable this check, or to make it more thorow with `full` value. Default is `quick`.

## Schema check
The `validate` command will verify that the `tiles` table/view exists, and that it has the expected columns and indexes. It will also verify that the `metadata` table/view exists, and that it has the expected columns and indexes.

## Per-tile validation
If the `.mbtiles` file uses [flat_with_hash](54-mbtiles-schema.md#flat-with-hash) or [normalized](54-mbtiles-schema.md#normalized) schema, the `validate` command will verify that the MD5 hash of the `tile_data` column matches the `tile_hash` or `tile_id` columns (depending on the schema).

A typical Normalized schema generated by tools like [tilelive-copy](https://github.com/mapbox/TileLive#bintilelive-copy) use MD5 hash in the `tile_id` column. The Martin's `mbtiles` tool can use this hash to verify the content of each tile. We also define a new [flat-with-hash](54-mbtiles-schema.md#flat-with-hash) schema that stores the hash and tile data in the same table, allowing per-tile validation without the multiple table layout.

Per-tile validation is not available for the `flat` schema, and will be skipped.

## Aggregate Content Validation

Per-tile validation will catch individual tile corruption, but it will not detect overall datastore corruption such as missing tiles, tiles that should not exist, or tiles with incorrect z/x/y values. For that, the `mbtiles` tool defines a new metadata value called `agg_tiles_hash`.

The value is computed by hashing the combined value for all rows in the `tiles` table/view, ordered by z,x,y. The value is computed using the following SQL expression, which uses a custom `md5_concat_hex` function from [sqlite-hashes crate](https://crates.io/crates/sqlite-hashes):

```sql, ignore
md5_concat_hex(
CAST(zoom_level AS TEXT),
CAST(tile_column AS TEXT),
CAST(tile_row AS TEXT),
tile_data)
```

In case there are no rows or all are NULL, the hash value of an empty string is used. Note that SQLite allows any value type to be stored as in any column, so if `tile_data` accidentally contains non-blob/text/null value, validation will fail.

The `mbtiles` tool will compute `agg_tiles_hash` value when copying or validating mbtiles files. Use `--update-agg-tiles-hash` to force the value to be updated, even if it is incorrect or does not exist.
80 changes: 80 additions & 0 deletions docs/src/54-mbtiles-schema.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# MBTiles Schemas
The `mbtiles` tool builds on top of the original [MBTiles specification](https://github.com/mapbox/mbtiles-spec#readme) by specifying three different kinds of schema for `tiles` data: `flat`, `flat-with-hash`, and `normalized`. The `mbtiles` tool can convert between these schemas, and can also generate a diff between two files of any schemas, as well as merge multiple schema files into one file.

## flat
Flat schema is the closest to the original MBTiles specification. It stores all tiles in a single table. This schema is the most efficient when the tileset contains no duplicate tiles.

```sql, ignore
CREATE TABLE tiles (
zoom_level INTEGER,
tile_column INTEGER,
tile_row INTEGER,
tile_data BLOB);
CREATE UNIQUE INDEX tile_index on tiles (
zoom_level, tile_column, tile_row);
```

## flat-with-hash
Similar to the `flat` schema, but also includes a `tile_hash` column that contains a hash value of the `tile_data` column. Use this schema when the tileset has no duplicate tiles, but you still want to be able to validate the content of each tile individually.

```sql, ignore
CREATE TABLE tiles_with_hash (
zoom_level INTEGER NOT NULL,
tile_column INTEGER NOT NULL,
tile_row INTEGER NOT NULL,
tile_data BLOB,
tile_hash TEXT);
CREATE UNIQUE INDEX tiles_with_hash_index on tiles_with_hash (
zoom_level, tile_column, tile_row);
CREATE VIEW tiles AS
SELECT zoom_level, tile_column, tile_row, tile_data
FROM tiles_with_hash;
```

## normalized
Normalized schema is the most efficient when the tileset contains duplicate tiles. It stores all tile blobs in the `images` table, and stores the tile Z,X,Y coordinates in a `map` table. The `map` table contains a `tile_id` column that is a foreign key to the `images` table. The `tile_id` column is a hash of the `tile_data` column, making it possible to both validate each individual tile like in the `flat-with-hash` schema, and also to optimize storage by storing each unique tile only once.

```sql, ignore
CREATE TABLE map (
zoom_level INTEGER,
tile_column INTEGER,
tile_row INTEGER,
tile_id TEXT);
CREATE TABLE images (
tile_id TEXT,
tile_data BLOB);
CREATE UNIQUE INDEX map_index ON map (
zoom_level, tile_column, tile_row);
CREATE UNIQUE INDEX images_id ON images (
tile_id);
CREATE VIEW tiles AS
SELECT
map.zoom_level AS zoom_level,
map.tile_column AS tile_column,
map.tile_row AS tile_row,
images.tile_data AS tile_data
FROM
map JOIN images
ON images.tile_id = map.tile_id;
```

Optionally, `.mbtiles` files with `normalized` schema can include a `tiles_with_hash` view. All `normalized` files created by the `mbtiles` tool will contain this view.

```sql, ignore
CREATE VIEW tiles_with_hash AS
SELECT
map.zoom_level AS zoom_level,
map.tile_column AS tile_column,
map.tile_row AS tile_row,
images.tile_data AS tile_data,
images.tile_id AS tile_hash
FROM
map JOIN images
ON map.tile_id = images.tile_id;
```
8 changes: 5 additions & 3 deletions docs/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,9 @@
- [Using with deck.gl](43-using-with-deck-gl.md)
- [Using with Mapbox](44-using-with-mapbox.md)
- [Recipes](45-recipes.md)
- [MBTiles](50-mbtiles.md)
- [Tools](51-tools.md)
- [Supported Schemas](52-schemas.md)
- [Tools](50-tools.md)
- [MBTiles Metadata](51-mbtiles-meta.md)
- [MBTiles Copying / Diffing](52-mbtiles-copy.md)
- [MBTiles Validation](53-mbtiles-validation.md)
- [MBTiles Schemas](54-mbtiles-schema.md)
- [Development](60-development.md)
6 changes: 5 additions & 1 deletion justfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,11 @@ export CARGO_TERM_COLOR := "always"

# Start Martin server
run *ARGS:
cargo run -- {{ ARGS }}
cargo run -p martin -- {{ ARGS }}

# Run mbtiles command
mbtiles *ARGS:
cargo run -p mbtiles -- {{ ARGS }}

# Start release-compiled Martin server and a test database
run-release *ARGS: start
Expand Down

0 comments on commit 429d44b

Please sign in to comment.