Skip to content

Commit

Permalink
Add --accumulate-attribute to tippecanoe-overzoom (#189)
Browse files Browse the repository at this point in the history
* Starting to factor out attribute accumulation into its own file

* Continuing to factor out attribute accumulation

* Reduce duplicate code

* Plumbing the accumulate-attribute option around

* Call the attribute accumulator

* Test that accumulation works

* Add missing #includes

* Don't sort within individual multiplier clusters

Doing so throws off the spatial distribution of the low zooms

* Docs and changelog

* Add comments
  • Loading branch information
e-n-f authored Jan 23, 2024
1 parent f957f30 commit cbf2227
Show file tree
Hide file tree
Showing 17 changed files with 481 additions and 419 deletions.
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# 2.41.2

* Add --accumulate-attribute to tippecanoe-overzoom
* Go back to ordering features within each multiplier cluster spatially, not in the order specified for tile feature order

# 2.41.1

* Make --preserve-input-order, --order-by, --order-descending-by, --order-smallest-first, and --order-largest-first cooperate with --retain-points-multiplier. The clusters will be ordered by their lead feature in the specified sequence. The other features in each cluster will continue to be physically near the lead feature, but ordered as specified within the cluster.
Expand Down
11 changes: 8 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ C = $(wildcard *.c) $(wildcard *.cpp)
INCLUDES = -I/usr/local/include -I.
LIBS = -L/usr/local/lib

tippecanoe: geojson.o jsonpull/jsonpull.o tile.o pool.o mbtiles.o geometry.o projection.o memfile.o mvt.o serial.o main.o text.o dirtiles.o pmtiles_file.o plugin.o read_json.o write_json.o geobuf.o flatgeobuf.o evaluator.o geocsv.o csv.o geojson-loop.o json_logger.o visvalingam.o compression.o clip.o sort.o
tippecanoe: geojson.o jsonpull/jsonpull.o tile.o pool.o mbtiles.o geometry.o projection.o memfile.o mvt.o serial.o main.o text.o dirtiles.o pmtiles_file.o plugin.o read_json.o write_json.o geobuf.o flatgeobuf.o evaluator.o geocsv.o csv.o geojson-loop.o json_logger.o visvalingam.o compression.o clip.o sort.o attribute.o
$(CXX) $(PG) $(LIBS) $(FINAL_FLAGS) $(CXXFLAGS) -o $@ $^ $(LDFLAGS) -lm -lz -lsqlite3 -lpthread

tippecanoe-enumerate: enumerate.o
Expand All @@ -67,7 +67,7 @@ tippecanoe-enumerate: enumerate.o
tippecanoe-decode: decode.o projection.o mvt.o write_json.o text.o jsonpull/jsonpull.o dirtiles.o pmtiles_file.o
$(CXX) $(PG) $(LIBS) $(FINAL_FLAGS) $(CXXFLAGS) -o $@ $^ $(LDFLAGS) -lm -lz -lsqlite3

tile-join: tile-join.o projection.o mbtiles.o mvt.o memfile.o dirtiles.o jsonpull/jsonpull.o text.o evaluator.o csv.o write_json.o pmtiles_file.o clip.o
tile-join: tile-join.o projection.o mbtiles.o mvt.o memfile.o dirtiles.o jsonpull/jsonpull.o text.o evaluator.o csv.o write_json.o pmtiles_file.o clip.o attribute.o
$(CXX) $(PG) $(LIBS) $(FINAL_FLAGS) $(CXXFLAGS) -o $@ $^ $(LDFLAGS) -lm -lz -lsqlite3 -lpthread

tippecanoe-json-tool: jsontool.o jsonpull/jsonpull.o csv.o text.o geojson-loop.o
Expand All @@ -76,7 +76,7 @@ tippecanoe-json-tool: jsontool.o jsonpull/jsonpull.o csv.o text.o geojson-loop.o
unit: unit.o text.o sort.o mvt.o
$(CXX) $(PG) $(LIBS) $(FINAL_FLAGS) $(CXXFLAGS) -o $@ $^ $(LDFLAGS) -lm -lz -lsqlite3 -lpthread

tippecanoe-overzoom: overzoom.o mvt.o clip.o evaluator.o jsonpull/jsonpull.o text.o
tippecanoe-overzoom: overzoom.o mvt.o clip.o evaluator.o jsonpull/jsonpull.o text.o attribute.o
$(CXX) $(PG) $(LIBS) $(FINAL_FLAGS) $(CXXFLAGS) -o $@ $^ $(LDFLAGS) -lm -lz -lsqlite3 -lpthread

-include $(wildcard *.d)
Expand Down Expand Up @@ -296,6 +296,11 @@ overzoom-test: tippecanoe-overzoom
./tippecanoe-decode tests/pbf/0-0-0-pop-filtered.pbf 0 0 0 > tests/pbf/0-0-0-pop-filtered.pbf.json.check
cmp tests/pbf/0-0-0-pop-filtered.pbf.json.check tests/pbf/0-0-0-pop-filtered.pbf.json
rm tests/pbf/0-0-0-pop-filtered.pbf tests/pbf/0-0-0-pop-filtered.pbf.json.check
# Thinning with accumulation
./tippecanoe-overzoom -y NAME -m --accumulate-attribute NAME:comma -o tests/pbf/0-0-0-pop-accum.pbf tests/pbf/0-0-0-pop.pbf 0/0/0 0/0/0
./tippecanoe-decode tests/pbf/0-0-0-pop-accum.pbf 0 0 0 > tests/pbf/0-0-0-pop-accum.pbf.json.check
cmp tests/pbf/0-0-0-pop-accum.pbf.json.check tests/pbf/0-0-0-pop-accum.pbf.json
rm tests/pbf/0-0-0-pop-accum.pbf tests/pbf/0-0-0-pop-accum.pbf.json.check
# Filtering
# 243 features in the source tile tests/pbf/0-0-0-pop.pbf
# 27 of them match the filter and are retained
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -997,3 +997,4 @@ reads tile `inz/inx/iny` of `in.mvt.gz` and produces tile `outz/outx/outy` of `o
* `-j` *filter*: Filter features using the same expression syntax as in tippecanoe.
* `-m`: If a tile was created with the `--retain-points-multiplier` option, thin the tile back down to its normal feature count during overzooming. The first feature from each cluster will be retained, unless `-j` is used to specify a filter, in which case the first matching filter from each cluster will be retained instead.
* `--preserve-input-order`: Restore a set of filtered features to its original input order
* `--accumulate-attribute`: Behaves as in `tippecanoe` to sum attributes from the features of a multiplier cluster that are not included in the final output. The attributes from features that are filtered away with `-j` are *not* accumulated onto the output feature.
146 changes: 146 additions & 0 deletions attribute.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
#include <string>
#include <map>
#include "attribute.hpp"
#include "errors.hpp"
#include "serial.hpp"
#include "jsonpull/jsonpull.h"
#include "milo/dtoa_milo.h"

void set_attribute_accum(std::map<std::string, attribute_op> &attribute_accum, std::string name, std::string type) {
attribute_op t;

if (type == "sum") {
t = op_sum;
} else if (type == "product") {
t = op_product;
} else if (type == "mean") {
t = op_mean;
} else if (type == "max") {
t = op_max;
} else if (type == "min") {
t = op_min;
} else if (type == "concat") {
t = op_concat;
} else if (type == "comma") {
t = op_comma;
} else {
fprintf(stderr, "Attribute method (%s) must be sum, product, mean, max, min, concat, or comma\n", type.c_str());
exit(EXIT_ARGS);
}

attribute_accum.insert(std::pair<std::string, attribute_op>(name, t));
}

void set_attribute_accum(std::map<std::string, attribute_op> &attribute_accum, const char *arg, char **argv) {
if (*arg == '{') {
json_pull *jp = json_begin_string(arg);
json_object *o = json_read_tree(jp);

if (o == NULL) {
fprintf(stderr, "%s: -E%s: %s\n", *argv, arg, jp->error);
exit(EXIT_JSON);
}

if (o->type != JSON_HASH) {
fprintf(stderr, "%s: -E%s: not a JSON object\n", *argv, arg);
exit(EXIT_JSON);
}

for (size_t i = 0; i < o->value.object.length; i++) {
json_object *k = o->value.object.keys[i];
json_object *v = o->value.object.values[i];

if (k->type != JSON_STRING) {
fprintf(stderr, "%s: -E%s: key %zu not a string\n", *argv, arg, i);
exit(EXIT_JSON);
}
if (v->type != JSON_STRING) {
fprintf(stderr, "%s: -E%s: value %zu not a string\n", *argv, arg, i);
exit(EXIT_JSON);
}

set_attribute_accum(attribute_accum, k->value.string.string, v->value.string.string);
}

json_free(o);
json_end(jp);
return;
}

const char *s = strchr(arg, ':');
if (s == NULL) {
fprintf(stderr, "-E%s option must be in the form -Ename:method\n", arg);
exit(EXIT_ARGS);
}

std::string name = std::string(arg, s - arg);
std::string type = std::string(s + 1);

set_attribute_accum(attribute_accum, name, type);
}

void preserve_attribute(attribute_op op, std::string &key, serial_val &val, std::vector<std::string> &full_keys, std::vector<serial_val> &full_values, std::map<std::string, accum_state> &attribute_accum_state) {
for (size_t i = 0; i < full_keys.size(); i++) {
if (key == full_keys[i]) {
switch (op) {
case op_sum:
full_values[i].s = milo::dtoa_milo(atof(full_values[i].s.c_str()) + atof(val.s.c_str()));
full_values[i].type = mvt_double;
break;

case op_product:
full_values[i].s = milo::dtoa_milo(atof(full_values[i].s.c_str()) * atof(val.s.c_str()));
full_values[i].type = mvt_double;
break;

case op_max: {
double existing = atof(full_values[i].s.c_str());
double maybe = atof(val.s.c_str());
if (maybe > existing) {
full_values[i].s = val.s.c_str();
full_values[i].type = mvt_double;
}
break;
}

case op_min: {
double existing = atof(full_values[i].s.c_str());
double maybe = atof(val.s.c_str());
if (maybe < existing) {
full_values[i].s = val.s.c_str();
full_values[i].type = mvt_double;
}
break;
}

case op_mean: {
auto state = attribute_accum_state.find(key);
if (state == attribute_accum_state.end()) {
accum_state s;
s.sum = atof(full_values[i].s.c_str()) + atof(val.s.c_str());
s.count = 2;
attribute_accum_state.insert(std::pair<std::string, accum_state>(key, s));

full_values[i].s = milo::dtoa_milo(s.sum / s.count);
} else {
state->second.sum += atof(val.s.c_str());
state->second.count += 1;

full_values[i].s = milo::dtoa_milo(state->second.sum / state->second.count);
}
break;
}

case op_concat:
full_values[i].s += val.s;
full_values[i].type = mvt_string;
break;

case op_comma:
full_values[i].s += std::string(",") + val.s;
full_values[i].type = mvt_string;
break;
}
}
}
}
28 changes: 28 additions & 0 deletions attribute.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
#ifndef ATTRIBUTE_HPP
#define ATTRIBUTE_HPP

#include <vector>
#include <map>

enum attribute_op {
op_sum,
op_product,
op_mean,
op_concat,
op_comma,
op_max,
op_min,
};

struct accum_state {
double sum = 0;
double count = 0;
};

struct serial_val;

void set_attribute_accum(std::map<std::string, attribute_op> &attribute_accum, std::string name, std::string type);
void set_attribute_accum(std::map<std::string, attribute_op> &attribute_accum, const char *arg, char **argv);
void preserve_attribute(attribute_op op, std::string &key, serial_val &val, std::vector<std::string> &full_keys, std::vector<serial_val> &full_values, std::map<std::string, accum_state> &attribute_accum_state);

#endif
72 changes: 57 additions & 15 deletions clip.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
#include "mvt.hpp"
#include "evaluator.hpp"
#include "serial.hpp"
#include "attribute.hpp"

static std::vector<std::pair<double, double>> clip_poly1(std::vector<std::pair<double, double>> &geom,
long long minx, long long miny, long long maxx, long long maxy,
Expand Down Expand Up @@ -757,7 +758,7 @@ static std::vector<std::pair<double, double>> clip_poly1(std::vector<std::pair<d
std::string overzoom(std::string s, int oz, int ox, int oy, int nz, int nx, int ny,
int detail, int buffer, std::set<std::string> const &keep, bool do_compress,
std::vector<std::pair<unsigned, unsigned>> *next_overzoomed_tiles,
bool demultiply, json_object *filter, bool preserve_input_order) {
bool demultiply, json_object *filter, bool preserve_input_order, std::map<std::string, attribute_op> const &attribute_accum) {
mvt_tile tile;

try {
Expand All @@ -771,7 +772,7 @@ std::string overzoom(std::string s, int oz, int ox, int oy, int nz, int nx, int
exit(EXIT_PROTOBUF);
}

return overzoom(tile, oz, ox, oy, nz, nx, ny, detail, buffer, keep, do_compress, next_overzoomed_tiles, demultiply, filter, preserve_input_order);
return overzoom(tile, oz, ox, oy, nz, nx, ny, detail, buffer, keep, do_compress, next_overzoomed_tiles, demultiply, filter, preserve_input_order, attribute_accum);
}

struct tile_feature {
Expand All @@ -784,28 +785,69 @@ struct tile_feature {
size_t seq = 0;
};

void feature_out(tile_feature const &feature, mvt_layer &outlayer, std::set<std::string> const &keep) {
static void feature_out(std::vector<tile_feature> const &features, mvt_layer &outlayer, std::set<std::string> const &keep, std::map<std::string, attribute_op> const &attribute_accum) {
// Add geometry to output feature

mvt_feature outfeature;
outfeature.type = feature.t;
for (auto const &g : feature.geom) {
outfeature.type = features[0].t;
for (auto const &g : features[0].geom) {
outfeature.geometry.emplace_back(g.op, g.x, g.y);
}

// ID and attributes, if it didn't get clipped away

if (outfeature.geometry.size() > 0) {
if (feature.has_id) {
if (features[0].has_id) {
outfeature.has_id = true;
outfeature.id = feature.id;
outfeature.id = features[0].id;
}

outfeature.seq = feature.seq;
outfeature.seq = features[0].seq;

for (size_t i = 0; i + 1 < feature.tags.size(); i += 2) {
if (keep.size() == 0 || keep.find(feature.layer->keys[feature.tags[i]]) != keep.end()) {
outlayer.tag(outfeature, feature.layer->keys[feature.tags[i]], feature.layer->values[feature.tags[i + 1]]);
if (attribute_accum.size() > 0) {
// convert the attributes of the output feature
// from mvt_value to serial_val so they can have
// attributes from the other features of the
// multiplier cluster accumulated onto them

std::map<std::string, accum_state> attribute_accum_state;
std::vector<std::string> full_keys;
std::vector<serial_val> full_values;

for (size_t i = 0; i + 1 < features[0].tags.size(); i += 2) {
full_keys.push_back(features[0].layer->keys[features[0].tags[i]]);
full_values.push_back(mvt_value_to_serial_val(features[0].layer->values[features[0].tags[i + 1]]));
}

// accumulate whatever attributes are specified to be accumulated
// onto the feature that will survive into the output, from the
// features that will not

for (size_t i = 1; i < features.size(); i++) {
for (size_t j = 0; j + 1 < features[i].tags.size(); j += 2) {
std::string key = features[i].layer->keys[features[i].tags[j]];

auto f = attribute_accum.find(key);
if (f != attribute_accum.end()) {
serial_val val = mvt_value_to_serial_val(features[i].layer->values[features[i].tags[j + 1]]);
preserve_attribute(f->second, key, val, full_keys, full_values, attribute_accum_state);
}
}
}

// convert the final attributes back to mvt_value
// and tag them onto the output feature

for (size_t i = 0; i < full_keys.size(); i++) {
if (keep.size() == 0 || keep.find(full_keys[i]) != keep.end()) {
outlayer.tag(outfeature, full_keys[i], stringified_to_mvt_value(full_values[i].type, full_values[i].s.c_str()));
}
}
} else {
for (size_t i = 0; i + 1 < features[0].tags.size(); i += 2) {
if (keep.size() == 0 || keep.find(features[0].layer->keys[features[0].tags[i]]) != keep.end()) {
outlayer.tag(outfeature, features[0].layer->keys[features[0].tags[i]], features[0].layer->values[features[0].tags[i + 1]]);
}
}
}

Expand All @@ -822,7 +864,7 @@ static struct preservecmp {
std::string overzoom(mvt_tile tile, int oz, int ox, int oy, int nz, int nx, int ny,
int detail, int buffer, std::set<std::string> const &keep, bool do_compress,
std::vector<std::pair<unsigned, unsigned>> *next_overzoomed_tiles,
bool demultiply, json_object *filter, bool preserve_input_order) {
bool demultiply, json_object *filter, bool preserve_input_order, std::map<std::string, attribute_op> const &attribute_accum) {
mvt_tile outtile;

for (auto const &layer : tile.layers) {
Expand Down Expand Up @@ -863,7 +905,7 @@ std::string overzoom(mvt_tile tile, int oz, int ox, int oy, int nz, int nx, int

if (flush_multiplier_cluster) {
if (pending_tile_features.size() > 0) {
feature_out(pending_tile_features[0], outlayer, keep);
feature_out(pending_tile_features, outlayer, keep, attribute_accum);
pending_tile_features.clear();
}
}
Expand Down Expand Up @@ -958,7 +1000,7 @@ std::string overzoom(mvt_tile tile, int oz, int ox, int oy, int nz, int nx, int
}

if (pending_tile_features.size() > 0) {
feature_out(pending_tile_features[0], outlayer, keep);
feature_out(pending_tile_features, outlayer, keep, attribute_accum);
pending_tile_features.clear();
}

Expand All @@ -985,7 +1027,7 @@ std::string overzoom(mvt_tile tile, int oz, int ox, int oy, int nz, int nx, int
std::string child = overzoom(outtile, nz, nx, ny,
nz + 1, nx * 2 + x, ny * 2 + y,
detail, buffer, keep, false, NULL,
demultiply, filter, preserve_input_order);
demultiply, filter, preserve_input_order, attribute_accum);
if (child.size() > 0) {
next_overzoomed_tiles->emplace_back(nx * 2 + x, ny * 2 + y);
}
Expand Down
7 changes: 5 additions & 2 deletions geometry.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
#include <stdio.h>
#include <mvt.hpp>
#include "jsonpull/jsonpull.h"
#include "attribute.hpp"

#define VT_POINT 1
#define VT_LINE 2
Expand Down Expand Up @@ -102,11 +103,13 @@ double distance_from_line(long long point_x, long long point_y, long long segA_x
std::string overzoom(mvt_tile tile, int oz, int ox, int oy, int nz, int nx, int ny,
int detail, int buffer, std::set<std::string> const &keep, bool do_compress,
std::vector<std::pair<unsigned, unsigned>> *next_overzoomed_tiles,
bool demultiply, json_object *filter, bool preserve_input_order);
bool demultiply, json_object *filter, bool preserve_input_order,
std::map<std::string, attribute_op> const &attribute_accum);

std::string overzoom(std::string s, int oz, int ox, int oy, int nz, int nx, int ny,
int detail, int buffer, std::set<std::string> const &keep, bool do_compress,
std::vector<std::pair<unsigned, unsigned>> *next_overzoomed_tiles,
bool demultiply, json_object *filter, bool preserve_input_order);
bool demultiply, json_object *filter, bool preserve_input_order,
std::map<std::string, attribute_op> const &attribute_accum);

#endif
Loading

0 comments on commit cbf2227

Please sign in to comment.