-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert GPX/KML to geojson #61
Conversation
Per https://github.com/mapbox/unpacker/issues/907, this PR just needs to run mapnik-omnivore and add the output as a JSON file to the bundle (for both KML/GPX preprocessor and geojson-index preprocessor). Happy to jump on this tomorrow. |
The togeojson KML tests fail 50% of the time locally with the following error:
Going to split each fixture into its own tape test as a first step. Wonder if a previous test in the for loop is erroring and an assert.end() is lingering. |
Looks like the original error that was causing the .end() callback error is:
I console logged the above error from within the indexing logic. I suspect mapnik doesnt like some of the geojson files we are feeding it for indexing or some kind of race condition when indexing each layer? |
Per chat with @springmeyer , looks like the indexing error is pointing to something else going on in the converted geojson layers. I took a look at the resulting converted geojson layers for the tests that are failing, and looks as if they are producing incomplete objects (invalid json). Note: not all resulting geojson layers are invalid. Going to keep 👀 |
One step further with debugging the failing test. I have proof now, that something is still holding on to Unfortunately it's a pain to find out: running the test from the console the exception occurs every time.
|
I think we have here a good ole fashioned control flow problem (when I remove createIndex locally, tests no longer fail). Looks like createIndex can potentially trigger before the forEach loops are finished. This is because createIndex is being called outside of the forEach loops, and is not relying on any sort of callback or way to determine when layers are in fact ready to be indexed. I like your idea from yesterday, that we should index after all layers are converted. We can also refactor the main function a bit to better manage flow and modularize a bit. Found a great example of how we can try untangling some of this: var fs = require('fs');
function read_directory(path, next) {
fs.readdir(".", function (err, files) {
var count = files.length,
results = {};
files.forEach(function (filename) {
fs.readFile(filename, function (data) {
results[filename] = data;
count--;
if (count <= 0) {
next(results);
}
});
});
});
}
function read_directories(paths, next) {
var count = paths.length,
data = {};
paths.forEach(function (path) {
read_directory(path, function (results) {
data[path] = results;
count--;
if (count <= 0) {
next(data);
}
});
});
}
read_directories(['articles', 'authors', 'skin'], function (data) {
// Do something
}); |
Nothing wrong with the control flow, just gdal has to be manually disposed to get rid of the file reference faster:
The 4 references to the dataset in the screenshot above came from
as each of them opens the dataset (maybe room for improvement?) and Working on the next failing test now:
|
Same with failing mbtiles test: resources are not properly disposed. Now all tests run through again, but 8 are failing. |
That's also why not ok 4 expected error message --- operator: equal expected: 'GPX does not contain any valid features.' actual: undefined ... not ok 5 .end() called twice --- operator: fail ... |
|
Node
This lead me down a google search of node GC, but not sure that's the proper route at this point. Will keep pondering. |
Great find. Thanks.
The explanation above was the reason for not being able to delete the tmpfile, as node hat not yet done its garbage collection. Sorry if I caused any confusion, but this should be fixed now and the file should not be immortal anymore.
Not necessary to spend more time on this. There are GC parameters you can pass to the node executable to fiddle with the internals of its GC The take away here is: always dispose of resources properly as soon as you are done with them. Be it a file ( |
Awesome work. Also agree @BergWerkGIS. This is a pretty classic case of windows behavior being more strict and that helping highlight core issues of the code that should be written more robustly. 💯
That is neat/great that setting to |
@springmeyer Already looked at the code and I think this is purely on node's side (and our implementation) and there is nothing else gdal can do about it (although not familiar with node's GC internals).
if (this_dataset) {
LOG("Disposing Dataset [%p]", this_dataset);
ptr_manager.dispose(uid);
LOG("Disposed Dataset [%p]", this_dataset);
this_dataset = NULL;
} Pseudo code of our implementation looks like this: function one(){var ds = new gdal.ds; return two(ds);}
function two(ds){return three(ds);}
function three(ds){return four(ds);}
function four(ds){fs.unlink(ds);} And with this implementation it absolutely makes sense for me that node hasn't fully released |
UPDATE on funky test failure Per chat with Jake, node-mapnik no longer supports builds for node 0.12 |
Cannot reproduce the The highlighted files are those left after |
Hmmm, looking into why tests are still failing here, so we can get this ready to rock. |
@@ -34,6 +34,7 @@ module.exports = function(infile, outdir, callback) { | |||
.on('exit', function() { | |||
// If error printed to --validate-features log | |||
if (data.indexOf('Error') != -1) { | |||
console.log(data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug logging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GretaCB what's the plan of attack?
This has changed here (https://github.com/mapbox/preprocessorcerer/compare/stderr-fix#diff-39694ce4a586caad3e6ad813c295b454R32) too as latest mapnik-index
is now returning an error code, if --validate-features
fails.
Need to wait to update to latest node-mapnik?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plan of attack re: mapbox/mapnik-swoop#14
cc @mapsam
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working through a node-mapnik dev build so we can test this right now.
Per #19
track_points
. Found that these are automatically inserted by gdal (most of the time). @BergWerkGIS did we decide on if possible to allowtrack_points
produced by an actual GPS device?@BergWerkGIS