Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Single-pass stats, pyramids, and histogram (#116)
* Bare bones of single-pass stats support, not yet complete * Somewhat neater management of singlePassInfo object * Closer to a working version * Seems to be roughly working, but needs lots more testing to be sure * Get maxval right * Improve aggregationType handling. More to think about here. Fix array sub-sampling for pyramids * Simplify aggType handling. Commented out band_ov_FlushCache(), as it seems to go Very Slowly * Make a proper dummy njit decorator for when numba is not available. Remove attempt at flushing pyramid block from Python, as this works, but slows everything down. Will have to rely on the KEA C++ fix * Change the default value of DEFAULT_OVERVIEWAGGREGRATIONTYPE to 'NEAREST', which it probably should have been all along * Much simpler dummy njit * singlePassMgr instead of singlePassInfo * Check each output driver for pyramid writing support * Add separate timings for pyramids/stats/histogram * Cope with completely null output raster * Don't write stats when whole raster is null * Bit more explanation in SinglePassManager docstring * Put 'closing' timer after the other things * Remove 'closing' as a known timer name * Re-factoring of addStatistics() routine, to allow better sharing of code with single-pass calculations. More to do, but so far seems to work * Mostly completed re-factoring of old addStatistics() routine, and its use in imagewriter. Can now manage each of the three actions independently * Include cache flush in 'writing' timer * Simplify field names on HistogramParams class * Improved docstrings for setApproxStats and setSinglePassHistogram. Add imagename= parameter for setApproxStats. Add check on use of approx stats on thematic outputs * Improved docstrings. Fix backward compatibility of addStatistics() * Fix teststats.py so it runs stand-alone * Add workinggrid and singlePassMgr to ApplierReturn object * Typo * Beginnings of a non-numba histogram method. Does not yet work, but shows considerable promise * Reverse arguments to updateHist. Fix calcMin for thematic GDAL histogram * Remove remnants of numba-based histogram * Better handling of counts for negative pixel values. Still not entirely right. * Fix hist min/max for case with both positive and negative pixels * Guard against attempting histogram when raster is all nulls * Minor tidying * For athematic large-int rasters, collapse the single-pass 'direct' histogram to a smaller 'linear' one, which matches how we have always done it with gdal's GetHistogram() * Better name for int type sets * Check on pyramids and approx stats * More explicit name directPyramidsSupported * Minor changes to make sure the new tests work * New teststats.py, more comprehensive, and better aligned with single-pass stats calculation * Only use RFC40 to write Histogram column if the layer is thematic, and when doing so, make allowance for HISTOMIN to be something other than zero * Only use KEA for ratstats test, since RFC40 is not properly supported in HFA. Be more explicit about the null value, and set thematic before doing calcStats * Add tests for whether single-pass happened when requested * Match the old behaviour for thematic layers, forcing HISTOMIN to be zero. Add exception if there are negative values in a thematic layer * Use int64 for counts, not uint64, to avoid overflow when subtracting totals. Fix minor discrepancy in linearHistFromDirect * Add a more rigorous test on histogram individual counts, but so far only for 'direct' binned case * Always write binFunction, even when writing histogram with RAT column * Also apply hist test to athematic direct case * Remove my paranoid test on linear hist total count. There is now no way it can fail * Comments explaining the limits on using RAT to write histogram * Update docstrings to remove reference to numba, and further explain default behaviour of single-pass stuff * Typo * Move the omit test outside the format/datatype loops * Ensure omit check always happens * Implement a test of histogram counts for linear-binned case * Typos * Remove old histLimits method, left over from earlier implementation * Remove unused variable * Use a more reliable check of whether to use RAT to read histogram back * Use numpy.histogram for both direct and linear histograms. Seems fine, and simplifies the code * Add a test with negative pixel values * Fix test for null at end of counts * Test with a null value other than zero * Include a test with no null value * Note in docstrings about assuming the null value(s) have already been set * Allow genRampImageFile to set numRows/numCols * Added explicit test of pyramid layers * Cope with the case of a block of all nulls * Add a test for an output of all nulls * Simplify default aggregation type code * Very old typo in docstring * Tidying up a few docstrings * Remove now-redundant variable * Don't use KEA for the extra one-off tests, as it may not be present * Cope with old GDAL that doesn't have Int64 * Revert to HFA for rat stats test, as KEA may not be present * Include libgdal-kea in the linux-conda tests * Close the test output, so that Windows will be able to delete it * Put file close further down, after all possible uses, but before delete * Use the more robust test for GDAL 64-bit support * Another place where test outfile has to be closed before Windows tries to delete it * Include a print statement so I can track test failure on github * More debug printing * Correct checks on 64-bit type support * Remove debug print statements
- Loading branch information