The {RDP} package contains an implementation of the Ramer–Douglas–Peucker algorithm for reducing the number of points in a curve.
There are other R packages that offer implementations of the Ramer–Douglas–Peucker algorithm, but {RDP} has only a single dependency and it executes pretty fast.
{RDP} is available on CRAN and can be installed with the command
install.packages("RDP")
{RDP} can also be installed from GitHub with the {remotes} package using this command:
remotes::install_github("robertdj/RDP")
There is a single function in the {RDP} package. Here is an example from the Wikipedia page linked to above with a description of the algorithm. The original line is black and the approximating line is red.
x <- seq(from = 0, to = 5, by = 0.01)
y <- exp(-x) * cos(2 * pi * x)
plot(x, y, type = "l")
lines(RDP::RamerDouglasPeucker(x, y, 0.06), col = "red", lty = 2)
Here is a comparison of the execution speed and memory usage of the {RDP} implementation and a pure R implementation from the {kmlShape} package:
x <- as.double(1:10^4)
y <- rnorm(length(x))
bench::mark(
RDP = RDP::RamerDouglasPeucker(x, y, epsilon = 0.1),
kmlShape = kmlShape::DouglasPeuckerEpsilon(x, y, epsilon = 0.1)
)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 RDP 822.57µs 938.5µs 945. 5.28MB 4.00
#> 2 kmlShape 5.11s 5.11s 0.196 35.74MB 13.5
In this example we see from the {bench} package summary that {RDP} is several 1000 times faster and only use a fraction of the memory.
The C++ code was originally based on a Gist from TimSC, but over time it has been almost completely rewritten.
This package is almost exclusively C++ code that is linked against R. As explained in Rcpp's FAQ this leaves me no option but to use GPL.