Skip to content

A R package to extract data from scatterplots and line plots

Notifications You must be signed in to change notification settings

east-winds/digitize

 
 

Repository files navigation

digitize: a plot digitizer in R

This R package helps streamline the digitization of plots. It has two modes: 1) manual selection for use with scatterplots, a fork of the digitize package, and 2) automatic extraction for use with well-behaved lineplots, building on the magick library. It currently works with three bitmap image formats (jpeg, png, bmp), automatically detecting the image type using the package readbitmap.

Installation

You must have devtools installed.

To install from github:

if(!require(devtools)) install.packages('devtools')
library(devtools)
devtools::install_github("east-winds/digitize")

Or, to install locally:

if(!require(devtools)) install.packages('devtools')
library(devtools)
devtools::install("path/to/digitize")

Example: Auto digitization of a line plot

library(digitize)

## make a temporary image
tmp <- tempfile()
png(tmp)
plot(x = 0:10, y = rnorm(11) + 0:10, xlab="x", ylab="y",
     xlim=c(0,10),ylim=c(-1,11), type="l") + grid()
#> integer(0)
dev.off()

## auto-digitize figure using two calibration points and
# 		pre-specifying both x-axis and y-axis

#   Select calibration points (0,0) and (10,10) in blue:
myfn <- digitize(tmp, x1=0, x2=10, y1=0, y2=10, twopoints=T, auto=T)
#> ...careful how you calibrate.
#> Click IN ORDER: x1y1, x2y2
#>
#>     Step 1 ----> Click on x1y1
#>     |
#>     |
#>     |
#>     y1
#>     |______x1____________________
#>      
#>     Step 2 ----> Click on x2y2
#>     |
#>     y2
#>     |
#>     |
#>     |_____________________x2_____
#>     
#>

plot of chunk line_example

#>
#>
#> .....AUTOMATED INPUT.....
#>
#> Attempting to use `magick` to extract curve

# Plot returned spline function
x = seq(0,10,0.1)
plot(x, myfn(x), type='l', main = 'Extracted data')

plot of chunk line_example

Experimental: Extracting multiple lines

digitize(..., auto=T, lines=2) will attempt to extract two lines from the same graph (works best for red and blue), returning a list of interpolation functions.

Example: Manual digitization of a scatter plot

library(digitize)

## make a temporary image
tmp <- tempfile()
png(tmp)
plot(rnorm(10) + 1:10, xlab="x", ylab="y",
     xlim=c(0,10),ylim=c(0,10), xaxs="i", yaxs="i")
dev.off()
#> RStudioGD
#>         2

## manually digitize figure,
# 		pre-specifying both x-axis and y-axis
mydata <- digitize(tmp, x1=0, x2=10, y1=0, y2=10, twopoints=T)
#> ...careful how you calibrate.
#> Click IN ORDER: x1y1, x2y2
#>
#>     Step 1 ----> Click on x1y1
#>     |
#>     |
#>     |
#>     y1
#>     |______x1____________________
#>      
#>     Step 2 ----> Click on x2y2
#>     |
#>     y2
#>     |
#>     |
#>     |_____________________x2_____
#>     
#>

plot of chunk pt_example

#>
#>
#> .....MANUAL INPUT.....
#>
#> Click all the data. (Do not hit ESC, close the window or press any mouse key.)
#>
#> Once you are done - exit:
#>
#>  - Windows: right click on the plot area and choose 'Stop'!
#>
#>  - X11: hit any mouse button other than the left one.
#>
#>  - quartz/OS X: hit ESC

# Plot returned points
plot(mydata$x, mydata$y, main = 'Extracted data')

plot of chunk pt_example

Acknowledgement

If you use the auto digitization feature, please reference the github repo: https://github.com/east-winds/digitize/

If you use the manual scatter plot features, please reference: https://github.com/tpoisot/digitize#citation

Contributions welcome.

About

A R package to extract data from scatterplots and line plots

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • R 100.0%