Skip to content

Latest commit

 

History

History
22 lines (14 loc) · 1.09 KB

README.org

File metadata and controls

22 lines (14 loc) · 1.09 KB

RHIPE: R and Hadoop Integrated Processing Environment

Analyze Data using Hadoop tools from within the R environment (e.g. MapReduce)

Documentation: http://saptarshiguha.github.com/RHIPE/

Binary downloads: http://ml.stat.purdue.edu/rhipebin/

Changes

Many! And better documentation to come soon. Very importantly, rhmr has gone and is now replaced by rhwatch. Though the warning says ‘use the same arguments’, this is not true. Here is a quick conversion

The parameters ifolder, ofolder and inout have gone.

  • For sequence input and sequence output, rhwatch(, input=path-to-input, output=path-to-output)
  • For text input and sequence output, rhwatch(, input=rhfmt(path-to-input,type‘text’), output=path-to-output=
  • For text output, sequence input, rhwatch(, input=path-to-input, output=rhfmt(path-to-output, type‘text’)=
  • For the mapreduce equivalent of lapply(1:N, F), do rhwatch(map=rhmap({ rhcollect(k, F(k) )}), input=N) for documentation regarding these , ask the google groups mailing list.

And there is support for reading from HBase, writing to HBase too(experimental).