You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Environment:
HortonWorks 2.1 cluster integrated with Kerberos and Active Directory
R version: 3.1.3
Issue:
I am trying to run a simple MR job using R on a Kerberos enabled Hadoop cluster. The R code is given below:
Sys.setenv(HADOOP_STREAMING="/usr/lib/hadoop-mapreduce/hadoop-streaming-2.4.0.2.1.5.0-695.jar")
Sys.setenv(HADOOP_CMD="/usr/bin/hadoop")
Sys.setenv(HADOOP_CONF_DIR="/etc/hadoop/conf")
library(rhdfs)
library(rmr2)
hdfs.init()
ints = to.dfs(1:100)
calc = mapreduce(input = ints, map = function(k, v) cbind(v, 2*v))
Till this point the mapreduce job runs successfully but when I try to access the results using the following command, an error is thrown:
from.dfs(calc)
The error is "Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 1 did not have 8 elements".
The same error is thrown while accessing output of any MR job [wordcount, pi value].
Environment:
HortonWorks 2.1 cluster integrated with Kerberos and Active Directory
R version: 3.1.3
Issue:
I am trying to run a simple MR job using R on a Kerberos enabled Hadoop cluster. The R code is given below:
Sys.setenv(HADOOP_STREAMING="/usr/lib/hadoop-mapreduce/hadoop-streaming-2.4.0.2.1.5.0-695.jar")
Sys.setenv(HADOOP_CMD="/usr/bin/hadoop")
Sys.setenv(HADOOP_CONF_DIR="/etc/hadoop/conf")
library(rhdfs)
library(rmr2)
hdfs.init()
ints = to.dfs(1:100)
calc = mapreduce(input = ints, map = function(k, v) cbind(v, 2*v))
Till this point the mapreduce job runs successfully but when I try to access the results using the following command, an error is thrown:
from.dfs(calc)
The error is "Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 1 did not have 8 elements".
The same error is thrown while accessing output of any MR job [wordcount, pi value].
The traceback() function displays the following:
7: scan(file = file, what = what, sep = sep, quote = quote, dec = dec,
nmax = nrows, skip = 0, na.strings = na.strings, quiet = TRUE,
fill = fill, strip.white = strip.white, blank.lines.skip = blank.lines.skip,
multi.line = FALSE, comment.char = comment.char, allowEscapes = allowEscapes,
flush = flush, encoding = encoding, skipNul = skipNul)
6: read.table(textConnection(hdfs("ls", fname, intern = TRUE)),
skip = 1, col.names = c("permissions", "links", "owner",
"group", "size", "date", "time", "path"), stringsAsFactors = FALSE)
5: hdfs.ls(fname)
4: part.list(fname)
3: lapply(src, function(x) system(paste(hadoop.streaming(), "dumptb",
rmr.normalize.path(x), ">>", rmr.normalize.path(dest))))
2: dumptb(part.list(fname), tmp)
1: from.dfs(calc)
Please let me know how to resolve this issue.
The text was updated successfully, but these errors were encountered: