Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

brec.sh genHistEvent error #14

Open
nattachai305 opened this issue May 8, 2018 · 6 comments
Open

brec.sh genHistEvent error #14

nattachai305 opened this issue May 8, 2018 · 6 comments

Comments

@nattachai305
Copy link

nattachai305 commented May 8, 2018

Hi Pranab, I followed the tutorial through Implicit scenario and I stuck at the step of genHistEvent.
As this mentioned ./brec.sh genHistEvent <item_count> <user_count> <average_event_count_per_user>.
I run the following below.
./brec.sh genHistEvent 100 100 9
And got error
./brec.sh: line 58: $5: ambiguous redirect
The schema I use is exacly engageEvent.json. and the variables I used in brec.sh are below.
JAR_NAME=/etc/recomlib/sifarish-1.0.jar CHOMBO_JAR_NAME=/etc/recomlib/chombo-1.0.jar HDFS_BASE_DIR=/user/pranab/reco PROP_FILE=/etc/git/sifarish/reco.properties HDFS_META_BASE_DIR=/user/pranab/meta/imra
Also I have already created JAR_NAME, CHOMBO_JAR_NAME, PROP_FILE and HDFS_BASE_DIR, HDFS_META_BASE_DIR in local filesystem and HDFS accordingly.
I have downloaded all the required dependencies.
I've been trying to solve this for too long time and I can not. So I couldn't help but asked for your help here and would appreciate your answer.

@pranab
Copy link
Owner

pranab commented May 8, 2018

I posted the fix in my blog comment https://pkghosh.wordpress.com/2014/02/10/from-explicit-user-engagement-to-implicit-product-rating/#comment-5075. I think you posted the same issue there

@pranab
Copy link
Owner

pranab commented May 8, 2018

Regarding uuid issue which version of ruby are you using?

@nattachai305
Copy link
Author

I think it's ruby 1.8.7 (2013-06-27 patchlevel 374)

@nattachai305
Copy link
Author

I have tried 2.4.1 but doesn't work either. Which version should I use or is required for the project?

@nattachai305
Copy link
Author

nattachai305 commented May 9, 2018

Thanks for your time.
Aside from this problem I got another problem when I was going through Explicit Rating Data Generation approach until stuck at step 5. The error showed below.

`[root@quickstart resource]# sudo -u hdfs ./brec.sh correlation
running MR to generate item correlation from rating data
input /user/pranab/reco/crat output /user/pranab/reco/simi
rmr: DEPRECATED: Please use 'rm -r' instead.
rmr: `/user/pranab/reco/simi': No such file or directory
removed output dir /user/pranab/reco/simi
18/05/09 06:37:39 INFO client.RMProxy: Connecting to ResourceManager at quickstart.cloudera/127.0.0.1:8032
18/05/09 06:37:40 INFO input.FileInputFormat: Total input paths to process : 1
18/05/09 06:37:40 INFO mapreduce.JobSubmitter: number of splits:1
18/05/09 06:37:40 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1525063506679_0062
18/05/09 06:37:40 INFO impl.YarnClientImpl: Submitted application application_1525063506679_0062
18/05/09 06:37:40 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1525063506679_0062/
18/05/09 06:37:40 INFO mapreduce.Job: Running job: job_1525063506679_0062
18/05/09 06:37:47 INFO mapreduce.Job: Job job_1525063506679_0062 running in uber mode : false
18/05/09 06:37:47 INFO mapreduce.Job:  map 0% reduce 0%
18/05/09 06:37:55 INFO mapreduce.Job:  map 100% reduce 0%
18/05/09 06:38:01 INFO mapreduce.Job: Task Id : attempt_1525063506679_0062_r_000000_0, Status : FAILED
Error: java.lang.NumberFormatException: For input string: "**6Z31HNOXVGHM**"
	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
	at java.lang.Integer.parseInt(Integer.java:492)
	at java.lang.Integer.parseInt(Integer.java:527)
	at org.sifarish.feature.CosineSimilarity.initVector(CosineSimilarity.java:83)
	at org.sifarish.feature.CosineSimilarity.findDistance(CosineSimilarity.java:45)
	at org.sifarish.common.ItemDynamicAttributeSimilarity$SimilarityReducer.reduce(ItemDynamicAttributeSimilarity.java:282)
	at org.sifarish.common.ItemDynamicAttributeSimilarity$SimilarityReducer.reduce(ItemDynamicAttributeSimilarity.java:164)
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

18/05/09 06:38:08 INFO mapreduce.Job:  map 100% reduce 100%
18/05/09 06:38:08 INFO mapreduce.Job: Job job_1525063506679_0062 failed with state FAILED due to: Task failed task_1525063506679_0062_r_000000
Job failed as tasks failed. failedMaps:0 failedReduces:1

18/05/09 06:38:08 INFO mapreduce.Job: Counters: 37
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=173166
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=4431
		HDFS: Number of bytes written=0
		HDFS: Number of read operations=3
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=0
	Job Counters 
		Failed reduce tasks=2
		Launched map tasks=1
		Launched reduce tasks=2
		Data-local map tasks=1
		Total time spent by all maps in occupied slots (ms)=2305536
		Total time spent by all reduces in occupied slots (ms)=3828224
		Total time spent by all map tasks (ms)=4503
		Total time spent by all reduce tasks (ms)=7477
		Total vcore-milliseconds taken by all map tasks=4503
		Total vcore-milliseconds taken by all reduce tasks=7477
		Total megabyte-milliseconds taken by all map tasks=2305536
		Total megabyte-milliseconds taken by all reduce tasks=3828224
	Map-Reduce Framework
		Map input records=100
		Map output records=1000
		Map output bytes=65000
		Map output materialized bytes=13391
		Input split bytes=131
		Combine input records=0
		Spilled Records=1000
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=169
		CPU time spent (ms)=1570
		Physical memory (bytes) snapshot=227512320
		Virtual memory (bytes) snapshot=980533248
		Total committed heap usage (bytes)=257425408
	File Input Format Counters 
		Bytes Read=4300
rmr: DEPRECATED: Please use 'rm -r' instead.
rmr: `/user/pranab/reco/simi/_logs': No such file or directory
rmr: DEPRECATED: Please use 'rm -r' instead.
rmr: `/user/pranab/reco/simi/_SUCCESS': No such file or directory`

According to the tutorial I thought that the format of rate is already correct so I tried to skip step 3 and copied rate from /reco to /reco/crat and successfully ran ./brec.sh correlation at step 5. But still stuck at step 6.3 and the error is shown below.

running MR for rating predictor
input /user/pranab/reco/crat,/user/pranab/reco/simi output /user/pranab/reco/utpr
rmr: DEPRECATED: Please use 'rm -r' instead.
rmr: `/user/pranab/reco/utpr': No such file or directory
removed output dir /user/pranab/reco/utpr
18/05/09 06:53:49 INFO client.RMProxy: Connecting to ResourceManager at quickstart.cloudera/127.0.0.1:8032
18/05/09 06:53:50 INFO input.FileInputFormat: Total input paths to process : 2
18/05/09 06:53:50 INFO mapreduce.JobSubmitter: number of splits:2
18/05/09 06:53:50 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1525063506679_0065
18/05/09 06:53:50 INFO impl.YarnClientImpl: Submitted application application_1525063506679_0065
18/05/09 06:53:50 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1525063506679_0065/
18/05/09 06:53:50 INFO mapreduce.Job: Running job: job_1525063506679_0065
18/05/09 06:53:58 INFO mapreduce.Job: Job job_1525063506679_0065 running in uber mode : false
18/05/09 06:53:58 INFO mapreduce.Job:  map 0% reduce 0%
18/05/09 06:54:04 INFO mapreduce.Job: Task Id : attempt_1525063506679_0065_m_000000_0, Status : FAILED
**Error: java.lang.NumberFormatException: For input string: "MBFM6Q0Q1PR9:84"**
	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
	at java.lang.Integer.parseInt(Integer.java:492)
	at java.lang.Integer.parseInt(Integer.java:527)
	at org.sifarish.common.UtilityPredictor$PredictionMapper.map(UtilityPredictor.java:201)
	at org.sifarish.common.UtilityPredictor$PredictionMapper.map(UtilityPredictor.java:90)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

18/05/09 06:54:05 INFO mapreduce.Job:  map 50% reduce 0%
18/05/09 06:54:10 INFO mapreduce.Job:  map 100% reduce 0%
18/05/09 06:54:11 INFO mapreduce.Job:  map 100% reduce 100%
18/05/09 06:54:11 INFO mapreduce.Job: Job job_1525063506679_0065 failed with state FAILED due to: Task failed task_1525063506679_0065_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

18/05/09 06:54:11 INFO mapreduce.Job: Counters: 35
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=178789
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=21319
		HDFS: Number of bytes written=0
		HDFS: Number of read operations=3
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=0
	Job Counters 
		Failed map tasks=2
		Killed reduce tasks=1
		Launched map tasks=3
		Other local map tasks=1
		Data-local map tasks=2
		Total time spent by all maps in occupied slots (ms)=7176192
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=14016
		Total vcore-milliseconds taken by all map tasks=14016
		Total megabyte-milliseconds taken by all map tasks=7176192
	Map-Reduce Framework
		Map input records=757
		Map output records=1514
		Map output bytes=71158
		Map output materialized bytes=19041
		Input split bytes=131
		Combine input records=0
		Spilled Records=1514
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=112
		CPU time spent (ms)=1320
		Physical memory (bytes) snapshot=232357888
		Virtual memory (bytes) snapshot=981663744
		Total committed heap usage (bytes)=257425408
	File Input Format Counters 
		Bytes Read=21188
rmr: DEPRECATED: Please use 'rm -r' instead.
rmr: `/user/pranab/reco/utpr/_logs': No such file or directory
rmr: DEPRECATED: Please use 'rm -r' instead.
rmr: `/user/pranab/reco/utpr/_SUCCESS': No such file or directory

I have tried to solve this but I cannot. So would you please enlighten me.

@pranab
Copy link
Owner

pranab commented May 11, 2018

Please provide me with the sequence of steps using step numbers from the tutorial. Your input to the the MR seems incorrect. Also provide 2 or 3 sample lines form the input to the the failing MR job. The fact that you manually copied files, tells me that you are not following the tutorial steps properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants