Wednesday 15 August 2012

hadoop - mapreduce reducer size is wrong -

I am writing a simple MapReduce program, how many times each row appears in the input. My goal is to have two identical data in the directories. So my goal in reducing the stage is that each key actually appears twice (one in each input directory)

This is my code -

I have not found the reason that due to lowering the stage I get a wrong number in the log, I think every reddener gets a number, which merge The number of shuffle made is equal to the number.

Where am I?

