Tuesday 15 April 2014

java - Hadoop Map Reduce Query -


I try to use HADOOP MadReduce to calculate the weight of the incoming edges for each node in the graph having had. The input is in a .tsv format and looks like:

src tgt weight

x 102 1

X 200 1

X 123 5

Y245 1

Y 101 1

Z-992

X 145 3

Y 24 1

a 21 5

. .

.

Expected output is:

src sUM (weight)

x 10

Y3

Z 2

A5

I have used WordCount as an example code reference from Hadop (), I tried to manipulate the code, but all my efforts ended in vain.

I am very new to Java and Horoscope. I have shared my code, please help me understand what is wrong with the code.

Thank you.

Code:

  import java.io.IOException; Import java.util. *; Import org.apache.hadoop.fs.Path; Import org.apache.hadoop.conf *; Import org.apache.hadoop.io *; Import org.apache.hadoop.mapred *; Import org.apache.hadoop.util *; Public Class Task1 {MapReduceBase implements MapReduceBase in public static class map & lt; LongwayText, Text, Text, IntWritable & gt; {Private Final Static IntWritable value_parsed = New IntWritable (); Private text word = new text (); Public Zero map throws IOException (long-term appropriate key, text value, output calculator & lt; text, intWritable & gt; output, reporter reporter) {string line = value. Tutorial (); StringTokenizer Tokenizer = New StringTokenizer (line); Text key = new text (); Int sum; While (tokenizer.hasMoreTokens ()) {tokenizer.nextToken (); Keys.set (tokenizer.nextToken ()); Sum = Integer.parseInt (tokenizer.nextToken ()); Output.collect (keys, new intwritable (sum)); }}} Expand declines in public static class MapReduceBase applies Reducer & lt; Text, IntWritable, Text, IntWritable & gt; {Public Zero Low (Text Key, Iterator & quot; IntWritable> Price, OutputClalter & lt; Text, IntWritable> Output, Reporter Reporter) throws IOException {int sum = 0; While (values.hasNext ()) {sum + = values.next (). Get (); } Output.collect (key, new IntWritable (sum)); }} Public static zero principal (string [] args) throws exception {JobConf conf = new JobConf (Task1.class); Conf.setJobName ("Task1"); Conf.setOutputKeyClass (Text.class); Conf.setOutputValueClass (IntWritable.class); Conf.setMapperClass (Map.class); Conf.setCombinerClass (Reduce.class); Conf.setReducerClass (Reduce.class); Conf.setInputFormat (TextInputFormat.class); Conf.setOutputFormat (TextOutputFormat.class); FileInputFormat.setInputPaths (conf, new path (args [0])); FileOutputFormat.setOutputPath (conf, new path (args [1])); JobClient.runJob (conf); }}  

You have to change your code a little bit.

  while (tokenizer.hasMoreTokens ()) {tokenizer.nextToken (); // This value is the first column keys. Set (tokenizer.nextToken ()); // This is incorrect - you do not have to // set the first column as the key // second column sum = integer. Print (token effect.exxton ()); // here // third column output.collect (key, new intwritable (sum)); }  

Hope this can help you


No comments:

Post a Comment