FileOutputFormat.getOutputPath(context)
Discover Hadoop
Sunday, January 15, 2012
Get Output Path of Job in Hadoop MapReduce Framework
How to Avoid Sorting and Partitioning in Map only Job
We can define MapReduce job with no reducer. In this case, all the mappers write their outputs under specified job output directory. So; there will be no sorting and no partitioning.
Just set the number of reduces to 0.
Just set the number of reduces to 0.
job.setNumReduceTasks(0);
How to overcome on Java Heap Space Error in Hadoop MapReduce Framework
mapred.map.child.java.opts // heap size for map tasks mapred.reduce.child.java.opts // heap size for reduce tasks
Configuration conf = new Configuration(); conf.set("mapred.map.child.java.opts", "-Xmx512m"); conf.set("mapred.reduce.child.java.opts", "-Xmx512m");It will override any existing values.
Or
open your conf/mapred-site.xml and set these value
mapred.map.child.java.opts -Xmx1024m heap size for map tasks mapred.reduce.child.java.opts -Xmx1024m heap size for reduce tasks
Make sure ((num_of_maps * map_heap_size) + (num_of_reducers * reduce_heap_size)) is not larger than memory available in the system. Max number of mappers & reducers can also be tuned looking at available system resources.
How to Pass Parameter to Mapper in Hadoop Map Reuce
In Main of Your Program
In Mapper, I access this value in setup function
String Input_Parameter_Value = "Value_send_to_Mapper"; Configuration conf = HBaseConfiguration.create(); conf.set("Input_Parameter_Name",Input_Parameter_Value);
In Mapper, I access this value in setup function
String value = context.getConfiguration().get("Input_Parameter_Name"); System.out.print("Value: "+value);
Thursday, January 12, 2012
Get File Name Processod by the Mapper in Hadoop Map Reduce
FileSplit fileSplit = (FileSplit)context.getInputSplit(); String filenameProcessod = fileSplit.getPath().getName().toString(); System.out.println("File Name Processing "+filenameProcessod);
Subscribe to:
Posts (Atom)