2020. 1. 3. 11:16
웹사이트 글자수 세기
java mapreduce 사용
1.wordcount class
public class WordCount {
public static void main(String[] args) throws Exception{
Configuration conf = new Configuration();
if(args.length!=2) {
System.err.println("Usage : WordCount ");
System.exit(2);
}
Job job = new Job(conf, "WordCount");
job.setJarByClass(WordCount.class);
job.setMapperClass(WordCountMapper.class);
job.setReducerClass(WordCountReducer.class);
job.setInputFormatClass(org.apache.hadoop.mapreduce.lib.input.TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
}
2.mapper class
public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
@Override
protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, IntWritable>.Context context )
throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while(itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
}
3.reducer class
public class WordCountReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
private IntWritable result = new IntWritable();
@Override
protected void reduce(Text key, Iterable values,
Reducer<Text, IntWritable, Text, IntWritable>.Context context) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}
4.실행결과
'빅데이터' 카테고리의 다른 글
[mapreduce]교통사고 발생건수 통계 (0) | 2020.01.03 |
---|---|
[mapreduce]ncdc 연도별 기온 통계 (0) | 2020.01.03 |
[mapreduce]인구수 통계 (0) | 2020.01.03 |