Bug 10 - Sort Benchmarks (edit)
Status: CONFIRMED (edit)
Alias: None (edit)
(show other bugs)
1.0
:
()
Depends on: ()
Blocks: ()
 
Reported: 2023-06-13 02:50 UTC by
Modified: 2023-06-13 02:50 UTC (History)

0 users (edit)
(never email me about this bug)
(add)

Current Est.: %Complete: Gain:
0.0 0.0 + 0 0.0
Summarize time (including time for bugs blocking this bug)

Attachments

:

Status:
of
[tag] [reply] [−] Description 2023-06-13 02:50:03 UTC
teragen:生成数据,teragen会按行生成数据,每行100字节,
	
	生成100M数据,需要行数100*1024*1024*1024/100 ,生成的数据存入/teradata/100M-input
    hadoop jar /usr/hadoop-parafs/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar teragen -Dmapred.map.tasks=10 1048576 /teradata/100M-input

   查看数据是否已生成
    hadoop fs -ls /teradata
	

	
	terasort:将teragen生成的数据/teradata/100M-input进行排序,将排序结果存入 /teradata/100M-output
	
	生成100M数据,需要行数100*1024*1024*1024/100=
    hadoop jar /usr/hadoop-parafs/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar terasort -Dmapred.reduce.tasks=5 /teradata/100M-input /teradata/100M-output
	
	
	teravalidate:对terasort的排序结果进行验证,验证结果存入到/teradata/100M-validate
	
	hadoop jar /usr/hadoop-parafs/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar teravalidate /teradata/100M-output /teradata/100M-validate