10 – Sort Benchmarks

Bug 10 - Sort Benchmarks (edit)

Status:	CONFIRMED (edit)

Alias:	None (edit)

Product:

Component:	(show other bugs)
Version:	1.0
Hardware:

Importance:
Assignee:	chenxi (edit) (take)

URL:	()
Personal Tags:

Depends on:	()
Blocks:	()

Reported:	2023-06-13 02:50 UTC by Admin
Modified:	2023-06-13 02:50 UTC (History)
CC List:	Add me to CC list 0 users (edit)
Ignore Bug Mail:	(never email me about this bug)

See Also:	(add)

Orig. Est.:	Current Est.:	Hours Worked:	Hours Left:	%Complete:	Gain:	Deadline:
	0.0	0.0 +		0	0.0
Summarize time (including time for bugs blocking this bug)

Attachments
Add an attachment (proposed patch, testcase, etc.)

Additional Comments:

Comment

Preview

Status:	of Mark as Duplicate

[tag] [reply] [−] Description Admin 2023-06-13 02:50:03 UTC

teragen：生成数据，teragen会按行生成数据，每行100字节,
	
	生成100M数据，需要行数100*1024*1024*1024/100 ，生成的数据存入/teradata/100M-input
    hadoop jar /usr/hadoop-parafs/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar teragen -Dmapred.map.tasks=10 1048576 /teradata/100M-input

   查看数据是否已生成
    hadoop fs -ls /teradata
	

	
	terasort：将teragen生成的数据/teradata/100M-input进行排序，将排序结果存入 /teradata/100M-output
	
	生成100M数据，需要行数100*1024*1024*1024/100=
    hadoop jar /usr/hadoop-parafs/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar terasort -Dmapred.reduce.tasks=5 /teradata/100M-input /teradata/100M-output
	
	
	teravalidate：对terasort的排序结果进行验证，验证结果存入到/teradata/100M-validate
	
	hadoop jar /usr/hadoop-parafs/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar teravalidate /teradata/100M-output /teradata/100M-validate

Add Comment