生产上面每次搭建好Hadoop集群之后,都喜欢测试一下Hadoop集群的读写速度,而Hadoop官方的jar包也给我们提供了测试用例,Hadoop jar运行Hadoop自带jar包
1,测试写速度:Hadoop版本hadoop-2.6.0-cdh5.16.2,生产稳定
向HDFS集群写10个128M的文件
命令:
hadoop jar ${HADOOP_HOME}/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.6.0-cdh5.16.2-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 128MB
吞吐量3.8M/S,平均4.8M/s,二者偏差大,说明每一个map之间时长偏差比较大
2,测试读速度
读取HDFS集群10个128M的文件
命令:
hadoop jar ${HADOOP_HOME}/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.6.0-cdh5.16.2-tests.jar TestDFSIO -read -nrFiles 10 -fileSize 128MB
3,删除测试数据
hadoop jar ${HADOOP_HOME}/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.6.0-cdh5.16.2-tests.jar TestDFSIO -clean
4,测试MapReduce速度
1,使用RandomWriter来产生随机数,每个节点运行10个Map任务,每个Map产生大约1G大小的二进制随机数
hadoop jar ${HADOOP_HOME}/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.16.2.jar randomwriter random-data
2,执行Sort程序
hadoop jar ${HADOOP_HOME}/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0-cdh5.16.2.jar sort random-data sorted-data
3,验证数据是否真正排好序了
hadoop jar ${HADOOP_HOME}/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.6.0-cdh5.16.2-tests.jar testmapredsort -sortInput random-data -sortOutput sorted-data