Hadoop学习笔记四:基于Yarn的MapReduce集群搭建

基于Yarn的MapReduce集群搭建可参照官方文档 https://hadoop.apache.org/docs/r2.6.5/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html

基础知识

MapReduce
关于MapReduce基础知识,可查阅此篇文章:https://blog.csdn.net/luzhensmart/article/details/90202313
Yarn
关于Yarn基础知识,可查阅此篇文章:https://www.jianshu.com/p/3f406cf438be

服务器准备

本文搭建所用服务器环境是在上篇文章【Hadoop学习笔记三:高可用集群搭建(Hadoop2.x)】https://www.jianshu.com/p/666ff9bbf784 基础上进行的,基于Yarn的MapReduce集群服务器规划方案如下图。

基于Yarn的MapReduce集群.png

一、免密登录

两个ResourceManager节点类似于高可用集群NameNode一个级别,主备之间可能需要进行切换,所以主备节点需要免秘钥登录。按上图所示,需配置Node03和Node04之间免密钥登录。


rm-ha-overview.png

03节点 .ssh 目录下:

ssh-keygen -t dsa -P '' -f ./id_dsa
cat id_dsa.pub >> authorized_keys
scp id_dsa.pub node04:`pwd`/node03.pub

04节点 .ssh 目录下 :

cat node03.pub >> authorized_keys
ssh-keygen -t dsa -P '' -f ./id_dsa
cat id_dsa.pub >> authorized_keys
scp id_dsa.pub node03:`pwd`/node04.pub

03节点 .ssh 目录下:

cat node04.pub >> authorized_keys

二、配置项

1.mapred-site.xml

#重命名
mv mapred-site.xml.template mapred-site.xml 

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>
#参照 官网Single Node Cluster配置Yarn

2.yarn-site.xml

#配置数据洗牌阶段归于yarn管理
 <property>
   <name>yarn.nodemanager.aux-services</name>
   <value>mapreduce_shuffle</value>
 </property>
#参照 官网Single Node Cluster配置Yarn

#官网给出的最简配置
 <property>
   <name>yarn.resourcemanager.ha.enabled</name>
   <value>true</value>
 </property>
 <property>
   <name>yarn.resourcemanager.cluster-id</name>
   <value>cluster1</value>
 </property>
 <property>
   <name>yarn.resourcemanager.ha.rm-ids</name>
   <value>rm1,rm2</value>
 </property>
 <property>
   <name>yarn.resourcemanager.hostname.rm1</name>
   <value>node03</value>
 </property>
 <property>
   <name>yarn.resourcemanager.hostname.rm2</name>
   <value>node04</value>
 </property>
 <property>
   <name>yarn.resourcemanager.zk-address</name>
   <value>node02:2181,node03:2181,node04:2181</value>
 </property>

3.分发两个文件到:02 03 04节点

scp mapred-site.xml yarn-site.xml node02:`pwd`
scp mapred-site.xml yarn-site.xml node03:`pwd`
scp mapred-site.xml yarn-site.xml node04:`pwd`

三、启动集群

1.启动zookeeper

#02/03/04节点同时执行
zkServer.sh start

2.启动hdfs

#node01 节点启动
start-dfs.sh
#注意,有一个脚本不要用,start-all.sh
#如果nn1和 nn2没有启动,需要在node01,node02分别手动启动:
hadoop-daemon.sh start namenode 

3.启动yarn

#启动NodeManager
 start-yarn.sh
#在03,04节点 启动ResourceManager
yarn-daemon.sh start resourcemanager

浏览器访问: node03:8088 node04:8088

四、停止集群

#node01上执行
stop-dfs.sh 

#node01上停止NodeManager
stop-yarn.sh

#node03,node04停止ResourceManager
yarn-daemon.sh stop resourcemanager 

#node02,node03,node04上停止zk
zkServer.sh stop

五、测试计算能力

#test.txt 是 【Hadoop学习笔记二:全分布式搭建(Hadoop1.x)】 时上传
hadoop jar hadoop-mapreduce-examples-2.6.5.jar wordcount test.txt /wordcount

计算过程可在浏览器查看进度
http://node04:8088/cluster

19/12/01 04:40:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
19/12/01 04:40:47 INFO input.FileInputFormat: Total input paths to process : 1
19/12/01 04:40:47 INFO mapreduce.JobSubmitter: number of splits:2
19/12/01 04:40:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1575143975107_0004
19/12/01 04:40:49 INFO impl.YarnClientImpl: Submitted application application_1575143975107_0004
19/12/01 04:40:50 INFO mapreduce.Job: The url to track the job: http://node04:8088/proxy/application_1575143975107_0004/
19/12/01 04:40:50 INFO mapreduce.Job: Running job: job_1575143975107_0004
19/12/01 04:41:49 INFO mapreduce.Job: Job job_1575143975107_0004 running in uber mode : false
19/12/01 04:41:49 INFO mapreduce.Job:  map 0% reduce 0%
19/12/01 04:43:30 INFO mapreduce.Job:  map 33% reduce 0%
19/12/01 04:43:31 INFO mapreduce.Job:  map 50% reduce 0%
19/12/01 04:46:44 INFO mapreduce.Job:  map 50% reduce 17%
19/12/01 04:46:46 INFO mapreduce.Job:  map 100% reduce 17%
19/12/01 04:46:50 INFO mapreduce.Job:  map 100% reduce 43%
19/12/01 04:46:53 INFO mapreduce.Job:  map 100% reduce 100%
19/12/01 04:46:55 INFO mapreduce.Job: Job job_1575143975107_0004 completed successfully

结果查看

hdfs dfs -cat /wordcount/part-r-00000

推荐阅读更多精彩内容