1、HDFS-HA集群配置 

Apache Hadoop 3.3.4 – HDFS High Availability Using the Quorum Journal Manager

1.1、环境准备

修改IP修改主机名及主机名和IP地址的映射关闭防火墙ssh免密登录安装JDK,配置环境变量等

1.2、集群规划

linux121linux122linux123NameNodeNameNodeJournalNodeJournalNodeJournalNodeDataNodeDataNodeDataNodeZKZKZKResourceManagerNodeManagerNodeManagerNodeManager

1.3、NodeManager

启动zookeeper集群

zk.sh start

查看状态

zk.sh status

注意:这里的zk.sh是我写的群起脚本命令。

1.4、配置HDFS-HA集群

(1)停止原先HDFS集群

stop-dfs.sh

(2)在所有节点,/opt/lagou/servers目录下创建一个ha文件夹

mkdir /opt/lagou/servers/ha

(3)将/opt/lagou/servers/目录下的 hadoop-2.9.2拷贝到ha目录下

cp -r hadoop-2.9.2 ha

(4)删除原集群data目录

rm -rf /opt/lagou/servers/ha/hadoop-2.9.2/data

(5)配置hdfs-site.xml(后续配置都要清空原先的配置)

dfs.nameservices

lagoucluster

dfs.ha.namenodes.lagoucluster

nn1,nn2

dfs.namenode.rpc-address.lagoucluster.nn1

linux121:9000

dfs.namenode.rpc-address.lagoucluster.nn2

linux122:9000

dfs.namenode.http-address.lagoucluster.nn1

linux121:50070

dfs.namenode.http-address.lagoucluster.nn2

linux122:50070

dfs.namenode.shared.edits.dir

qjournal://linux121:8485;linux122:8485;linux123:8485/lagou

dfs.client.failover.proxy.provider.lagoucluster

org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

dfs.ha.fencing.methods

sshfence

dfs.ha.fencing.ssh.private-key-files

/root/.ssh/id_rsa

dfs.journalnode.edits.dir

/opt/journalnode

dfs.ha.automatic-failover.enabled

true

(6)配置core-site.xml

fs.defaultFS

hdfs://lagoucluster

hadoop.tmp.dir

/opt/lagou/servers/ha/hadoop-2.9.2/data/tmp

ha.zookeeper.quorum

linux121:2181,linux122:2181,linux123:2181

(7)拷贝配置好的hadoop环境到其他节点

1.5、启动HDFS-HA集群

(1)在各个JournalNode节点上,输入以下命令启动journalnode服务(去往HA安装目录,不要使用环境变量中命令)

/opt/lagou/servers/ha/hadoop-2.9.2/sbin/hadoop-daemon.sh start journalnode

(2)在[nn1]上,对其进行格式化,并启动

/opt/lagou/servers/ha/hadoop-2.9.2/bin/hdfs namenode -format

/opt/lagou/servers/ha/hadoop-2.9.2/sbin/hadoop-daemon.sh start namenode

(3)在[nn2]上,同步nn1的元数据信息

/opt/lagou/servers/ha/hadoop-2.9.2/bin/hdfs namenode -bootstrapStandby

(4)在[nn1]上初始化zkfc

/opt/lagou/servers/ha/hadoop-2.9.2/bin/hdfs zkfc -formatZK

(5)在[nn1]上,启动集群

/opt/lagou/servers/ha/hadoop-2.9.2/sbin/start-dfs.sh

(6)验证

将Active NameNode进程killkill -9 namenode的进程id

2、YARN-HA配置

2.1、YARN-HA工作机制

官方文档

Apache Hadoop 3.3.4 – ResourceManager High Availability

YARN-HA工作机制

2.2、配置YARN-HA集群

(1)配置YARN-HA集群

修改IP修改主机名及主机名和IP地址的映射关闭防火墙ssh免密登录安装JDK,配置环境变量等配置Zookeeper集群

(2)具体配置

(3)yarn-site.xml(清空原有内容)

yarn.nodemanager.aux-services

mapreduce_shuffle

yarn.resourcemanager.ha.enabled

true

yarn.resourcemanager.cluster-id

cluster-yarn

yarn.resourcemanager.ha.rm-ids

rm1,rm2

yarn.resourcemanager.hostname.rm1

linux122

yarn.resourcemanager.hostname.rm2

linux123

yarn.resourcemanager.zk-address

linux121:2181,linux122:2181,linux123:2181

yarn.resourcemanager.recovery.enabled

true

yarn.resourcemanager.store.class

org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore

(4)同步更新其他节点的配置信息

(5)启动hdfs

sbin/start-yarn.sh

好文阅读

评论可见,请评论后查看内容,谢谢!!!
 您阅读本篇文章共花了: