ssh 服务器 hdfs 大数据实战之配置集群

设置免密登录

1）生成公钥和私钥

[kfk@bigdata-pro01 ~]$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/home/kfk/.ssh/id_rsa): /home/kfk/.ssh/id_rsa already exists. Overwrite (y/n)? y Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/kfk/.ssh/id_rsa. Your public key has been saved in /home/kfk/.ssh/id_rsa.pub. The key fingerprint is: SHA256:0LUJcD8XkH4jluGWhmFhJ9rw9uWe0ZM7+yP3hlcepcc kfk@bigdata-pro01.kfk.com The key’s randomart image is: ±–[RSA 2048]----+ | o..+o. | | Oooo . | | o.===+o | | o…X=o. …| | S+.oo.+o.| | . o.+E| | o o+o| | …=+| | ++=| ±—[SHA256]-----+

2）密钥分发

ssh-copy-id进行密钥分发，具体操作如下： [kfk@bigdata-pro01 ~]$ ssh-copy-id bigdata-pro02 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: “/home/kfk/.ssh/id_rsa.pub” /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed – if you are prompted now it is to install the new keys kfk@bigdata-pro02’s password:

Number of key(s) added: 1

Now try logging into the machine, with: “ssh ‘bigdata-pro02’” and check to make sure that only the key(s) you wanted were added.

[kfk@bigdata-pro01 ~]$ ssh-copy-id bigdata-pro03 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: “/home/kfk/.ssh/id_rsa.pub” /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed – if you are prompted now it is to install the new keys kfk@bigdata-pro03’s password:

Number of key(s) added: 1

Now try logging into the machine, with: “ssh ‘bigdata-pro03’” and check to make sure that only the key(s) you wanted were added.

[kfk@bigdata-pro01 ~]$ ssh-copy-id bigdata-pro04 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: “/home/kfk/.ssh/id_rsa.pub” /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed – if you are prompted now it is to install the new keys kfk@bigdata-pro04’s password: Permission denied, please try again. kfk@bigdata-pro04’s password:

Number of key(s) added: 1

Now try logging into the machine, with: “ssh ‘bigdata-pro04’” and check to make sure that only the key(s) you wanted were added.

[kfk@bigdata-pro01 ~]$ ssh-copy-id bigdata-pro05 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: “/home/kfk/.ssh/id_rsa.pub” /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed – if you are prompted now it is to install the new keys kfk@bigdata-pro05’s password:

Number of key(s) added: 1

Now try logging into the machine, with: “ssh ‘bigdata-pro05’” and check to make sure that only the key(s) you wanted were added.

同步配置环境变量

/etc/bashrc export JAVA_HOME=/opt/modules/jdk-18.0.2.1 export PATH=

PATH:

PATH:JAVA_HOME/bin:. /home/kfk/.bash_profile

在其他主机创建/opt/modules目录

mkdir -p /op/modules

分发jdk

scp -r /opt/modules/jdk-18.0.2.1 root@bigdata-pro02:/opt/modules scp -r /opt/modules/jdk-18.0.2.1 root@bigdata-pro03:/opt/modules scp -r /opt/modules/jdk-18.0.2.1 root@bigdata-pro04:/opt/modules scp -r /opt/modules/jdk-18.0.2.1 root@bigdata-pro05:/opt/modules

分发hadoop

scp -r /home/kfk/hadoop-2.8.0 kfk@bigdata-pro02:/home/kfk/ scp -r /home/kfk/hadoop-2.8.0 kfk@bigdata-pro03:/home/kfk/ scp -r /home/kfk/hadoop-2.8.0 kfk@bigdata-pro04:/home/kfk/ scp -r /home/kfk/hadoop-2.8.0 kfk@bigdata-pro05:/home/kfk/

配置集群

核心配置文件

配置core-site.xml [atguigu@hadoop102 hadoop]$ vi core-site.xml 在该文件中编写如下配置

fs.defaultFS

hdfs://hadoop102:9000

hadoop.tmp.dir

/opt/module/hadoop-2.7.2/data/tmp

HDFS配置文件

配置hadoop-env.sh [atguigu@hadoop102 hadoop]$ vi hadoop-env.sh export JAVA_HOME=/opt/module/jdk1.8.0_144 配置hdfs-site.xml atguigu@hadoop102 hadoop]$ vi hdfs-site.xml 在该文件中编写如下配置

dfs.replication

dfs.namenode.secondary.http-address

hadoop104:50090

YARN配置文件

配置yarn-env.sh [atguigu@hadoop102 hadoop]$ vi yarn-env.sh export JAVA_HOME=/opt/module/jdk1.8.0_144 配置yarn-site.xml [atguigu@hadoop102 hadoop]$ vi yarn-site.xml 在该文件中增加如下配置

yarn.nodemanager.aux-services

mapreduce_shuffle

yarn.resourcemanager.hostname

hadoop103

MapReduce配置文件

配置mapred-env.sh [atguigu@hadoop102 hadoop]$ vi mapred-env.sh export JAVA_HOME=/opt/module/jdk1.8.0_144 配置mapred-site.xml atguigu@hadoop102 hadoop]$ cp mapred-site.xml.template mapred-site.xml

[atguigu@hadoop102 hadoop]$ vi mapred-site.xml 在该文件中增加如下配置

mapreduce.framework.name

yarn

在集群上分发配置好的Hadoop配置文件

使用scp 或 rsync脚本同步即可

集群单点启动

（1）如果集群是第一次启动，需要格式化NameNode [atguigu@hadoop102 hadoop-2.7.2]$ hadoop namenode -format （2）在hadoop102上启动NameNode [atguigu@hadoop102 hadoop-2.7.2]$ hadoop-daemon.sh start namenode [atguigu@hadoop102 hadoop-2.7.2]$ jps 3461 NameNode （3）在hadoop102、hadoop103以及hadoop104上分别启动DataNode [atguigu@hadoop102 hadoop-2.7.2]$ hadoop-daemon.sh start datanode [atguigu@hadoop102 hadoop-2.7.2]$ jps 3461 NameNode 3608 Jps 3561 DataNode [atguigu@hadoop103 hadoop-2.7.2]$ hadoop-daemon.sh start datanode [atguigu@hadoop103 hadoop-2.7.2]$ jps 3190 DataNode 3279 Jps [atguigu@hadoop104 hadoop-2.7.2]$ hadoop-daemon.sh start datanode [atguigu@hadoop104 hadoop-2.7.2]$ jps 3237 Jps 3163 DataNode

集群群点启动

配置slaves

/opt/module/hadoop-2.7.2/etc/hadoop/slaves [atguigu@hadoop102 hadoop]$ vi slaves 在该文件中增加如下内容： hadoop102 hadoop103 hadoop104 注意：该文件中添加的内容结尾不允许有空格，文件中不允许有空行。

启动集群

（1）如果集群是第一次启动，需要格式化NameNode [atguigu@hadoop102 hadoop-2.7.2]$ bin/hdfs namenode -format （2）启动HDFS [atguigu@hadoop102 hadoop-2.7.2]$ sbin/start-dfs.sh [atguigu@hadoop102 hadoop-2.7.2]$ jps 4166 NameNode 4482 Jps 4263 DataNode [atguigu@hadoop103 hadoop-2.7.2]$ jps 3218 DataNode 3288 Jps

[atguigu@hadoop104 hadoop-2.7.2]$ jps 3221 DataNode 3283 SecondaryNameNode 3364 Jps （3）启动YARN [atguigu@hadoop103 hadoop-2.7.2]$ sbin/start-yarn.sh 注意：NameNode和ResourceManger如果不是同一台机器，不能在NameNode上启动 YARN，应该在ResouceManager所在的机器上启动YARN。

集群基本测试

上传文件到集群

上传小文件

[atguigu@hadoop102 hadoop-2.7.2]$ hadoop fs -mkdir -p /user/atguigu/input [atguigu@hadoop102 hadoop-2.7.2]$ hadoop fs -put wcinput/wc.input /user/atguigu/input

上传大文件

[atguigu@hadoop102 hadoop-2.7.2]$ bin/hadoop fs -put /opt/software/hadoop-2.7.2.tar.gz /user/atguigu/input

上传文件后查看文件存放在什么位置

查看HDFS文件存储路径

[atguigu@hadoop102 subdir0]$ pwd /opt/module/hadoop-2.7.2/data/tmp/dfs/data/current/BP-938951106-192.168.10.107-1495462844069/current/finalized/subdir0/subdir0

查看HDFS在磁盘存储文件内容

[atguigu@hadoop102 subdir0]$ cat blk_1073741825 hadoop yarn hadoop mapreduce atguigu atguigu

拼接

-rw-rw-r–. 1 atguigu atguigu 134217728 5月 23 16:01 blk_1073741836 -rw-rw-r–. 1 atguigu atguigu 1048583 5月 23 16:01 blk_1073741836_1012.meta -rw-rw-r–. 1 atguigu atguigu 63439959 5月 23 16:01 blk_1073741837 -rw-rw-r–. 1 atguigu atguigu 495635 5月 23 16:01 blk_1073741837_1013.meta [atguigu@hadoop102 subdir0]$ cat blk_1073741836>>tmp.file [atguigu@hadoop102 subdir0]$ cat blk_1073741837>>tmp.file [atguigu@hadoop102 subdir0]$ tar -zxvf tmp.file

下载

[atguigu@hadoop102 hadoop-2.7.2]$ bin/hadoop fs -get /user/atguigu/input/hadoop-2.7.2.tar.gz ./ 其他还有需要配置集群时间同步。

集群时间同步

时间同步的方式：找一个机器，作为时间服务器，所有的机器与这台集群时间进行定时的同步，比如，每隔十分钟，同步一次时间。

金钥匙

ssh 服务器 hdfs 大数据实战之配置集群

hadoop 大数据 HDFS原理剖析

java hadoop 【大数据部署】HDFS，Yarn集群快速搭建教程（涵盖Windows与Linux中的注意事项）

发表评论取消回复

金钥匙

ssh 服务器 hdfs 大数据实战之配置集群

hadoop 大数据 HDFS原理剖析

java hadoop 【大数据部署】HDFS，Yarn集群快速搭建教程（涵盖Windows与Linux中的注意事项）

相关文章

发表评论取消回复