1.集群规划

hadoop102 master、worker hadoop103 workerhadoop104worker

2.环境准备工作

        1)三台节点均需部署JDK(1.8+),并配置相关环境变量。

        2)需部署数据库,支持MySQL(5.7+)或者PostgreSQL(8.2.15+)。

        3)需部署Zookeeper(3.4.6+)。

        4)三台节点均需安装进程管理工具包psmisc。如下三台节点都需要安装

[root@hadoop102 ~]$ sudo yum install -y psmisc

[root@hadoop103 ~]$ sudo yum install -y psmisc

[root@hadoop104 ~]$ sudo yum install -y psmisc

3.初始化数据库

         DolphinScheduler 元数据存储在关系型数据库中,故需创建相应的数据库和用户。

注意:未避免因设置密码过于简单而报错,这里需要降低密码的强度级别

mysql> set global validate_password_length=4;

mysql> set global validate_password_policy=0; 可以纯数据或纯字母

未进行上述设置,可能会报错:

ERROR 1819 (HY000): Your password does not satisfy the current policy requirements

  具体操作分为三步:1、创建dolphinscheduler数据库,2、创建dolphinscheduler用户,3、授予dolphinscheduler用户dolphinscheduler库的全部权限

1、创建dolphinscheduler数据库

mysql> CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

2、创建dolphinscheduler用户

mysql> CREATE USER 'dolphinscheduler'@'%' IDENTIFIED BY 'dolphinscheduler';

%意思是可以通过dp在任何机器上都能访问mysql数据库

3、授予dolphinscheduler用户,对于dolphinscheduler库的全部权限

mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'dolphinscheduler'@'%'; --不能改成小写

4、刷新

mysql> flush privileges;

4.进行部署

        1、解压文件

[root@hadoop102 ~]# cd /opt/software

[root@hadoop102 ~]# tar -zxvf apache-dolphinscheduler-3.0.0-bin.tar.gz

[root@hadoop102 software]# ll

drwxrwxr-x 10 root root 4096 9月 19 09:40 apache-dolphinscheduler-3.0.0-bin

-r-------- 1 root root 148680098 9月 12 21:51 apache-dolphinscheduler-3.0.0-bin.tar.gz

        2、进入apache-dolphinscheduler-3.0.0-bin文件夹进行配置

总共需要修改如下三种配置文件:

1、/opt/software/apache-dolphinscheduler-3.0.0-bin/bin/env/isntall_env.sh

2、/opt/software/apache-dolphinscheduler-3.0.0-bin/bin/env/dolphinscheduler_env.sh

3、如下四个文件的common.properties配置,四个文件配置均一致,worker-server下的common.properties不用配置,因为启动的时候,worker-server会读取tools下的common.properties配置

     /opt/software/apache-dolphinscheduler-3.0.0-bin/tools/conf/common.properties

     /opt/software/apache-dolphinscheduler-3.0.0-bin/api-server/conf/common.properties

     /opt/software/apache-dolphinscheduler-3.0.0-bin/alert-server/conf/common.properties

     /opt/software/apache-dolphinscheduler-3.0.0-bin/master-server/conf/common.properties

开始配置:

配置1:

[root@hadoop102 env]#

vim /opt/software/apache-dolphinscheduler-3.0.0-bin/bin/env/isntall_env.sh

修改如下配置:

#1、现在只在hadoop102配置,调度器如何知道你的集群,通过ips告诉,安装的时候会把文件分发到这里配置的集群上hadoop103,hadoop104

ips="hadoop102,hadoop103,hadoop104"

#2、端口不用动

sshPort="22"

#3、master是哪台机器,必须是上面ips中的其中一台或多台

masters="hadoop102" #多台:"hadoop102,hadoop103"

#4、dolphinscheduler的默认工作组,登入界面后的默认工作组就是指这里的,集群必须是上面ips中

workers="hadoop102:default,hadoop103:default,hadoop104:default"

#5、指定alertServer和apiServers的主机

alertServer="hadoop102"

apiServers="hadoop102"

#6、dolphinscheduler调度器的安装路径,安装在本用户用权限的下的路径。注意,集群启动后,以后要修改配置,就要在下面的路径下修改,在安装包路径下修改配置是无效的!!!

installPath="/opt/module/apache-dolphinscheduler-quality"

#7、部署用户,是dolphinscheduler调度器的启动用户,需要具有sudo的权限,并且配置免密

deployUser="hdfs"

 配置2:

vim /opt/software/apache-dolphinscheduler-3.0.0-bin/bin/env/dolphinscheduler_env.sh

#1、 配置javahome

export JAVA_HOME=/usr/local/jdk1.8.0_231

#2、配置mysql数据库,就是刚才创建的

export DATABASE="mysql"

export SPRING_PROFILES_ACTIVE=${DATABASE}

export SPRING_DATASOURCE_URL="jdbc:mysql://172.24.140.181:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8"

export SPRING_DATASOURCE_USERNAME="dolphinscheduler"

export SPRING_DATASOURCE_PASSWORD="dolphinscheduler"

#3、zk设置

export REGISTRY_TYPE=${REGISTRY_TYPE:-zookeeper}

export REGISTRY_ZOOKEEPER_CONNECT_STRING="172.24.140.181:2181,172.24.140.182:2181,172.24.140.183:2181"

4、hadoop、hive,环境变量配置,注意:后面要用python、datax时,也要在这里配置

export HADOOP_HOME=/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/hadoop

export HADOOP_CONF_DIR=/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/hadoop/etc/hadoop

export SPARK_HOME=/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/spark

export HIVE_HOME=/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/hive

配置3:

vim /opt/software/apache-dolphinscheduler-3.0.0-bin/tools/conf/common.properties

# 1、临时文件路径

data.basedir.path=/tmp/dolphinschedulerquality

# 2、资源存储: HDFS, S3, NONE。资源指的脚本、jar包等文件资源,上传到哪种数据库中,后面调用的时候会去这里找

resource.storage.type=HDFS

#3、在hdfs的根路径,资源会传到hdfs下面路径上

resource.upload.path=/dolphinscheduler

#4、没有开启kerberos就不用管,开启,就如下配置

# 4.1whether to startup kerberos

hadoop.security.authentication.startup.state=true

# 4.2java.security.krb5.conf path

java.security.krb5.conf.path=/opt/krb5.conf

# 4.3login user from keytab username

login.user.keytab.username=hdfs-mycluster@ESZ.COM

# 4.4login user from keytab path

login.user.keytab.path=/opt/hdfs.headless.keytab

# 4.5kerberos expire time, the unit is hour

kerberos.expire.time=2

# 6 操作hdfs的用户,需要是hdfs的超级用户,谁启动namenode,谁就是hdfs的超级用户,CDH上是hdfs用户,不知道如何查看hdfs用户,百度好了再填

hdfs.root.user=hdfs

#7 如果namenode HA 被开启, 需要复制 core-site.xml and hdfs-site.xml 到 conf 路径下,并且要改成集群的名称,如:hdfs://mycluster:8020

fs.defaultFS=hdfs://sd-140-181:8020

# 8 ds如何知道yarn是否运行完毕,就是通过下面的接口去访问yarn,默认不用改变

resource.manager.httpaddress.port=8088

#9 yarn的ip配置,根据是否启用ha,分为两种情况

# if resourcemanager HA is enabled, please set the HA IPs; if resourcemanager is single, keep this value empty

yarn.resourcemanager.ha.rm.ids=

# resourcemanager HA is enabled or not use resourcemanager, please keep the default value; If resourcemanager is single, you only need to replace ds1 to actual resourcemanager hostname

yarn.application.status.address=http://sd-140-181:%s/ws/v1/cluster/apps/%s

# job history status url when application number threshold is reached(default 10000, maybe it was set to 1000)

yarn.job.history.status.address=http://sd-140-181:19888/ws/v1/history/mapreduce/jobs/%s

配置3配置完后,也要对如下文件,配置相同参数

 vim /opt/software/apache-dolphinscheduler-3.0.0-bin/api-server/conf/common.properties

 vim /opt/software/apache-dolphinscheduler-3.0.0-bin/alert-server/conf/common.properties

 vim /opt/software/apache-dolphinscheduler-3.0.0-bin/master-server/conf/common.properties

3.放置jar包

官网说明一定要8.0.16版本及以上,我先在是8.0.30版本,不能直接从5.x版本的mysql拿,没有的话,就去下载!!!

$cp /usr/share/java/mysql-connector-java.jar /opt/software/apache-dolphinscheduler-3.0.0-bin/api-server/libs/

$cp /usr/share/java/mysql-connector-java.jar /opt/software/apache-dolphinscheduler-3.0.0-bin/alert-server/libs

$cp /usr/share/java/mysql-connector-java.jar /opt/software/apache-dolphinscheduler-3.0.0-bin/master-server/libs

$cp /usr/share/java/mysql-connector-java.jar /opt/software/apache-dolphinscheduler-3.0.0-bin/worker-server/libs

$cp /usr/share/java/mysql-connector-java.jar /opt/software/apache-dolphinscheduler-3.0.0-bin/tools/libs

4、初始化、启动dolphinschedule调度器

   初始化数据库

$cd /opt/software/apache-dolphinscheduler-3.0.0-bin/tools/bin

$sh upgrade-schema.sh

 安装、并启动

$ cd /opt/software/apache-dolphinscheduler-3.0.0-bin/bin/

$ sh install.sh

浏览器访问地址 http://test01:12345/dolphinscheduler/ui/login 即可登录系统UI。

默认的用户名和密码是 admin/dolphinscheduler123 

启动后,时区问题:

通过修改在bin/env/dolphinscheduler_env.sh 加上export SPRING_JACKSON_TIME_ZONE=Asia/Shanghai重启服务就可以了

或者启动后,在界面的右上角能改时间

有什么不懂可以留言,会经常查看的

精彩文章

评论可见,请评论后查看内容,谢谢!!!
 您阅读本篇文章共花了: