栏目分类:
子分类:
返回
终身学习网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
终身学习网 > IT > 软件开发 > 后端开发 > Java

Spark3.X分布式集群部署

Java 更新时间:发布时间: 百科书网 趣学号
一、部署规划

hadoop高可用集群部署参考: Hadoop3.X分布式高可用集群部署

1.1 版本说明
软件版本
操作系统CentOS Linux release 7.8.2003 (Core)
JAVAjdk-8u271-linux-x64
Hadoophadoop-3.3.1
ScalaScala2.12.15
Sparkspark-3.1.2-bin-hadoop3.2
1.2 集群规划
hostnameIP组件
master172.16.20.200NameNodeSpark-Master
secondmaster172.16.20.201NameNodeSpark-Master
slave1172.16.20.202ZookeeperDataNodeNodeManageSpark-Worker
slave2172.16.20.203ZookeeperDataNodeNodeManageSpark-Worker
slave3172.16.20.204ZookeeperDataNodeNodeManageSpark-Worker
二、环境配置 2.1 配置Scala环境
  • master, slave1,slave2相同操作

下载解压

下载地址: https://downloads.lightbend.com/scala/2.13.6/scala-2.13.6.tgz

tar -zxf scala-2.12.15.tgz
mv scala-2.12.15 /usr/local/scala

配置环境变量

cat >> /etc/profile << 'EOF'
#SCALA
SCALA_HOME=/usr/local/scala
PATH=$SCALA_HOME/bin:$PATH
export PATH SCALA_HOME

EOF
source /etc/profile

验证

scala -version
Scala code runner version 2.12.15 -- Copyright 2002-2021, LAMP/EPFL and Lightbend, Inc.
2.2 节点免密登录

虽有节点做如下操作

ssh-keygen -t rsa
ssh-copy-id -i /root/.ssh/id_rsa.pub root@slave1
ssh-copy-id -i /root/.ssh/id_rsa.pub root@slave2
ssh-copy-id -i /root/.ssh/id_rsa.pub root@slave3
三、Spark集群部署 3.1 下载解压

下载地址: http://spark.apache.org/downloads.html

tar -zxf spark-3.1.2-bin-hadoop3.2.tgz -C /opt/hadoop/
ln -s /opt/hadoop/spark-3.1.2-bin-hadoop3.2 /usr/local/spark

各节点配置环境变量, /etc/profie下加入

cat >> /etc/profile << 'EOF'
SPARK_HOME=/usr/local/spark
PATH=$SPARK_HOME/bin:$PATH
export PATH SPARK_HOME

EOF
source /etc/profile
3.2 修改配置
cd $SPARK_HOME/conf

spark-env.sh

mkdir -pv /data/spark
cat > spark-env.sh << 'EOF'
export JAVA_HOME=/usr/java/jdk1.8/jdk1.8.0_271
export SCALA_HOME=/usr/local/scala
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=slave1:2181,slave2:2181,slave3:2181 -Dspark.deploy.zookeeper.dir=/spark"
export SPARK_LOCAL_DIRS=/data/spark
export SPARK_DRIVER_MEMORY=4g
export SPARK_WORKER_CORES=4
EOF

workers

cat > workers << EOFslave1slave2slave3EOF
3.3 同步配置
rsync -av /opt/hadoop/spark-3.1.2-bin-hadoop3.2 root@slave1:/opt/hadoop/rsync -av /opt/hadoop/spark-3.1.2-bin-hadoop3.2 root@slave2:/opt/hadoop/

并在节点上创建软连接

ln -s /opt/hadoop/spark-3.1.2-bin-hadoop3.2 /usr/local/spark
3.4 启动

master节点集群方式启/停spark集群

$SPARK_HOME/sbin/start-all.sh$SPARK_HOME/sbin/stop-all.sh

secondmaster节点单节点方式启/停Master

$SPARK_HOME/sbin/start-master.sh$SPARK_HOME/sbin/stop-master.sh
四、验证启动状态 5.1 命令查看

查看zk数据

zkCli.shls /spark[leader_election, master_status]

JPS查看

master节点

// JPS命令查看15928 Master

slave节点

// JPS命令查看11907 Worker
5.2 web页面查看

访问master和secondmaster的8080端口, 查看spark主页

master: Status: ALIVE

secondmaster: Status: STANDBY

五、高可用验证

停止master节点Master进程, 访问secondmaster的spark页面,查看状态是否切换为ALIVE

转载请注明:文章转载自 www.051e.com
本文地址:http://www.051e.com/it/274001.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 ©2023-2025 051e.com

ICP备案号:京ICP备12030808号