
hadoop高可用集群部署参考: Hadoop3.X分布式高可用集群部署
1.1 版本说明| 软件 | 版本 |
|---|---|
| 操作系统 | CentOS Linux release 7.8.2003 (Core) |
| JAVA | jdk-8u271-linux-x64 |
| Hadoop | hadoop-3.3.1 |
| Scala | Scala2.12.15 |
| Spark | spark-3.1.2-bin-hadoop3.2 |
| hostname | IP | 组件 | ||||
|---|---|---|---|---|---|---|
| master | 172.16.20.200 | NameNode | Spark-Master | |||
| secondmaster | 172.16.20.201 | NameNode | Spark-Master | |||
| slave1 | 172.16.20.202 | Zookeeper | DataNode | NodeManage | Spark-Worker | |
| slave2 | 172.16.20.203 | Zookeeper | DataNode | NodeManage | Spark-Worker | |
| slave3 | 172.16.20.204 | Zookeeper | DataNode | NodeManage | Spark-Worker |
下载解压
下载地址: https://downloads.lightbend.com/scala/2.13.6/scala-2.13.6.tgz
tar -zxf scala-2.12.15.tgz mv scala-2.12.15 /usr/local/scala
配置环境变量
cat >> /etc/profile << 'EOF' #SCALA SCALA_HOME=/usr/local/scala PATH=$SCALA_HOME/bin:$PATH export PATH SCALA_HOME EOF source /etc/profile
验证
scala -version Scala code runner version 2.12.15 -- Copyright 2002-2021, LAMP/EPFL and Lightbend, Inc.2.2 节点免密登录
虽有节点做如下操作
ssh-keygen -t rsa ssh-copy-id -i /root/.ssh/id_rsa.pub root@slave1 ssh-copy-id -i /root/.ssh/id_rsa.pub root@slave2 ssh-copy-id -i /root/.ssh/id_rsa.pub root@slave3三、Spark集群部署 3.1 下载解压
下载地址: http://spark.apache.org/downloads.html
tar -zxf spark-3.1.2-bin-hadoop3.2.tgz -C /opt/hadoop/ ln -s /opt/hadoop/spark-3.1.2-bin-hadoop3.2 /usr/local/spark
各节点配置环境变量, /etc/profie下加入
cat >> /etc/profile << 'EOF' SPARK_HOME=/usr/local/spark PATH=$SPARK_HOME/bin:$PATH export PATH SPARK_HOME EOF source /etc/profile3.2 修改配置
cd $SPARK_HOME/conf
spark-env.sh
mkdir -pv /data/spark cat > spark-env.sh << 'EOF' export JAVA_HOME=/usr/java/jdk1.8/jdk1.8.0_271 export SCALA_HOME=/usr/local/scala export HADOOP_HOME=/usr/local/hadoop export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=slave1:2181,slave2:2181,slave3:2181 -Dspark.deploy.zookeeper.dir=/spark" export SPARK_LOCAL_DIRS=/data/spark export SPARK_DRIVER_MEMORY=4g export SPARK_WORKER_CORES=4 EOF
workers
cat > workers << EOFslave1slave2slave3EOF3.3 同步配置
rsync -av /opt/hadoop/spark-3.1.2-bin-hadoop3.2 root@slave1:/opt/hadoop/rsync -av /opt/hadoop/spark-3.1.2-bin-hadoop3.2 root@slave2:/opt/hadoop/
并在节点上创建软连接
ln -s /opt/hadoop/spark-3.1.2-bin-hadoop3.2 /usr/local/spark3.4 启动
master节点集群方式启/停spark集群
$SPARK_HOME/sbin/start-all.sh$SPARK_HOME/sbin/stop-all.sh
secondmaster节点单节点方式启/停Master
$SPARK_HOME/sbin/start-master.sh$SPARK_HOME/sbin/stop-master.sh四、验证启动状态 5.1 命令查看
查看zk数据
zkCli.shls /spark[leader_election, master_status]
JPS查看
master节点
// JPS命令查看15928 Master
slave节点
// JPS命令查看11907 Worker5.2 web页面查看
访问master和secondmaster的8080端口, 查看spark主页
master: Status: ALIVE
secondmaster: Status: STANDBY
五、高可用验证停止master节点Master进程, 访问secondmaster的spark页面,查看状态是否切换为ALIVE