Prerequisites for Debian Hadoop High Availability (HA)
Before configuring HA, ensure you have:
- At least 3 Debian nodes (for NameNode, DataNode, JournalNode, ResourceManager roles) with static IPs and proper hostname/DNS resolution.
- Java (OpenJDK 8 or 11) installed on all nodes (
sudo apt install openjdk-11-jdk). - Hadoop (version 3.x recommended) downloaded and extracted on all nodes.
- Passwordless SSH configured between all nodes (using
ssh-keygenandssh-copy-id) for seamless communication. - ZooKeeper cluster (3 or 5 nodes) set up for coordination (critical for automatic failover).
1. Configure ZooKeeper Cluster (Coordination Service)
ZooKeeper is essential for monitoring NameNode/ResourceManager health and triggering automatic failover.
- Install ZooKeeper: On each ZooKeeper node, run:
sudo apt install zookeeper zookeeperd - Configure ZooKeeper: Edit
/etc/zookeeper/conf/zoo.cfgon all nodes to include cluster members:CreatedataDir=/var/lib/zookeeper clientPort=2181 server.1=zoo1:2888:3888 # Replace with your node hostnames server.2=zoo2:2888:3888 server.3=zoo3:2888:3888myidfile in/var/lib/zookeeperon each node with a unique ID (e.g.,1for zoo1,2for zoo2). - Start ZooKeeper: Run
sudo systemctl start zookeeperon all nodes and verify status withsudo systemctl status zookeeper.
2. Configure HDFS High Availability (NameNode HA)
HDFS HA eliminates the single point of failure (SPOF) of the NameNode using Active/Standby nodes and JournalNodes for metadata synchronization.
-
Modify
core-site.xml: Define the HDFS namespace and ZooKeeper quorum (for ZKFC):<property> <name>fs.defaultFSname> <value>hdfs://myclustervalue> property> <property> <name>ha.zookeeper.quorumname> <value>zoo1:2181,zoo2:2181,zoo3:2181value> property> -
Modify
hdfs-site.xml: Configure NameNode roles, shared storage (JournalNodes), and failover settings:<property> <name>dfs.nameservicesname> <value>myclustervalue> property> <property> <name>dfs.ha.namenodes.myclustername> <value>nn1,nn2value> property> <property> <name>dfs.namenode.rpc-address.mycluster.nn1name> <value>namenode1:8020value> property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2name> <value>namenode2:8020value> property> <property> <name>dfs.namenode.shared.edits.dirname> <value>qjournal://journalnode1:8485;journalnode2:8485;journalnode3:8485/myclustervalue> property> <property> <name>dfs.ha.automatic-failover.enabledname> <value>truevalue> property> <property> <name>dfs.client.failover.proxy.provider.myclustername> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvidervalue> property> <property> <name>dfs.ha.fencing.methodsname> <value>sshfencevalue> property> <property> <name>dfs.ha.fencing.ssh.private-key-filesname> <value>/root/.ssh/id_rsavalue> property> -
Start JournalNodes: On each JournalNode node, run:
hadoop-daemon.sh start journalnodeVerify with
jps(look forJournalNodeprocesses). -
Format and Start NameNodes:
- On the Active NameNode (e.g., namenode1), format the NameNode:
hdfs namenode -format - Start the NameNodes on both nodes:
start-dfs.sh - Check NameNode status with
hdfs haadmin -report(should show one Active, one Standby).
- On the Active NameNode (e.g., namenode1), format the NameNode:
3. Configure YARN High Availability (ResourceManager HA)
YARN HA ensures the ResourceManager (which schedules jobs) remains available even if one instance fails.
-
Modify
yarn-site.xml: Configure ResourceManager roles and ZooKeeper for state storage:<property> <name>yarn.resourcemanager.ha.enabledname> <value>truevalue> property> <property> <name>yarn.resourcemanager.cluster-idname> <value>yarn-clustervalue> property> <property> <name>yarn.resourcemanager.ha.rm-idsname> <value>rm1,rm2value> property> <property> <name>yarn.resourcemanager.zk-addressname> <value>zoo1:2181,zoo2:2181,zoo3:2181value> property> <property> <name>yarn.resourcemanager.ha.idname> <value>rm1value> property> -
Start YARN: On the Active ResourceManager (e.g., resourcemanager1), run:
start-yarn.shThe Standby ResourceManager (e.g., resourcemanager2) will automatically sync state from ZooKeeper.
4. Validate High Availability
- Check NameNode Status: Run
hdfs haadmin -reportto confirm one Active and one Standby NameNode. - Test Failover:
- Simulate Active NameNode failure (e.g.,
kill -9the NameNode process on the active node). - Wait 30–60 seconds (ZooKeeper election time) and run
hdfs haadmin -reportagain— the Standby should become Active.
- Simulate Active NameNode failure (e.g.,
- Check ResourceManager Status: Run
yarn node -listto verify the Active ResourceManager is handling requests. - Submit a Test Job: Run a simple MapReduce job (e.g.,
hadoop jar hadoop-mapreduce-examples.jar pi 10 100) to ensure the cluster functions during failover.
Key Notes for Production
- Use odd number of JournalNodes (3 or 5) to avoid split-brain scenarios.
- Secure ZooKeeper with authentication (e.g., SASL) in multi-tenant environments.
- Monitor cluster health with tools like Prometheus + Grafana or Ambari to detect issues early.
- Regularly back up NameNode metadata (stored in JournalNodes) to prevent data loss.
以上就是关于“Debian Hadoop如何实现高可用”的相关介绍,筋斗云是国内较早的云主机应用的服务商,拥有10余年行业经验,提供丰富的云服务器、租用服务器等相关产品服务。云服务器资源弹性伸缩,主机vCPU、内存性能强悍、超高I/O速度、故障秒级恢复;电子化备案,提交快速,专业团队7×24小时服务支持!
简单好用、高性价比云服务器租用链接:https://www.jindouyun.cn/product/cvm