Debian如何安装HDFS_运维文库_资讯中心

发布时间:2026-04-28 13:00:31

阅读量:3

Prerequisites
Before installing HDFS on Debian, ensure your system is up-to-date and install essential tools:

sudo apt update && sudo apt upgrade -y
sudo apt install wget ssh vim -y

These commands update package lists, upgrade installed packages, and install wget (for downloading Hadoop), ssh (for remote access), and vim (for configuration editing).

1. Install Java Environment
Hadoop requires Java 8 or higher. Install OpenJDK 11 (recommended for compatibility):

sudo apt install openjdk-11-jdk -y

Verify the installation:

java -version

You should see output indicating OpenJDK 11 is installed.

2. Create a Dedicated Hadoop User
For security and isolation, create a non-root user (e.g., hadoop) and add it to the sudo group:

sudo adduser hadoop
sudo usermod -aG sudo hadoop

Switch to the new user:

su - hadoop

This user will manage all Hadoop operations.

3. Download and Extract Hadoop
Download the latest stable Hadoop release (e.g., 3.3.6) from the Apache website:

wget https://downloads.apache.org/hadoop/common/hadoop-3.3.6/hadoop-3.3.6.tar.gz

Extract the archive to /usr/local/ and rename the directory for simplicity:

sudo tar -xzvf hadoop-3.3.6.tar.gz -C /usr/local/
sudo mv /usr/local/hadoop-3.3.6 /usr/local/hadoop

Change ownership of the Hadoop directory to the hadoop user:

sudo chown -R hadoop:hadoop /usr/local/hadoop

4. Configure Environment Variables
Set up Hadoop-specific environment variables in /etc/profile (system-wide) or ~/.bashrc (user-specific). Open the file with vim:

vim ~/.bashrc

Add the following lines at the end:

export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64  # Adjust if using a different Java version

Load the changes into the current session:

source ~/.bashrc

Verify the variables are set:

echo $HADOOP_HOME  # Should output /usr/local/hadoop

5. Configure SSH Passwordless Login
Hadoop requires passwordless SSH between the NameNode and DataNodes. Generate an SSH key pair:

ssh-keygen -t rsa -b 4096 -C "hadoop@debian"

Press Enter to accept default file locations and skip passphrase entry. Copy the public key to the local machine (for single-node clusters) or other cluster nodes:

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys

Test passwordless login:

ssh localhost

You should log in without entering a password.

6. Configure Hadoop Core Files
Navigate to the Hadoop configuration directory:

cd $HADOOP_HOME/etc/hadoop

Edit the following files to define HDFS behavior:

core-site.xml: Sets the default file system (HDFS) and NameNode address.

<configuration>
    <property>
        <name>fs.defaultFSname>
        <value>hdfs://namenode:9000value>  
    property>
configuration>

hdfs-site.xml: Configures replication factor (for fault tolerance) and data directories.

<configuration>
    <property>
        <name>dfs.replicationname>
        <value>1value>  
    property>
    <property>
        <name>dfs.namenode.name.dirname>
        <value>/opt/hadoop/hdfs/namenodevalue>  
    property>
    <property>
        <name>dfs.datanode.data.dirname>
        <value>/opt/hadoop/hdfs/datanodevalue>  
    property>
configuration>

mapred-site.xml: Specifies the MapReduce framework (YARN).

<configuration>
    <property>
        <name>mapreduce.framework.namename>
        <value>yarnvalue>
    property>
configuration>

yarn-site.xml: Configures YARN resource management.

<configuration>
    <property>
        <name>yarn.nodemanager.aux-servicesname>
        <value>mapreduce_shufflevalue>
    property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.classname>
        <value>org.apache.hadoop.mapred.ShuffleHandlervalue>
    property>
configuration>

7. Create HDFS Data Directories
Create the directories specified in hdfs-site.xml for NameNode and DataNode storage:

sudo mkdir -p /opt/hadoop/hdfs/namenode
sudo mkdir -p /opt/hadoop/hdfs/datanode
sudo chown -R hadoop:hadoop /opt/hadoop  # Change ownership to the hadoop user

8. Format the NameNode
The NameNode must be formatted once before starting HDFS. Run this command carefully (it will erase existing HDFS data):

hdfs namenode -format

You should see output indicating successful formatting.

9. Start HDFS Services
Start the HDFS daemons (NameNode and DataNode) using the start-dfs.sh script:

$HADOOP_HOME/sbin/start-dfs.sh

Check the status of HDFS processes with jps:

jps

You should see NameNode and DataNode running (along with other Java processes).

10. Verify HDFS Installation
Use HDFS commands to confirm the cluster is operational:

List the root directory:
```
hdfs dfs -ls /
```
Create a test directory:
```
hdfs dfs -mkdir -p /user/hadoop/input
```

Upload a local file to HDFS:

echo "Hello, HDFS!" > test.txt
hdfs dfs -put test.txt /user/hadoop/input/

Read the file from HDFS:

hdfs dfs -cat /user/hadoop/input/test.txt

You should see the output Hello, HDFS!

Troubleshooting Tips

Port Conflicts: Ensure ports like 9000 (NameNode) and 50070 (Web UI) are not blocked by your firewall.
Java Issues: Verify JAVA_HOME is correctly set in $HADOOP_HOME/etc/hadoop/hadoop-env.sh.
Permission Errors: Use chown to ensure the hadoop user owns all Hadoop-related directories.

以上就是关于“Debian如何安装HDFS”的相关介绍，筋斗云是国内较早的云主机应用的服务商，拥有10余年行业经验，提供丰富的云服务器、租用服务器等相关产品服务。云服务器资源弹性伸缩，主机vCPU、内存性能强悍、超高I/O速度、故障秒级恢复；电子化备案，提交快速，专业团队7×24小时服务支持！

简单好用、高性价比云服务器租用链接：https://www.jindouyun.cn/product/cvm