【Hadoop】 Hadoop Getting Started 1: Installation and Configuration (Pseudo Distribution)

0 preparation

Linux Java environment configuration: https://blog.csdn.net/Tiezhu_Wang/article/details/113822949
Closed Linux Firewall: https://blog.csdn.net/Tiezhu_Wang/article/details/113861262
Install Firefox: https://blog.csdn.net/Tiezhu_Wang/article/details/113385544

1 download

Official website: https://hadoop.apache.org/releases.html

Or Baidu web disk: link: https://pan.baidu.com/s/1XHwHfBIu3fFSnqmtuH1p_A (extraction code: xysm)

2 installation

Install Hadoop in the /usr/local directory:

sudo tar -zxf ~/Downloads/hadoop-3.2.1.tar.gz -C /usr/local

Switch to the directory, you can check the decompression completed:

Change file permissions (“Hadoop” prev is the system username):

cd /usr/local
sudo chown -R hadoop ./hadoop-3.2.1/

3 Check if Hadoop is available

Hadoop can be used after decompression. Use the following command to view the Hadoop version:

/usr/local/hadoop-3.2.1/bin/hadoop version

    Hadoop

4 Configure the pseudo-distribution

4.1 Setting the Hadoop environment variable

vim ~/.bashrc

Add the following environment variables:

export HADOOP_HOME=/usr/local/hadoop-3.2.1
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

After outputting the change, the configuration becomes effective:

source ~/.bashrc

Switch to any directory, and check if the environment variable configuration was successful:

cd
hadoop version

The version information is the same as above, the configuration is successful

4.2 Change the configuration file

Hadoop Pseudo-Distribution needs to change two configuration files: core-site.xml and hdfs-size.xml
core-site.xml :

cd /usr/local/hadoop-3.2.1/etc/hadoop/
gedit ./core-site.xml

Add the following configuration, and save it after saving:

<configuration>
	<property>
		<name>hadoop.tmp.dir</name>
		<value>file:/usr/local/hadoop-3.2.1/tmp</value>
		<description>A base for other temporary directories.</description>
	</property>
	<property>
		<name>fs.defaultFS</name>
		<value>hdfs://localhost:9000</value>
	</property>
</configuration>

As shown in the picture:
Primary site

hdfs-site.xml :

cd /usr/local/hadoop-3.2.1/etc/hadoop/
gedit ./hdfs-site.xml

Add the following configuration, and save it after saving:

<configuration>
	<property>
		<name>dfs.replication</name>
		<value>1</value>
	</property>
	<property>
		<name>dfs.namenode.name.dir</name>
		<value>file:/usr/local/hadoop-3.2.1/tmp/dfs/name</value>
	</property>
	<property>
		<name>dfs.datanode.data.dir</name>
		<value>file:/usr/local/hadoop-3.2.1/tmp/dfs/data</value>
	</property>
</configuration>

As shown in the picture:
hdfs-site

5 Verify that the configuration was successful

Once the configuration is complete, format the Namenode:

hdfs namenode -format

Seeing the following prompts, the format is successful:

Open Namenode and Datanode Guardian:

start-dfs.sh

Then use JPS to check if it started successfully:
jps
You can see that three node processes have started. You can also visit Localhost:9870 to view files in HDF format:9870:

HDFS
Once you are logged in, you can view your HDFS home directory:
    HDFS
If the above information appears, the configuration is successful. Use the following command to close the Guardian process:

stop-dfs.sh

Leave a Comment