Page 1 of 9
Single Node Hadoop Cluster
Installation Guide
Please follow the steps listed below to setup Hadoop in Pseudo distributed mode.
1. Download vMWare Player
https://my.vmware.com/web/vmware/free#desktop_end_user_computing/vmware_player/5_0
https://www.vmware.com/tryvmware/?p=player
2. Download Ubuntu VM Image
http://www.momotrade.com/tool/vm/ubuntu1604t.html
4. Install SSH and setup passwordless SSH connection
sudo apt-get install openssh-server
Page 2 of 9
ssh-keygen -t rsa -P ""
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
5. Installing latest java on Ubuntu
sudo apt-get install open-jdk-8-jdk
Page 3 of 9
6. Download Hadoop
Download latest version of Hadoop software
http://apache.claz.org/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
wget http://apache.claz.org/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
7. Update .bashrc
tar xzf hadoop-2.7.3.tar.gz
--setup JAVA_HOME
open .bashrc file
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-i386
export HADOOP_HOME=/home/user/hadoop-2.7.3
Page 4 of 9
--Add Hadoop env vars to .bashrc
export HADOOP_HOME=/home/user/hadoop-2.7.3
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_INSTALL=$HADOOP_HOME
8. Update Hadoop Configuration files are hadoop-env.sh, core-
site.xml, hdfs-site.xml, mapred-site.xml and yarn-site.xml
cd $HADOOP_HOME/etc/hadoop
vi /home/user/hadoop-2.7.3/etc/hadoop/hadoop-env.sh
Add the property below to hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-i386
Page 5 of 9
--1. Update core-site.xml
vi /home/user/hadoop-2.7.3/etc/hadoop/core-site.xml
Add the property below to core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Page 6 of 9
--2. hdfs-site.xml
vi /home/user/hadoop-2.7.3/etc/hadoop/hdfs-site.xml
Add the property below to core-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
--3. yarn-site.xml
vi /home/user/hadoop-2.7.3/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
Page 7 of 9
--4. mapred-site.xml
vi /home/user/hadoop-2.7.3/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
--9. Format Namenode
hdfs namenode -format
Page 8 of 9
--10 Start Daemons
Go to /home/user/hadoop-2.7.3/sbin folder
10.1 Start HDFS daemons
start-dfs.sh
--Use jps command to check the list of daemons running on linux box
Jps
Page 9 of 9
ps -eaf | grep 'java'
10.2 Start YARN daemons
start-yarn.sh
10.3 Access Hadoop URLs
HDFS : http://localhost:50070/
RM UI: http://localhost:8088/
Top Related