Installing hadoop on ubuntu 16
-
Upload
enrique-davila -
Category
Data & Analytics
-
view
397 -
download
4
Transcript of Installing hadoop on ubuntu 16
Enrique Davila Big Data Instructor enrique.davila@gm
ail.com
1
Installing Hadoop on Ubuntu 16INSTALL OPEN JDK
10/24/2016
Enrique Davila Big Data Instructor [email protected]
2Install Java
Do I have Java? Type on terminal: java -version If I see the output below, then I don’t have java installed, follow
instructions next slide
10/24/2016
Enrique Davila Big Data Instructor [email protected]
3Install Java
Type: sudo apt-get install openjdk-8-jdk Type Y to continue the installation process (it will take a while to
complete the installation)
10/24/2016
Enrique Davila Big Data Instructor [email protected]
4Do I have java?
To confirm java ins installed on my Ubuntu system type: java –version You will see output below
10/24/2016
Enrique Davila Big Data Instructor [email protected]
5Install Openssh
Is mandatory to install openssh server:
sudo apt-get install openssh-server If ssh server is installed then generate keys, run command below:
ssh-keygen -t rsa Enter file, press enter Enter passphrase, press enter Enter same passphrase again press enter
10/24/2016
Enrique Davila Big Data Instructor [email protected]
6SSH Keys
Now we will copy the key to the user and host, in my case my user is hadoop and host is hadoopdev
ssh-copy-id hadoop@hadoopdev
10/24/2016
Enrique Davila Big Data Instructor enrique.davila@gm
ail.com
7
Download and Install HadoopDOWNLOAD HADOOP FROM APACHE WEB PAGE
10/24/2016
Enrique Davila Big Data Instructor [email protected]
8Download Apache Hadoop
Type in the terminal the following command to create new folder within my home linux folder, in this case/home/Hadoop/:
mkdir hadoop_install Then go into this new folder: cd hadoop_install And copy the command below: wget http://
www-eu.apache.org/dist/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
10/24/2016
Enrique Davila Big Data Instructor [email protected]
9Download Apache Hadoop
You will see windows reflecting the progress of the download
10/24/2016
Enrique Davila Big Data Instructor [email protected]
10Unzip Hadoop folder
Once download is complete Type the following command: tar -xvf hadoop-2.7.3.tar.gz Now you will see 2 folders, the new directory is called hadoop-2.7.3:
10/24/2016
Enrique Davila Big Data Instructor [email protected]
11Setup bashrc
This is the java location (very important for next steps):
Edit bashrc Type:
Sudo gedit ~/.bashrc
10/24/2016
Enrique Davila Big Data Instructor [email protected]
12Setup ~/.bashrc
Add this lines to the .bashrc Pls note on previous slide the java path is displayed, need to point
bashrc to the actual java path #HADOOP VARIABLES START export JAVA_HOME=/usr/lib/jvm/ java-1.8.0-openjdk-amd64 export HADOOP_INSTALL=/home/hadoop/hadoop_install export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin
10/24/2016
Enrique Davila Big Data Instructor [email protected]
13Testing hadoop installation
Type the following command to refresh ~/.bashrc changes (no need to restart)source ~/.basrch
Type the command below (if at this point you see an output like this you’re doing well)hadoop version
10/24/2016
Enrique Davila Big Data Instructor enrique.davila@gm
ail.com
14
Setup single nodeINSTALL OPEN JDK
10/24/2016
Enrique Davila Big Data Instructor [email protected]
15Point your java to hadoop conf file
Go to the path: /home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop Edit the file: sudo gedit Hadoop-env.sh
10/24/2016
Enrique Davila Big Data Instructor [email protected]
16Modifying hadoop-env.sh
Modify the value for Java Home in the file: hadoop-env.sh
10/24/2016
Enrique Davila Big Data Instructor [email protected]
17Modify core-site.xml
Create a folder called tmp in /home/hadoop/hadoop_install Add the following text to the core-site.xml , file is on the path:
/home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop<configuration><property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/hadoop_install/tmp</value> <description>A base for other temporary directories.</description> </property><property> <name>fs.default.name</name> <value>hdfs://localhost:54310</value> <description>The name of the default file system.</description> </property> </configuration>
10/24/2016
Enrique Davila Big Data Instructor [email protected]
18Modify mapred-site.xml
By default there is a file called: mapred-site.xml.template, needs to be renamed to mapred-site.xml and then add the code below:
File is on path: /home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop<configuration><property> <name>mapred.job.tracker</name> <value>localhost:54311</value> <description>The host and port that the MapReduce job tracker runs at. </description> </property>
10/24/2016
Enrique Davila Big Data Instructor [email protected]
19Modify hdfs-site.xml
We need to créate 2 new folders which will contain name node and data node:
I placed these 2 folders on: /home/hadoop/hadoop_install/
10/24/2016
Enrique Davila Big Data Instructor [email protected]
20Modify hdfs-site.xml
Add the code below in the file hdfs-site.xml, the paths for namnode and datanode are the 2 new folders you just created on previous slide.
<configuration>
<property> <name>dfs.replication</name> <value>1</value> </property>
<property> <name>dfs.namenode.name.dir</name> <value>file:///home/hadoop/hadoop_install/namenode</value> </property>
<property> <name>dfs.data.node.name.dir</name> <value>file:///home/hadoop/hadoop_install/datanode</value> </property>
</configuration>
#hdfs-site.xml is located on the path: /home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop
10/24/2016
Enrique Davila Big Data Instructor [email protected]
21Format the namenode
Run the following command: hadoop namenode –format
10/24/2016
Enrique Davila Big Data Instructor [email protected]
22Format the namenode part 2
If everything is ok you will see message below:
10/24/2016
Enrique Davila Big Data Instructor [email protected]
23Running Hadoop Single node
Run the command: startall.sh Then execute the command:
jps, you will see the following output
10/24/2016
Enrique Davila Big Data Instructor [email protected]
25Web Interface: localhost:50070
In the browser go to: localhost:50070
10/24/2016
Enrique Davila Big Data Instructor [email protected]
26Applies for:
This installation runs under: Ubuntu 16 Hadoop 2.7.3 Virtual Machine:
2 Processors 2 Gb Ram 2 Network Interface, 1 as Bridge, 2nd as Nat
10/24/2016
Enrique Davila Big Data Instructor [email protected]
27You need help?
Contact name: Enrique Davila Gutierrez [email protected]
10/24/2016