CDH3 Single Node Installation Guide Dell Server...

37
CDH3 Single Node Installation Guide Dell Server Configuration Guide Ubuntu 10.04.4 LTS Desktop Installation Guide Created: 01-12-2015 Author: Hyun Kim Last Updated: 01-12-2015 Version Number: 0.1 Contact info: [email protected] [email protected]

Transcript of CDH3 Single Node Installation Guide Dell Server...

Page 1: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

CDH3 Single Node Installation Guide

Dell Server Configuration Guide

Ubuntu 10.04.4 LTS Desktop Installation

Guide

Created: 01-12-2015

Author: Hyun Kim

Last Updated: 01-12-2015

Version Number: 0.1

Contact info: [email protected]

[email protected]

Page 2: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

Downloading Ubuntu 10.04.4 LTS Desktop

1. In order to run CDH3, we need an operating system. For this

particular demonstration, we are going to use Ubuntu Desktop. No,

you don’t need to remove your current operating system. Ubuntu is

quite light and you can install in ON your current operating

system. The best part is, if you don’t like it, you can easily

remove it. No hard feelings. Sounds good? Let’s get started.

2. Before you do ANYTHING and I mean ANYTHING, you need to check what

operating system CDH3 supports. Our ultimate goal is to install

CDH3 on the Ubuntu. You can check the requirements for CDH3 by

clicking the link below:

http://www.cloudera.com/content/cloudera/en/documentation

/archives/cdh3/v3u6/CDH3-Quick-Start/cdh3qs_topic_2.html

Page 3: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

3. For this demonstration, we are going to install Ubuntu 10.04 LTS

(Lucid Lynx) Desktop 64-bit. CDH3 supports 32-bit operating

systems. Yet, according to Cloudera, “for production environments,

64-bit packages are recommended”. Therefore, be aware.

4. Ubuntu is a free operating system and you can download it from the

link below. Click the “64-bit PC (AMD64) desktop CD” link and it

will automatically start downloading.

http://old-releases.ubuntu.com/releases/lucid/

Page 4: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

5. If you successfully downloaded Ubuntu disk image file on Google

Chrome like I did, it will be saved on your Downloads folder.

However, if you are unsure where the file is saved, click the down

arrow button next to the download icon and it will give you a list

of options. Select ‘Show in folders’, which will open up the folder

where the Ubuntu disk image is downloaded.

5. Done!

Creating a bootable Ubuntu USB Flash Drive

1. We downloaded Ubuntu and now we need to install it. In this

tutorial, I’m trying to install the Ubuntu on a server. In order to

do this, I have a couple options. However, I have a laptop with

Page 5: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

Windows 7 installed on it and I happen to have a 7gb usb flash

drive. If you looked at the picture above you know what I’m going

to do. We are going to create a bootable Ubuntu USB Flash Drive!

This is already well explained on the official Ubuntu website. I

will leave some links below.

Creating a bootable USB stick on Windows.

http://www.ubuntu.com/download/desktop/create-a-usb-stick-on-

windows

Creating a bootable USB stick on Ubuntu.

http://www.ubuntu.com/download/desktop/create-a-usb-stick-on-ubuntu

Creating a bootable USB stick on OS X

http://www.ubuntu.com/download/desktop/create-a-usb-stick-on-mac-

osx

Download Universal USB Installer

http://www.pendrivelinux.com/universal-usb-installer-easy-as-1-2-

3/#button

Page 6: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

2. Since I’m currently using Windows 7, I will click on the first link

and follow the instruction. However, Universal USB installer failed

to recognize my USB Flash Drive. Therefore, I had to activate the

“Now Showing All Drives” button in order to select my USB Flash

Drive. On my computer, F is the USB Flash Drive.

Page 7: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

3. If you have anything on your USB flash drive, activate the format

option. As a matter of fact, my USB driver was already formatted

but just to be safe, I formatted it again on the installer. Now you

may click “create” button to create a bootable Ubuntu USB Flash

Drive.

Page 8: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

4. Once the installation is successfully done, you will see what’s

shown in the picture below. You’ve created a bootable Ubuntu USB

Flash Drive.

Page 9: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

Creating New Virtual Disk

1. In this tutorial, I’m using a Dell PowerEdge server. Turn on the

server and press Ctrl+R to run configuration utility. You will see

the screen below.

Page 10: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

2. Press ‘F2’ while ‘Controller 0’ is selected. Select “Create New VD”

and press Enter.

3. I’m going to set RAID Level: RAID-1. To select drivers, use the

spacebar. Use the tab key to go to “Basic Settings”. Name the VD

and I left the “Advanced Settings” unchanged. You may configure the

“Advanced Settings” as you wish if your server allows to. Select OK

and press Enter to create a new virtual disk.

Page 11: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

4. Let the virtual disk to initialize. This may take a couple hours

but it’s better to get it done now than later. Once it’s done,

restart the server by using the “Ctrl+Alt+Delete” key command.

Installing Ubuntu

1. Press F11 key to run BIOS boot manager after you restart the

server. Insert the Ubuntu bootable USB Flash Drive to the server.

2. You will see the options as shown in the picture below. Select

“Hard Drive C:” by using the down arrow key and select “From USB: “

option on the list. This will boot your USB flash drive.

Page 12: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

3. Select Install Ubuntu and press Enter.

Page 13: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

4. Change settings appropriately and press “Continue” button until

installation is being started.

Page 14: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an
Page 15: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

5. Wait until the installation is completed. Once the installation is

done, we are almost ready to install CDH3.

Download JDK and Install it

1. Download JDK from the link below.

http://www.oracle.com/technetwork/java/javase/downloads/java-

archive-downloads-javase6-419409.html#jdk-6u26-oth-JPR

2. Since “Cloudera recommends version 1.6.0_26” we will be installing

that version of JDK. To extract and install jdk-6u26-linux-x64.bin, open

Terminal and do the followings.

3. Copy the file to /usr/local by using the commands below

$ cd Downloads

(Assuming that the JDK file is saved on Downloads folder)

$ sudo cp jdk-6u26-linux-x64.bin /usr/local

Page 16: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

(this copies the file to /usr/local)

$ cd /usr/local

$ sudo sh jdk-6u26-linux-x64.bin

Download and installing CDH3 package

1. Now we are finally ready to install CDH3. Click the link below to

download CDH3 package. We installed Ubuntu 12.04 Lucid Lynx.

http://www.cloudera.com/content/cloudera/en/documentation/archives/

cdh3/v3u6/CDH3-Installation-Guide/cdh3ig_topic_4_4.html

Page 17: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

$ cd

$ sudo dpkg -i Downloads/cdh3-repository_1.0_all.deb

Install CDH3 on all hosts

$ sudo apt-get update

$ apt-cache search hadoop

$ sudo apt-get install hadoop-0.20 hadoop-0.20-native

(Press y and then enter to continue installation)

Install daemon-type you need. However, this is tutorial for a

single node cluster. Therefore, I will be installing all of them.

$ sudo apt-get install hadoop-0.20-namenode

Page 18: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

$ sudo apt-get install hadoop-0.20-datanode

$ sudo apt-get install hadoop-0.20-secondarynamenode

Page 19: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

$ sudo apt-get install hadoop-0.20-tasktracker

Page 20: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

$ sudo apt-get install hadoop-0.20-jobtracker

Add CDH3 Repository

Create a file by entering this command below:

$ sudo nano /etc/apt/sources.list.d/cloudera.list

Page 21: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

edit the file by adding these two lines below:

deb http://archive.cloudera.com/debian lucid-cdh3 contrib

deb-src http://archive.cloudera.com/debian lucid-cdh3 contrib

Press “Ctrl+x” to save and exit the file.

Add Repository Key

$ curl -s http://archive.cloudera.com/debian/archive.key | sudo apt-key add -

(this gave me an error explaining that ‘curl’ needs to be installed)

$ sudo apt-get install curl

Page 22: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

Set JAVA_HOME and HADOOP_HOME

1. Now we’ve installed all the hadoop packages. Are we done? No, not

quite yet. We need to set JAVA_HOME and HADOOP_HOME so that the

system can recognize what’s installed.

$ cd /usr/lib/hadoop-0.20/bin

$ nano ~/.bashrc

Once the file is opened, on the bottom of the file, copy and paste

these:

export HADOOP_HOME=/usr/lib/hadoop

export PATH=$PATH:/usr/lib/hadoop/bin

export JAVA_HOME=/usr/local/jdk1.6.0_26

export PATH=$PATH:/usr/local/jdk1.6.0_26/bin

Leave everything else unchanged.

Your java path might be different if you are using different Ubuntu

version or different version of jdk.

2. Run the following commands to see if JAVA_HOME and HADOOP_HOME are

set correctly.

3. If the commands output nothing, close the terminal and try the

commands again by reopening terminal.

4. If you still don’t get any output from the commands, go back to the

previous step and see if you misspelled anything or if there is any

extra comma when you edited .bashrc file.

Page 23: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

5. If you see what’s in the picture below, you’ve set JAVA_HOME and

HADOOP_HOOP properly.

Hadoop and Java Version

1. $ hadoop version

$ java -version

If it prints out something similar to what’s shown in the picture

above, you’ve done everything correctly so far.

Edit hadoop-env.sh

$ sudo gedit /usr/lib/hadoop/conf/hadoop-env.sh

I didn’t delete anything. I just added these two lines and that’s

good enough.

export JAVA_HOME=/usr/local/jdk1.6.0_26

export HADOOP_HOME=/usr/lib/hadoop

Page 24: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an
Page 25: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

Adding Dedicated users to Hadoop Group

$sudo gpasswd -a hdfs hadoop

$sudo gpasswd -a mapred hadoop

Edit core-site.xml

$ sudo gedit /usr/lib/hadoop/conf/core-site.xml

add this property between <configuration> </configuration>

<property> <name>hadoop.tmp.dir</name> <value>/usr/lib/hadoop/tmp</value> </property> <property>

Page 26: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

<name>fs.default.name</name> <value>hdfs://localhost:8020</value> </property>

$ sudo mkdir /usr/lib/hadoop/tmp

Page 27: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

$ cd /usr/lib/hadoop/

$ sudo chmod 750 /usr/lib/hadoop/tmp/

$ sudo chown hdfs:hadoop /usr/lib/hadoop/tmp/

Page 28: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

See how the tmp folder is not under “root root”? Instead, it is under

“hdfs hadoop”, which we just did.

To see this on your machine, use the command below:

$ls -la /usr/lib/hadoop/

hdfs-site.xml

$ sudo gedit /usr/lib/hadoop/conf/hdfs-site.xml

add this property between <configuration> </configuration>

<property> <name>dfs.permissions</name> <value>false</value> </property> <property> <name>dfs.name.dir</name> <value>/storage/name</value> </property> <property> <name>dfs.data.dir</name>

Page 29: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

<value>/storage/data</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property>

$cd

$cd /usr/lib/hadoop/conf

$ sudo mkdir storage

Page 30: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

$ sudo chmod 775 /storage/

$ sudo chown hdfs:hadoop /storage/

$ls -la /usr/lib/hadoop/conf/

Page 31: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

mapred-site.xml

$ sudo gedit /usr/lib/hadoop/conf/mapred-site.xml

add this property between <configuration> </configuration>

<property>

<name>mapred.job.tracker</name>

<value>hdfs://localhost:8021</value>

</property>

<property>

<name>mapred.system.dir</name>

<value>/home/cdh3/mapred/system</value>

</property>

<property>

<name>mapred.local.dir</name>

<value>/home/cdh3/mapred/local</value>

</property>

<property>

<name>mapred.temp.dir</name>

<value>/home/cdh3/mapred/temp</value>

</property>

Page 32: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

$cd

$ sudo mkdir /home/ chd3/mapred

$ sudo chmod 775 /home/ cdh3 /mapred

$ sudo chown mapred:hadoop /home/ cdh3 /mapred

User Assignment

Format namenode

Type and enter the commands below.

$ cd /usr/lib/hadoop/bin/

$ sudo -u hdfs hadoop namenode -format

Page 33: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

When I tried to format namenode, this error occurred. In this case, we

just need to edit a few things so that Hadoop-config can read

jdk1.6.0_26. No big deal.

First, to open hadoop-config

$ cd /usr/lib/hadoop/bin/

$ sudo gedit hadoop-config.sh

This should fix the problem. Save and try again.

$ sudo -u hdfs hadoop namenode -format

will give you this screen below

Page 34: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

Start Daemons

$ sudo /etc/init.d/hadoop-0.20-namenode start

$ sudo /etc/init.d/hadoop-0.20-secondarynamenode start

$ sudo /etc/init.d/hadoop-0.20-jobtracker start

$ sudo /etc/init.d/hadoop-0.20-datanode start

$ sudo /etc/init.d/hadoop-0.20-tasktracker start

$ netstat -ptlen

Page 35: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

Checking UI

On your internet browser, type

“localhost:50030”

to open the “NameNode” page

Page 36: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

On your internet browser, type

“localhost:50070”

to open the “Map/Reduce administration” page

Page 37: CDH3 Single Node Installation Guide Dell Server ...loganbright.com/wp-content/uploads/2015/01/CDH3.pdf · Downloading Ubuntu 10.04.4 LTS Desktop 1. In order to run CDH3, we need an

If you see the pages above, you have