Setting Up Hadoop 2.X for MultiNode

download Setting Up Hadoop 2.X for MultiNode

of 6

Transcript of Setting Up Hadoop 2.X for MultiNode

  • 8/9/2019 Setting Up Hadoop 2.X for MultiNode

    1/6

    Setting up hadoop 2.4.1 multi-node

    cluster in Ubuntu 14.04 64-bit1. Follow my post tosetup single node hadoop clusterand set it up in all your slave computers.

    2. One PC will be master, from where everything is controlled. All other PCs are slaves. NOTE: Wewill assume mypc1 is the master and other pc's are slaves

    !. "dit #osts file to say at which $P Address your computers %&aster and all slave PCs ' are and

    modify following lines accordingly.(O)"* #ostname can be different from the hostname in that PC. +

    sudo gedit /etc/hostsFor eample, if PC1 is the name of a computer and its $P is 1-.2--.1., hostname in host file entry can

    be mypc2 or anything.

    NOTE:remove line starting with 12/.1.-.1 in hosts file. And you should have same pc name % as specified

    by 12/.-.-.1 line, in this case PC1' in/etc/hostsfile and/etc/hostnamefile or else you will get the

    error host not found. And restart system for the changes to ta0e effect.

    127.0.0.1 localhost PC1

    10.200.1.7 mypc1

    10.200.1.8 mypc2

    10.200.1.9 mypc3

    10.200.1.10 mypc4

    . Configurations to be done in both &aster and lave Computers

    3eplace the code in core+site ml file with following code. Change mypc1to the name of

    your &aster PC % first change directory using cd/usr/local/hadoop/etc/hadoop '

    sudo gedit core-site.xml

    fs.default.name

    !alue"hd#s$//mypc1$%4310/!alue"

    The name of the default file system. A URI whose

    scheme and authority determine the FileSystem implementation. The

    uris scheme determines the config property !fs.S"#$%$.impl& naming

    the FileSystem implementation class. The uris authority is used to

    determine the host' port' etc. for a filesystem.

    http://kishorer747.blogspot.in/2014/09/setting-up-single-node-hadoop-241_20.htmlhttp://kishorer747.blogspot.in/2014/09/setting-up-single-node-hadoop-241_20.htmlhttp://kishorer747.blogspot.in/2014/09/setting-up-single-node-hadoop-241_20.htmlhttp://kishorer747.blogspot.in/2014/09/setting-up-single-node-hadoop-241_20.html
  • 8/9/2019 Setting Up Hadoop 2.X for MultiNode

    2/6

    3eplace the code in hdfs+site ml file with following code. % value of 3eplication should

    be e4ual to no. of lave computers. $n this case, .'

    sudo gedit hd#s-site.xml

    dfs.replication

    )

    *efault +loc, replication.

    The actual num+er of replications can +e specified when the file is

    created.

    The default is used if replication is not specified in create time.

    ds.data.dir

    /usr/local/hadoop/hds

    Directory to store fles in HDFS.

    This directory is not ormatted when namenode is ormatted.

    3eplace the code in mapred+site ml file with following code.% modify mypc1 to the name

    of your &aster PC '

    sudo gedit mapred-site.xml

    mapred.o!.trac"er

    mypc#$%&'##

    The host and port that the (ap)educe o! trac"er runs

    at. * +local+, then o!s are run in-process as a single map

  • 8/9/2019 Setting Up Hadoop 2.X for MultiNode

    3/6

    and reduce tas".

    3eplace the code in yarn+site ml file with following code. %3eplace mypc1with your

    &aster (odes nameor IP ddress'

    sudo gedit yar&-site.xml

    yarn.nodemanager.au2-services mapreduce3shu4e

    yarn.nodemanager.au2-services.mapreduce3shu4e.class

    org.apache.hadoop.mapred.Shu4eHandler

    yarn.resourcemanager.resource-trac"er.address

    mypc#$567%

    yarn.resourcemanager.scheduler.address

    mypc#$56'6

    yarn.resourcemanager.address

    mypc#$56%6

  • 8/9/2019 Setting Up Hadoop 2.X for MultiNode

    4/6

    5. Configurations to be done only in &aster Computer % NOTE:user should !e hduser in

    terminal " 3edo this step if you change $Ps later. First delete .ssh folder in hduser and generate new 0ey

    + % sudo rm -' /home/hduser/.ssh (( ssh-)eyge& -t rsa -P ** (( cat

    +,/.ssh/idrsa.pu "" +,/.ssh/authoried)eys '

    6.

    "nabling # Access, so that master can access all computers, including the master pc

    mypc1. % use id7dsa.pub in case id7rsa.pub doesnt wor0.'

    ssh-copy-id -i 8/.ssh/id3rsa.pu! hduser9mypc#

    ssh-copy-id -i 8/.ssh/id3rsa.pu! hduser9mypc7

    ssh-copy-id -i 8/.ssh/id3rsa.pu! hduser9mypc'

    ssh-copy-id -i 8/.ssh/id3rsa.pu! hduser9mypc&

    (ow )est # connection to all PCs using this commands. % sshfrom your pcs hduser,i.e type e#itafter you ssha pc and then sshagain to a new one.'

    ssh mypc#

    ssh mypc7

    ssh mypc'

    ssh mypc&

    Create a masters file and a slaves file to specify which PCs are masters and slaves.First

    change directory and paste following lines respectively+

    cd /usr/local/hadoop/etc/hadoop/

    sudo gedit masters

    mypc-

    sudo gedit sla!es

    mypc-

    mypc

    mypc

    mypc)

  • 8/9/2019 Setting Up Hadoop 2.X for MultiNode

    5/6

    Finally, output of &aster File should be % see usingcat/usr/local/hadoop/etc/hadoop/masters'

    mypc-

    Output of slaves file % also includes master pc mypc1 as we want to run programs on

    master pc also. ' % see usingcat/usr/local/hadoop/etc/hadoop/sla!es'

    mypc#

    mypc

    mypc

    mypc)

    /. )esting )ime88 + In $aster P%% NOTE: user should !e hduser in the terminal'

    Change dir+ cd/usr/local/hadoop/i&

    Format (amenode + hadoop &ame&ode #ormat 9y starting #:F daemons % start+dfs.sh ', the (ame(ode % and :atandoe ' daemon is

    started on &aster PC and :ata(ode daemons are started on all nodes % lave PCs '. And by

    tarting &ap3ed daemons % start+mapred.sh', start (ode&anager daemon in all PCs. (O)"*

    Chec0 if respective daemons are running by using ps

    cd /usr/local/hadoop/etc/hadoop start-ds.sh :: start-yarn.sh

    ;arining of

  • 8/9/2019 Setting Up Hadoop 2.X for MultiNode

    6/6

    1. remove temp. folders +sudo rm -' /tmp/52. Only in master+ Chec0 if ssh to all the pcs is wor0ing, without entering password every time.

    !. =ill all process if they are already running %:ont do it if you have any obs running' +

    sudo )ill -9 +6lso# -ti$8088 (( sudo )ill

    -9 +6lso# -ti$8042 (( sudo )ill -9 +6lso#-ti$%0070 (( sudo )ill -9 +6lso#-ti$%007% (( sudo )ill -9 +6lso#-ti$%0090. Format the namenode % should not as0 you to 3e+format the File+system and eit status should be )' +

    hadoop &ame&ode -#ormat5. From master, start all the daemons again.

    Comments are highly appreciated.