A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

download A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

of 12

Transcript of A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

  • 7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

    1/12

    International Journal of Advance Foundation and Research in Computer (IJAFRC)

    Volume 2, Issue 12, Decemer ! 2"1#$I%%& 2' * #', Impact Factor * 1$'1+

    " - . 2"1#, IJAFRC All Ri/hts Reserved 000$ifarc$or/

    A Disparateness!A0are %chedulin/ usin/ !Centroids

    Clusterin/ and 3%4 5echni6ues in 7adoop ClusterDr. E. Laxmi Lydia1,Dr. M.Ben Swarup2 ,Dr. Challa Narsimham3

    1Assoia!e "ro#essor, Depar!men! o# Compu!er Siene and En$ineerin$, %i$nan&s 'ns!i!u!e (#'n#orma!ion )ehnolo$y, %isa*hapa!nam, Andhra "radesh, 'ndia.

    2"ro#essor, Compu!er Siene and En$ineerin$, %i$nan&s 'ns!i!u!e (# 'n#orma!ion )ehnolo$y,

    %isa*hapa!nam, Andhra "radesh, 'ndia.3"ro#essor, Compu!er Siene and En$ineerin$, %i$nan&s 'ns!i!u!e (# 'n#orma!ion )ehnolo$y,

    %isa*hapa!nam, Andhra "radesh, 'ndia.

    elaxmi2++2yahoo.om,-#or-en$mail.om

    A 8 % 5 R A C 5

    8i/ data stora/e mana/ement is one of the most challen/in/ issues for 7adoop clusterenvironments, since lar/e amount of data intensive applications fre6uentl9 involve a hi/h de/ree

    of data access localit9$ In traditional approaches hi/h!performance computin/ consists dedicated

    servers that are used to data stora/e and data replication$ 5herefore to solve the prolems of

    Disparateness amon/ the os and resources a :Disparateness!A0are %chedulin/ al/orithm; is

    proposed in the cluster environment$ In this research 0or< 0e represent !centroids clusterin/

    in i/ data mechanism for 7adoop cluster$ 5his approach is mainl9 focused on the ener/9

    consumption in 5he 7adoop cluster, 0hich helps to increase the s9stem reliailit9$ 5he 7adoop

    cluster consists of resources 0hich are cate/ori=ed for minimi=in/ the schedulin/ dela9 in the

    7adoop cluster usin/ the !Centroids clusterin/ al/orithm$ A novel provisionin/ mechanism is

    introduced alon/ 0ith the consideration of load, ener/9, and net0or< time$ 89 inte/ratin/ thesethree parameters, the optimi=ed fitness function is emplo9ed for 3article %0arm 4ptimi=ation

    (3%4) to select the computin/ node$ Failure ma9 occur after completion of the successful

    e>ecution in the net0orperimental results e>hiit etter schedulin/

    len/th, schedulin/ dela9, speed up, failure ratio, ener/9 consumption than the e>istin/ s9stems$

    e90ords? !Centroids Clusterin/, 8i/ data, 7adoop Cluster, data access localit9, data replication,

    s9stem reliailit9, particle s0arm optimi=ation

    I$ I&5R4D@C5I4&

    'n reen! years, -i$ da!a has rapidly de/eloped in!o a ho!spo! !ha! a!!ra!s $rea! a!!en!ion #rom aademia,

    indus!ry, and e/en $o/ernmen!s around !he world 012. Na!ure and Siene ha/e pu-lished speial

    issues dedia!ed !o disuss !he oppor!uni!ies and hallen$es -rou$h! -y -i$ da!a 03, . M4insey, !he

    well5*nown mana$emen! and onsul!in$ #irm, alle$ed !ha! -i$ da!a has pene!ra!ed in!o e/ery area o#

    !oday6s indus!ry and -usiness #un!ions and has -eome an impor!an! #a!or in produ!ion 07. 8sin$ and

    minin$ -i$ da!a heralds a new wa/e o# produ!i/i!y $row!h and onsumer impe!us. (69eilly Media e/en

    asser!ed !ha! :!he #u!ure -elon$s !o !he ompanies and people !ha! !urn da!a in!o produ!s; 0

  • 7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

    2/12

    International Journal of Advance Foundation and Research in Computer (IJAFRC)

    Volume 2, Issue 12, Decemer ! 2"1#$I%%& 2' * #', Impact Factor * 1$'1+

    1 - . 2"1#, IJAFRC All Ri/hts Reserved 000$ifarc$or/

    =ha! is -i$ da!a> So #ar, !here is no uni/ersally aep!ed de#ini!ion. 'n =i*ipedia, -i$ da!a is de#ined as

    :an all5enompassin$ !erm #or any olle!ion o# da!a se!s so lar$e and omplex !ha! i! -eomes di##iul! !o

    proess usin$ !radi!ional da!a proessin$ applia!ions;0?. @rom a maro perspe!i/e, -i$ da!a an -e

    re$arded as a -ond !ha! su-!ly onne!s and in!e$ra!es !he physial world, !he human soie!y, and

    y-erspae. ere !he physial world has a re#le!ion in y-erspae, em-odied as -i$ da!a, !hrou$h'n!erne!, !he 'n!erne! o# )hin$s, and o!her in#orma!ion !ehnolo$ies, while human soie!y $enera!es i!s

    -i$ da!a5-ased mappin$ in y-erspae -y means o# mehanisms li*e humanompu!er in!er#aes, -rain

    mahine in!er#aes, and mo-ile 'n!erne! 0. 'n !his sense, -i$ da!a an -asially -e lassi#ied in!o !wo

    a!e$ories, namely, da!a #rom !he physial world, whih is usually o-!ained !hrou$h sensors, sien!i#i

    experimen!s and o-ser/a!ions suh as -iolo$ial da!a, neural da!a, as!ronomial da!a, and remo!e

    sensin$ da!a, and da!a #rom !he human soie!y, whih is o#!en auired #rom suh soures or domains as

    soial ne!wor*s, 'n!erne!, heal!h, #inane, eonomis, and !ranspor!a!ion.

    Apahe adoop is a so#!ware #ramewor* !ha! suppor!s da!a5in!ensi/e dis!ri-u!ed applia!ions under a

    #ree liense. '! has -een used -y many -i$ !ehnolo$y ompanies, suh as AmaFon, @ae-oo*, Gahoo and

    'BM. adoop 01 is -es! *nown #or Map9edue and i!s dis!ri-u!ed #ile sys!em D@S. Map9edue idea is

    men!ioned in a Hoo$le paper 011, !o -e simply !he !as* o# Map9edue is ano!her proessin$ o# di/ide and

    onur. adoop 0 is aimed a! pro-lems !ha! reuire examina!ion o# all !he a/aila-le da!a. @or example,

    !ex! analysis and ima$e proessin$ $enerally reuire !ha! e/ery sin$le reord -e read, and o#!en

    in!erpre!ed in !he on!ex! o# similar reords. adoop uses a !ehniue alled Map9edue !o arry ou! !his

    exhaus!i/e analysis ui*ly. D@S $i/es !he dis!ri-u!ed ompu!in$ s!ora$e pro/ides and suppor!. )hey

    are !he !wo main su-proIe!s #or adoop pla!#orm. .adoop se! @'@( al$ori!hm as i!s de#aul! al$ori!hm.

    Aordin$ !o our researh o# al$ori!hm #or adoop, we #ound !ha! i! una-le !o sa!is#y !he demand o#

    users. =e anno! only *eep !he idea o# #irs! ome #irs! ser/ed. =e need !o !hin* a-ou! !he reuiremen!

    #orm !ha! has !he hi$her priori!y, -u! a! !he same !ime we also an *eep !he #airness !o o!her users. )hen

    we announed 45en!roids Clus!erin$ al$ori!hm in Bi$ Da!a5adoop Clus!er.

    1$1 7AD443 A&D 7DF%4VRVIB

    adoop Dis!ri-u!ed @ile Sys!em D@S0 is !he primary s!ora$e sys!em used -y adoop applia!ions.

    )he adoop dis!ri-u!ed #ile sys!em is desi$ned !o handle lar$e #iles mul!i5HB wi!h seuen!ial

    readJwri!e opera!ion. Eah #ile is -ro*en in!o hun*s, and s!ored aross mul!iple da!a nodes as loal (S

    !ra* o# o/erall #ile dire!ory s!ru!ure and !he plaemen! o# hun*s. Da!aNode repor!s all i!s hun*s !o

    !he NameNode a! -oo!up. Eah hun* has a /ersion num-er whih will -e inreased #or all upda!e.

    )here#ore, !he NameNode *now i# any o# !he hun*s o# a Da!aNode is s!ale !hose s!ale hun*s will -e

    $ar-a$e olle!ed a! a la!er !ime. )o read a #ile, !he lien! A"' will alula!e !he hun* index -ased on !he

    o##se! o# !he #ile poin!er and ma*e a reues! !o !he NameNode. )he NameNode will reply whih

    Da!aNodes has a opy o# !ha! hun*. @rom !his poin!s, !he lien! on!a!s !he Da!aNode dire!ly wi!hou!

    $oin$ !hrou$h !he NameNode.

    )o wri!e a #ile, lien! A"' will #irs! on!a! !he NameNode who will desi$na!e one o# !he replia as !he

    primary -y $ran!in$ i! a lease. )he response o# !he NameNode on!ains who is !he primary and who are

    !he seondary replias. )hen !he lien! push i!s han$es !o all Da!aNodes in any order, -u! !his han$e is

    s!ored in a -u##er o# eah Da!aNode. A#!er han$es are -u##ered a! all Da!aNodes, !he lien! send a

    :ommi!; reues! !o !he primary, whih de!ermines an order !o upda!e and !hen push !his order !o all

    o!her seondaries. A#!er all seondaries omple!e !he ommi!, !he primary will response !o !he lien!

    a-ou! !he suess. All han$es o# hun* dis!ri-u!ion and me!ada!a han$es will -e wri!!en !o an opera!ion

    lo$ #ile a! !he NameNode. )his lo$ #ile main!ain an order lis! o# opera!ion whih is impor!an! #or !he

  • 7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

    3/12

    International Journal of Advance Foundation and Research in Computer (IJAFRC)

    Volume 2, Issue 12, Decemer ! 2"1#$I%%& 2' * #', Impact Factor * 1$'1+

    2 - . 2"1#, IJAFRC All Ri/hts Reserved 000$ifarc$or/

    NameNode !o reo/er i!s /iew a#!er a rash. )he NameNode also main!ain i!s persis!en! s!a!e -y re$ularly

    he*5poin!in$ !o a #ile. 'n ase o# !he NameNode rash, a new NameNode will !a*e o/er a#!er res!orin$

    !he s!a!e #rom !he las! he*poin! #ile and replay !he opera!ion lo$.

    Figure 1: HDFS

    1$2 A3RD@C4VRVIB

    )he Map9edue #rame wor* 03 onsis!s o# a sin$le mas!er Ko-)ra*er and one sla/e )as*)ra*er per

    lus!er node. )he mas!er is responsi-le #or shedulin$ !he Io-s6 omponen! !as*s in !he sla/es,

    moni!orin$ !hem, and re5exeu!in$ any #ailed !as*s. )he sla/es exeu!ed !he !as*s as dire!ed -y !he

    mas!er. As men!ioned, Map9edue applia!ions are -ased on a mas!er5sla/e model 0

  • 7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

    4/12

    International J

    Volume 2, Is

    ' - . 2"1#, IJAFRC All Ri/hts Re

    II$ RA5DB4R

    adoop&s Map9edue opera!ion is

    similar !o ordinary, non5preemp!i

    !as* is assi$ned. As wha! ' ha/e lea

    2$1 FIR%5 I& FIR%5 4@5 (FIF4)?

    )his expression desri-es !he prin

    -y orderin$ proess -y #irs! ome,

    in nex! wai!s un!il !he #irs! is #inish

    2$2 R4@&D R48I& (RR)?

    'n ompu!er opera!ion, one me!ho

    !he ompu!er is !o limi! eah proe

    ano!her proess a !urn or O!ime5sl

    2$' 7IE75 3RI4RI5 FIR%5 (73F)?

    )he al$ori!hm shedulin$ proess,

    "riori!y se!!in$ #or !he num-er o#

    o# rea!ion is -ased on !he ini!ial

    proess anno! -e han$ed durin$

    rea!e an ini!ial priori!y !o de!erm

    hara!eris!is han$e.

    2$ BIE75D R4@&D R48I& (BRR

    =ei$h!ed 9ound 9o-in is a shed

    ueue in a ne!wor* in!er#ae arin#ini!esimal amoun!s o# da!a #ro

    nonemp!y ueue.

    urnal of Advance Foundation and Researc

    ue 12, Decemer ! 2"1#$I%%& 2' * #'

    served

    Figure 2: MapReduce

    ini!ia!i/e reues!in$ #rom !as*!ra*er !o Io-

    e shedulin$ opera!in$ sys!em, whih is ann

    ned a-ou! !he adoop al$ori!hms, !here are #

    iple o# a ueue proessin$ !ehniue or ser/i

    #irs! ser/ed -eha/ior, wha! omes in #irs! is h

    d. )his is !he de#aul! al$ori!hm in adoop.

    o# ha/in$ di##eren! pro$ram proess !a*e !ur

    ss !o a er!ain shor! !ime period, !hen suspen

    eO. )his is o#!en desri-ed as round5ro-in p

    eah will -e assi$ned !o handle !he hi$hes!

    !a!i when i! an -e dynami. S!a!i priori!y

    hara!eris!is o# !he proess or user reuir

    opera!ion. Dynami priori!y num-er re#ers

    ne when !he #irs! num-er, a#!er runnin$ in !h

    ?

    ulin$ disipline. Eah pa*e! #low or onne

    . '! is !he simples! approxima!ion o# $eneraeah nonemp!y ueue, =99 ser/es a nu

    in Computer (IJAFRC)

    , Impact Factor * 1$'1+

    000$ifarc$or/

    !ra*er .)he priniple is

    o! -e in!errup! one !he

    ur lassi al$ori!hms.

    in$ on#li!in$ demands

    ndled #irs!, wha! omes

    s usin$ !he resoures o#

    in$ !ha! proess !o $i/e

    oess shedulin$.

    priori!y ready proess.

    um-er is in !he proess

    men!s iden!i#ied in !he

    !o !he proess and !hen

    e proess as !he proess

    !ion has i!s own pa*e!

    liFed. =hile H"S ser/es-er o# pa*e!s #or eah

  • 7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

    5/12

    International Journal of Advance Foundation and Research in Computer (IJAFRC)

    Volume 2, Issue 12, Decemer ! 2"1#$I%%& 2' * #', Impact Factor * 1$'1+

    - . 2"1#, IJAFRC All Ri/hts Reserved 000$ifarc$or/

    III$ DI%3ARA5&%% ABAR %C7D@I&EA33R4AC7I&7AD443C@%5R&VIR4&&5

    )his se!ion explains !he o/erall #low desrip!ion o# !he proposed Dispara!eness5aware shedulin$

    approah in lus!er en/ironmen!. 'ni!ially, !he Dispara!eness lus!er en/ironmen! is rea!ed alon$ wi!h!he proper!ies o# resoure suh as resoure !ype, proessin$ speed, and !he memory. 'n order !o a/oid !he

    shedulin$ delay, !he sys!em needs !o #orm a lus!er usin$ !he 45en!roids lus!erin$. Dependin$ up on

    hi$her priori!ies, !he node will mo/e !o !he lus!er. @ur!hermore, !he osine similari!y is #indin$ ou! !o

    ompu!e !he lus!ers. A#!er aomplishin$ !he lus!er, !he #i!ness #un!ion is es!ima!ed wi!h !he

    onsidera!ion o# load, ener$y, and !ime #or eah lus!er. )hus, !he lus!ers are sheduled and !hen

    exeu!ed a#!er uploadin$ !he load. (ne any #ailure ours durin$ !he proess, !he /alue mus! reompu!ed

    usin$ "S( and predi!ed ano!her op!imal node. @i$ure 3 depi!s !he o/erall #low dia$ram o# !he proposed

    me!hodolo$y. )he maIor omponen!s o# !he proposed sys!em are -rie#ly disussed as #ollows

    %84 D%CRI35I4&(,) )ime reuired #or i

    !h lus!er omple!ion in r!h lus!er

    resoure

    (,) Load o# r!hlus!er resoure a! i!hlus!er su-mission(,) Ener$y reuired #or i!hlus!er in r!hlus!er resoure 'n/o*in$ )ime o# r!hlus!er resoure

    (,) Exeu!in$ )ime o# i!hlus!er in r!hlus!er resoure(,) 9e!rie/in$ )ime o# i!hlus!er in r!hlus!er resoure

    i!hClus!er SiFe

    Clus!er "roessin$ Speed o# r!hlus!er resoure

    I!hClus!er SiFe SiFe o# r!hClus!er 9esoure

    'n/o*e Ener$y o# r!hlus!er resoure(,) Exeu!ion Ener$y o# i!hlus!er in r!hlus!er resoure r!hClus!er 9esoure "roessin$ Ener$y

    Table 1. Symbols and its descriptions

    '$1 !C&5R4ID% C@%5RI&E

    )he well5*nown lus!erin$ pro-lem an -e sol/ed -y 45means, whih is one o# !he simples! unsuper/isedlearnin$ al$ori!hms. Assume !he 4 num-er o# lus!ers #or lassi#yin$ a $i/en lus!er proessor in a simple

    and easy way. 45means lus!erin$ does no! ha/e a $uaran!ee #or op!imal solu!ion as !he per#ormane is

    -ased on an ini!ial en!roids. )hus, !he proposed sys!em uses !he par!i!ionin$ lus!erin$, say, 45en!roids

    lus!erin$ as desri-ed in !he #ollowin$ al$ori!hm. )a-le ' shows !he no!a!ion and i!s desrip!ion !ha!

    employed in !he proposed sys!em

  • 7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

    6/12

    International Journal of Advance Foundation and Research in Computer (IJAFRC)

    Volume 2, Issue 12, Decemer ! 2"1#$I%%& 2' * #', Impact Factor * 1$'1+

    # - . 2"1#, IJAFRC All Ri/hts Reserved 000$ifarc$or/

    Figure 3 An oerall !lo" diagram o! t#e proposed $%centroids clustering &n Hadoop 'luster.

    Al$ori!hm 1 45Cen!roids Clus!erin$

    'npu! Clus!er "roessorsC"n, * /alueBe$in

    'ni!ialiFe * en!roids C*

    @or i P +, 1, 2Qn !hen JJ Clus!er 9esoure "roper!y

    %1=9)i, C"Si, C"SiiR@or I P +, 1, 2Q* !hen JJ 45Cen!roid6s "roper!y

    %2= 9)I, C"SI, C"SiIRSimiI= CS(%1, %2)= %1 .%2

    |%1

    |.|%2

    |R

    End @or I

    Min'ndex P +R

    Min P Simi+Curr'ndex P +R

    =hile Curr'ndex * !hen

    '#SimiCurr_'ndex< !henMin = SimiCurr_'ndex

    Min'ndex P Curr'ndexR

    End '#

    Curr'ndexTTR

    End =hile

    Se!Clus!er'D Min'ndexR

    End #or iR

    Create Hadoop

    Cluster

    Get

    Compute Fitness function

    by Load Energy and Time

    CLUSTERING

    Compute K-

    Find cosine

    Compute

    Get Minimum value

    node in each cluster

    Schedule

    Upload

    Execute all

    Recomputed the nodes by

    Predict the Optimal

    If

    Failu

    re

    Get Resource

    Properties

    Resource

    Processing

    Memory

  • 7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

    7/12

    International Journal of Advance Foundation and Research in Computer (IJAFRC)

    Volume 2, Issue 12, Decemer ! 2"1#$I%%& 2' * #', Impact Factor * 1$'1+

    - . 2"1#, IJAFRC All Ri/hts Reserved 000$ifarc$or/

    End Be$inR

    )he inpu!s !a*en #or !he a-o/e 45en!roids lus!erin$ al$ori!hm are !he Clus!er proessors and !he 4

    /alue. A! #irs!, !he 4 num-er o# en!roids are ini!ialiFed and !he /e!ors %1and %2 are de#ined wi!h

    respe! !o !he re!rie/in$ !ime, Clus!er proessin$ speed, and siFe o# Clus!er resoure. Dependin$ up on!he /e!ors, !he osine similari!y is es!ima!ed. )hen, !he similari!y measures /eri#ies #or !he minimum

    /alue o# !he urren! index. '# !he /alue is said !o -e less, !hen i! se!s as a minimum /alue. '! he*s un!il

    !he urren! index is less !han !he * num-er o# en!roids. )he minimum index is se! as !he lus!er 'D.

    @ur!her, !he #i!ness #un!ion is es!ima!ed un!il all Io-s are sheduled -y !he onsidera!ion o# load, ener$y,

    and !ime #or eah lus!er. '! is de#ined as

    () = (,)+ (,)+ (,) (1)

    '$1$15I C43@5A5I4&

    )he !ime ompu!a!ion #or i!h lus!er omple!ion in r!h Clus!er resoure is desri-ed as

    (,)= + (,)+ (,) (2)=here, !he Exeu!in$ )ime o# i!h lus!er in r!h Clus!er resoure and 9e!rie/in$ )ime o# i!h lus!er in r!h

    Clus!er resoure are $i/en respe!i/ely as in eua!ion 3 and eua!ion .

    (,)=

    (3)

    (,)=

    !+ "! (4)erein, ! is !he -andwid!h #or !he sheduler !o !he reei/er and "!men!ions !he delay !ha! ours-e!ween !he sheduler and !he reei/er #or ne!wor* ommunia!ion.

    '$2$2 4AD C43@5A5I4&)he load ompu!a!ion o# r!hClus!er resoure a! i!hlus!er su-mission is as #ollows

    (,)= # $%0 &'() (5)

    '$2$'&RE C43@5A5I4&

    An ener$y reuired #or i!h lus!er in r!h lus!er resoure is de#ined as

    (,) = + (,) (6)=here, !he Exeu!ion Ener$y o# i!h lus!er in r!h Clus!er resoure is $i/en as

    = *

    (7)

    Based on !he onsidera!ion o# !ime, load, and ener$y, !he proposed es!ima!ion o# #i!ness #un!ion has !he

    #ollowin$ ad/an!a$es

    )he shedulin$ delay an -e a/oided in !he ini!ial s!a$e.

    MinimiFe !he ompu!a!ion os!

    Derease !he exeu!ion !ime

    '$2$ 3%48A%D DI%3ARA5&%%!ABAR %C7D@I&E 4D

    )he heuris!i op!imiFa!ion al$ori!hms are widely used #or sol/in$ a wide /arie!y o# N"5omple!e

    pro-lems. "S( is onsidered as !he la!es! e/olu!ionary op!imiFa!ion !ehniues, whih has #ewer

    al$ori!hm parame!ers !han !he o!her approahes.Algorit#m 2: Disparateness%A"are Sc#eduling using (S)

    Input Clus!erLis! Kn

    , Clus!er 9esoure C9m

  • 7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

    8/12

    International Journal of Advance Foundation and Research in Computer (IJAFRC)

    Volume 2, Issue 12, Decemer ! 2"1#$I%%& 2' * #', Impact Factor * 1$'1+

    + - . 2"1#, IJAFRC All Ri/hts Reserved 000$ifarc$or/

    4utput Alloa!ed Clus!er Lis! ALn

    8e/in

    Forx P +, 1, 2 Q n then

    )H9 P UR JJ )emporary Clus!er 9esoure lis!

    ForyP +, 1, 2 Q m thenIf K

    x. 9!ype(). eualsC9y. 9!ype()then)C9.add C9y)nd IFG

    nd ForyR

    SC9 P "S( )C9,Kx JJ Sele!ed Clus!er resoure

    Kx. se!lus!er9esoure(SC9)

    ALx- SH9nd ForxR

    nd 8e/inG

    )he inpu!s !a*en #or !his model are lus!er lis! and !he lus!er resoure. 'ni!ially, !he !emporary lus!er

    resoure lis! is emp!y and !he similari!y should -e #ur!her /eri#ied. '! he*s whe!her !he resoure !ype o#

    lus!er lis! is similar !o !he resoure !ype o# lus!er resoure. '# !he /eri#ia!ion is similar, !hen !he lus!er

    resoures are added !o !he !emporary lus!er resoure lis!. )he !ehniue o# "S( is applied #or !he

    !emporary lus!er resoure lis! and !he lus!er lis! !o aomplish !he sele!ed lus!er resoure. )he #inal

    ou!pu! o-!ained in !his shedulin$ model is !he alloa!ed lus!er resoure alon$ wi!h !he sele!ed lus!er

    resoure. =hen a #ailure ours durin$ !his proess, i! an -e reompu!ed -y "S( and !he o!her op!imal

    node will -e predi!ed. )he ad/an!a$es o/er !he he!ero$eneous5aware shedulin$ model are

    MinimiFe !he num-er o# #ailures

    'nreases !he resoure u!iliFa!ion

    IV$ 3RF4RA&CA&A%I%

    )his se!ion ompares !he per#ormane o# !he proposed Dispara!eness5Aware Shedulin$ DAS

    al$ori!hm wi!h !wo exis!in$ shedulin$ al$ori!hms 7ei/ht 3riorit9 First (73F)H1' and Bei/hted

    Round Roin (BRR)H1'.)he per#ormane me!ris used #or !he analysis are sys!em relia-ili!y,

    shedulin$ delay, shedulin$ len$!h, speed up, ener$y onsump!ion, and #ailure ra!io wi!h respe! !o !he

    num-er o# lus!ers. Due !o !he dynami resoure a/aila-ili!y, !he -eha/ior o# shedulin$ al$ori!hms on

    real lus!er pla!#orms is no! pra!ial. )he simula!ion is !he op!imum hoie #or !es!in$ and omparin$

    !he shedulin$ al$ori!hms, where !he experimen!s on real pla!#orms are o#!en non5reprodui-le. )hus, anex!ensi/e simula!ion en/ironmen! o# lus!er sys!em is -uil! as in )a-le ''.

    5RIC% 3ARA5R VA@

    Computin/

    &ode

    Num-er o# Clus!er 9esoure 7 5 7+

    Num-er o# os! per Clus!er 9esoure 7 5 1+

    "roessin$ Speed M'"S 1+++ 5 7+++

    Num-er o# 9esoure )ypes 2 5 7

    Communication Bandwid!h MB"S 1++ 5 7++La!enyms 7 5 1+

    Clusters Num-er o# Clus!ers 1+ 5 1++

  • 7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

    9/12

    International J

    Volume 2, Is

    - . 2"1#, IJAFRC All Ri/hts Re

    $1%%5 RIA8II5 V%$&@8R

    )he relia-ili!y o# !he sys!em a

    ma!hema!ially de#ined as #ollows

    ere, /is !he dis!ri-u!ion o# !@i$.2 desri-es !he sys!em relia-ili

    !he proposed DAS model. )he sys

    approahes, where i!s /alue is $rad

    Figure * T#e result o!

    $2 %C7D@I&E &E57 V%$&@8

    Shedule len$!h is measured as

    =

    Figure +. T#e result o

    0.

    1.

    2.

    System

    reliability

    urnal of Advance Foundation and Researc

    ue 12, Decemer ! 2"1#$I%%& 2' * #'

    served

    iFe o# Clus!ers M'1+

    7+

    Table 2. Simulation parameters

    4F C@%5R%

    -e alula!ed -y !he a/era$e relia-ili!y

    =# /01%1

    (8)e relia-ili!y pro-a-ili!y o# lus!ers'1, and ni

    y in !erms o# !he num-er o# lus!ers #or !he ex

    !em relia-ili!y o# DAS model is $rea!er !ha

    ually dereased wi!h respe! !o !he num-er o#

    ystem reliability "it# respect to t#e numbe

    4F C@%5R%

    max('1), '2, 2 , '0 9

    sc#edule lengt# "it# respect to t#e number

    0

    5

    1

    5

    2

    5

    10 20 30 40 50 60 70 80 90Number of Clusters

    HPF WRR DAS

    in Computer (IJAFRC)

    , Impact Factor * 1$'1+

    000$ifarc$or/

    ++5

    ++

    o# all lus!ers and is

    s !he num-er o# lus!ers.

    is!in$ "@ and =99 and

    !he o!her !wo exis!in$

    lus!ers.

    r o! clusters

    o! 'lusters.

  • 7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

    10/12

    International J

    Volume 2, Is

    - . 2"1#, IJAFRC All Ri/hts Re

    @i$. 7 depi!s !he resul! o# !he sh

    =99 and !he proposed DAS model.

    $' %3D @3 V%$&@8R 4F C@%5

    )he speed up is !he ra!io o# !he se

    is ompu!ed as #ollows

    @i$.< shows !he resul! o# speed up

    and !he proposed DAS model. )he s

    approahes, where i!s /alue is $rad

    Figure ,T#e resul

    $' %C7D@I&E DA V%$&@8R

    Figure - T#e result o! sc#e

    @i$.? shows !he resul! o# sheduli

    9DS2 and MCMS and !he propose!he o!her !wo shedulin$ al$ori!hm

    $&RE C4&%@35I4& V%$&@8

    urnal of Advance Foundation and Researc

    ue 12, Decemer ! 2"1#$I%%& 2' * #'

    served

    edule len$!h in !erms o# num-er o# lus!ers

    )he shedule len$!h is lower !han !he o!her !

    R%

    en!ial exeu!ion !ime !o !he shedule len$!h

    3

    # # 450

    46789././9.

    1+

    i!h respe! !o !he num-er o# lus!ers #or !he

    peed up o# AS model is hi$her !han !he o!he

    ually inreasin$ wi!h re$ard !o !he num-er o#

    t o! speed up "it# respect to t#e number o!

    4F C@%5R R%4@RC

    uling delay "it# respect to t#e number o! c

    $ delay wi!h respe! !o !he num-er lus!er

    d AS model. )he proposed sys!em has a lows.

    R 4F C@%5R R%4@RC

    in Computer (IJAFRC)

    , Impact Factor * 1$'1+

    000$ifarc$or/

    or !he exis!in$ "@ and

    o exis!in$ sys!ems.

    # !he ou!pu! shedule. '!

    xis!in$ "@ and =99

    !wo exis!in$

    lus!ers.

    lusters.

    uster resources$

    esoure #or !he exis!in$

    er shedulin$ delay !han

  • 7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

    11/12

    International J

    Volume 2, Is

    +" - . 2"1#, IJAFRC All Ri/hts Re

    @i$. shows !he resul! o# ener$y

    exis!in$ "@ and =99 and !he pr

    !he o!her !wo shedulin$ al$ori!h

    resoure.

    Figure T#e result o! energ

    1$' FAI@R RA5I4 V%$&@8R 4F

    )he resul! o# !he shedulin$ delay

    =99 and !he proposed DAS model

    !he o!her !wo exis!in$ shedulin$ a

    Figure /T#e result o! !a

    V$ C4&C@%I4&A&DF@5@R

    )his paper proposes a :Dispara!e

    researh wor* we represen! 45C

    approah is mainly #oused on !he

    sys!em relia-ili!y. )he adoop l

    shedulin$ delay in !he adoop lmehanism is in!rodued alon$ wi

    !hese !hree parame!ers, !he op!i

    urnal of Advance Foundation and Researc

    ue 12, Decemer ! 2"1#$I%%& 2' * #'

    served

    onsump!ion wi!h respe! !o !he num-er

    posed DAS model. )he proposed sys!em o

    s. '!s /alue $radually inreases in re$ards !

    consumption "it# respect to t#e number o

    C@%5R R%4@RC

    wi!h respe! !o !he num-er lus!er resoure

    is shown in @i$.. )he proposed sys!em has

    l$ori!hms.

    lure ratio "it# respect to t#e number o! 'lu

    4R

    ess5Aware Shedulin$ al$ori!hm; in !he lus

    n!roids lus!erin$ in -i$ da!a mehanism #

    ner$y onsump!ion in !he adoop lus!er, w

    s!er onsis!s o# resoures whih are a!e$or

    s!er usin$ !he 45Cen!roids lus!erin$ al$ori!h !he onsidera!ion o# load, ener$y, and ne!

    iFed #i!ness #un!ion is employed #or "ar!i

    in Computer (IJAFRC)

    , Impact Factor * 1$'1+

    000$ifarc$or/

    lus!er resoure #or !he

    sumes less ener$y !han

    o !he num-er o# lus!er

    cluster resource.

    or !he exis!in$ "@ and

    lower #ailure ra!io !han

    ter resource.

    !er en/ironmen!. 'n !his

    r adoop lus!er. )his

    ih helps !o inrease !he

    iFed #or minimiFin$ !he

    m. A no/el pro/isionin$or* !ime. By in!e$ra!in$

    le Swarm (p!imiFa!ion

  • 7/25/2019 A Disparateness-Aware Scheduling using K-Centroids Clustering and PSO Techniques in Hadoop Cluster

    12/12

    International Journal of Advance Foundation and Research in Computer (IJAFRC)

    Volume 2, Issue 12, Decemer ! 2"1#$I%%& 2' * #', Impact Factor * 1$'1+

    +1 - . 2"1#, IJAFRC All Ri/hts Reserved 000$ifarc$or/

    "S( !o sele! !he ompu!in$ node. @ailure may our a#!er omple!ion o# !he suess#ul exeu!ion in !he

    ne!wor*. )o impro/e !he #aul! !olerane ser/ie, !he mi$ra!ion o# !he lus!er is #oused on !he par!iular

    #ailure node. )his an reompu!ed !he node -y "S( and !he orrespondin$ op!imal node is predi!ed. )he

    experimen!al resul!s exhi-i! -e!!er shedulin$ len$!h, shedulin$ delay, speed up, #ailure ra!io, ener$y

    onsump!ion !han !he exis!in$ sys!ems.

    VI$ RFR&C%

    01 %. Mayer5Shon-er$er, 4. Cu*ier, Bi$ Da!a A 9e/olu!ion )ha! =ill )rans#orm ow =e Li/e, =or*,

    and )hin*, ou$h!on Mi##lin arour!, 2+13.

    02 A. CuFForea, "ri/ay and seuri!y o# -i$ da!a urren! hallen$es and #u!ure researh

    perspe!i/es, in "roeedin$s o# !he @irs! 'n!erna!ional =or*shop on "ri/ay and Seuri!yo# Bi$

    Da!a, "SBD 61, 2+1.

    03 Bi$ da!a, Na!ure 77?2+V 2++ 113