Big Data Cis

download Big Data Cis

of 7

Transcript of Big Data Cis

  • 8/9/2019 Big Data Cis

    1/15

    Big Data Analytics

    Shreekant Kadam

    XMBA - 58

  • 8/9/2019 Big Data Cis

    2/15

    What are we going to

    understand What is Big Data?

    Why we landed up there?

    To whom does it matter

    Are we ready to handle it?

    What are the concerns?

    Tools and Technologies

  • 8/9/2019 Big Data Cis

    3/15

    Simple to start

    What is the maximum file size you

    have dealt so far? Movies/iles/Streaming video that you have used?

    What have you o!served?

    What is the maximum download

    speed you get?

    Simple computation "ow much time to #ust transfer$

  • 8/9/2019 Big Data Cis

    4/15

    What is !ig data?

    %&very day' we create ($) *uintillion !ytes of data+ so much that ,-. of the data in the world todayhas !een created in the last two years alone$ hisdata comes from everywhere0 sensors used togather climate information' posts to social mediasites' digital pictures and videos' purchasetransaction records' and cell phone 12S signals to

    name a few$his data is %big data$3

  • 8/9/2019 Big Data Cis

    5/15

    "uge amount of data

     There are huge volumes of data inthe world:

    + From the beginning of recordedtime until 2003,

    + We created 5 billion gigabytes (eabytes! of data"

    + #n 20$$, the same amount was

    created every two days+ #n 20$3, the same amount of data

    is created every $0 minutes"

  • 8/9/2019 Big Data Cis

    6/15

    Big data spans three dimensions: Volume,

    Velocity and Variety Volume: %nter&rises are awash with ever'growing data of all ty&es, easily amassing

    terabyteseven &etabytesof information"  Turn $2 terabytes of Tweets created each day into im&roved &roduct sentiment

    analysis

    )onvert 350 billion annual meter readings to better &redict &ower consum&tion

    Velocity: *ometimes 2 minutes is too late" For time'sensitive &rocesses such ascatching fraud, big data must be used as it streams into your enter&rise in order to

    maimi+e its value" *crutini+e 5 million trade events created each day to identify &otential fraud

    naly+e 500 million daily call detail records in real'time to &redict customer churnfaster

     The latest # have heard is $0 nano seconds delay is too much"

    Variety: -ig data is any ty&e of data ' structured and unstructured data such as tet,sensor data, audio, video, clic. streams, log /les and more" ew insights are foundwhen analy+ing these data ty&es together"

    1onitor $00s of live video feeds from surveillance cameras to target &oints ofinterest

    %&loit the 04 data growth in images, video and documents to im&rove customersatisfaction

  • 8/9/2019 Big Data Cis

    7/15

    inally4$

    -ig' 6ata is similar to 7*mall'data but bigger

    "" -ut having data bigger it re8uires di9erenta&&roaches:

     Techni8ues, tools,architecture

    with an aim to solve new &roblems

    ;r old &roblems in a betterway

  • 8/9/2019 Big Data Cis

    8/15

    Whom does it matter 5esearch 6ommunity

    Business 6ommunity 7 8ew tools' new capa!ilities' new infrastructure' new

    !usiness models etc$'

    9n sectors

    Financial Services..

  • 8/9/2019 Big Data Cis

    9/15

    he Social :ayer in an ;nstrumented ;nterconnected World

    2+

    billion 

    people

    on the

    Web by

    end 2011

    30 billion RFID

    tags today (1.3B in 2005)

    4.6

    billion camea

    phones

    !old

    !ide

    100s of

    millions

    of GPS

    enabled  

    de"ices

    sold

    ann#ally

    76 million smat

    metes in 200$%

     200& by 201'

    12+ TBs o t!eet data

    e"ey day

    25+ TBs o log data

    e"ey day

       ?

       T   B  s  o   (

       d

      a   t  a

      e  "  e    y

       d  a  y

  • 8/9/2019 Big Data Cis

    10/15

    What does Big Data trigger?

    rom %Big Data and the We!0 Algorithms for Data ;ntensive Scala!le 6omputing3' 2h$D hesis' 1ianmarco

  • 8/9/2019 Big Data Cis

    11/15

    ypes of tools typically used

    in Big Data Scenario Where is the &rocessing hosted<

    6istributed server=cloud

    Where data is stored<

    6istributed *torage (eg: ma+on s3!

    Where is the &rogramming model<

    6istributed &rocessing (1a& >educe!

    ?ow data is stored and indeed<

    ?igh &erformance schema free database What o&erations are &erformed on the data<

    nalytic=*emantic @rocessing (%g">6F=;WA!

  • 8/9/2019 Big Data Cis

    12/15

    When dealing with Big Data is

    hard When the o&erations on data are com&le:

    %g" *im&le counting is not a com&le&roblem"

    1odeling and reasoning with data ofdi9erent .inds can get etremely com&le

    Bood news with big'data:

    ;ften, because of the vast amount of

    data, modeling techni8ues can getsim&ler (e"g", smart counting can re&lacecom&le model'based analytics!

    as long as we deal with the scale"

  • 8/9/2019 Big Data Cis

    13/15

    ime for thin

  • 8/9/2019 Big Data Cis

    14/15

    Why Big7Data?

    Cey enablers for the a&&earance and growth of7-ig'6ata are:

    + #ncrease in storageca&abilities

    + #ncrease in &rocessing

    &ower+ vailability of data

  • 8/9/2019 Big Data Cis

    15/15

    *I+,