Implementation challenges in Big Data - Dr. Nilesh Karnik

Post on 09-May-2015

361 views 0 download

description

In todays competitive environment companies are faced with different types of challenges. Implementation of Big Data is one of them. Dr. Nilesh Karnik takes us through some of them.

Transcript of Implementation challenges in Big Data - Dr. Nilesh Karnik

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2Implementation Challenges in Big Data Analytics

• Dr. Nilesh N. Karnik

Copyright 2013 RESTRICTED CIRCULATION 2

The Challenge of BIG Data

ADVANCED Analytics

SOLUTIONS in the Pipeline

What we will discuss

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

3

Big Data : Distributed Processing

OLD IDEA NEW IDEA

!

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

5

EXAMPLE 1: Task of storing books on a shelf

Simple, right?

Image source Flickr. Image copyright belongs with original artist.

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

6

EXAMPLE 1: Task of storing books on a shelf

And now?

Image source Flickr. Image copyright belongs with original artist.

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

7

Image source Flickr. Image copyright belongs with original artist.

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

8

EXAMPLE 2 : Summarizing a Report

SUMMER PROJECT REPORT

Simple, right?

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

9

EXAMPLE 2 : Summarizing a Report

And now?

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

10

EXAMPLE 3 : Baking a Cake

Simple, right?And now?

Image source PINTEREST. Image copyright belongs with original artist.

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

11

Advanced Analytics

• Well developed tool set for “small data” environment

• Challenges in Big Data environment

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

12

Advanced Analytics: MapReduce Difficulties

ITERATIVE

Image source Flickr. Image copyright belongs with original artist.

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

13

Advanced Analytics: MapReduce Difficulties

INCREMENTAL PROCESSING REQUIRES RESTART

Image source Flickr. Image copyright belongs with original artist.

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

14

Advanced Analytics: MapReduce Difficulties

BATCH LEARNING SCANS ALL DATA IN ONE GO

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

15

Some Solutions Data Scientists are working on

New frameworks• E.g., HaLoop*, PrIter# (Extensions of Hadoop)

• Percolator$ (Proprietary Google framework)

* Y. Bu, B. Howe, M. Balazinska, and M. Ernst, “HaLoop: Efficient iterative data processing on large clusters”, VLDB, 2010.# Y. Zhang, Q. Gao, L. Gao and C. Wang, “PrIter: A distributed framework for prioritized iterative computations”, SoCC, 2011. $ D. Peng and F. Dabek, “Large-scale incremental processing using distributed transactions and notifications”, OSDI, 2010

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

16

Some Solutions Data Scientists are working on

Smarter algorithms / Different implementations

• Random forest

• Parallelized Stochastic Gradient Descent

Copyright 2013 RESTRICTED CIRCULATION 17

@nilesh_karnikNilesh@aureusanalytics.com

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

SINGAPOREAureus Analytics Pte. Ltd.17, Phillip Street,#05-01, Grand BuildingSingapore (048695)

INDIAAureus Analytics Pvt. Ltd.

706, Powai Plaza

Hiranandani Gardens, Powai

Mumbai – 400076info@aureusanalytics.com www.aureusanalytics.com

Thank You!