Implementation challenges in Big Data - Dr. Nilesh Karnik

17
Aureus Claims Solution Copyright 2013 RESTRICTED CIRCULATION Footer Option 2 Implementation Challenges in Big Data Analytics Dr. Nilesh N. Karnik

description

In todays competitive environment companies are faced with different types of challenges. Implementation of Big Data is one of them. Dr. Nilesh Karnik takes us through some of them.

Transcript of Implementation challenges in Big Data - Dr. Nilesh Karnik

Page 1: Implementation challenges in Big Data - Dr. Nilesh Karnik

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2Implementation Challenges in Big Data Analytics

• Dr. Nilesh N. Karnik

Page 2: Implementation challenges in Big Data - Dr. Nilesh Karnik

Copyright 2013 RESTRICTED CIRCULATION 2

The Challenge of BIG Data

ADVANCED Analytics

SOLUTIONS in the Pipeline

What we will discuss

Page 3: Implementation challenges in Big Data - Dr. Nilesh Karnik

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

3

Big Data : Distributed Processing

OLD IDEA NEW IDEA

!

Page 4: Implementation challenges in Big Data - Dr. Nilesh Karnik

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

5

EXAMPLE 1: Task of storing books on a shelf

Simple, right?

Image source Flickr. Image copyright belongs with original artist.

Page 5: Implementation challenges in Big Data - Dr. Nilesh Karnik

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

6

EXAMPLE 1: Task of storing books on a shelf

And now?

Image source Flickr. Image copyright belongs with original artist.

Page 6: Implementation challenges in Big Data - Dr. Nilesh Karnik

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

7

Image source Flickr. Image copyright belongs with original artist.

Page 7: Implementation challenges in Big Data - Dr. Nilesh Karnik

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

8

EXAMPLE 2 : Summarizing a Report

SUMMER PROJECT REPORT

Simple, right?

Page 8: Implementation challenges in Big Data - Dr. Nilesh Karnik

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

9

EXAMPLE 2 : Summarizing a Report

And now?

Page 9: Implementation challenges in Big Data - Dr. Nilesh Karnik

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

10

EXAMPLE 3 : Baking a Cake

Simple, right?And now?

Image source PINTEREST. Image copyright belongs with original artist.

Page 10: Implementation challenges in Big Data - Dr. Nilesh Karnik

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

11

Advanced Analytics

• Well developed tool set for “small data” environment

• Challenges in Big Data environment

Page 11: Implementation challenges in Big Data - Dr. Nilesh Karnik

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

12

Advanced Analytics: MapReduce Difficulties

ITERATIVE

Image source Flickr. Image copyright belongs with original artist.

Page 12: Implementation challenges in Big Data - Dr. Nilesh Karnik

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

13

Advanced Analytics: MapReduce Difficulties

INCREMENTAL PROCESSING REQUIRES RESTART

Image source Flickr. Image copyright belongs with original artist.

Page 13: Implementation challenges in Big Data - Dr. Nilesh Karnik

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

14

Advanced Analytics: MapReduce Difficulties

BATCH LEARNING SCANS ALL DATA IN ONE GO

Page 14: Implementation challenges in Big Data - Dr. Nilesh Karnik

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

15

Some Solutions Data Scientists are working on

New frameworks• E.g., HaLoop*, PrIter# (Extensions of Hadoop)

• Percolator$ (Proprietary Google framework)

* Y. Bu, B. Howe, M. Balazinska, and M. Ernst, “HaLoop: Efficient iterative data processing on large clusters”, VLDB, 2010.# Y. Zhang, Q. Gao, L. Gao and C. Wang, “PrIter: A distributed framework for prioritized iterative computations”, SoCC, 2011. $ D. Peng and F. Dabek, “Large-scale incremental processing using distributed transactions and notifications”, OSDI, 2010

Page 15: Implementation challenges in Big Data - Dr. Nilesh Karnik

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

16

Some Solutions Data Scientists are working on

Smarter algorithms / Different implementations

• Random forest

• Parallelized Stochastic Gradient Descent

Page 16: Implementation challenges in Big Data - Dr. Nilesh Karnik

Copyright 2013 RESTRICTED CIRCULATION 17

@[email protected]

Page 17: Implementation challenges in Big Data - Dr. Nilesh Karnik

Aureus Claims Solution

Copyright 2013 RESTRICTED CIRCULATION

Footer Option 2

SINGAPOREAureus Analytics Pte. Ltd.17, Phillip Street,#05-01, Grand BuildingSingapore (048695)

INDIAAureus Analytics Pvt. Ltd.

706, Powai Plaza

Hiranandani Gardens, Powai

Mumbai – [email protected] www.aureusanalytics.com

Thank You!