Big Data and Health Care

54
BIG DATA DIGITAL HEALTH REVOLUTION Alex A0135681 Henri A0135487 Zheng A0121892 Pham A0095804 Yin A0119974 Kavitha A0110143 For information on other technologies, see http://www.slideshare.net/Funk98/presentations

Transcript of Big Data and Health Care

Page 1: Big Data and Health Care

BIG DATA DIGITAL HEALTH

REVOLUTIONAlex A0135681

Henri A0135487

Zheng A0121892

Pham A0095804

Yin A0119974

Kavitha A0110143

For information on other technologies, see http://www.slideshare.net/Funk98/presentations

Page 2: Big Data and Health Care

HAVE YOU EVER

VISITED A DOCTOR?

Page 3: Big Data and Health Care

ONE SIZE FITS ALL

Page 4: Big Data and Health Care

ONE SIZE FITS ALL

Page 5: Big Data and Health Care

FOOD FOR THOUGHTS

Page 6: Big Data and Health Care

FOOD FOR THOUGHTS

40,000

+PATIENTS DIE IN US

EACH

Page 7: Big Data and Health Care

BIG DATA DIGITAL HEALTH

REVOLUTION

Page 8: Big Data and Health Care

CONTENT

DATA

COLLECTION

SENSORS

DATA

PROCESSING

HARDWARE

DATA

ANALYZING

ALGORITHMS

SUSTAINABLE HEALTHCARE

SYSTEM

Page 9: Big Data and Health Care

CONTENT

DATA

COLLECTION

SENSORS

DATA

PROCESSING

HARDWARE

DATA

ANALYZING

ALGORITHMS

SUSTAINABLE HEALTHCARE

SYSTEM

TODAY

IN FUTURE

Page 10: Big Data and Health Care

SENSORS TODAY

iBGStar

iHealth wireless

pulse oximeter

Jawbon

e Withings smart body

analyser

Page 11: Big Data and Health Care

iBGStar

iHealth wireless

pulse oximeter

Jawbon

e Withings smart body

analyser

CALORI

ES

EATING

HABITS

SLEEP

BODY

TEMPERATUR

E

HEART

RATE

BLOOD

SUGAR

SENSORS TODAY

Page 12: Big Data and Health Care

CONTENT

DATA

COLLECTION

SENSORS

DATA

PROCESSING

HARDWARE

DATA

ANALYZING

ALGORITHMS

SUSTAINABLE HEALTHCARE

SYSTEM

TODAY

IN FUTURE

Page 13: Big Data and Health Care

DETECTIO

N

ANALYSIS DIAGNOSTIC

S

CELL

CULTURE

DRUG

DELIVERY

THERAPEUTI

CS

SENSORS IN FUTURE

Page 14: Big Data and Health Care

Continuous MicroCHIPS

Glucose MonitoringGoogle lens

MIT batteryless power

source

Parathyroid

hormone

microchip injection

SENSORS IN FUTURE

Sensor-Laden

Transdermal patch

Page 15: Big Data and Health Care

SENSORS IN FUTURE - BioMEMS and Microsystems

Page 16: Big Data and Health Care

SIZE

POWER

COMMUNICATIO

N

Page 17: Big Data and Health Care

SENSORS IN FUTURE - Micro supercapacitors

Laser-scribed graphene micro-supercapacitors

Page 18: Big Data and Health Care

SENSORS IN FUTURE - Reduction in MOSFET size

Page 19: Big Data and Health Care

SENSORS IN FUTURE - External communication

Page 20: Big Data and Health Care

SENSORS IN FUTURE - The trend in shrinking sells

Page 21: Big Data and Health Care

SENSORS IN FUTURE - BioMEMS and Microsystems

● Size decrease

● Better and smaller communication chips and algorithms

● micro supercapacitors

● This will facilitate the arrival of these new implantable chips

● Allows for non bothersome personal medicine

● Allow for more tailored medicine

● It will require more data analysis and more processing power

Page 22: Big Data and Health Care

CONTENT

DATA

COLLECTION

SENSORS

DATA

PROCESSING

HARDWARE

DATA

ANALYZING

ALGORITHMS

SUSTAINABLE HEALTHCARE

SYSTEM

Introduction

SSD vs HDD

Data Protection

Page 23: Big Data and Health Care

The Storage Medium used is

of More focus than the

Quantity of Storage used. It is

no longer one-size-fits-all

“Data Deluge” is

Fundamentally Changing the

way that Storage is

Approached.

HARDWAREIntroduction

Page 24: Big Data and Health Care

● Provide Real-time Or Near Real-time

Responses.

● Handle Huge Data Volumes Growing Rapidly

Key Characteristics of Big Data Infrastructure:

● High processing/IOPS performance

● Very Large Capacity.

HARDWAREWhat’s Key to Efficient Data Processing?

Page 25: Big Data and Health Care

KEY DIFFERENTIATOR

● Big Data is Largely Unstructured.

● Unstructured Data is Immutable

● Traditional File Systems have Built-in Functions to handle Insert/Update.

● Creates a Lot of Overhead in Terms of Performance, IOs Required to

Access Data and the Ability to Scale

HARDWAREWHY DO WE NEED A DIFFERENT APPROACH?

FIG. GROWTH OF UNSTRUCTURED DATA ANNUALLY

Page 26: Big Data and Health Care

● Objects in one Large, Scalable Pool of Storage

● Stores metadata – Information about the

object

● An Object ID is stored, to Locate the Data

● Objects are immutable

● No File System Hierarchy

Products:

● Scality’s RING architecture

● Dell DX

● EMC’s Atmos

HARDWAREOBJECT STORAGE – Choice of Storage

Page 27: Big Data and Health Care

CONTENT

DATA

COLLECTION

SENSORS

DATA

PROCESSING

HARDWARE

DATA

ANALYZING

ALGORITHMS

SUSTAINABLE HEALTHCARE

SYSTEM

Introduction

SSD vs HDD

Data Protection

Page 28: Big Data and Health Care

● Access Times

SSDs exhibit Virtually no Access

time

● Random I/O Performance of SSDSSD Delivers at least 6000 IO/Sec

15 times faster than HDD(400

IO/S)

● Reliability

SSDs 4-10 Times more Reliable

HARDWAREStorage Medium Solid-State Drive (SSD) or Hard Disk

Drive(HDD)

SSD

HDD

Page 29: Big Data and Health Care

REAL TIME APPLICATIONS OF SSD

● Read-Intensive Video-on-demand(VOD), and Image-Retrieval

Applications.

● Emerging Applications (Big Data/Hadoop/Cloud)

HARDWARECOMPARISON OF BOOT TIMES USING SSD & HDD

Page 30: Big Data and Health Care

2011Throughput 250 MB/s , Capacity 512GB

2014:1000 MB/s Data Transfer , Capacity 4TB

Standard 2.5 inch form factor

Further Scale Down of Flash

Lithography

Leads to Continued Performance Gains

and Greater Capacity Points.

HARDWARESolid-State drives SSDs & Moore’s Law

Fig 1.HDD Aerial Density follow Moore’s

Law

Fig2. Avg. Price Comparison of SSD Vs.

HDD

Page 31: Big Data and Health Care

CONTENT

DATA

COLLECTION

SENSORS

DATA

PROCESSING

HARDWARE

DATA

ANALYZING

ALGORITHMS

SUSTAINABLE HEALTHCARE

SYSTEM

Introduction

SSD vs HDD

Data Protection

Page 32: Big Data and Health Care

HARDWAREDATA PROTECTION – WHY DIFFER FROM TRADITIONAL

APPROACHES?

Page 33: Big Data and Health Care

RAID (REDUNDANT ARRAY OF INDEPENDENT DISKS)● Originally Designed for Small Capacity Disks.

● Longer Time taken to Restore a Failed Drive as Capacity Increase.

● To Shorten Longer Rebuild cycles, RAID Systems Ship with Faster Processors,

Leading to High Energy Consumption.

REPLICATION

● Copies Add Additional Costs: Typically 133% or more Additional Storage is

needed for each Additional Copy

● Storage System will get More Expensive as the amount of Data Increases

HARDWARELimitations of Traditional Approaches

Page 34: Big Data and Health Care

How Does it Work?

● Information Dispersal Algorithms (IDAs)

separate data into Unrecognizable slices of

information.

● It is then dispersed to Storage Nodes in

disparate Storage locations.

● It can be implemented Locally or

Distributed .

● Only a Pre-defined subset of the slices From

the Dispersed Storage Nodes is needed to fully

Retrieve all of the Data.

HARDWAREInformation Dispersal - Better Approach?

Page 35: Big Data and Health Care

● It is Resilient against Natural disasters or Technological failures, like

Drive failures, System Crashes and Network Failures.

● Data can still be Accessed in Real-time even if there are Multiple

Simultaneous Failures across a String of Hosting Devices, Servers or

Networks

● Five 9’s or More are Guaranteed with Overhead Low as 20% - As

Opposed To 3 Copies Requiring 200% Overhead.

HARDWAREBenefits of Information Dispersal

Page 36: Big Data and Health Care

HARDWARECost Savings from IDA in Petabyte Storage over RAID and

Replication

Page 37: Big Data and Health Care

When looking at Number of Years without Data loss, with a 99.99999% Confidence Level,

Information Dispersal doesn’t even appear on the Chart because even For a Large storage amount

like 524K Terabytes, the Confidence for Years without data loss is not within anyone’s

lifetime.(Theoretically Over 79 Million Years.)

HARDWARECost Savings from IDA in Petabyte Storage over RAID and

Replication

Page 38: Big Data and Health Care

When looking at Number of Years without Data loss, with a 99.99999% Confidence Level,

Information Dispersal doesn’t even appear on the Chart because even For a Large storage amount

like 524K Terabytes, the Confidence for Years without data loss is not within anyone’s

lifetime.(Theoretically Over 79 Million Years.)

HARDWARECost Savings from IDA in Petabyte Storage over RAID and

Replication

Page 39: Big Data and Health Care

CONTENT

DATA

COLLECTION

SENSORS

DATA

PROCESSING

HARDWARE

DATA

ANALYZING

ALGORITHMS

SUSTAINABLE HEALTHCARE

SYSTEM

Deal with huge

data

Machine learning

Page 40: Big Data and Health Care

How make the huge dataset to match the ICD 10?

ALGORITHMSDeal with the huge data

Page 41: Big Data and Health Care

ICD 10 Clinical

Modifications

69823

ICD CM Dataset • 3-7 characters

• Character 1 is alpha

• Character 2 is numeric

• Character 3-7 can be alpha

or numeric

ICD 10 Procedure

Coding System

76000

ICD 10 PCS Dataset • 7 characters

• Each one can be alpha or

numeric

• Numbers 0-9; letters A-H, J-

N, P-Z

ALGORITHMSICD 10 introduction

Page 42: Big Data and Health Care

Analytics Algorithms

Machine Learning

Image Retrieval system

Huge Nonstandard

Data Source (4V)

Data Feature Selection

Huge multiple

characters mapping

databases

Data Analytics

Volume

Velocity

Variety

Veracity

ALGORITHMSWhy we need big data

Page 43: Big Data and Health Care

CONTENT

DATA

COLLECTION

SENSORS

DATA

PROCESSING

HARDWARE

DATA

ANALYZING

ALGORITHMS

SUSTAINABLE HEALTHCARE

SYSTEM

Deal with huge data

Machine learning

Page 44: Big Data and Health Care

Diagnosis is a relatively straightforward

machine learning problem. Clinical

decision making is highly suited for

rule-based systems because of the

nature of the data, such as ICD-10

codes, medications, etc.,

ALGORITHMSMachine Learning in medical diagnosis

Page 45: Big Data and Health Care

ALGORITHMSPopular Imaging Modalities in Healthcare Domain

Page 46: Big Data and Health Care

ALGORITHMSMedical Image Retrieval System

Page 47: Big Data and Health Care

*ImageCLEF medical – competition on Medical Image Processing

Two main tasks:● Image–based retrieval● Case–based retrieval

source : http://www.imageclef.org/

# of images

ALGORITHMSDatabase of ImageCLEF Data Medical

competition

Page 48: Big Data and Health Care

• This is the classic medical retrieval task.

• Similar to Query by Image Example.

• Given the query image, find the most similar images.

http://www.imageclef.org/

# performance

ALGORITHMSImage base retrieval Algorithm

Performance = Difficulty * Accuracy

# of images Mean average

precision

Page 49: Big Data and Health Care

• This is a more complex

task; is closer to the

clinical workflow.

• A case description, with

patient demographics,

limited symptoms and test

results including imaging

studies, is provided (but

not the final diagnosis).

• The goal is to retrieve

cases including images

that might best suit the

provided case description.

http://www.imageclef.org/

ALGORITHMSCase-based retrieval Algorithm

Page 50: Big Data and Health Care

Speed Slow Fast

Accuracy Hard to keep Precision

Level to study Quite hard Easy to learn

Solution level Shallow Deep

Machine

Learning

NO YES

Result Hard to explain Perspective visualization

ALGORITHMSManual Calculate VS Software and Algorithm

Page 51: Big Data and Health Care

CONTENT

DATA

COLLECTION

SENSORS

DATA

PROCESSING

HARDWARE

DATA

ANALYZING

ALGORITHMS

SUSTAINABLE HEALTHCARE

SYSTEM

Technological

fusion

Page 52: Big Data and Health Care

TECHNOLOGICAL FUSION

BioMEMS Hardware Object Storage

Information Dispersal

Machine Learning

Page 53: Big Data and Health Care

More data can be

gathered to identify

patterns and

interactions

Doctors will use for

diagnosis and decision-

making

Health care costs will

decrease

Individual patient care

will improve

TECHNOLOGICAL FUSION CONCLUSION

Page 54: Big Data and Health Care

THANK YOUQ&A