Megatrends in · Data warehouse vs. data lake Characteristics Data Warehouse Data Lake Data...

Post on 29-Jun-2020

9 views 1 download

Transcript of Megatrends in · Data warehouse vs. data lake Characteristics Data Warehouse Data Lake Data...

Megatrends in the digital era

DITTAYA WANVARIE

JUNE 24, 2020

Industrial revolution

STEAM-POWERED FACTORIES

MASS PRODUCTION AND MANUFACTURING

DIGITIZATION

This Photo by https://www.the-vital-edge.com/words-as-bridge/ is licensed under CC BY

Computer Processing UnitProcessing unit◦ Mechanical systems: Gears

◦ Electro-mechanical systems: Relays

◦ Electronic systems: Vacuum tubes

◦ Electronic systems: Transistors

Quantum computers?

Computer StorageStorage◦ Punch cards

◦ Magnetic tape

◦ Magnetic disks

◦ Optical discs

◦ Solid-state disks

Moore's lawT H E N U M B E R O F T R A N S I S TO R S I N A D E N S E I N T E G R AT E D C I R C U I T ( I C ) D O U B L E S A B O U T E V E R Y T W O Y E A R S .

ENIAC vs a smartphone

This Photo by Unknown Author is licensed under CC BY-SA This Photo by Unknown Author is licensed under CC BY

The 4th industrial revolution

Klaus Schwab, World Economic Forum 2016

How do we adapt our lives from the digital technology

Artificial intelligence

3D printing

Internet of things

MOOC

BIO-TECH DIGITAL-TECH NANO-TECH NEURO-TECH GREEN-TECH OTHERS

Crucial Emerging Technologies for the SDGs until 2030

UN Global Sustainable Development Report 2016

Keys in the digital era

PROCESSING POWER

STORAGE SIZE CONNECTIVITY BETWEEN SYSTEMS

CLEAN DATA!

How does the digital age change our lives?

Digitization

Document E-mails E-books, e-journal

Digital music

Workplace

VIRTUAL DESK : DESKTOP VIRTUAL MEETING ROOM

SHARED WORKSPACE

If we have digitized data

We can automatically generate/summarize the emailGmail: extract the event and add to the calendar

Automatic transcription of the meeting, even in a live event

AutomationROBOTIC PROCESS AUTOMATION EX: MICROSOFT AUTOMATE

Predefined workflow

This Photo by Unknown Author i s licensed under CC BY-SA

Artificial Intelligence

A model that tries to produce similar result to that from human

A bunch of human behavior records

Example applications of AI

Diagnosis of cancer from X-ray film or MRI

Face recognition Fraud detection

Stock price prediction

Personalized recommendation

Personal assistant : Siri, Google assistant, Alexa

Self-driving car

Data is the new oil

Super appWeChat

LINE

Kbank, SCB

Manufacturing industry

MachinesControl

Optimizing asset

performance

Quality assurance Automation Logistics

3D Printing

CHEAP PROTOTYPE CUSTOMIZATION ON-DEMAND

Bone 3D printingLim et al. [2018] 3D-Printed Personalized Titanium Implant Design,Manufacturing and Verification for Bone Tumor Surgery of Forearm.

This Photo by Lim et al. is licensed under CC BY

Internet of things

SURVEILLANCE CAMERA

SMART FARMING CONTACT TRACING/TRACKING

Amazon warehouse and a fleet of robots

This Photo by Unknown Author is licensed under CC BY-ND

Amazon fulfillment center

https://www.youtube.com/watch?v=YL9XjyXsKKk

MOOCCoursera

edX

Chula MOOC

Thai MOOC

How should we follow the trends?

Data is the new oilSystematically collect data◦ Extract (from the source)

◦ Transform

◦ Load (to the target)

Write logic to make use of the data

Data warehouse

https://www.syncsort.com/en/glossary/etl

Data integration is not easyAn example in a university

Office of the registrar

Student profiles

Student performance

Course enrollment

Human resource managementStaff profile

Staff promotion and benefits

Resource management

List of facilities

Facility usages

Faculty admin

Assigning a course to a staff

Data lakehttps://aws.amazon.com/big-data/datalakes-and-analytics/what-is-a-data-lake/

Data warehouse vs. data lake

Characteristics Data Warehouse Data Lake

DataRelational from transactional systems, operational databases, and line of business applications

Non-relational and relational from IoT devices, web sites, mobile apps, social media, and corporate applications

SchemaDesigned prior to the DW implementation (schema-on-write)

Written at the time of analysis (schema-on-read)

Price/Performance Fastest query results using higher cost storage Query results getting faster using low-cost storage

Data Quality Highly curated data that serves as the central version of the truth

Any data that may or may not be curated (ie. raw data)

Users Business analystsData scientists, Data developers, and Business analysts (using curated data)

Analytics Batch reporting, BI and visualizationsMachine Learning, Predictive analytics, data discovery and profiling

https://aws.amazon.com/big-data/datalakes-and-analytics/what-is-a-data-lake/

Technology Concern

TECHNOLOGY LIMITATION

SECURITY OF THE INFRASTRUCTURE

LEGAL ASPECTS

Sample bias and wrong hypotheses

This Photo by Unknown Author i s licensed under CC BY-SA

Ransomware attack

Provincial Electricity Authority (PEA)

Hospitals

Accountability

AI JUDGE CREDIT ASSESSMENT AUTOPILOT

Social Impact

Digital literacy Digital divide Job

Data is the new oil!

FINAL REMARKS