Hadoop Big Data Lakes Keynote
-
Upload
mark-van-rijmenam -
Category
Technology
-
view
777 -
download
0
Transcript of Hadoop Big Data Lakes Keynote
Mark van Rijmenam Founder Datafloq
Big Data Strategist International Keynote speaker
Author Think Bigger
Data Lakes: Store Anything, Analyze Everything
Ingest and store multiple data
streams
From a variety of data sources
Enabling real-time in-memory &
in-database analytics Within an open and secure environment
Democratize Access to Data
6 Important Characteristics of a Hadoop Data Lake
BigDataLake
DataSource&Latencydemands
Agnos7c
Dataisstoredinna7veformat
BuildwithHadoop
framework
Role-basedAccesstoAllData
Mix&matchanydatasource
Completeflexibilityand
extremelyscalable
1) The Rise of IoT Data Lakes 2) Spark Enabled Data Lakes 3) The Appearance of Data-Lake-as-a-Service
The Challenges of the Internet of Things are the drivers behind the IoT Data Lake
Shared Standards and Infrastructure
Data Control and Access
Data Security
The IoT Data Lake
There are several important components of Apache Spark
General Purpose Engine Large-scale data
processing
In-memory distributed computing engine
Build quickly, iterate fast
Unified platform
Three important challenges solved with Data-Lake-as-a-Service Solutions
Governance
Data Preparation
Metadata
Management
Security
Data Breaches
Role-based
Access
Value
Reduce Complexity
Reduce Costs
Improve
Analytics
Thank you
@vanRijmenam
https://datafloq.com
Available on Amazon