10 things you need to know about Spark
-
Upload
ibm-analytics -
Category
Documents
-
view
3.099 -
download
2
Transcript of 10 things you need to know about Spark
Suited for real-time applications—such as the
Internet of Things—where much or most of the
data analysis will be performed on cached, live
data, rather than stored, historical data.
.
Includes runtime engines that are optimized
for in-memory processing, streaming analytics,
graph analysis and machine learning.
Leverages existing
programming languages
such as Python,
Scala or SQL and
provides seamless
access to enterprise
data with familiar tools.
Boosts data scientist
productivity through
in-memory performance,
easier APIs, support for
any programming
language and more
workflows.
Evolves user investments in advanced analytics, machine
learning platforms and big data platforms such as Hadoop.
Parallelizes big data analytics models across distributed
in-memory clusters, combining SQL, streaming and graph
analytics within the same application.
Initially developed at University of California Berkeley’s
AMPLab starting in 2009 and deepened through efforts
of an expanding open-source community and industry.
Open-sourced in 2013 by the Apache
Software Foundation to top-level status.
Continues to gain
active members,
with the Apache
Spark community
now boasting over
465 contributors.
Adoption by a growing range of organizations as the future
of their big data analytics environment for new challenges
requiring in-memory, machine learning, stream computing
and graph analysis.
Hungry for more information on Spark?
Get started learning more about Spark today at
BigDataUniversity.com