Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data...
Transcript of Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data...
![Page 1: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/1.jpg)
Español Mario Nemirovsky
![Page 2: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/2.jpg)
English Mario Nemirovsky
![Page 3: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/3.jpg)
Silicon Valley version Mario Nemirovsky
![Page 4: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/4.jpg)
4
In March 2012, the Obama administration announced the big data research and development initiative.
The leading IT companies, such as SAG, Oracle, IBM, Microsoft, SAP and HP, have spent more than $15 billion on buying data management and analytics software.
This industry on its own is worth more than $100 billion.
![Page 5: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/5.jpg)
1. ¿Cuán grande es big data?
2. ¿De dónde proviene la data?
3. ¿Dónde se guarda?
4. ¿Cómo se analiza?
5. ¿Cómo se visualiza? (luego)
6. ¿Quién lo necesita?
![Page 6: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/6.jpg)
Google was processing 20 PB a day in 2008
Wayback Machine had 3 PB +100 TB/month (3/2009)
Facebook has 2.5 PB of user data + 15 TB/day (4/2009)
eBay has 6.5 PB of user data + 50 TB/day (5/2009)
640K ought to be enough for anybody.
![Page 7: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/7.jpg)
Large Hadron Collider in 2012
40 000 000 000 000 B/S (40 TB/S)
Air Bus A380 Generate 640TB per Flight
Twitter Generate 12 TB of data per day
New York Stock Exchange 1TB of data everyday
Walmart alone had 30 Billion RFID sensors in 2012
![Page 8: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/8.jpg)
The Model of Generating/Consuming Data has Changed
8
Old Model: Few companies are generating data, all others are consuming data
New Model: all of us are generating data, and all of us are consuming data
![Page 9: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/9.jpg)
Lots of data is being collected ◦ Web data, e-commerce
◦ department/grocery stores
◦ Bank/Credit Card
◦ Social Network
◦ Health
◦ Genetics
Big Data Everywhere!
![Page 10: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/10.jpg)
![Page 11: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/11.jpg)
Source: Avanade Global Survey: The Business Impact of Big Data, November 2010
![Page 12: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/12.jpg)
Science ◦ Data bases from astronomy, genomics, environmental data,
transportation data, …
Humanities and Social Sciences ◦ Scanned books, historical documents, social interactions data,
new technology like GPS …
Business & Commerce ◦ Corporate sales, stock market transactions, census,
Entertainment ◦ Hollywood movies, MP3 files, …
Medicine ◦ MRI & CT scans, patient records, …
![Page 13: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/13.jpg)
HP envisions 1 trillion sensors in use around the world
There are many types of sensors temperature, pressure, level, humidity
speed, motion, distance
light or the presence/absence
![Page 14: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/14.jpg)
IoT: “expansion of connectivity” using IP networking of “things” into public and private IP networks, linking computing and storage resources, and also people
The “Industrialization” of IP networks reaches domains previously characterized by application specific, often non-IP networks
“Smart Objects” include
Organized into: Vehicles, Intelligent traffic controls and lighting elements, industrial automation, healthcare, etc.
Actuators: act on devices (e.g. turn on/off an engine, a light, close a valve, or even trigger a complex set of actions)
Sensors: measure power quality/voltage/…, pressure, mechanical constraints, video, pollution, gas/water/.. leaks, motion
Smart tags (RFID)
Rodolfo Milito
![Page 15: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/15.jpg)
Today’s Dominant Endpoints
Dominant Endpoints in 2025
Industrial Automation
Healthcare
Intelligent Buildings
Precision Agriculture Transportation and Connected Vehicles
A person behind every device Devices clustered in systems
Rodolfo Milito
![Page 16: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/16.jpg)
Non-trivial Extension of Cloud Computing from the Core to the Edge that enables a
whole new wave of services and applications
Virtualization, Multi-tenancy, & some distinctive features
fog = cloud close to the ground
Suites of Use Cases - (Mobile) Content Delivery
• Low latency Apps (gaming, streaming, augmented reality ...)
- Geo-distributed apps • Sensor/actuator networks, Smart Cities
- Large-scale distributed control systems • Connected Vehicle, Int.Transportation, Smart Grid
Fog is the platform where the Internet meets the physical world
Rodolfo Milito
![Page 17: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/17.jpg)
Grid Data Latency Hierarchy (taken from Jeff Taft)
Multiple uses of same datum (latency requirements/destinations)
FO
G
CLO
UD
in
terp
lay
Rodolfo Milito
![Page 18: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/18.jpg)
¿Dónde se guarda?
What makes big data different? Why isn't saving/moving/copying big data as simple as using the tools we already have?
![Page 19: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/19.jpg)
Big Data Store
• Difficult/slow transfers • Expense for storage/backup • Difficult to share and publish
![Page 20: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/20.jpg)
![Page 21: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/21.jpg)
The process of examining large amounts of data of various types to uncover hidden patterns, unknown correlations, and other valuable information.
![Page 22: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/22.jpg)
Predictive Power of Big Data Analytics in Healthcare
Analysis Of Farm Soil
Improving Oil and Gas Operations
Retailers are Using Big Data Analytics to Outperform Others
…..
![Page 23: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/23.jpg)
Need of immediate response time
– Can't afford latency of sending up and back the chain
Closed-loop control
– In controlling physical systems – cannot depend on speed and availability of resources back at the data center – e.g. smart traffic light system
Privacy, Data-ownership considerations
– Regulatory and business concerns may not allow moving the data
Improved scale and aggregate throughput via parallelism
◦ -- Data sources often naturally distributed
Avoid sending unnecessary Data
– Offload centralized resources that would otherwise have to filter through volumes of uninteresting/useless data.
Rodolfo Milito
![Page 24: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/24.jpg)
Analysis Sensing
Action
Data Centers (Central or Distributed)
Em
erg
ing F
ootp
rint fo
r D
istrib
ute
d In
tellig
ence &
Pro
cessin
g
Mobility NGN Cloud Security Video
Core
Multi-Service Edge
Edge (Embedded Systems and Sensors)
• “Data at Rest” aggregated collection and storage
• “Data at Rest” ETL and Analytics for Structured & Unstructured Data
• “What if” Analytics • Predictive Analytics • Streaming/CEP Analytics • Applications • Visualization & Reporting
• Networked Data Collection • Processing at the Edge
• Streaming ETL (e.g. Filtering, Transformation, Aggregation)
• Streaming/CEP Analytics • Real-time Alerts and Actions • Applications Execution
• Localized Visualization & Reporting
• Networked Data Collection • Processing at the Edge
• Streaming ETL (e.g. Cleansing, Filtering, Transformation, Aggregation)
• “Skinny” Streaming/CEP Analytics, Alerts and Actions
Rodolfo Milito
![Page 25: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/25.jpg)
• It is not just lots of data (structured?)
• It is not just exponential growth of data
• It is new ways of making sense over data that require changes to existing architectures.
• Big Data, the term, in its current use, implies many other things, like:
• Apache Hadoop Framework
• Commodity hardware leveraging Moore’s law
• Infinite scalability
• No data temples
![Page 26: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/26.jpg)
No single standard definition…
“Big Data” is unstructured data whose scale,
diversity, and complexity require new
architecture, techniques, algorithms, and
analytics to manage it and extract value and
hidden knowledge from it…
29
![Page 27: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/27.jpg)
Data Volume
Data volume is increasing exponentially
30
![Page 28: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/28.jpg)
Various formats, types, and structures
Text, numerical, images, audio, video, sequences, time series, social media data, multi-dim arrays, etc…
Static data vs. streaming data
31
![Page 29: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/29.jpg)
Data is begin generated fast and need to be processed fast
Online Data Analytics
33
![Page 30: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/30.jpg)
how we can capture the most important data as it happens and deliver that to the right people in real-time
how we can store the data
how we can analyze and understand it given its size and our computational capacity
other challenges from privacy and security to access
![Page 31: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/31.jpg)
Greater than the challenges are the opportunities
We can extract ◦ insight and knowledge ◦ identify trends ◦ use the data to improve productivity ◦ gain competitive advantage ◦ create substantial value for the world economy
Big data provides an opportunity to find insight in new and emerging types of data.
Argentina can take advantage of these
opportunities
![Page 32: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/32.jpg)
Discovery of useful, possibly unexpected, patterns in data
Non-trivial extraction of implicit, previously unknown and potentially useful information from data
Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns
![Page 33: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/33.jpg)
Aggregation and Statistics ◦ Data warehouse and OLAP
Indexing, Searching, and Querying ◦ Keyword based search
◦ Pattern matching (XML/RDF)
Knowledge discovery ◦ Data Mining
◦ Statistical Modeling
![Page 34: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/34.jpg)
Security
Finance Smarter Healthcare
Telecom
Manufacturing
Traffic Control
Trading Analytics Fraud and Risk
Precision Agriculture
Search Quality
Retail: Churn, NBO
Multi-channel sales
![Page 35: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/35.jpg)
• HealthCare - Deep Analytics (pattern recognition) - Assisted Living, Home Care, Athletics Apps
Precision Agriculture Oil and Gas Transportation Smart Cities - Smart Traffic Lights - Pollution Monitoring - Infrastructure Health Monitoring
Connected Vehicle & Rail Smart Grid Retail Industry
Rodolfo Milito
![Page 36: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/36.jpg)
IP WAN Backhaul
(Cellular, Broadband, Ethernet, Serial)
802.11 Wi-Fi or Ethernet LAN
Aggregation Point
(e.g., Farm House)
Mobile Endpoints
(Tractors, Implements)
Small Cell
Cellular
Fixed Endpoints
(Environmental Sensors – Water, Nitrogen)
IPv6 enabled
802.15.4g/e
RF Mesh
Internet / Cloud / VPN
Satellite
Endpoint
IPv6 Stack 802.11
Wi-Fi
1 2 3
Macro Cell
Cellular
2
Satellite
Rodolfo Milito
![Page 37: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/37.jpg)
Category Requirements Pluses Minuses Comments
Intelligent Irrigation System
Sensor network and access Edge and Core integration (sensor information + weather forecast)
Better yields Water savings Sensors can also measure soil conditions
Cost of deployment
Wi-Fi infrastructure helps
Produce Tracking
Tagging & tracking system
Provenance guarantees
Opportunities in Precision Farming Rodolfo Milito
![Page 38: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/38.jpg)
Rodolfo Milito
![Page 39: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/39.jpg)
Smart Water Structural Health
Intelligent Transportation
Environmental Monitoring
Safety & Security
Public Lightin
g
Se
rvic
es
PublicCloud
Subscription Based Services
Private Cloud
Security, ITS, Lighting, Water
Op
era
tio
n
Infr
as
tru
ctu
re
En
d P
oin
ts
Ethernet
WiFi, 802.11P, Wave2M, Low Power RF, PLC, 802.15.4, etc.
NMS
S+CC Service Delivery Platform
IoT
for
Sm
art &
Connecte
d C
om
munit
ies
Sm
art
Tra
ffic
Lig
ht
Syste
m
Rodolfo Milito
![Page 40: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/40.jpg)
Op
era
tio
n
Infr
as
tru
ctu
re
En
d P
oin
ts
Se
rvic
es
Public Cloud
Subscription-based Services
Private (OEM) Cloud
Data Center/Virtual Servers
Enterprise Cloud
Enterprise Video, Voice, Data
V2V Communication (802.11p)
Electrical Charging Network Charging Stations,
Other Services (802.11p ?)
Mobile WiFi Offload Wi-Fi Hotspots, 802.11u, 3G/4G
Consumer Network Home/Dealership Wi-Fi Hotspots, Femtocells
Mobile SP 1 Mobile SP 1 Communications Service Providers, “Fog”
VNO Policy Enforcement, Flow-based
Management, DPI
Software
DSRC Roadside Infrastructure 802.11p (V2I)
Mobile SP 1 Mobile SP 1
Energy Service Providers
(Smart Grid)
V2I/Upstream Communication (Wi-Fi, 3G/4G, 802.11p, etc.)
Rodolfo Milito
![Page 41: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/41.jpg)
Travelers Centers
Vehicles Field
47
Rodolfo Milito
![Page 42: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/42.jpg)
Roadside multi-purpose equipment based on convergence of routing, computing and wireless technologies
Distributed, multi-tenancy computing model
Supporting multiple wireless technologies
Located with other traffic control equipment
Purpose - Managed Service ◦ Regulate traffic (Traffic Router – cars, IP
packets, same) ◦ Collect tolls taxes (per transaction fee
collection) ◦ E-Commerce support ◦ Content delivery ◦ Traffic sensor management (e.g., Sensys)
Rodolfo Milito
![Page 43: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/43.jpg)
Big Data Integration is Multidisciplinary
Less than 10% of Data world are genuinely relational
Meaningful data integration in the real, messy, schema-less
and complex Big Data world of database and semantic web
using multidisciplinary and multi-technology methode
The Billion Triple Challenge
Web of data contain 31 billion RDf triples, that 446million of
them are RDF links, 13 Billion government data, 6 Billion
geographic data, 4.6 Billion Publication and Media data, 3 Billion
life science data
BTC 2011, Sindice 2011
Demonstrate the Value of Semantics: let data integration drive
DBMS technology
Large volumes of heterogeneous data, like link data and RDF
![Page 44: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/44.jpg)
53
![Page 45: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/45.jpg)
Jobs - The U.S. could face a shortage by 2018 of 140,000 to 190,000
people with "deep analytical talent" and of 1.5 million people capable of analyzing data in ways that enable business decisions. (McKinsey & Co)
- Big Data industry is worth more than $100 billion and growing at almost 10% a year (roughly twice as fast as the software business)
![Page 46: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/46.jpg)
In 2008 it the paper ¨Big-Data Computing: Creating revolutionary breakthroughs in commerce, science, and society¨ ◦ Just as search engines have transformed how we
access information, other forms of bigdata computing can and will transform the activities of companies, scientific researchers, medical practitioners, and our nation's defense and intelligence operations.
In 2012, the Obama administration announced the Big Data Research and Development Initiative
![Page 47: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/47.jpg)
Let´s catch the wave Argentina! Puedes ser un líder en Big Data
Qué debemos hacer para subirnos al tren….
![Page 48: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/48.jpg)
![Page 49: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/49.jpg)
9
- Government
In 2012, the Obama administration announced the Big Data Research
and Development Initiative
84 different big data programs spread across six departments
- Private Sector
- Walmart handles more than 1 million customer transactions every hour,
which is imported into databases estimated to contain more than
2.5 petabytes of data
- Facebook handles 40 billion photos from its user base.
- Falcon Credit Card Fraud Detection System protects 2.1 billion active
accounts world-wide
- Science
- Large Synoptic Survey Telescope will generate
140 Terabyte of data every 5 days.
- Large Hardon Colider 13 Petabyte data produced in 2010
- Medical computation like decoding human Genome
- Social science revolution
- New way of science (Microscope example)
![Page 50: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/50.jpg)
◦ Crowler Ingestion processes of the data
Custom processing
highly specialized
◦ User accessing and using data Transaction processing (storage processing gfs) capture thru interaction
spaner
◦ Processing it analysys
Mapreduce hadoop (batch mode)
Machine learning
Smart quering
Required many eng. Teams to solve this …
![Page 51: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/51.jpg)
Data from everywhere ◦ You should not care where from
Medical – health genone genetic map and tracking
Consumer related kmart.target, walmart
Auto industry car status
![Page 52: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/52.jpg)
Internet plays a key role
Enterprise, health, retail, government, finantial
New DB, new Storage
What new ◦ 3V volume,velocity,variety
◦ 4S source,size,speed,structure
◦ Tipical data Create,Read,Update,Delete CRUD now Create,Replicate,Apende (not delet just apend),Processing
![Page 53: Español Mario Nemirovsky€¦ · 4 In March 2012, the Obama administration announced the big data research and development initiative. The leading IT companies, such as SAG, Oracle,](https://reader033.fdocuments.us/reader033/viewer/2022050109/5f4763b929e9b30dc1782292/html5/thumbnails/53.jpg)
Retailing Finantial Healthcare Data from video IoT Hadoop is leader in 2 key elements ◦ Distributed file system ◦ Mapreduce
BD on the Cloud Oportunities ◦ Farmers whether crop faliors ◦ Pandemics ◦ Heath care 150B saving
IoT cisco predicts that in 2015 4.8 Billon Therabytes trafic