Kayvan Sadeghi Department of Statistics ... - mypages.iit.edu
© 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks...
-
Upload
gregory-heath -
Category
Documents
-
view
212 -
download
0
Transcript of © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks...
![Page 1: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/1.jpg)
© 2012 IBM Corporation
IBM Security Systems
11© 2013 IBM Corporation
11
The Age of
Big Data
Thanks to Kayvan Tirdad at York Universityand Kalapriya Kannan at IBM
![Page 2: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/2.jpg)
© 2012 IBM Corporation
IBM Security Systems
22© 2013 IBM Corporation
22
11 Introduction: Explosion in Quantity of Data
Big Data Characteristics2
Usage Example in Big Data
3
4 Importance of Big Data
5
Cost Problem (example)
2
3
4
5
Contents 1
![Page 3: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/3.jpg)
© 2012 IBM Corporation
IBM Security Systems
33© 2013 IBM Corporation
33
16 Some Challenges in Big Data
Other Aspects of Big Data27
Contents
![Page 4: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/4.jpg)
© 2012 IBM Corporation
IBM Security Systems
44© 2013 IBM Corporation
44
Data is raw, unorganized facts that need to be processed. Data can be something simple and seemingly random and useless until it is organized.
When data is processed, organized, structured or presented in a given context so as to make it useful.
What we have / What we want
Data verses Information
![Page 5: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/5.jpg)
© 2012 IBM Corporation
IBM Security Systems
55© 2013 IBM Corporation
55
What Is Big Data?
Big data is defined as voluminous unstructured data from many different sources, such as:
– Social networks
– Banking and financial services
– E-commerce services
– Web-centric services
– Internet search indexes
– Scientific searches
– Document searches
– Medical records
– Weblogs
![Page 6: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/6.jpg)
© 2012 IBM Corporation
IBM Security Systems
66© 2013 IBM Corporation
66
Big data-From Wikipedia, the free encyclopedia
Big data[1][2] is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.
As of 2012[update], limits on the size of data sets that are feasible to process in a reasonable amount of time were on the order of exabytes of data.[8] Scientists regularly encounter limitations due to large data sets in many areas, including meteorology, genomics,[9] connectomics, complex physics simulations,[10] and biological and environmental research.[11] The limitations also affect Internet search, finance and business informatics. Data sets grow in size in part because they are increasingly being gathered by ubiquitous information-sensing mobile devices, aerial sensory technologies (remote sensing), software logs, cameras, microphones, radio-frequency identification readers, and wireless sensor networks. [12][13][14] The world's technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s;[15] as of 2012[update], every day 2.5 exabytes (2.5×1018) of data were created.[16] The challenge for large enterprises is determining who should own big data initiatives that straddle the entire organization. [17]
Big data is difficult to work with using most relational database management systems and desktop statistics and visualization packages, requiring instead "massively parallel software running on tens, hundreds, or even thousands of servers".[18] What is considered "big data" varies depending on the capabilities of the organization managing the set, and on the capabilities of the applications that are traditionally used to process and analyze the data set in its domain. "For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration." [19]
6
![Page 7: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/7.jpg)
© 2012 IBM Corporation
IBM Security Systems
77© 2013 IBM Corporation
77
What is big data?
“Every day, we create 2.5 quintillion(1018) bytes of data — so much that 90% of the data in the world today has been created in the last two years alone. This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few.
This data is “big data.”
![Page 8: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/8.jpg)
© 2012 IBM Corporation
IBM Security Systems
88© 2013 IBM Corporation
88
Simple to start
What is the maximum file size you have dealt so far?– Movies/Files/Streaming video that you have used?– What have you observed?
What is the maximum download speed you get?Simple computation
– How much time to just transfer.
![Page 9: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/9.jpg)
© 2012 IBM Corporation
IBM Security Systems
99© 2013 IBM Corporation
99
Data size
1 요타 바이트
![Page 10: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/10.jpg)
© 2012 IBM Corporation
IBM Security Systems
1010© 2013 IBM Corporation
1010
![Page 11: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/11.jpg)
© 2012 IBM Corporation
IBM Security Systems
1111© 2013 IBM Corporation
1111
Introduction: Explosion in Quantity of Data
1946 2012
Eniac LHC X 6,000,000 = 1 (40 TB/S)
Air Bus A380- 1 billion line of code-each engine generate 10 TB every 30 min
640TB per Flight
Twitter Generate approximately 12 TB of data per day
New York Stock Exchange 1TB of data everyday
storage capacity has doubled roughly every three years since the 1980s
![Page 12: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/12.jpg)
© 2012 IBM Corporation
IBM Security Systems
1212© 2013 IBM Corporation
1212
Introduction: Explosion in Quantity of Data
Our Data-driven World Science
– Data bases from astronomy, genomics, environmental data, transportation data, …
Humanities and Social Sciences– Scanned books, historical documents, social interactions data, new
technology like GPS …
Business & Commerce– Corporate sales, stock market transactions, census, airline traffic, …
Entertainment– Internet images, Hollywood movies, MP3 files, …
Medicine– MRI & CT scans, patient records, …
![Page 13: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/13.jpg)
© 2012 IBM Corporation
IBM Security Systems
1313© 2013 IBM Corporation
1313
Spatio-Temporal Data
Average Monthly Temperature of land and ocean
![Page 14: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/14.jpg)
© 2012 IBM Corporation
IBM Security Systems
1414© 2013 IBM Corporation
1414
5Introduction: Explosion in Quantity of Data
Our Data-driven World
What we do with these amount of data?
Ignore
- Fish and Oceans of Data
![Page 15: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/15.jpg)
© 2012 IBM Corporation
IBM Security Systems
1515© 2013 IBM Corporation
1515
Big Data Characteristics7
Big Data Vectors (3Vs)
- high-volumeamount of data
- high-velocitySpeed rate in collecting or acquiring or generating or processing of data
- high-variety different data type such as audio, video, image data (mostly unstructured
data)
![Page 16: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/16.jpg)
© 2012 IBM Corporation
IBM Security Systems
1616© 2013 IBM Corporation
1616
Big data spans three dimensions: Volume, Velocity and Variety
Volume: Enterprises are awash with ever-growing data of all types, easily amassing terabytes—even petabytes—of information.
– Turn 12 terabytes of Tweets created each day into improved product sentiment analysis – Convert 350 billion annual meter readings to better predict power consumption
Velocity: Sometimes 2 minutes is too late. For time-sensitive processes such as catching fraud, big data must be used as it streams into your enterprise in order to maximize its value.
– Scrutinize 5 million trade events created each day to identify potential fraud – Analyze 500 million daily call detail records in real-time to predict customer churn faster – The latest I have heard is 10 nano seconds delay is too much.
Variety: Big data is any type of data - structured and unstructured data such as text, sensor data, audio, video, click streams, log files and more. New insights are found when analyzing these data types together.
– Monitor 100’s of live video feeds from surveillance cameras to target points of interest – Exploit the 80% data growth in images, video and documents to improve customer
satisfaction
![Page 17: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/17.jpg)
© 2012 IBM Corporation
IBM Security Systems
1717© 2013 IBM Corporation
1717
Big Data Characteristics 6
How big is the Big Data?- What is big today maybe not big tomorrow
Big Data Vectors (3Vs)
- Any data that can challenge our current technology in some manner can consider as Big Data
- Volume- Communication- Speed of Generating- Meaningful Analysis
"Big Data are high-volume, high-velocity, and/or high-variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization”Gartner 2012
![Page 18: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/18.jpg)
© 2012 IBM Corporation
IBM Security Systems
1818© 2013 IBM Corporation
1818
Cost Problem (example) 8
Cost of processing 1 Petabyte of data with 1000 node ?1 PB = 1015 B = 1 million gigabytes = 1 thousand terabytes
- 9 hours for each node to process 500GB at rate of 15MB/S- 15*60*60*9 = 486000MB ~ 500 GB- 1000 * 9 * 0.34$ = 3060$ for single run
- 1 PB = 1000000 / 500 = 2000 * 9 = 18000 h /24 = 750 Day
- The cost for 1000 cloud node each processing 1PB2000 * 3060$ = 6,120,000$
![Page 19: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/19.jpg)
© 2012 IBM Corporation
IBM Security Systems
1919© 2013 IBM Corporation
1919
Importance of Big Data 9
- GovernmentIn 2012, the Obama administration announced the Big Data Research and Development Initiative84 different big data programs spread across six departments
- Private Sector- Walmart handles more than 1 million customer transactions every hour, which is imported into databases estimated to contain more than 2.5 petabytes of data- Facebook handles 40 billion photos from its user base.- Falcon Credit Card Fraud Detection System protects 2.1 billion active accounts world-wide
- Science- Large Synoptic Survey Telescope will generate 140 Terabyte of data every 5 days.- Large Hardon Colider 13 Petabyte data produced in 2010- Medical computation like decoding human Genome- Social science revolution- New way of science (Microscope example)
![Page 20: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/20.jpg)
© 2012 IBM Corporation
IBM Security Systems
2020© 2013 IBM Corporation
2020
Importance of Big Data
Job
- The U.S. could face a shortage by 2018 of 140,000 to 190,000 people with "deep analytical talent" and of 1.5 million people capable of analyzing data in ways that enable business decisions. (McKinsey & Co)
- Big Data industry is worth more than $100 billion
growing at almost 10% a year (roughly twice as fast as the software business)
10
Technology Player in this field Oracle
Exadata
Microsoft HDInsight Server
IBM Netezza
![Page 21: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/21.jpg)
© 2012 IBM Corporation
IBM Security Systems
2121© 2013 IBM Corporation
2121
Usage Example of Big Data 12
US 2012 Election
- data mining for individualized ad targeting
- Orca big-data app
- YouTube channel( 23,700 subscribers and 26 million page views)
- Ace of Spades HQ
- predictive modeling- mybarackobama.com - drive traffic to other campaign sites Facebook page (33 million "likes") YouTube channel (240,000 subscribers and 246 million page views).- a contest to dine with Sarah Jessica Parker- Every single night, the team ran 66,000 computer simulations, Reddit!!!- Amazon web services
![Page 22: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/22.jpg)
© 2012 IBM Corporation
IBM Security Systems
2222© 2013 IBM Corporation
2222
Usage Example in Big Data 13
Data Analysis prediction for US 2012 Election
Drew Linzer, June 2012 332 for Obama, 206 for Romney
Nate Silver’s, Five thirty Eight blogPredict Obama had a 86% chance of winningPredicted all 50 state correctly
Sam Wang, the Princeton Election Consortium The probability of Obama's re-election at more than 98%
media continue reporting the race as very tight
![Page 23: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/23.jpg)
© 2012 IBM Corporation
IBM Security Systems
2323© 2013 IBM Corporation
2323
Usage Example in Big Data 11
- Moneyball: The Art of Winning an Unfair GameOakland Athletics baseball team and its general manager Billy Beane
- Oakland A's' front office took advantage of more analytical gauges of player performance to field a team that could compete successfully against richer competitors in MLB
- Oakland approximately $41 million in salary, New York Yankees, $125 million in payroll that same season.Oakland is forced to find players undervalued by the market,
- Moneyball had a huge impact in other teams in MLB
And there is a moneyball movie!!!!!
![Page 24: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/24.jpg)
© 2012 IBM Corporation
IBM Security Systems
2424© 2013 IBM Corporation
2424
Some Challenges in Big Data
Big Data Integration is MultidisciplinaryLess than 10% of Big Data world are genuinely relationalMeaningful data integration in the real, messy, schema-less and complex Big Data world of database and semantic web using multidisciplinary and multi-technology methode
The Billion Triple ChallengeWeb of data contain 31 billion RDf triples, that 446million of them are RDF links, 13 Billion government data, 6 Billion geographic data, 4.6 Billion Publication and Media data, 3 Billion life science dataBTC 2011, Sindice 2011
The Linked Open Data RipperMapping, Ranking, Visualization, Key Matching, Snappiness
Demonstrate the Value of Semantics: let data integration drive DBMS technology
Large volumes of heterogeneous data, like link data and RDF
![Page 25: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/25.jpg)
© 2012 IBM Corporation
IBM Security Systems
2525© 2013 IBM Corporation
2525
Other Aspects of Big Data 15
1- Automating Research Changes the Definition of Knowledge
2- Claim to Objectively and Accuracy are Misleading
3- Bigger Data are not always Better data
4- Not all Data are equivalent
5- Just because it is accessible doesn’t make it ethical
6- Limited access to big data creats new digital divides
Six Provocations for Big Data
![Page 26: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/26.jpg)
© 2012 IBM Corporation
IBM Security Systems
2626© 2013 IBM Corporation
2626
Other Aspects of Big Data
Five Big Question about big Data:1- What happens in a world of radical transparency, with data widely available?
2- If you could test all your decisions, how would that change the way you compete?
3- How would your business change if you used big data for widespread, real time customization?
4- How can big data augment or even replace Management?
5-Could you create a new business model based on data?
![Page 27: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/27.jpg)
© 2012 IBM Corporation
IBM Security Systems
2727© 2013 IBM Corporation
2727
The Four Phases of Data Conversion
1 2 3 4
![Page 28: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/28.jpg)
© 2012 IBM Corporation
IBM Security Systems
2828© 2013 IBM Corporation
2828
Finally….
`Big- Data’ is similar to ‘Small-data’ but bigger
.. But having data bigger it requires different approaches:
Techniques, tools, architecture… with an aim to solve new problems
Or old problems in a better way
![Page 29: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/29.jpg)
© 2012 IBM Corporation
IBM Security Systems
2929© 2013 IBM Corporation
2929
Whom does it matter
Research Community Business Community - New tools, new capabilities, new infrastructure, new business
models etc., On sectors
Financial Services..
![Page 30: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/30.jpg)
© 2012 IBM Corporation
IBM Security Systems
3030© 2013 IBM Corporation
3030
How are revenues looking like….
![Page 31: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/31.jpg)
© 2012 IBM Corporation
IBM Security Systems
3131© 2013 IBM Corporation
3131
The Social Layer in an Instrumented Interconnected World
2+ billion
people on the
Web by end 2011
30 billion RFID tags today
(1.3B in 2005)
4.6 billion camera phones
world wide
100s of millions of GPS
enabled devices
sold annually
76 million smart meters in 2009… 200M by 2014
12+ TBs of tweet data
every day
25+ TBs oflog data
every day
? T
Bs
of
dat
a ev
ery
da
y
![Page 32: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/32.jpg)
© 2012 IBM Corporation
IBM Security Systems
3232© 2013 IBM Corporation
3232
What does Big Data trigger?
From “Big Data and the Web: Algorithms for Data Intensive Scalable Computing”, Ph.D Thesis, Gianmarco
![Page 33: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/33.jpg)
© 2012 IBM Corporation
IBM Security Systems
3333© 2013 IBM Corporation
3333
BIG DATA is not just HADOOP
Manage & store huge volume of any data
Hadoop File System
MapReduce
Manage streaming data Stream Computing
Analyze unstructured data Text Analytics Engine
Data WarehousingStructure and control data
Integrate and govern all data sources
Integration, Data Quality, Security, Lifecycle Management, MDM
Understand and navigate federated big data sources
Federated Discovery and Navigation
![Page 34: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/34.jpg)
© 2012 IBM Corporation
IBM Security Systems
3434© 2013 IBM Corporation
3434
Types of tools typically used in Big Data Scenario
Where is the processing hosted?–Distributed server/cloud
Where data is stored?–Distributed Storage (eg: Amazon s3)
Where is the programming model?–Distributed processing (Map Reduce)
How data is stored and indexed?–High performance schema free database
What operations are performed on the data?–Analytic/Semantic Processing (Eg. RDF/OWL)
![Page 35: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/35.jpg)
© 2012 IBM Corporation
IBM Security Systems
3535© 2013 IBM Corporation
3535
When dealing with Big Data is hard
When the operations on data are complex:–Eg. Simple counting is not a complex problem.–Modeling and reasoning with data of different kinds can get
extremely complexGood news with big-data:
–Often, because of the vast amount of data, modeling techniques can get simpler (e.g., smart counting can replace complex model-based analytics)…
–…as long as we deal with the scale.
![Page 36: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/36.jpg)
© 2012 IBM Corporation
IBM Security Systems
3636© 2013 IBM Corporation
3636
Time for thinking
What do you do with the data.– Lets take an example:
• “From application developers to video streamers, organizations of all sizes face the challenge of capturing, searching, analyzing, and leveraging as much as terabytes of data per second—too much for the constraints of traditional system capabilities and database management tools.”
![Page 37: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/37.jpg)
© 2012 IBM Corporation
IBM Security Systems
3737© 2013 IBM Corporation
3737
Why Big-Data?
Key enablers for the appearance and growth of ‘Big-Data’ are:
+Increase in storage capabilities+Increase in processing power+Availability of data
![Page 38: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/38.jpg)
© 2013 IBM Corporation
IBM Security Systems
3838
THINK
![Page 39: © 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 The Age of Big Data Thanks to Kayvan Tirdad at York University and Kalapriya Kannan.](https://reader035.fdocuments.us/reader035/viewer/2022070409/56649e8f5503460f94b94065/html5/thumbnails/39.jpg)
© 2013 IBM Corporation
IBM Security Systems
3939
References
1. B. Brown, M. Chuiu and J. Manyika, “Are you ready for the era of Big Data?” McKinsey Quarterly, Oct 2011, McKinsey Global Institute2. C. Bizer, P. Bonez, M. L. Bordie and O. Erling, “The Meaningful Use of Big Data: Four Perspective – Four Challenges” SIGMOD Vol. 40, No. 4, December 20113. D. Boyd and K. Crawford, “Six Provation for Big Data” A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society, September 2011, Oxford Internet Institute4. D. Agrawal, S. Das and A. E. Abbadi, “Big Data and Cloud Computing: Current State and Future Opportunities” ETDB 2011, Uppsala, Sweden5. D. Agrawal, S. Das and A. E. Abbadi, “Big Data and Cloud Computing: New Wine or Just New Bottles?” VLDB 2010, Vol. 3, No. 26. F. J. Alexander, A. Hoisie and A. Szalay, “Big Data” IEEE Computing in Science and Engineering journal 20117. O. Trelles, P Prins, M. Snir and R. C. Jansen, “Big Data, but are we ready?” Nature Reviews, Feb 2011 8. K. Bakhshi, “Considerations for Big data: Architecture and approach” Aerospace Conference, 2012 IEEE8. S. Lohr, “The Age of Big Data” Thr New York times Publication, February 201210. M. Nielsen, “Aguide to the day of big data”, Nature, vol. 462, December 2009