Why big data_for_bd_sid_strategy_workshop_7_sep_2013_fz_v1.5
Big data, why care
-
Upload
daan-gerits -
Category
Business
-
view
122 -
download
3
description
Transcript of Big data, why care
![Page 1: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/1.jpg)
BigData, Why Care?
Saturday 20 October 12
![Page 2: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/2.jpg)
Datacrunchers Consultancy Services
Speaker
Daan Gerits- BigData Architect- DataCrunchers.eu
§Semantic Analysis, Data Harvesting, ...§Hadoop, Azure, BigInsights, ...§Storm
BigData.be co-organizer
2
Saturday 20 October 12
![Page 3: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/3.jpg)
Datacrunchers Consultancy Services
BigData
A lot of technical fuzz- Hadoop, Storm, Pig, ...
Seems to be only for the big players- Google, Facebook, Linkedin, Twitter, ...
So why should ‘we’ care?- we = Startups, Smaller and Medium Enterprises (SSME)
3
Saturday 20 October 12
![Page 4: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/4.jpg)
Datacrunchers Consultancy Services
What BigData Promises
Ability to store and process large amounts of data- Scalable in hardware and software- Scalable in budget
Which means your budget can grow with your data- start small with a small cluster
- the more data you want to manage, the more systems you add
Lower cost systems- Several low to medium end systems- instead of 1 big expensive one
4
Saturday 20 October 12
![Page 5: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/5.jpg)
Datacrunchers Consultancy Services
But what can you do with it?
Analyze your data with higher precisionAnalyze historical factsPrevent Data Loss- Infrastructure failure
- Human errors
Eliminate data silo’s
5
Saturday 20 October 12
![Page 6: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/6.jpg)
Datacrunchers Consultancy Services
High Precision Analysis
Traditional Technologies- Problems:
§Unable to store all data
- Solutions:§Sharding§Aggregate data
- Problems:§Sharding has a high maintanance cost§Sharding is complex for users and apps§Manual sharding adds a high risk§Data Aggregation causes loss in data precision
6
Saturday 20 October 12
![Page 7: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/7.jpg)
Datacrunchers Consultancy Services
High Precision Analysis
BigData allows us to- Store and process large amounts of data
§So no need to aggregate
- ‘Forget’ about sharding§BigData technologies do this for you§Makes it predictable§And transparant
But- You have to configure it correctly
- You don’t have ad-hoc querying (yet)
7
Saturday 20 October 12
![Page 8: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/8.jpg)
Datacrunchers Consultancy Services
Analyze Historical Facts
Data Warehouse- Built on top of parameters
What if we forget to add a parameter?- Add the parameter
- Start gathering information for that parameter
Problem:- We will only have information from the moment we add
the parameter!
8
Saturday 20 October 12
![Page 9: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/9.jpg)
Datacrunchers Consultancy Services
Analyze Historical Facts
Let’s store everythingDetermine the parameters later- by humans- by machine learning algorithms
Analysis will process all dataWhat if we forget to add a parameter?- add the parameter
- regenerate your reports
9
Saturday 20 October 12
![Page 10: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/10.jpg)
Datacrunchers Consultancy Services
Analyze Historical Data
Conclusion- Traditionally: Ask first, store later- BigData: store first, ask later
10
Saturday 20 October 12
![Page 11: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/11.jpg)
Datacrunchers Consultancy Services
Prevent Data Loss
Traditional technologies- Machine Failure
§ I hope you have a backup from yesterday?
- Human Error §Whoops I deleted those records§ I hope you have a backup from yesterday?
- So in the worst case, you lose one day of data
11
Saturday 20 October 12
![Page 12: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/12.jpg)
Datacrunchers Consultancy Services
Prevent Data Loss
BigData allows us to- Survive machine failure without data-loss- Survive human error without data-loss
But- You need a data-model which supports this
§ Incremental model
- You need to restrict operations§Only append data, No updates or deletes
12
Saturday 20 October 12
![Page 13: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/13.jpg)
Datacrunchers Consultancy Services
Prevent Data Loss
Conclusion- Traditional technologies
§ requires very advanced setups to handle machine failure§allow you to go back to yesterday’s state
- BigData § requires knowledge of how the failover algorithms work§expects failure most of the time§allows you to go back to the previous state
13
Saturday 20 October 12
![Page 14: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/14.jpg)
Datacrunchers Consultancy Services
Eliminate Data Silo’s
Departments having their own data sources- start to modify that data- start to treat it as their master data
- not coupled to the master dataset
Causes a lot of overhead- Silo’s miss master data updates- Business decisions based on silo data, not the more
accurate master data
No obvious way out
14
Saturday 20 October 12
![Page 15: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/15.jpg)
Datacrunchers Consultancy Services
Eliminate Data Silo’s
Consolidate the silo’s- Identify the silo’s- Import the data from the silo’s into one store
- Reconstruct master data based on silo rules and priorities
15
MasterData
Sa
M
SuSupport
Marketing
Sales
Saturday 20 October 12
![Page 16: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/16.jpg)
Datacrunchers Consultancy Services
Eliminate Data Silo’s
Generate read-only data-models per applicationData changes are sent to the master data- using a specific api- using database triggers
16
DataWarehouse
Public API
ERP/CRM DBM1
M2
M3
MasterData
Saturday 20 October 12
![Page 17: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/17.jpg)
Datacrunchers Consultancy Services
Eliminate Data Silo’s
Conclusion- You will have to consolidate- But you need a structural solution
- Which can be provided by BigData
- In a flexible and future-proof way
17
Saturday 20 October 12
![Page 18: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/18.jpg)
Datacrunchers Consultancy Services
Conclusion
There is a lot to think aboutBut BigData can do a lot of things- A lot more than I explained today
For a reasonable priceAnd you are not alone- bigdata.be- datacrunchers.eu
18
Saturday 20 October 12
![Page 19: Big data, why care](https://reader034.fdocuments.us/reader034/viewer/2022050804/54c67e924a7959a4368b46b6/html5/thumbnails/19.jpg)
Questions?
Saturday 20 October 12