BIG DATA EUROPE BIG DATA EUROPE PLATFORM REQUIREMENTS & DRAFT ARCHITECTURE: THE RESULTS OF THE ONLINE SURVEY BIG DATA EUROPE WORKSHOP: THE CHALLENGES OF BIG DATA FOR SOCIETIES IN A CHANGING WORLD MARTIN KALTENBÖCK (SEMANTIC WEB COMPANY), 18.11.2015 HTTP://WWW.BIG-‐DATA-‐EUROPE.EU/
Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges
Semantic Web Company (SWC)
SWC was founded 2001, head-quartered in Vienna
30 experts in linked data technologies & textmining
Product: PoolParty Semantic Suite (launched 2009)
Serving customers from all over the world
EU- & US-based consulting services
SWC: Customers & Partners
Some of our Customers ● Credit Suisse ● Boehringer Ingelheim ● Roche ● Wolters Kluwer ● BMJ Publishing Group ● Red Bull Media House ● Canadian Broadcasting Corporation (CBC) ● Pearson ● Council of the EU ● DG Environment, EC ● Healthdirect Australia ● Ministry of Finance (Austria) ● World Bank Group ● Inter-American Development Bank (IADB) ● International Atomic Energy Agency (IAEA) ● Buildings Performance Institute Europe (BPIE) ● Renewable Energy & Energy Efficiency P (REEEP) ● Global Buildings Performance Network (GBPN) ● American Physical Society ● Education Services Australia (ESA) ● Norwegian Directorate of Immigration ● Australian National Data Service
Finance / Automotive / Publisher / Health Care / Public Administration / Energy / Education
Selected Partners ● EBCONT ● EPAM Systems ● iQuest ● PwC ● Tenforce ● OpenLink Software ● Ontotext ● MarkLogic ● Gravity Zero ● Altotech ● Wolters Kluwer ● Taxonomy Strategies ● Digirati ● Fraunhofer (IAIS) ● University of Leipzig (INFAI) ● The Open Data Instizute (ODI)
We all have one goal in mind: Make machines smart enough so that they can help us to find those needles in the haystack, which are really relevant to us.
The Motivation – Big Data
Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone. This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few.
This data is big data. Source: IBM
Big Data Dimensions
Rationale
COORDINATION Stakeholder Engagement (Requirements Elicitation)
SUPPORT Design, Realise, Evaluate
Big Data Aggregator Platform
Create and Manage Societal Big Data Interest Groups
Cloud-deployment ready Big Data Aggregator Platform
CSA Measures
Results
BIG DATA EUROPE STAKERHOLDER ENGAGEMENT & REQUIREMENTS ENGINEERING APPROACH
Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges
BDE Stakeholder Engagement Approach & Activities
Work Packages & Implementation Phases
Community Building
M1-‐M12 M13-‐M24 M25-‐M36
Enabling Technologies
Component Integration
Uptake
Integrator Deployment
Community Assessment
WP3 – Big Data Generic Enabling Technologies & Architecture
WP5 – Big Data Integrator Instances
WP7 – Dissemination & Communication
WP2 – Community Building & Requirements
WP4 – Big Data Integrator Platform
WP6 – Real-‐life Deployment & User Evaluation
Orthogonal Dimensions of Big Data Ecosystems
Generic Big Data Enabling Technologies
Data Value Chain
Data Generation & Acquisition
Data Analysis & Processing
Data Storage & Curation
Data Visualization &
Usage
Data-‐driven Services
Societal Challeng
es
Dom
ain Specific D
ata Assets & Techn
olog
y Healthcare
Food Security
Energy
Intelligent Transport
Climate & Environment
Inclusive & Reflective Societies
Secure Societies
Methodology of Requirements Engineering
BDE Approach & Methodology • BDE Core Question Matrix as a basic Tool • Online Survey (20.5. – 26.6.2015, 394 Participants) • 7 x 15 Face to Face Interviews (3 x 5 per SC) • 7 Workshops in 2015 (7 in 2016, 7 in 2017) • 7 BDE Pilot (Use Case) ideas / specifications
Requirements
Use case pilots
Online survey Interviews
BDE Core Question Matrix
Elements of the RE model
Questions to people within the specific Societal Challenge (grouped by type of interviewee)
Business Strategic Technical Domain Experts
Stories Question Question Question Question
In this element, stories which describe the current status and future development are asked Question Question Question Question
Personas Question Question Question Question In this element, typical personas which play a role are described Question Question Question Question
Data Question Question Question Question This element is to describe the data in amount, quality, type, usage, etc. Question Question Question Question
Technologies Question Question Question Question In this element, the technical requirements to our specific solution are described Question Question Question Question
Other Question Question Question Question
BDE Stakeholder Survey
The empirical methodology of online surveys generally coincides with problems of representativity. Samples generated through online surveys are regarded as biased, especially in terms of age, sex and education. Additionally lower response rates compared to other methods, self-selection and the lack of verifiability of demographic information provided by the respondents do not allow to draw conclus ions beyond the sample ascertained by the survey itself.
BDE Stakeholder Survey - Participants
Participants: sector and organisation size
BDE Stakeholder Survey - Participants
Self-definition of role in the sector
BDE Stakeholder Survey - Participants
Participation in EU funded projects
BDE Stakeholder Survey - Participants
Years of IT Experience
BDE Stakeholder Survey - Results
Importance of Volume
Importance of Velocity
BDE Stakeholder Survey - Results
Importance of Variety
Efficiency of Data Infrastructures
BDE Stakeholder Survey - Results
Big Data
Volume
Velocity
Variety
Veracity
• Not an issue • Would be nice to have
• Very important • “mostly economic and Social Science data”
• Not so much data • “Increasingly important”
• Very important “Data inconsistencies and ambiguities are solved before processing”
BDE Stakeholder Survey - Results
Investments in Big Data Technologies
Investments per Orgaisation Size
BDE Stakeholder Survey – Results: Growth of Data Volumes
BDE Stakeholder Survey – Results: Long Term Preservation
BDE Stakeholder Survey – Results: Long Term Preservation
BDE Stakeholder Survey – Results: Long Term Preservation
¥ Long term preservation of data o SC6 has the infrastructure in place for longterm
preservation of data o “Current practice is a core service where data is held in a
central place within a national infrastructure, and secure remote access is provided to each social research team.”
¥ Data processing o “We use small samples or just the “main information”’ of
data needed.”
BDE Stakeholder Survey – Results
Need of Processing Large Volumes of Data per Organisation Size
BIG DATA EUROPE TECHNICAL REQUIREMENTS & ARCHITECTURE / COMPONENTS
Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges
Blueprint of the Data Aggregator Platform
Batch Layer
Speed Layer
Data Storage
Real-time data & Transactions …
Batch View
Real-time View
mes
sage
pas
sing
message passing
Applications & Showcases
Real-time dashboards
Domain-specific BDE apps
Big Data Analytics In-stream Mining
BDE Platform
& Intelligence
Input data Stream Spatial Social Statistical Temporal Transactional Imagery
+ Semantic Layer
Lambda Architecture
Work distributor & monitor Work executorsWork in itiator
Spark master
dispatcher
Spark worker
Spark worker
Target situation
Spark master dispatcher
Spark worker
Spark worker
Worker1
Worker2
Worker3
Deployed situation
Big Data solutions – technical challenges
Work distributor & monitor Work executorsWork in itiator
Spark master
dispatcher
Spark worker
Spark worker
Target situation
Spark master dispatcher
Spark worker
Spark worker
Worker1
Worker2
Worker3
Deployed situation
Big Data solutions – technical challenges
BDE platform – generic robust resource management
BDE platform – generic robust resource management
Announcements….
• HangOut, 23.11.2015, 11.00am -12.00pm CET (SC2) INRA’s Big Data Perspectives and Implementation Challenges
• HangOut, 25.11.2015, 14.00pm -15.00pm CET (SC1) Challenge of Health, Demographic Change and Wellbeing
• HangOut, 08.12.2015, 11.00pm -12.00pm CET (SC3) Big Data in the energy domain
• Big Data Europe MeetUp Vienna, 15.12.2015, 16:00-19:30pm CET, LINK • 2016 Conference on Big Data from Space, March 15, 2016, LINK
SEMANTiCS2016, early September 2016 in Leipzig, Germany, http://www.semantics.cc
EDF2016, 29-30 June 2016 Eindhoven, Netherlands, http://2016.data-forum.eu
BDE Channels for Societal Challenge 6
• Overall Website: http://www.big-data-europe.eu • SC 6 Website: http://www.big-data-europe.eu/social-sciences/ • W3C Community Group: https://www.w3.org/community/bde-societies/ • Subscribe BDE Newsletter: http://bit.ly/1PyhXRS
Contact the BDE Societal Challenge 6 network Domain: Ivana Ilijasic Versic (CESSDA): [email protected] Technical: Martin Kaltenböck (Semantic Web Company): [email protected]
Workshop 18.11. – Interactive Sessions
Session 1: Data in place in the Social Sciences and Humanities • What are the most important data sources in social
sciences available / you are using (open / closed)? • How are the characteristics along the 4 Vs of Big Data
regarding such sources (Volume - Variety - Velocity - Veracity)?
Session 2: Risks and Challenges of successful data management • What are the most important challenges in data
management in social sciences? • What are the most dangerous risks you can think of
regarding data management in social sciences? • SWOT - Analysis
Session 3: Technological demands of data • What technologies are in place in your organisations? • What technologies are on your roadmap - or are you
evaluating at the moment? • What are the most critical technological issues?
Session 4: Legal and policy demands of data • Open Versus Closed data in social sciences? • What are the most important legal issues in place? • What needs to change regarding Policies to enable
more efficient data management in social sciences?
Martin Kaltenböck, [email protected] Semantic Web Company GmbH Mariahilfer Strasse 70/8, A-1070 Vienna +43-1-4021235 http://www.semantic-web.at http://www.poolparty-software.com http://slideshare.net/semwebcompany http://youtube.com/semwebcompany
Your Questions please….
www.big-data-europe.eu 27-Nov-15
#BigDataEurope
Top Related