: 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate...

15

Transcript of : 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate...

Page 1: : 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate Community Member List, Use Cases, and Datasets Dissemination Level Public Due Date

Collaborative Project

Holistic Benchmarking of Big Linked DataProject Number: 688227 Start Date of Project: 2015/12/01 Duration: 36 months

Deliverable 1.1.2Intermediate Community Member List,Use Cases, and Datasets

Dissemination Level Public

Due Date of Deliverable Month 18, 31/05/2017

Actual Submission Date Month 18, 31/05/2017

Work Package WP1 - Requirements Elicitation and Commu-nity Building

Task T1.1

Type Report

Approval Status Final

Version 1.0

Number of Pages 14

Abstract: This deliverable is an update of D1.1.1 and presents the up-dates carried out on the preliminary member list as well as on the use cases.The use cases are the results of discussions carried out during the last twoHobbit project meetings and are hence endorsed by the project consortium.

The information in this document re�ects only the author's views and the European Commission is not liable for any use

that may be made of the information contained therein. The information in this document is provided "as is" without

guarantee or warranty of any kind, express or implied, including but not limited to the �tness of the information for a

particular purpose. The user thereof uses the information at his/ her sole risk and liability.

This project has received funding from the European Union's Horizon 2020 research and innovation programme under

grant agreement No 688227.

Page 2: : 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate Community Member List, Use Cases, and Datasets Dissemination Level Public Due Date

D1.1.2 - v. 1.0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

History

Version Date Reason Revised by

0.0 15/05/2017 First draft created Axel Ngonga (InfAI)

0.3 30/05/2017 Peer reviewed Nadine Jochimsen (InfAI)

1.0 31/05/2017 Updates and corrections Axel-Cyrille Ngonga Ngomo (InfAI)

Author List

Organization Name Contact Information

InfAI Axel-Cyrille Ngonga Ngomo [email protected]

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 1

Page 3: : 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate Community Member List, Use Cases, and Datasets Dissemination Level Public Due Date

D1.1.2 - v. 1.0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Executive Summary

This document details the intermediary state of the Hobbit community and is basically an updateof D1.1.1. The focus of the project over period between D1.1.1 and D1.1.2 was on realizing the technicalbackend necessary to run the �rst portion of the challenges. Hence the focus on WP1 was not onexpansion but on consolidation. In particular, the partners focused on curating the contact list andupdating it new contacts. After the curation, removal of unreliable addresses and addition of novelcontacts, the address list grew to 232 relevant contacts. The list of datasets as remained constantand will be increased with the upcoming participation in conferences. A summary of the datasets(including a short description of their content and purpose, size and expected growth) was given inD1.1.1 and is repeated herein for the sake of self-containment. The interaction with experts and otherresearch projects has also led to the de�nition of use cases within which benchmarking as o�ered byHobbit could be of central importance. These use cases and the relevant benchmarks and datasetswere detailed in D1.1.1 and did not change over the last 4 months.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 2

Page 4: : 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate Community Member List, Use Cases, and Datasets Dissemination Level Public Due Date

D1.1.2 - v. 1.0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Contents

1 Introduction 6

2 Intermediary State of the Community 7

2.1 Community Building Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 Current State of the Community . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Datasets 10

4 Use Cases 13

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 3

Page 5: : 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate Community Member List, Use Cases, and Datasets Dissemination Level Public Due Date

D1.1.2 - v. 1.0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

List of Tables

1 Dissemination channels of Hobbit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Excerpt of dissemination and outreach events in which Hobbit participated. ESWCstands for Extended Semantic Web Conference. ISWC is the International SemanticWeb Conference. ECAI is the European Conference on Arti�cial Intelligence. . . . . . 9

3 Excerpt of the datasets available to the Hobbit project. A complete list can be foundat http://hobbit.iminds.be. Generators can create datasets of any size. Hence sizeand expected growth cannot be stated. . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4 Excerpt of the datasets available to the Hobbit project. A complete list can be foundat http://hobbit.iminds.be. Generators can create datasets of any size. Hence sizeand expected growth cannot be stated. . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

5 Excerpt of the datasets available to the Hobbit project. A complete list can be foundat http://hobbit.iminds.be. Generators can create datasets of any size. Hence sizeand expected growth cannot be stated. . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 4

Page 6: : 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate Community Member List, Use Cases, and Datasets Dissemination Level Public Due Date

D1.1.2 - v. 1.0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

List of Figures

1 Overview of Hobbit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Snapshot of Hobbit's Twitter account . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Distribution of Hobbit contacts in the world (left) and in Europe (right) . . . . . . . 8

4 Distribution of roles of Hobbit contacts . . . . . . . . . . . . . . . . . . . . . . . . . . 9

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 5

Page 7: : 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate Community Member List, Use Cases, and Datasets Dissemination Level Public Due Date

D1.1.2 - v. 1.0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 Introduction

In its �rst year, Hobbit has aimed to establish itself as the provider of a benchmarking platformfor industry and academia with a focus of Big Linked Data technologies. One of the key steps towardsachieving this goal was to build up a community of interested parties around the project. As shownin Figure 1, the idea behind this community is to

1. gather supplementary datasets relevant to the project,

2. gather KPIs for the evaluation of the frameworks,

3. gather solutions to benchmark and

4. collect potential members of the Hobbit association.

Data Collection

Industrydata

Measure Collection

Benchmark Creation

Benchmark 1

KPIsTasks

KPIsTasksKPIsTasks

KPIsTasks

KPIsTasks

KPIsTasks

Benchmark 2

Benchmark n

HOBBITPlatform

Solution 1

Solution k

Solution 2

Challenges

Reports

Participants/Community

Figure 1: Overview of Hobbit

In months 13 to 18, the project continued to build up the infrastructure necessary to achieve the goalaforementioned. In particular, �rst challenges were organized. This led to the focus in the area ofdatasets and community being on dissemination and consolidation. The intermediary volume of thecommunity has hence only slightly increased in number, with the contact list now at 232 relevantcontacts (92.8%) of the goal for the end of the project, see Section 2). The possible use cases (seeSection 4) have not been altered and still represent the current stand of the consortium.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 6

Page 8: : 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate Community Member List, Use Cases, and Datasets Dissemination Level Public Due Date

D1.1.2 - v. 1.0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 Intermediary State of the Community

2.1 Community Building Channels

As per the �rst 12 months of the project, we continued using the multi-channel strategy describedin Table 1. The results achieved by using this strategy are monitored continuously by the Hobbitconsortium (especially by the dissemination and outreach group). The outcomes of this monitoringare the subject of deliverable D1.4 of Hobbit.1

Channel Description

Mailing list Subscriptions to the HOBBIT mailing list

Survey Respondents to the survey sent out for requirements gather-ing

Flyers Distribution of �yers at di�erent events

Talks Presentations of the HOBBIT project

Workshops Organization of workshop at major conferences and events

Cooperations Cooperation with relevant H2020 and national projects

Challenges Organization of challenges at major conferences (ISWC,DEBS, ESWC)

Publications Scienti�c publications about the core technologies of HOB-BIT. Upcoming are publications which use the HOBBITplatform.

Table 1: Dissemination channels of Hobbit

2.2 Current State of the Community

Over the �rst half of the project, HOBBIT was disseminated in manifold ways with the aim ofbuilding up a community around the project. For example, the project was disseminated at more than35 events (see Table 2 for an excerpt), within which we also aimed to get interested parties to joinHOBBIT even at the lowest level of engagement possible. We also interacted through social media, forexample by generating tweet content on a daily basis (see Figure 2). The parties we interacted withacross our multi-channel outreach and dissemination strategy (see subsection 2.1) were asked to jointhe HOBBIT community or to provide us with contact data for further reference.

We gathered the following qualitative information on contacts:

• Email

• Full name, �rst name and last name

• Role and role type

1Available at https://project-hobbit.eu/about/deliverables/.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 7

Page 9: : 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate Community Member List, Use Cases, and Datasets Dissemination Level Public Due Date

D1.1.2 - v. 1.0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Figure 2: Snapshot of Hobbit's Twitter account

• Company

• Country

• LinkedIn

• Comment

• Source

• Project Contact

Figure 3: Distribution of Hobbit contacts in the world (left) and in Europe (right)

So far, 232 contacts were established and registered in the project contact database. We mainlyfocused on attracting the attention of companies to the project. In particular, 39.6% of the membersof the contact list are CxOs (e.g.,COO, CEO, CTO) or managers. 35.2 % are academics (professors,

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 8

Page 10: : 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate Community Member List, Use Cases, and Datasets Dissemination Level Public Due Date

D1.1.2 - v. 1.0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Figure 4: Distribution of roles of Hobbit contacts

researchers, etc.) while only 14.3% are company employees (see Figure 4). Of these 232 contacts, 135contacts were subject to further interactions, while approx. 100 are still to be contacted. We henceconsider the 135 as being meaningful contacts in the sense of the description of work, meaning that wehave already achieved 59% of the target of 250 meaningful contacts. While the contact data cannotbe published in this deliverable for reasons of privacy, Figure 3 gives an overview of the geospatialdistribution of the community so far. Most of our contacts are European, with 40 contacts fromGermany and 33 from Belgium. We have however also aimed to reach out beyond Europe to get aglimpse of the current ideas, trends and use cases that could bene�t the Hobbit association later. Forexample, our contact list extends to the Americas and China.

Event name Attendees/readers(estimates)

Dissemination and Out-reach action

ESWC 2017 300+ Challenges

European Data Forum 300+ Presentation in collaborationwith BigDataEurope2

BITKOM Big Data Summit 600+ Pitch of Hobbit

CEBIT 300+ Pitch of Hobbit

ESWC 2016 300+ Hobbit event

ISWC 2016 300+ BLINK workshop

ISWC 2016 300+ Link Discovery Tutorial

ECAI 2016 300+ Paper presentation

Diverse project meetings 100+ Hobbit pitch, liaison

Interviews 10,000+ Hobbit pitch

Table 2: Excerpt of dissemination and outreach events in which Hobbit participated. ESWC standsfor Extended Semantic Web Conference. ISWC is the International Semantic Web Conference. ECAIis the European Conference on Arti�cial Intelligence.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 9

Page 11: : 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate Community Member List, Use Cases, and Datasets Dissemination Level Public Due Date

D1.1.2 - v. 1.0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 Datasets

During the �rst 18 months, 19 datasets were gathered by the consortium and in a CKAN repository,accessible through the URL http://hobbit.iminds.be. These datasets are listed in the tables below.

Dataset Description Size (approxi-mation)

Growth (ex-pected)

Medical SubjectHeadings (MeSH)

Public RDF Datasets of Medi-cal Subject Headings (MeSH) con-trolled vocabulary

27,883 descriptorsin 2016 MeSH;87,000 entryterms, 232,000SupplementaryConcept Records(SCRs)

Approximately2% per year

LinkedSpending LinkedSpending contains govern-ment spendings from all over theworld as Linked Data. Linked-Spending uses the information col-lected by the OpenSpending projectand makes it available as data cube

2 million �nancialtransactions

7% per year

DBpedia DBpedia is a crowd-sourced commu-nity e�ort to extract structured in-formation from Wikipedia and makethis information available on theWeb. DBpedia allows answeringcomplex questions using the W3Cstandard SPARQL.

3 billion facts,125 languages,38.3 entities

10-20% per year

CER Smart Me-tering Project

The Smart Metering ElectricityCustomer Behaviour Trials (CBTs)took place during 2009 and 2010with over 5,000 Irish homes andbusinesses participating.

5,375 homes, 780businesses

Static

Next Bike Live information of GPS position ofaround 20,000 bicycles in about 70cities (http://www.nextbike.net/)

Live stream of3,000 bike posi-tions, 70 cities

Unclear

Table 3: Excerpt of the datasets available to the Hobbit project. A complete list can be found athttp://hobbit.iminds.be. Generators can create datasets of any size. Hence size and expectedgrowth cannot be stated.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 10

Page 12: : 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate Community Member List, Use Cases, and Datasets Dissemination Level Public Due Date

D1.1.2 - v. 1.0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Dataset Description Size (approxi-mation)

Growth (ex-pected)

BioASQ Dataset underlying the question an-swering challenge of the same name.The challenges focuses on large-scalebiomedical semantic indexing andquestion answering

800 questions 20,00%

Energy Map Ger-many

CSV data of development of solarenergy within Germany with instal-lation date, location, nominal capac-ity, GPS information

1.5 million entries 1-2%

LDBC The LDBC-SNB Data Generator(DATAGEN) is the responsible ofproviding the data sets used by allthe LDBC benchmarks.

Generator Generator

LinkedGeoData LinkedGeoData is an e�ort to adda spatial dimension to the Web ofData / Semantic Web.

30 billion facts 5-10% per year

TLC Trip RecordData

This dataset includes trip recordsfrom all trips completed in yellowand green taxis in NYC in 2014 andselected months of 2015.

1.1 billion taxitrips

10-20% per year

GitHub Data GitHub is how people build softwareand is home to the largest commu-nity of open source developers in theworld, with over 12 million peoplecontributing to 31 million projects.

31 millionprojects,12 mil-lion users

5-10% per year

TWIG Ontology The ontology for the synthetic ver-sion of Twitter based on the Twit-ter7 dataset.

Generator Generator

QALD6 Question Answering on Linked Dataversion 6. The dataset containsapproximately questions in naturallanguage as well as the correspond-ing SPARQL queries and keywordqueries to gather information fromDBpedia, DBpedia abstracts and re-lated datasets.

500 questions 10%

Table 4: Excerpt of the datasets available to the Hobbit project. A complete list can be found athttp://hobbit.iminds.be. Generators can create datasets of any size. Hence size and expectedgrowth cannot be stated.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 11

Page 13: : 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate Community Member List, Use Cases, and Datasets Dissemination Level Public Due Date

D1.1.2 - v. 1.0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Dataset Description Size (approxi-mation)

Growth (ex-pected)

BENGAL This family of datasets for namedentity recognition, entity disam-biguation and relation extractionare generated automatically out ofRDF data using natural languagegeneration.

Generator Generator

LIVED The �Long Device Level EnergyData� (LIVED) dataset and con-tains measurements collected fromsmart plugs multi-sensors as de-picted.

2.5 billion mea-surements

Static

Linked Connec-tions

Linked Connections is a method forgenerating publishing transit datausing a low-cost API. It does this byexposing data in JSON(-LD).

Generator Generator

Table 5: Excerpt of the datasets available to the Hobbit project. A complete list can be found athttp://hobbit.iminds.be. Generators can create datasets of any size. Hence size and expectedgrowth cannot be stated.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 12

Page 14: : 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate Community Member List, Use Cases, and Datasets Dissemination Level Public Due Date

D1.1.2 - v. 1.0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 Use Cases

The use cases of interest to theHobbit community and contacts vary signi�cantly and are collectedcontinuously. So far, we were able to gather descriptions within

1. dissemination events,

2. interviews,

3. collaborations with other projects and

4. in deliverables of other projects.

This data collection process returns use cases hint at applications in the following domains (note thatthe names and contacts from which the information was gathered are partly omitted on purpose forthe sake of privacy):

• Industry 4.0: The use of semantics in the industry 4.0 is of central importance for the creation ofmachines that can justify their behavior and interact with their users. Amongst other activities,we gathered information from the experts in the SAKE3 and STEP4 projects, who expressedinterest in benchmarking link discovery, storage, machine learning and visualisation. Datasetssuch as the CER Smart Metering, LIVED and Weidmüller are of interest.

• Geospatial data analysis: Geospatial datasets belong to the largest and most used datasetson the planet. Contacts with experts from related projects (GeoKnow,5 GEISER,6 SmartRegio,7

STEP, SLIPO, SAGE) revealed that these experts are interested in Hobbit datasets related togeospatial entities and points of interest (LinkedGeoData, Energy Map Germany, LinkedConnec-tions, TLC Record Trip). The benchmarks of interest here are related to knowledge extractionfrom structured and unstructured data, storage, versioning and machine learning and visualisa-tion.

• Smart Energy: Devising a machinery that can use energy data to provide customers with intel-ligent energy services ranging from the automatic selection of energy providers to the detectionof unwanted states (machinery on during the weekend, open fridge doors, etc.) is regarded as aninnovative goal worthy of pursuit. Benchmarking how well such systems perform demands bench-marks in data acquisition, storage, versioning. Relevant datasets include the LIVED, Weidmüllerand CER Smart Metering datasets.

• Weather Data Analysis: The increasing amount of streaming data from weather sensorsdemands novel techniques for the semantic analysis of streaming data. The area of continuousqueries was regarded as one of the key areas for which benchmarking methodologies and uni�edsemantics still need to be dealt with. Here, Smart metering data (LIVED, Weidmüller, CER)are regarded as being of signi�cance, while storage and acquisition benchmarks are key.

• Human Resource Management: A rather surprising use case for the HOBBIT datasets, gen-erators and benchmarks for the sake of �nding good candidates for job o�ers. Novel applications

3http://sake-projekt.de4https://www.projekt-step.de/5http://geoknow.eu/6http://www.projekt-geiser.de/7http://www.smartregio.org/

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 13

Page 15: : 688227 Start Date of Project: 2015/12/01 Duration: 36 months … · 2018. 10. 7. · Intermediate Community Member List, Use Cases, and Datasets Dissemination Level Public Due Date

D1.1.2 - v. 1.0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

for this purpose demand e�cient entity recognition, entity linking and relation extraction, whichare the area targeted by the knowledge extraction benchmark of HOBBIT. Relevant datasetshere include the TWIG and the BENGAL datasets.

• Enterprise Search: Searching through streams of ever changing data is of central importancefor data-driven companies. The use cases here include federated search across several datasets(see projects DIESEL8 and WDAqua9) to search on mobile devices (e.g., project QAMEL10).The QALD 6, DBpedia, BioASQ, MESH and BENGAL datasets are here the most related whilethe knowledge acquisition benchmarks are the most important.

• European societal challenges: Through our collaboration with BigDataEurope, we were ableto gather use cases for HOBBIT for seven of the societal challenges formulated by the EuropeanUnion (i.e., health, food and agriculture, energy, transport, climate, social sciences and security).Given the diversity of the challenges, virtually all datasets and benchmarks provided by HOBBITare relevant for at least one of the challenges or for the technical solutions underlying thesechallenges. For example, the CER Smart Metering data and the data storage and knowledgebenchmarks are of central importance for the energy domain while LinkedConnections and allother transport datasets are relevant for the transport societal challenge.

Minor use cases include works on linguas francas for storage, morphology analysis as well as indexingfor storage and question answering.

8https://diesel-project.eu/9http://wdaqua.eu/

10https://qamel.eu/

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Page 14