: 688227 Start Date of Project: 2015/12/01 Duration: 36 ... · the address list grew to 300...
Transcript of : 688227 Start Date of Project: 2015/12/01 Duration: 36 ... · the address list grew to 300...
Collaborative Project
Holistic Benchmarking of Big Linked DataProject Number: 688227 Start Date of Project: 2015/12/01 Duration: 36 months
Deliverable 1.1.3Final Community Member List, UseCases, and Datasets
Dissemination Level Public
Due Date of Deliverable Month 28, 31/03/2018
Actual Submission Date Month 28, 30/03/2018
Work Package WP1 - Requirements Elicitation and Commu-nity Building
Task T1.1
Type Report
Approval Status Final
Version 2.0
Number of Pages 16
Abstract: This deliverable is an update of D1.1.2 and presents the up-dates carried out on the intermediate member list as well as on the usecases. The use cases are the results of discussions carried out during theHobbit project meetings and are hence endorsed by the project consortium.
The information in this document re�ects only the author's views and the European Commission is not liable for any use
that may be made of the information contained therein. The information in this document is provided "as is" without
guarantee or warranty of any kind, express or implied, including but not limited to the �tness of the information for a
particular purpose. The user thereof uses the information at his/ her sole risk and liability.
This project has received funding from the European Union's Horizon 2020 research and innovation programme under
grant agreement No 688227.
D1.1.3 - v. 2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
History
Version Date Reason Revised by
0.0 15/05/2017 First draft created Axel Ngonga (InfAI)
0.3 30/05/2017 Peer reviewed Nadine Jochimsen (InfAI)
1.0 31/05/2017 Updates and corrections Axel-Cyrille Ngonga Ngomo (InfAI)
2.0 30/11/2017 Updates and corrections Gayane Sedrakyan (imec)
2.0 24/03/2018 Updates and corrections Gayane Sedrakyan (imec)
2.0 28/03/2018 Peer reviewed Pavel Smirnov (AGT)
2.0 29/03/2018 Updates and corrections Gayane Sedrakyan (imec)
Author List
Organization Name Contact Information
InfAI Axel-Cyrille Ngonga Ngomo [email protected]
imec Gayane Sedrakyan [email protected]
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Page 1
D1.1.3 - v. 2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Executive Summary
This document details the �nal state of the Hobbit community and is basically an update ofD1.1.2. During the second project year the focus of WP1 was not only on expansion but also onconsolidation and curation. In particular, the partners focused on curating the contact list and updatingit with new contacts. After the curation, removal of unreliable addresses and addition of novel contacts,the address list grew to 300 relevant contacts. The list of datasets were increased to 23. The interactionwith experts and other research projects has also led to the de�nition of use cases within whichbenchmarking as o�ered by Hobbit could be of central importance. The updated list of use cases andthe relevant benchmarks and datasets are detailed.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Page 2
D1.1.3 - v. 2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Contents
1 Introduction 6
2 Final State of the Community 7
2.1 Community Building Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Current State of the Community . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Datasets 11
4 Use Cases 15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Page 3
D1.1.3 - v. 2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
List of Tables
1 Dissemination channels of Hobbit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Excerpt of dissemination and outreach events in which Hobbit participated. . . . . . 9
3 Excerpt of dissemination and outreach events in which Hobbit participated. . . . . . 10
4 Excerpt of the datasets available to the Hobbit project. A complete list can be foundat http://hobbit.ilabt.imec.be. Generators can create datasets of any size. Hencesize and expected growth cannot be stated. . . . . . . . . . . . . . . . . . . . . . . . . 11
5 Excerpt of the datasets available to the Hobbit project. A complete list can be foundat http://hobbit.ilabt.imec.be. Generators can create datasets of any size. Hencesize and expected growth cannot be stated. . . . . . . . . . . . . . . . . . . . . . . . . 12
6 Excerpt of the datasets available to the Hobbit project. A complete list can be foundat http://hobbit.ilabt.imec.be. Generators can create datasets of any size. Hencesize and expected growth cannot be stated. . . . . . . . . . . . . . . . . . . . . . . . . 13
7 Excerpt of the datasets available to the Hobbit project. A complete list can be foundat http://hobbit.ilabt.imec.be. Generators can create datasets of any size. Hencesize and expected growth cannot be stated. . . . . . . . . . . . . . . . . . . . . . . . . 14
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Page 4
D1.1.3 - v. 2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
List of Figures
1 Overview of Hobbit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Snapshot of Hobbit's Twitter account . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Distribution of Hobbit contacts in the world (left) and in Europe (right) . . . . . . . 8
4 Distribution of roles of Hobbit contacts . . . . . . . . . . . . . . . . . . . . . . . . . . 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Page 5
D1.1.3 - v. 2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 Introduction
In its �rst year, Hobbit has aimed to establish itself as the provider of a benchmarking platformfor industry and academia with a focus of Big Linked Data technologies. One of the key steps towardsachieving this goal was to build up a community of interested parties around the project. As shownin Figure 1, the idea behind this community is to
1. gather supplementary datasets relevant to the project,
2. gather KPIs for the evaluation of the frameworks,
3. gather solutions to benchmark and
4. collect potential members of the Hobbit association.
Data Collection
Industrydata
Measure Collection
Benchmark Creation
Benchmark 1
KPIsTasks
KPIsTasksKPIsTasks
KPIsTasks
KPIsTasks
KPIsTasks
Benchmark 2
Benchmark n
HOBBITPlatform
Solution 1
Solution k
Solution 2
Challenges
Reports
Participants/Community
Figure 1: Overview of Hobbit
During the second project year, the project continued to build up the infrastructure necessary toachieve the goal aforementioned. In particular, new challenges were organized. This led to the focusin the area of datasets and community being on dissemination and consolidation. The �nal volumeof the community has hence increased in number, with the contact list now at 300 relevant contacts(120%) of the goal for the end of the project, see Section 2). The possible use cases (see Section 4)have not been altered and still represent the current stand of the consortium.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Page 6
D1.1.3 - v. 2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Final State of the Community
2.1 Community Building Channels
As per the second year of the project, we continued using the multi-channel strategy describedin Table 1. The results achieved by using this strategy are monitored continuously by the Hobbitconsortium (especially by the dissemination and outreach group). The outcomes of this monitoringare the subject of deliverable D1.4 of Hobbit.1
Channel Description
Mailing list Subscriptions to the HOBBIT mailing list
Survey Respondents to the survey sent out for requirements gather-ing
Flyers Distribution of �yers at di�erent events
Talks Presentations of the HOBBIT project
Workshops Organization of workshop at major conferences and events
Cooperations Cooperation with relevant H2020 and national projects
Challenges Organization of challenges at major conferences (ISWC,DEBS, ESWC)
Publications Scienti�c publications about the core technologies of HOB-BIT. Upcoming are publications which use the HOBBITplatform.
Table 1: Dissemination channels of Hobbit
2.2 Current State of the Community
Over the two years of the project, HOBBIT was disseminated in manifold ways with the aim ofbuilding up a community around the project. For example, the project was disseminated at morethan 55 events (see Table 2 and Table 3 for an excerpt), within which we also aimed to get interestedparties to join HOBBIT even at the lowest level of engagement possible. We also interacted throughsocial media, for example by generating tweet content on a daily basis (see Figure 2). The partieswe interacted with across our multi-channel outreach and dissemination strategy (see subsection 2.1)were asked to join the HOBBIT community or to provide us with contact data for further reference.In addition, accomodating HOBBIT Association in a subgroup of an existing task force of BDVAassociation resulted in expanded networks and community contacts.
We gathered the following qualitative information on contacts:
• Full name, �rst name and last name
1Available at https://project-hobbit.eu/about/deliverables/.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Page 7
D1.1.3 - v. 2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Figure 2: Snapshot of Hobbit's Twitter account
• Role and role type
• Company
• Country
• Comment
• Source
• Project Contact
Figure 3: Distribution of Hobbit contacts in the world (left) and in Europe (right)
So far, 300 contacts were established and registered in the project contact database. We mainlyfocused on attracting the attention of companies to the project. In particular, 38.8% of the members
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Page 8
D1.1.3 - v. 2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Figure 4: Distribution of roles of Hobbit contacts
of the contact list are CxOs (e.g.,COO, CEO, CTO) or managers. 34 % are academics (professors,researchers, etc.) while only 16.4% are company employees (see Figure 4). Of these 300 contacts, 255contacts were subject to further interactions (107 companies, 76 academics).We hence consider the 255as being meaningful contacts in the sense of the description of work, meaning that we have alreadyachieved 102% of the target of 250 meaningful contacts. While the contact data cannot be publishedin this deliverable for reasons of privacy, Figure 3 gives an overview of the geospatial distribution ofthe community so far. Most of our contacts are European, with 43 contacts from Germany and 32from Belgium. We have however also aimed to reach out beyond Europe to get a glimpse of the currentideas, trends and use cases that could bene�t the Hobbit association later. For example, our contactlist extends to the USA, Canada, Brazil, Australia and China.
During the second project year the project has contacted BDVA Association with the aim of elabo-rating the creation of the HOBBIT association under its umbrella. After several rounds of negotiations,the integration of HOBBIT in a subgroup of an existing task force was achieved in the third year ofthe project.This subgroup will merge the e�orts of several data benchmarking projects under one BigData Benchmarking umbrella within which HOBBIT will take the lead of linked data benchmarking.The total number of members subscribed in the task force is 176.
Event name Attendees/readers (esti-mates)
Big Data Value Association (BDVA) workshops 2017 ≈ 100
European Big Data Value Forum (EBDVF) 2017 ≈ 1,200
International Semantic Web Conference (ISWC) 2017 ≈ 800
European Semantic Web Conference (ESWC) 2017 ≈ 400
Distributed Event-Based Systems Conference (DEBS) 2017 ≈ 90
Web Intelligence (WI) 2017 ≈ 300
Table 2: Excerpt of dissemination and outreach events in which Hobbit participated.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Page 9
D1.1.3 - v. 2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Event name Attendees/readers (esti-mates)
Knowledge Capture (K-CAP) 2017 ≈ 75
World Wide Web (WWW) 2017 ≈ 800
Ontology Matching (OM) 2017 ≈ 30
NLIWOD 2017 at ISWC 2017 ≈ 50
BLINK 2017 at ISWC 2017 ≈ 25
International Conference on Semantic Systems (SEMAN-TiCS) 2017
≈ 370
IEEE BigData 2017 ≈ 600
AAAI Conference on Arti�cial Intelligence 2017 ≈ 1,850
European Big DataForum ≈ 300
BITKOM Big Data Summit 2016 ≈ 600
CEBIT 2016 ≈ 300
International Semantic Web Conference (ISWC) 2016 -BLINK workshop
≈ 300
International Semantic Web Conference (ISWC) 2016 - LinkDiscovery Tutorial
≈ 300
European Semantic Web Conference (ESWC) 2016 ≈ 300
European Conference on Arti�cial Intelligence (ECAI) 2016 ≈ 300
Diverse project meetings ≈ 100
Interviews ≈ 10,000
Table 3: Excerpt of dissemination and outreach events in which Hobbit participated.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Page 10
D1.1.3 - v. 2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Datasets
During 24 months, 23 datasets were gathered by the consortium and in a CKAN repository,accessible through the URL hobbit.ilabt.imec.be. These datasets are listed in the tables below (seeTable 4, Table 5, Table 6 and Table 7).
Dataset Description Size (approxi-mation)
Growth (ex-pected)
Medical SubjectHeadings (MeSH)
Public RDF Datasets of Medi-cal Subject Headings (MeSH) con-trolled vocabulary
27,883 descriptorsin 2016 MeSH;87,000 entryterms, 232,000SupplementaryConcept Records(SCRs)
Approximately2% per year
LinkedSpending LinkedSpending contains govern-ment spendings from all over theworld as Linked Data. Linked-Spending uses the information col-lected by the OpenSpending projectand makes it available as data cube
2 million �nancialtransactions
7% per year
DBpedia DBpedia is a crowd-sourced commu-nity e�ort to extract structured in-formation from Wikipedia and makethis information available on theWeb. DBpedia allows answeringcomplex questions using the W3Cstandard SPARQL.
3 billion facts,125 languages,38.3 entities
10-20% per year
CER Smart Me-tering Project
The Smart Metering ElectricityCustomer Behaviour Trials (CBTs)took place during 2009 and 2010with over 5,000 Irish homes andbusinesses participating.
5,375 homes, 780businesses
Static
Next Bike Live information of GPS position ofaround 20,000 bicycles in about 70cities (http://www.nextbike.net/)
Live stream of3,000 bike posi-tions, 70 cities
Unclear
Table 4: Excerpt of the datasets available to the Hobbit project. A complete list can be found athttp://hobbit.ilabt.imec.be. Generators can create datasets of any size. Hence size and expectedgrowth cannot be stated.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Page 11
D1.1.3 - v. 2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Dataset Description Size (approxi-mation)
Growth (ex-pected)
BioASQ Dataset underlying the question an-swering challenge of the same name.The challenges focuses on large-scalebiomedical semantic indexing andquestion answering
800 questions 20,00%
Energy Map Ger-many
CSV data of development of solarenergy within Germany with instal-lation date, location, nominal capac-ity, GPS information
1.5 million entries 1-2%
LDBC The LDBC-SNB Data Generator(DATAGEN) is the responsible ofproviding the data sets used by allthe LDBC benchmarks.
Generator Generator
LinkedGeoData LinkedGeoData is an e�ort to adda spatial dimension to the Web ofData / Semantic Web.
30 billion facts 5-10% per year
TLC Trip RecordData
This dataset includes trip recordsfrom all trips completed in yellowand green taxis in NYC in 2014 andselected months of 2015.
1.1 billion taxitrips
10-20% per year
GitHub Data GitHub is how people build softwareand is home to the largest commu-nity of open source developers in theworld, with over 12 million peoplecontributing to 31 million projects.
31 millionprojects,12 mil-lion users
5-10% per year
TWIG Ontology The ontology for the synthetic ver-sion of Twitter based on the Twit-ter7 dataset.
Generator Generator
QALD6 Question Answering on Linked Dataversion 6. The dataset containsapproximately questions in naturallanguage as well as the correspond-ing SPARQL queries and keywordqueries to gather information fromDBpedia, DBpedia abstracts and re-lated datasets.
500 questions 10%
Table 5: Excerpt of the datasets available to the Hobbit project. A complete list can be found athttp://hobbit.ilabt.imec.be. Generators can create datasets of any size. Hence size and expectedgrowth cannot be stated.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Page 12
D1.1.3 - v. 2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Dataset Description Size (approxi-mation)
Growth (ex-pected)
BENGAL This family of datasets for namedentity recognition, entity disam-biguation and relation extractionare generated automatically out ofRDF data using natural languagegeneration.
Generator Generator
LIVED The �Long Device Level EnergyData� (LIVED) dataset and con-tains measurements collected fromsmart plugs multi-sensors as de-picted.
2.5 billion mea-surements
Static
Linked Connec-tions
Linked Connections is a method forgenerating publishing transit datausing a low-cost API. It does this byexposing data in JSON(-LD).
Generator Generator
Weidm�ller Energyand InjectionMolding Data Set
This data set consists of simulateddata using based on real measure-ments. The sensor measurements inthe data set are taken from man-ufacturing machines. It containsreadings from energy meters as wellas sensors that monitor the pro-duction process of injection moldingmachines.
Generator Generator
Linked SoftwareDependencies
Performs queries over software mod-ules. Experimented with access475,000+ npm JavaScript librariesas 150,000,000+ RDF triples usingTPF, HDT or Turtle
Generator Generator
VIAF The VIAF (Virtual InternationalAuthority File) combines multiplename authority �les into a singleOCLC-hosted name authority ser-vice. The goal of the service isto lower the cost and increase theutility of library authority �les bymatching and linking widely-usedauthority �les and making that in-formation available on the Web.
Generator Generator
Table 6: Excerpt of the datasets available to the Hobbit project. A complete list can be found athttp://hobbit.ilabt.imec.be. Generators can create datasets of any size. Hence size and expectedgrowth cannot be stated.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Page 13
D1.1.3 - v. 2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Dataset Description Size (approxi-mation)
Growth (ex-pected)
GeoNames GeoNames contains placenames andcovers all countries.
GeoNames con-tains over 8million place-names
static
LOV LOV stands for Linked Open Vo-cabularies. LOV provides a choiceof several hundreds of such vo-cabularies, based on quality re-quirements including URI stabilityand availability on the Web, useof standard formats and publica-tion best practices, quality meta-data and documentation, identi�-able and trustable publication body,proper versioning policy.
Generator Generator
RISIS Research Infrastructure for researchand innovation policy studies. Alldatasets can be accessed via the visitrequest option.
unclear unclear
Synthetic TraceGenerator
Generates car trips as sequences ofGPS points. It takes into accounttypical origins and destinations forsome area, as well as speeds for everyroad segment.
Generator Generator
Table 7: Excerpt of the datasets available to the Hobbit project. A complete list can be found athttp://hobbit.ilabt.imec.be. Generators can create datasets of any size. Hence size and expectedgrowth cannot be stated.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Page 14
D1.1.3 - v. 2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 Use Cases
The use cases of interest to theHobbit community and contacts vary signi�cantly and are collectedcontinuously. So far, we were able to gather descriptions within
1. dissemination events,
2. interviews,
3. collaborations with other projects and
4. in deliverables of other projects.
This data collection process returns use cases hint at applications in the following domains (note thatthe names and contacts from which the information was gathered are partly omitted on purpose forthe sake of privacy):
• Industry 4.0: The use of semantics in the industry 4.0 is of central importance for the creation ofmachines that can justify their behavior and interact with their users. Amongst other activities,we gathered information from the experts in the SAKE2 and STEP3 projects, who expressedinterest in benchmarking link discovery, storage, machine learning and visualisation. Datasetssuch as the CER Smart Metering, LIVED and Weidmüller are of interest.
• Geospatial data analysis: Geospatial datasets belong to the largest and most used datasetson the planet. Contacts with experts from related projects (GeoKnow,4 GEISER,5 SmartRegio,6
STEP, SLIPO, SAGE) revealed that these experts are interested in Hobbit datasets related togeospatial entities and points of interest (LinkedGeoData, Energy Map Germany, LinkedConnec-tions, TLC Record Trip). The benchmarks of interest here are related to knowledge extractionfrom structured and unstructured data, storage, versioning and machine learning and visualisa-tion.
• Smart Energy: Devising a machinery that can use energy data to provide customers with intel-ligent energy services ranging from the automatic selection of energy providers to the detectionof unwanted states (machinery on during the weekend, open fridge doors, etc.) is regarded as aninnovative goal worthy of pursuit. Benchmarking how well such systems perform demands bench-marks in data acquisition, storage, versioning. Relevant datasets include the LIVED, Weidmüllerand CER Smart Metering datasets.
• Weather Data Analysis: The increasing amount of streaming data from weather sensorsdemands novel techniques for the semantic analysis of streaming data. The area of continuousqueries was regarded as one of the key areas for which benchmarking methodologies and uni�edsemantics still need to be dealt with. Here, Smart metering data (LIVED, Weidmüller, CER)are regarded as being of signi�cance, while storage and acquisition benchmarks are key.
• Human Resource Management: A rather surprising use case for the HOBBIT datasets, gen-erators and benchmarks for the sake of �nding good candidates for job o�ers. Novel applications
2http://sake-projekt.de3https://www.projekt-step.de/4http://geoknow.eu/5http://www.projekt-geiser.de/6http://www.smartregio.org/
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Page 15
D1.1.3 - v. 2.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
for this purpose demand e�cient entity recognition, entity linking and relation extraction, whichare the area targeted by the knowledge extraction benchmark of HOBBIT. Relevant datasetshere include the TWIG and the BENGAL datasets.
• Enterprise Search: Searching through streams of ever changing data is of central importancefor data-driven companies. The use cases here include federated search across several datasets(see projects DIESEL7 and WDAqua8) to search on mobile devices (e.g., project QAMEL9). TheQALD 6, DBpedia, BioASQ, MESH and BENGAL datasets are here the most related while theknowledge acquisition benchmarks are the most important.
• European societal challenges: Through our collaboration with BigDataEurope, we were ableto gather use cases for HOBBIT for seven of the societal challenges formulated by the EuropeanUnion (i.e., health, food and agriculture, energy, transport, climate, social sciences and security).Given the diversity of the challenges, virtually all datasets and benchmarks provided by HOBBITare relevant for at least one of the challenges or for the technical solutions underlying thesechallenges. For example, the CER Smart Metering data and the data storage and knowledgebenchmarks are of central importance for the energy domain while LinkedConnections and allother transport datasets are relevant for the transport societal challenge.
Minor use cases include works on linguas francas for storage, morphology analysis as well as indexingfor storage and question answering.
7https://diesel-project.eu/8http://wdaqua.eu/9https://qamel.eu/
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Page 16