D4.1Enriched!Semantic!Models!of! EmergencyEvents!€¦ · +project.eu0 0 0 0 This project has...
Transcript of D4.1Enriched!Semantic!Models!of! EmergencyEvents!€¦ · +project.eu0 0 0 0 This project has...
www.comrades-‐project.eu
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 687847
D4.1 Enriched Semantic Models of Emergency Events
Project acronym: COMRADES
Project full title: Collective Platform for Community
Resilience and Social Innovation during Crises
Grant agreement no.: 687847 Responsible: Grégoire Burel (OU)
Contributors:
Reviewer: Diana Maynard (USFD)
Document Reference: D4.1
Dissemination Level: <PU>
Version: 1.6
Date: 20/09/16
Disclaimer: This document reflects only the author's view and the Commission is not responsible for any use that may be made of the information it contains.
D4.1 Enriched Semantic Models of Emergency Events
2 | P a g e
© Copyright 2016 Grégoire Burel
History
Version Date Modification reason Modified by
0.1 20-09-2016 Initial draft Grégoire Burel
0.2 07-10-2016 Requirements and Specification Grégoire Burel
0.3 20-12-2016 Model Implementation Grégoire Burel
0.4 22-12-2016 Comments Merging Grégoire Burel
0.5 26-12-2016 Evaluation and Full Draft Grégoire Burel
D4.1 Enriched Semantic Models of Emergency Events
3 | P a g e
© Copyright 2016 Grégoire Burel
Table of contents
History ............................................................................................................................... 2 Table of contents ............................................................................................................... 3 List of tables ...................................................................................................................... 4 List of figures ..................................................................................................................... 4 Executive summary ............................................................................................................ 6 1 Introduction ................................................................................................................ 7
1.1 Objectives and Modelling Principles .............................................................................. 7 1.2 Design Approach and Methodology .............................................................................. 8
1.2.1 The NeOn Modelling Approach .............................................................................. 8 1.2.2 The Qualitative and Structural Design Methodology ........................................... 10 1.2.3 Ontology Evaluation ............................................................................................. 11
2 Structure of this document ........................................................................................ 12 Part I: Requirements Analysis and Model Specifications .................................................. 13 3 Introduction .............................................................................................................. 13 4 Requirements Information Sources ........................................................................... 14
4.1 COMRADES General Requirements ............................................................................. 14 4.2 Stakeholder Interviews ................................................................................................ 14 4.3 Ushahidi Data Structures ............................................................................................. 16 4.4 Crisis Related Datasets ................................................................................................. 19
5 The Ontology Requirement Specification Document (ORSD) ...................................... 24 5.1 COMRADES Aims and Model Purpose ......................................................................... 24 5.2 Intended Use and Users ............................................................................................... 25 5.3 Competency Questions ................................................................................................ 25
5.3.1 Work Package Requirements ................................................................................ 26 5.3.2 Interviews and Qualitative Requirements ............................................................ 26 5.3.3 Ushahidi Data Structures ...................................................................................... 27 5.3.4 Crisis Related Datasets .......................................................................................... 28
6 Term Glossary ............................................................................................................ 28 7 Summary ................................................................................................................... 29 Part II: COMRADES Ontology Model ................................................................................. 31 8 Introduction .............................................................................................................. 31 9 Model Principles ........................................................................................................ 31 10 Ontology Components ............................................................................................. 32
10.1 Classes and Relations ................................................................................................. 32
D4.1 Enriched Semantic Models of Emergency Events
4 | P a g e
© Copyright 2016 Grégoire Burel
10.1.1 Information Sources, Reports and Situations ..................................................... 33 10.1.2 Collections, Categories and Topics ..................................................................... 33 10.1.3 Actors, Organizations and Accounts ................................................................... 34 10.1.4 Tasks, Roles and Permissions .............................................................................. 34
10.2 Properties .................................................................................................................. 34 11 Integration with Existing Ontologies ........................................................................ 36
11.1 Crisis Related Ontologies ........................................................................................... 36 11.2 Other Ontologies ....................................................................................................... 37
12 Multilingual Support ............................................................................................... 38 13 Domain Knowledge ................................................................................................. 38 14 Summary ................................................................................................................. 38 Part III: Model Evaluation ................................................................................................ 40 15 Introduction ............................................................................................................ 40 16 Ontology Evaluation ................................................................................................ 40
16.1 Competency Questions Mapping .............................................................................. 40 16.2 Results ....................................................................................................................... 43
17 Conclusions ............................................................................................................. 44 Appendix ......................................................................................................................... 45 18 References .............................................................................................................. 45
List of tables
Table 1 Ushahidi data structures ........................................................................................ 17 Table 2 Crisis related datasets ............................................................................................ 20 Table 3 Data Structure of Crisis related datasets ................................................................ 23 Table 4 Top Terms Extracted from the Competency Question, Crisis Related Dataset and the Ushahidi Data Structures. .................................................................................................. 29 Table 5 Properties of the COMRADES Ontology .................................................................. 36 Table 6 Competency questions ontology mappings and evaluation. ................................... 43
List of figures
Figure 1: COMRADES Communities ............................................ Error! Bookmark not defined. Figure 2: Relationships of resilience capabilities (Comes, Unpublished Manuscript) ....... Error! Bookmark not defined.
D4.1 Enriched Semantic Models of Emergency Events
5 | P a g e
© Copyright 2016 Grégoire Burel
Figure 3: The relationship among enactment, organizing and sensemaking (Jennings, Greenwood 2003) ...................................................................... Error! Bookmark not defined. Figure 4: Institutional Framework for Disaster Management in India (R.K Dave 2012) .. Error! Bookmark not defined. Figure 5: Stills from the World Disaster Report 2013 promotional Video . Error! Bookmark not defined. Figure 6: The COBACORE Concept and challenges ....................... Error! Bookmark not defined. Figure 7: emBRACE Framework .................................................. Error! Bookmark not defined. Figure 8: COMRADES evaluation and communities’ interaction .. Error! Bookmark not defined.
D4.1 Enriched Semantic Models of Emergency Events
6 | P a g e
© Copyright 2016 Grégoire Burel
Executive summary
COMRADES (Collective Platform for Community Resilience and Social Innovation during Crises, www.comrades-project.eu) aims to empower communities with intelligent socio-technical solutions to help them reconnect, respond to, and recover from crisis situations. This deliverable analyses the COMRADES requirements from different project perspectives in order to design and implement a common semantic model that represents micro emergency events and related metadata. In particular we analyse: 1) the data structures used by the Ushahidi platform since it is used as the underlying platform of the COMRADES system; 2) the requirements for the tools that need to be integrated into COMRADES platform; 3) stakeholder interviews, and; 4) the structure of crisis related datasets. Based on the NeOn methodology [1] and a qualitative and structural design approach [2], we created an Ontology Requirement Specification Document (ORSD) [3], that highlights the needs and specifies the competency questions that the model needs to address in order to comply with the COMRADES model requirements. Following the development of the ORSD, we implement the COMRADES model as an ontology using RDF/OWL. In order to allow the usage of the ontology in multilingual scenarios we translate the classes, properties and relation names to different languages. Finally, for improving the interoperability of the model with existing ontological models we align some part of the COMRADES ontology with well known ontologies such as SIOC and FOAF. Although we cannot completely evaluate the ontological model since some data is not yet available for the model (i.e. the COMRADES platform is not yet fully developed), we show that the model can successfully represent 102 different competency questions.
D4.1 Enriched Semantic Models of Emergency Events
7 | P a g e
© Copyright 2016 Grégoire Burel
1 Introduction The representation of crisis events and micro-events is a key aspect of the COMRADES European project that aims to create an open-source community resilience platform for helping the management of emergency crises. The model needs to be easily integrated to the Ushahidi1 platform as it will be used as the backbone of the developed resilience platform. The focus of this deliverable is to provide a common semantic model that can be used in all the different aspects of the platform. In particular, the proposed model should allow the collection of user reports and related information and its organization. In the rest of this document, we refer to such model as the COMRADES model. In order to achieve this aim, we develop an ontological model based on standard semantic web technologies (RDF/OWL). The model is designed by studying what functionalities are necessary for developing the COMRADES crisis platform, existing platforms (in particular the Ushahidi platform), existing datasets and stakeholder interviews. Our development approach follows a qualitative and structural design methodology [2] where requirement and modelling needs are extracted from stakeholder interviews and existing platform data structures (e.g. Ushahidi) and datasets (e.g. Twitter2 and ACLED3). The idea of using datasets as input while developing the COMRADES model is motivated by the need for representing a large variety of input sources in the model. This differs from other ontology development methods where existing model are reviewed first by matching them to existing datasets and then extended. The COMRADES development approach first identifies requirements from the dataset and other sources, before creating an ontology and then aligning it if possible with existing ontologies. This approach has the advantage of better integration with requirements that are not specified in existing data such as requirements obtained from interviews and to provide a simpler ontology model. Besides the previous methodology, the COMRADES model development partially follows the NEON ontology guidelines [1] that outlines an approach for specifying and implementing ontological models.
1.1 Objectives and Modelling Principles
A difference between COMRADES and existing emergency platforms is the need to model and analyse small-scale events within larger crises and the importance of coordinating actions between parties. For instance, in emergency crises, an individual may need transportation while another party may be willing to provide such transportation. In this context, it is necessary to enable the coordination of the needed services with available resources. Similarly, it is also important to distinguish relevant
1 Ushahidi, https://www.ushahidi.com/. 2 Twitter, https://twitter.com/. 3 ACLED, http://www.acleddata.com/.
D4.1 Enriched Semantic Models of Emergency Events
8 | P a g e
© Copyright 2016 Grégoire Burel
and trustworthy information. As with action coordination, it remains mostly a manual task. In order to develop the COMRADES platform, a model that allows for such analysis is required. Although a few models have been designed in the past, most of them have been focused on particular crises aspects and tend to be overly complex. The aim of the proposed model is to provide a flexible and minimal model that addresses the representation of events, the associated evidences and resources, and provides a means for coordinating action automatically. The COMRADES model is directly linked with other WP4 tasks as well as the other work package needs. In particular, it needs to allow event and micro-event modelling (T4.2) and action coordination (T4.3) as well as multilingual processing (T3.1), content informativeness representation (T3.2) and content validity assessment (T3.3). Although there are different technologies for representing ontologies, we decide to use RDF/OWL as the COMRADES project needs to deal with data obtained from social media and online data sources and RDF/OWL is a semantic technology particularly adapted to such setting. Moreover, since the COMRADES platform will be web based this helps the integration of the model as web frameworks can manipulate RDF/OWL data easily. Even though different methods exist for building ontologies and data models, we decide to rely on two different approaches for building the COMRADES model. First, we propose to partially follow the NeOn methodology [1], a comprehensive approach for specifying, developing and evaluating ontologies. Second, we propose to apply the qualitative and structural design approach [2] for including non-ontological resources and user studies during the specification phase of the COMRADES model. Although the COMRADES model may be applied to different crises and scenarios not covered by the COMRADES project, the model goal is only focused on fulfilling the project requirements in order to provide a relatively simple model that can be easily reuse within the COMRADES project. For this purpose, the COMRADES model aims to follow the project requirements rather than providing a single model that fulfils all existing and future crisis platforms. Nevertheless, we aim to provide a model that can be extended easily, so it may be integrated in additional scenarios in the future.
1.2 Design Approach and Methodology
The development of the COMRADES model needs to follow a methodology in order to make sure that the model properly captures the design requirements.
1.2.1 The NeOn Modelling Approach
The NeOn methodology [1] is an approach for developing ontologies, identifying 9 different scenarios (i.e. design steps) that may arise when creating a new ontological model (Figure 1). As part of the creation of a new ontology, it is necessary to identify
D4.1 Enriched Semantic Models of Emergency Events
9 | P a g e
© Copyright 2016 Grégoire Burel
what scenarios apply to a particular development, as well as to create an Ontology Requirement Specification Document (ORSD) [3] document. A few different methods exists for designing ontological models such as Methontology [4], On-To-Knowledge [5], and DILIGENT [6]. However, we decided to focus on the NeOn approach since it helps the integration of existing models and reuse of non-ontological models.
Figure 1 Scenarios for Building Ontology Networks (Image source [7])
As displayed in Figure 1, the NeOn methodology is divided into the following 9 scenarios:
− Scenario 1: Specification to Implementation. − Scenario 2: Reusing and re-‐engineering non-‐ontological resources (NORs). − Scenario 3: Reusing ontological resources. − Scenario 4: Reusing and re-‐engineering ontological resources. − Scenario 5: Reusing and merging ontological resources. − Scenario 6: Reusing, merging and re-‐engineering ontological resources. − Scenario 7: Reusing ontology design patterns. − Scenario 8: Restructuring ontological resources. − Scenario 9: Localizing ontological resources.
For creating the COMRADES model, we have to follow the scenarios 1 and 9. To some extent we also try to reuse some commonly use ontologies as outlined in other scenarios. However, we do not strictly follow the ontology reuse scenario as it is not the focus of the COMRADES data model.
D4.1 Enriched Semantic Models of Emergency Events
10 | P a g e
© Copyright 2016 Grégoire Burel
The main scenario for developing the COMRADES model is Scenario 1, as the COMRADES ontological model needs to be developed from scratch. An important task of this scenario is the creation of an Ontology Requirement Specification Document (ORSD) [3] that describes the purpose, scope, implementation language, target group and intended uses of the specified model. In particular, this specification document needs to define a set of requirements that are defined as a set of Competency Questions (CQs). In order to create the ontology requirement specification, we need to collect knowledge from different sources and design competency questions. The COMRADES model needs to integrate with the existing Ushahidi platform, and to map existing non-ontological resources (i.e. existing datasets). Although the second scenario is in principle designed to help with this task by proposing a non-ontological resource reuse process, it focuses on glossaries, dictionaries, lexicons, classification schemes and taxonomies, and thesauri. This type of resource is wildly different from the ones we are integrating when designing the COMRADES model, as we focus on the integration of existing data structures from the Ushahidi platform and third party datasets that are more complex models that dictionaries and thesauri. As a result, the second scenario is not really suitable for our task. An important task is to enable the COMRADES model to be used in different languages. As a consequence, we use scenario 9 which mostly consists of translating ontological labels and descriptions to multiple languages. Although the scenarios 3, 4, 5, 6, 7 and 8 could be also applied to the design of the COMRADES model, the main focus is to provide modelling support for the different project tasks, integrated datasets and the Ushahidi platform4. Existing crisis ontologies are not completely relevant for COMRADES since they either focus on very specific use cases or are not designed to integrate with a large variety of data sources. As a consequence, the COMRADES modelling task does not concentrate on these scenarios. Nevertheless, when possible, we try to map some of the key COMRADES concepts to existing ontologies that are not necessarily designed for representing crises (e.g. SIOC, FOAF, DC Terms).
1.2.2 The Qualitative and Structural Design Methodology
One of the main shortcomings of the NeOn methodology is that the approach does not consider existing data structures and datasets as part of the development process. Even though the second scenario proposes the integration of non-ontological resources, its focus is not on existing data structures but on non-practical knowledge (e.g. thesauri, dictionaries). Therefore, the NeOn approach is mostly suitable when: 1) the new ontology needs to represent or integrate completely new datasets; or 2) the new ontology needs to integrate with existing ontologies.
4 As discussed previously, the Ushahidi platform will be the backbone of the COMRADES resilience platform. As a result, the model needs to support all the data structures used in the Ushahidi software.
D4.1 Enriched Semantic Models of Emergency Events
11 | P a g e
© Copyright 2016 Grégoire Burel
Although integrating existing ontologies can be seen as a type of structural analysis since it examines how existing ontological models can be used for a particular modelling task, it does not include a data format that is not already formally represented. In this context we propose to integrate elements from the qualitative and structural design methodology [2] where the design of a particular model is extracted from qualitative studies (e.g. interviews, surveys, etc.) and the structural analysis of datasets and the structure of existing software platforms or processes (e.g. thread structure of the data, user social interactions). In order to use this methodology, we need to: 1) obtain requirements and perceived needs by stakeholders using interviews or surveys (qualitative phase); and 2) collect the data that needs to be represented (e.g. Ushahidi data format, Twitter posts) (structural phase). Following that phase, the interviews are used for obtaining functional requirements and identifying important features that are necessary for designing a new model. Similarly, data structures are analysed for creating a common representation that feeds into the competency question and model implementation.
1.2.3 Ontology Evaluation
Although the NeOn methodology proposes an approach for developing a requirements document, it does not offer a clear process for evaluating if the developed model fulfils the ontology requirements. As part of the specification of the COMRADES model, we create competency questions that identify what type of query should be satisfied by the proposed model. Even though a complete evaluation of the COMRADES model requires the actual deployment of the model as part of the COMRADES resilience platform and the integration of the input and outputs of the different work packages, we propose to focus on a theoretical evaluation as there is not any data produced by the project yet. We evaluate the COMRADES model by mapping competency questions to the classes, properties and relations of the developed ontological model and by verifying if there is a possible query that can be used for connecting the different ontological resources associated with a given competency question. Therefore, the evaluation is a three steps process: 1) we map the classes, relations and properties from the COMRADES model to competency questions; 2) we determine if there is a path in the COMRADES ontology that connect the classes, relations and properties extracted from the competency questions; 3) if each competency question can be mapped and connected successfully to the COMRADES model, we conclude that the ontology successfully represent the competency questions.
D4.1 Enriched Semantic Models of Emergency Events
12 | P a g e
© Copyright 2016 Grégoire Burel
2 Structure of this document This document is divided into three parts: Part I: Requirements Analysis and Model Specifications In the first part of this document we define the requirement of the COMRADES model by specifying an Ontology Requirement Specification Document (ORSD) [3]. The approach follows the aforementioned NeOn methodology and Qualitative and Structural Design approach by analysing stakeholder interviews, the Ushahidi platform data structure, and the crisis related datasets that are analysed by the project. Part II: COMRADES Model After creating the ORSD, we introduce the COMRADES model. The model is based on the modelling and requirement principles highlighted in the introduction and implemented as an ontological model. Where possible, the model is aligned with existing ontologies so it is interoperable with existing technologies. We also provide multilingual labels for the model classes, properties and relations, so international communities can use the model more easily. Part III: Model Evaluation Although the model cannot be completely validated until later on in the project as it needs to be integrated with the COMRADES resilience platform and the inputs and outputs of the tools developed in the project, it can be evaluated based on the competency questions defined during the requirement analysis phase. In this part, we match competency questions to the model in order to confirm if the model satisfies the existing requirements.
D4.1 Enriched Semantic Models of Emergency Events
13 | P a g e
© Copyright 2016 Grégoire Burel
Part I: Requirements Analysis and Model Specifications
Platform Requirement Analysis and Specifications Document
3 Introduction According to the first scenario of the NeOn methodology, the first step for creating a new ontological model from scratch is to create an Ontology Requirement Specification Document (ORSD) [1]. In order to do so, we need to collect different information. As outlined in Figure 2, the ORSD document is divided in 7 different parts. For filling each of these parts, we use stakeholders’ interviews, the COMRADES project description (i.e. work package needs and project aims), and analyse the structure of the Ushahidi platform and different crises related datasets.
Figure 2 Template for creating an Ontology Requirement Specification Document (ORSD) (Image
source [1])
D4.1 Enriched Semantic Models of Emergency Events
14 | P a g e
© Copyright 2016 Grégoire Burel
We mostly follow the structure outlined by Figure 2. First, we identify the purpose of the model, its scope and level of formality. Then, we use both the COMRADES project description and interviews for framing the intended use and users of the model. Finally, we create competency questions using the qualitative and structural design methodology discussed in the introduction. As mentioned in the introduction section of this document, we use the qualitative and structural design approach for determining the requirements of the COMRADES model.
4 Requirements Information Sources As part of the ORSD design, we analyse four different type of data: 1) the COMRADES project development requirements (i.e. tools that are being developed in the various work packages); 2) stakeholder interviews; 3) the Ushahidi platform data structure, and; 3) the structure of the crises datasets. In this section, we present the different data sources that are investigated for designing the COMRADES model, and identify the model requirements that can be identified by those sources.
4.1 COMRADES General Requirements
A few requirements for the COMRADES model are clearly outlined by each work package (WP) tasks. In particular, the work on content informativeness and validity assessment (WP3) and emergency event detection, modelling and matchmaking (WP4) stipulates the following five tasks:
− Multilingual Content Processing (Task 3.1) − Content Informativeness Classification (Task 3.2) − Content and Source Validity Assessment (Task 3.3) − Emergency Event Identification and Clustering (Task 4.2) − Semantic Matchmaking of Emergency Events (Task 4.3)
Looking at each task, we can observe that for T3.1, the model needs to be able to represent different type of social media data and to be able to attach different pieces of information such as its language, topics and named entities. For T3.2 we need to represent the informativeness of individual messages. For T3.3, it is required to represent user profiles and the trustworthiness of particular pieces of information. For the WP4 tasks, documents need to be categorised, and events need to be represented (T4.2). For T4.3, events need to be clustered in order to match related events.
4.2 Stakeholder Interviews
Many requirements come directly from analysing the needs of existing communities dealing with emergency situations, which was gathered by WP2, and will be delivered in D2.1 in March 2017. In this context, 8 interviews were conducted as part of the
D4.1 Enriched Semantic Models of Emergency Events
15 | P a g e
© Copyright 2016 Grégoire Burel
work on community requirements and evaluation of resilience platform (WP2) and the sociotechnical requirement of the COMRADES resilience platform. The interviews, which will be fully described in D2.2 due March 2017, involved a specialist in ICT for disaster management and 7 community leaders. Each interviewee was asked questions about how they currently use technologies when dealing with crises, and specifically “What sociotechnical requirements should be considered to design a social platform to boost communities’ resilience in a disaster situation?” The interviewed ICT specialist was Dr Marc van den Homberg (MA), a senior disaster management expert at CORDAID5 (a development aid organisation in the Netherlands). He is currently working in Bangladesh with local communities to help them to deal with floods. During the interview, he shared experiences from past and current projects. The interviews with the community leaders focused on their perception of how they currently deal with disaster situations and how it is supported by technology. They shared insights concerning how a new technology could improve crisis management. The interviewed community leaders were the following:
-‐ Adin (AD), Director at Hysteria6, a community laboratory that is focused on youth empowerment, art and city issues in Semarang, Indonesia. A current user of the Ushahidi platform.
-‐ Milan Mukhia (MI) from CORDAID. Milan has worked on humanitarian services in disaster zones in different countries for 12 years. Coordinating collaborations among stakeholders is his main focus.
-‐ Salina Shakya (SA) from CORDAID. Works for the project Parivartan,7 helping to facilitate the process for the society to go back normal life after an earthquake disaster.
-‐ Lumanti Joshi (LU), from Lumanti,8 a support group for shelter in Nepal. Architect, he has worked on organising communities for building reconstruction plans after disasters for 13 years. His focus is bridging community and government by creating structured plans.
-‐ Chuks (CH), Deputy Director at Reclaim Naija9 in Nigeria. The goal of the Reclaim Naija project is related to monitoring elections in real time. Citizens use Ushahidi to report incidents such as fraud or violence. The aim is to change the paradigm of elections in Nigeria by empowering grass-root communities towards civic participation.
-‐ Elsa Marie D’Silva (EL) is the founder of the project Safecity,10 which aims at making the problem of sexual harassment more evident to the whole society
5 CORDAD, https://www.cordaid.org/. 6 Hysteria, http://grobakhysteria.or.id. 7 Parivartan, http://parivartannepal.org.np. 8 Lumanti, http://lumanti.org.np. 9 Reclaim Naija, http://reclaimnaija.net. 10 Safecity, http://safecity.np.
D4.1 Enriched Semantic Models of Emergency Events
16 | P a g e
© Copyright 2016 Grégoire Burel
and policy makers. It promotes campaigns and workshops within communities on sharing stories and maps cases of sexual harassment or abuse in public.
-‐ Olodotum Fadeyiye (OL), programmes officer, and Babatunde Adegoke, designer. Work for Connected Development11 in Nigeria on projects connecting communities and policy makers. Their projects are related to transparency, raising environmental awareness, mapping road conditions and traffic, human rights (monitoring abuse between police and citizens and women's rights), monitoring elections and emergency response, most of them use crowd mapping tools.
From the different interviews we can observe that there is a strong need for a platform that allows anonymous reports, privacy management, the collection and visualisation of event location and reports as well as methods for searching particular events and the ability to assign tasks to reports. The interviews identify requirements for the COMRADES platform and by extension the COMRADES model in term of functionalities, usability, data needs, performance and external data sources. Users need to create reports of incidents with geolocation, time and date, the source of the information (e.g. data source, person reporting the incident) while ensuring methods that allow anonymous reports and feedback. The reliability of information needs to be available, and reports need to be approved. It should also be possible to assign action to reports and check their status. Reports should be available in different languages if possible. Information should also be categorised (e.g. needs, resources…). In term of data sources, the model should support means for adding multiple data sources such as social media (e.g. Flickr, Instagram, Twitter…), SMS and WhatsApp. Besides the need for representing reports of event and external data, the interviews showed a strong need for identifying the reliability of information, privacy management as well as assigning tasks for solving particular issues. Therefore, the COMRADES model needs to provide an easy representation for external data and for a task representation model, as well as access to management of information and its trustworthiness.
4.3 Ushahidi Data Structures
As part of the development of the COMRADES platform, an audit of the different data structures used in the Ushahidi platform was performed (D5.1). The Ushahidi data structures cover a wide range of key COMRADES needs such as the representation of users, posts and categories. Since such data structures are all formatted in JSON, they need to be translated into an ontological model so they can be integrated into the COMRADES model. 11 Connected Development, http://connecteddevelopment.org.
D4.1 Enriched Semantic Models of Emergency Events
17 | P a g e
© Copyright 2016 Grégoire Burel
Table 1 Ushahidi data structures
By analysing the different APIs of the Ushahidi platform, we obtain the data structures listed in
DATA STRUCTURE NAME DESCRIPTION PROPERTIES RELATIONS
User A user in the Ushahidi platform. Id, url, created, updated, email, real name, allowed_privileges
role
Contact A contact represents a social media account, SMS number or email address that a message came from.
Id, url, data_provider, type, contact, created, updated, allowed_privileges
User (creator)
Post (Survey) A survey is the core unit of the Ushahidi platform. All social media data is transformed into a survey.
Id, url, title, content, created, updated, source, location, type, allowed_privileges.
Parent (Post), form, user (creator), tags,
Message
Messages store the raw data ingested from social media sources. Each Message is turned into a Post that can then be modified, but the original message is always retained.
Message also stores outgoing messages sent in response to social media sources.
Id,url, data_provider, data_provider_message_id, title, message, type, created, allowed_privileges
Post, contact
Form
Forms define the data structure of surveys. Each form consists of a number of Stages, and each Stage has a number of Attributes.
Id, url, name, description, type, created, updated, allowed_privileges
Parent (From)
Form Stage Form stages describe groups of form attributes.
Id, url, label, allowed_privileges Form
Form Attribute Form Attributes define the data type and input method of individual data point on a Post.
Id, url, label, input, type, required, default, priority, cardinality, created, updated, allowed_privileges
Form_stage,
Media Media represents file uploads, usually to be attached to a post.
Id, url, caption, created, updated, allowed_privileges User (creator), collection
Tag Tags (or categories) can be applied across all posts, regardless of the Post’s Form.
Id, url, tag, slug, type, description, created, color, icon, role, allowed_privileges
Parent (Tag)
Collection A collection is a group of Posts. Posts are manually added to a collection.
Id, url, name, description, created, updated, allowed_privileges
User (creator), posts, visible_to
Role (User Groups) User roles used for determining administration privileges.
Id, url, name, display_name, description, permissions, allowed_privileges
D4.1 Enriched Semantic Models of Emergency Events
18 | P a g e
© Copyright 2016 Grégoire Burel
Table 1. The information associated with the data structures consists of either properties or relations. Properties are textual fields that are not shared across data structures, whereas relations are used for linking different data structures together.
DATA STRUCTURE NAME DESCRIPTION PROPERTIES RELATIONS
User A user in the Ushahidi platform. Id, url, created, updated, email, real name, allowed_privileges
role
Contact A contact represents a social media account, SMS number or email address that a message came from.
Id, url, data_provider, type, contact, created, updated, allowed_privileges
User (creator)
Post (Survey) A survey is the core unit of the Ushahidi platform. All social media data is transformed into a survey.
Id, url, title, content, created, updated, source, location, type, allowed_privileges.
Parent (Post), form, user (creator), tags,
Message
Messages store the raw data ingested from social media sources. Each Message is turned into a Post that can then be modified, but the original message is always retained.
Message also stores outgoing messages sent in response to social media sources.
Id,url, data_provider, data_provider_message_id, title, message, type, created, allowed_privileges
Post, contact
Form
Forms define the data structure of surveys. Each form consists of a number of Stages, and each Stage has a number of Attributes.
Id, url, name, description, type, created, updated, allowed_privileges
Parent (From)
Form Stage Form stages describe groups of form attributes.
Id, url, label, allowed_privileges Form
Form Attribute Form Attributes define the data type and input method of individual data point on a Post.
Id, url, label, input, type, required, default, priority, cardinality, created, updated, allowed_privileges
Form_stage,
Media Media represents file uploads, usually to be attached to a post.
Id, url, caption, created, updated, allowed_privileges User (creator), collection
Tag Tags (or categories) can be applied across all posts, regardless of the Post’s Form.
Id, url, tag, slug, type, description, created, color, icon, role, allowed_privileges
Parent (Tag)
Collection A collection is a group of Posts. Posts are manually added to a collection.
Id, url, name, description, created, updated, allowed_privileges
User (creator), posts, visible_to
Role (User Groups) User roles used for determining administration privileges.
Id, url, name, display_name, description, permissions, allowed_privileges
D4.1 Enriched Semantic Models of Emergency Events
19 | P a g e
© Copyright 2016 Grégoire Burel
It is important to note that some of the features derived from the APIs would benefit from being modelled as relations rather than properties. For instance, the icon used for representing a Tag should be converted to a relation that links to a Media resource, so that any media can be used for representing a particular category. In general, it can be observed that different input sources (Message) need to be integrated into the COMRADES model, then converted into a standardised unit of information (Post). These posts are then categorised (Tag) or grouped (Collection). Users (User) need to be associated to documents as creators and input sources. Finally, users need roles (Role) that can be used for giving access permission to the platform data. An important aspect of the Ushahidi data model is the concept of forms. Forms are associated with particular posts and are used for representing arbitrary textual input using customisable fields. This is particularly challenging in an ontological context as it can add a lot of complexity to the model.
4.4 Crisis Related Datasets
Many of the crisis-related datasets and data sources that can be used for data analysis purposes by the COMRADES project come from social media and particularly Twitter. Crisis-related datasets are generally divided into high-level data and low-level information. High-level datasets contain citizen reports or social media reports about discrete events that occur in large-scale crises, whereas low-level datasets focus on the general description of events. Compared to high-level datasets, low-level datasets have more information about the specifics of particular events and are typically created manually by experts or organizations, by verifying reports. Unfortunately, such data tends to be created after events occur, and contains aggregated information. Compared to such low-level datasets, the high-level datasets tend to be unfiltered and unverified reports of discrete events that lack clear context. In COMRADES, we are more interested in types of data such as: 1) those which tend to contain more real-time information than the low-level datasets; 2) those where the dataset size is much larger than their low-level counterpart. The following table (Table 2) lists the different datasets that have been investigated so far. The available data can be divided depending on the data that was used for building a particular dataset. We distinguish three types of data source: social media data (i.e. Twitter posts), user reports (e.g. Ushahidi, ACLED) and news agency data (e.g. news websites). Each data types have advantages and disadvantages. Social media data is widely available, however reliability is unclear and the format is highly unstructured so it requires complex analysis in order to be converted into usable data. Citizen reports are more scarce but potentially more useful as they are formatted specifically for describing events. Finally, news data has the advantage to be more reliable and can contain information about disaster relief information. However, such data is more likely available after an event occurs and is low-level as it is summarizing a situation. DATASET DESCRIPTION MEDIA TYPE COVERAGE DATASET
D4.1 Enriched Semantic Models of Emergency Events
20 | P a g e
© Copyright 2016 Grégoire Burel
Table 2 Crisis related datasets
In term of data formats, existing social media datasets tend to be based on Twitter data, therefore, they directly follow the twitter message format and contains small short text with user information and sometimes user GPS coordinates that can be used for identifying the location of particular events. Report data is platform-specific but generally contains a title, a date, a location, a description, and type (e.g. fire, earthquake…). Sometimes there can be additional information depending on the type of report. For example, the Ushahidi instance
SIZE
Crisis Lex T26 26 crises partially annotated with informativeness, information type and source.
Twitter (Social Media) 2012-2013 ~250k Tweets
Incident Tweets
Data collected from multiple cities in the USA and UK. Partially annotated with event types.
Twitter (Social Media) 2012-2014 ~15M Tweets
Crisis Lex T6 6 Crises / Annotated by relatedness Twitter (Social Media) 2012-2013 ~60k Tweets
Crisis NLP
Multiples events datasets with some computed features. Multiple languages (English/French/Spanish).
Twitter (Social Media) 2014-2016 ~40M Tweets
Crisis Map (Ushahidi)
Many event report from Ushahidi deployments. Citizen Reports (Ushahidi) 2011-2013 33 Events
Phoenix Data Project
Near real-time event dataset created by scrapping 400 news sources.
Event Summaries (News Agency Data Source). Uses the CAMEO event taxonomy.
2014-Now Monthly datasets (expanding)
GDELT
Multiple databases created in near real-time created from multiple data sources in different languages.
Event Summaries (News Agency Data Source). Uses the CAMEO event taxonomy.
1979-Now / 2013-Now (with data source)
Collected every 15 minutes (expanding)
ACLED Event summaries created weakly about event occurring in Africa and Asia.
Event Summaries (Created and verified manually). Uses the CAMEO event taxonomy.
1997 – Now (Africa) / 2010 – Now (Asia)
Weekly datasets (expanding)
Crisis Net
Data of crises such as diseases, political conflicts, and health (e.g. Ebola), all freely accessible via REST API.
Reports automatically generated from different data sources. 2014
~1.6M Items
Relief Web Real-time API access to reports since 1996. Provides low-level information about global events.
Citizen Reports (Unformatted data). 1996-Now ~54K Reports
HDX
The Humanitarian Data Exchange is a dataset repository that contains multiple datasets about different crises and related resources in different formats.
Citizen Reports / Event Summaries / Social Media 2014-Now
4163 Datasets / 244 Locations / 804 Sources
(expanding)
D4.1 Enriched Semantic Models of Emergency Events
21 | P a g e
© Copyright 2016 Grégoire Burel
created for monitoring the USA presidential elections of 201612 has custom fields about candidates in each reports. Finally, many of the news agency based datasets such as the GDELT13, ACLED14 and Phoenix Data Project datasets15 follow the CAMEO [8] model that provides a taxonomy to identify the type of event mentioned as well as the actors involved. Since there are many similarities between the different data models listed in Table Error! Reference source not found., and since each dataset uses different terminology for describing similar type data, we decided to translate the data structures found in each dataset into the same format (Table 3).
Data Structure Feature Description Dataset
Report/Event/Post
ID Unique identifier. ACLED, GDELT, Phoenix, Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Creator Actor that created the document. Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP.
Creation Date Date when a document was created. Twitter Message, CrisisNet, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP.
Update Date Update date of a document. CrisisNet
Title Summary/Title of a document CrisisNet
Content Content of a document. ACLED,Twitter, CrisisNet, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Media A media associated with a document (e.g. Image, video). Twitter, CrisisNet
Actor Source The actor source. ACLED, GDELT, Phoenix
Number of sources Number of information sources for the document. GDELT
Category Tags or categories that classify a document.
Twitter, CrisisNet, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Location Precision Certainty of an event location. ACLED
Information type Identify the type of information contained in a document (e.g. caution, advice, donation...).
ACLED, GDELT, Phoenix,Incident Tweets,CrisisLex T26, CrisisNLP
Information sub-type Identify the sub-type of the information contained a document (e.g. donation-shelter).
GDELT, CrisisNLP
Information Sub-sub-type type
Identify sub-sub-type the information contained a document. GDELT
Language Language of a document. Twitter, CrisisNet, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
12 Ushahidi USA Elections, https://usaelectionmonitor.ushahidi.io. 13 GDELT, http://www.gdeltproject.org/data.html#rawdatafiles. 14 ACLED, http://www.acleddata.com/data/. 15 Phoenix Data Project, http://phoenixdata.org/data.
D4.1 Enriched Semantic Models of Emergency Events
22 | P a g e
© Copyright 2016 Grégoire Burel
English translation The English translation of the content of a document. CrisisNet
Relevance Identify if a document is about a crisis event. CrisisLex T26
Informativeness Identify if a document is informative (i.e. gives useful information).
CrisisLex T26, CrisisNLP
Actor Relation The relation between the actors involved in an event. ACLED, Phoenix
Target Actor The actors targeted by an event (recipient actor). ACLED, Phoenix
Fatalities count Number of fatalities. ACLED
Goldstein Code
Numeric score capturing the theoretical potential impact that the type of event will have on the stability of a country.
GDELT, Phoenix
Average Tone Scale defining the Positiveness / Negativeness of an event. GDELT
Reference An external information cited by a document.
Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Source Actor The actor that initiated an event ACLED, Phoenix
Favourite count Number of times a document has been bookmarked.
Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Messaged Actor The identifier of the actor that a document content is targeted at.
Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Parent Document The identifier of a parent document (e.g. reply to) or related event.
Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Shares count Number of times a document has been shared.
Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Event Location The location of the reported object.
ACLED, GDELT, Phoenix, Twitter Message, CrisisNet, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Event Date Date when the object referred in the document or report was created (e.g. event, resource...).
ACLED, GDELT, Phoenix.
Event Lifespan If the document describes something temporary (e.g. event) or permanent (e.g. resource).
CrisisNet.
Event Date precision Certainty of an event date. ACLED
Geolocation
Geolocation Description Name of a geolocation.
ACLED, GDELT, Twitter, CrisisNet, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Coordinates GPS coordinates of a geolocation.
ACLED, GDELT, Phoenix, Twitter, CrisisNet, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Country Country geolocation. ACLED, GDELT, CrisisNet
Admin Region 1 Largest administrative region. ACLED, GDELT, Phoenix
Admin Region 2 Second largest administrative region. ACLED, GDELT, Phoenix
Admin Region 3 Third largest administrative region. ACLED, GDELT, Phoenix
User/Account ID Unique actor identifier. Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
D4.1 Enriched Semantic Models of Emergency Events
23 | P a g e
© Copyright 2016 Grégoire Burel
Name Name of the actor. Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP,ACLED
Agent Type The type of an actor (e.g. media, government...).
CrisisLex T26, CrisisNLP, ACLED
Creation date Date when an actor account has been created.
Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Description Text description of an author. Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Favourite count Number of times an actor account has been bookmarked.
Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Suscribers List of actors subscribing to that particular actor account.
Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Subscribers count Number of actors account subscribing to that particular actor.
Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Subscribed count Number of actor accounts subscribed to.
Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Language Language of the actor. Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Geolocation Geolocation of actor. Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Documents List of created documents. Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Documents count Number of document created. Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
URL URL associated with the actor. Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP
Country Country of the actor. GDELT, ACELD, Phoenix
Organisation The actor group. (e.g. United Nation, al-Quaida) GDELT, ACLED, Phoenix
Ethnic group The ethnic group of an actor. GDELT
Religion The actor religion. GDELT
Data source
Creation date Date time at which the source was created. CrisisNet
Description Description of the source. CrisisNet
Start date When data is available initially CrisisNet
End date When the data is not accessible anymore. CrisisNet
Frequency How often the data get updated. CrisisNet
License The license of the data source. CrisisNet
Type The resource type (e.g. social network, news...). CrisisNet
Table 3 Data Structure of Crisis related datasets
As with the Ushahidi data structures, there are some features that may not be useful for the COMRADES model. For instance, the CrisisNet dataset provides data source information that is not necessary for the COMRADES model as this is not used by the different tools developed by the COMRADES platform and the Ushahidi platform. By analysing the different properties of each dataset, we distinguish four different data structures: 1) report, events and posts; 2) geolocation information; 3) user and account
D4.1 Enriched Semantic Models of Emergency Events
24 | P a g e
© Copyright 2016 Grégoire Burel
information, and; 4) data sources. The report, events and posts hold the main documents of the datasets whereas user and account information represent document creators or the organisation or information sources involved in events. Geolocation data structures hold information related to events and users. Finally, data source is only used by the CrisisNet dataset, and stores information about how data is accessed. In general, it appears that the crisis related datasets hold more information about events than the Ushahidi data structures even though Ushahidi can use forms for modelling such type of data. In particular, the datasets that follow the CAMEO taxonomy support different types of events and actors, with many different properties such as the actors involved in particular events as well as the organisations they belong to. In summary, the analysis of the crisis related datasets show that the Ushahidi data structure already support many of the requirement of the existing dataset except for the representation of domain specific information (e.g. Twitter posts) and rich user or event model that is mostly given by the CAMEO taxonomy and the Twitter data.
5 The Ontology Requirement Specification Document (ORSD) By analysing the stakeholder interviews, the Ushahidi platform and the different datasets considered for integration into the COMRADES model, we can create the ORSD that can be used for specifying what the COMRADES model should look like.
5.1 COMRADES Aims and Model Purpose
As part of the first step for designing the ORSD, we need to define the purpose, scope and level of formality of the model. The aim of the COMRADES project is to create a community resilience platform that provides a software that help communities to reconnect, response, and recover from crisis situations by providing a representation that allows communities or individuals to reconnect, respond to, and recover from crisis situations. In other words, the model needs to enable the representation of individuals and group of individuals, allow communication between individuals, enable communities to respond to crises by gathering critical information, and recover by allowing the organization of resources and aid. In order to do so, the COMRADES project aims to build on top of the Ushahidi platform by providing new intelligent algorithms aimed at helping communities, citizens, and humanitarian services with analysing, verifying, monitoring, and responding to emergency events. In general, the model needs to be general enough to cover a wide variety of scenarios and therefore be flexible. In the case of ontological development, a flexible model needs to offer relatively loose semantics (i.e. avoid overspecialisation) so that new types of users or resources do not require important ontological modifications. In terms of scope, the model aims to support the modelling of crises and their recovery through social media analysis and manual data input. In conclusion, the COMRADE model’s purpose can be defined as the following:
D4.1 Enriched Semantic Models of Emergency Events
25 | P a g e
© Copyright 2016 Grégoire Burel
The COMRADES model is a flexible model designed to organize individual or community communication of events, allow the gathering of information and resources about crises, and organize them for recovery purpose.
5.2 Intended Use and Users
The COMRADES model needs to satisfy different user groups such as governmental organizations and non-governmental groups, as well as individuals. Such individuals may have many different aims and goals. The model also needs to support algorithmic needs by allowing software to assert new information themselves (e.g. trustworthiness, extracted entities…). Following the interviews with stakeholders, we distinguish four different type of users for the COMRADES model:
(1) Platform stakeholders: The individuals or organisations that supply community platforms such as Ushahidi.
(2) Local community groups: Community members of local activist groups. (3) Responders: Organisations and individuals that use information gathered by
platforms in order to organise the response and recovery of a particular crisis. (4) Individuals and small citizen groups: Individuals or small communities that
are affected by a particular crisis. Each of these user groups have different needs that define how the COMRADES model will be used. For instance, platform stakeholders need to make sure that the platform can be deployed easily. Community groups need to be able to assess a given crisis situation. Responders require the ability to analyse a given situation and organise recovery. Finally, individuals and citizen groups need to understand the situation and to be able to ask for assistance by reporting crisis events. In summary, the COMRADES model needs to cater for the four different user types mentioned above, as well as to be useful in situations where users are looking for or are willing to provide information about crises and where responders are organising resources in order to solve a particular situation.
5.3 Competency Questions
A key part of the ORSD is to define competency questions that define what types of queries the model should be able to support. As previously discussed, we perform two types of analysis: 1) study stakeholder interviews for better understanding their needs; 2) analyse the work package requirements of the COMRADES project, and; 3) study the structure of the Ushahidi community. We define competency questions as task-oriented questions that need to be satisfied by the COMRADES model. These competency questions are used in the second part of this document for making sure that the model supports all the question-based requirements and, in the last part, for validating the model against the competency questions.
D4.1 Enriched Semantic Models of Emergency Events
26 | P a g e
© Copyright 2016 Grégoire Burel
In the following sections we convert the knowledge sources discussed in the previous section into competency questions. If a particular question is already covered by another knowledge source, we do not add an additional question.
5.3.1 Work Package Requirements
As previously discussed, the COMRADES tasks specify directly what type of information is needed for the algorithms used by the COMRADES resilience platform. For T3.2, T3.2 and T3.3, we obtain the following competency questions:
CQ1: How many messages are submitted to the platform? CQ3: What is the language of a message? CQ5: What are the topics of a message? CQ6: What are the named entities of a message? CQ7: What are the properties of a message? CQ8: How reliable is a message? CQ9: How reliable is a user? CQ10: How many users have submitted information to the platform?
For T4.2 and T4.3, we have the following competency questions:
CQ11: How many events are present in the platform? CQ12: What are the type of events in the platform? CQ13: How are two events related? CQ14: How many resources are needed? CQ15: How many resources are available?
5.3.2 Interviews and Qualitative Requirements
As previously discussed, the interviews conducted with the stakeholders showed the need for a strong reporting model with methods for managing the access to information, multiple data sources and task management. The competency questions derived from the interviews are listed below:
CQ16: What is the location of an event? CQ17: When did an event occur? CQ18: What is the information source of an event? CQ19: What are the information related to an event? CQ20: Is the event information source anonymous? CQ21: What are the pictures associated with an event? CQ22: What is the language of an event? CQ23: Who has access to the report? CQ24: Is the report reliable? CQ25: Is the report published (i.e. approved)? CQ26: What are the tasks associated to the report? CQ28: Who is assigned to a task? CQ29: What are the types of information sources?
D4.1 Enriched Semantic Models of Emergency Events
27 | P a g e
© Copyright 2016 Grégoire Burel
5.3.3 Ushahidi Data Structures
Although the Ushahidi platform supports 11 different data structures, it is possible to simplify the structure of the data model as long as the same information can be retrieved. The competency questions resulting from the analysis of the Ushahidi platform data format are:
CQ30: What are the type of posted document (Forms) available in the platform? CQ31: What are the type of media available? CQ32: What are the categories of documents that are in the platform? CQ33: What are the document collections in the platform? CQ34: What are the different user roles?
The COMRADES model also needs to be able to be queried for retrieving different properties from each model class stored in the model:
CQ35: When was a user created? CQ36: When was a user information updated? CQ37: When was a document created? CQ38: When was a document updated? CQ39: When was a message created? CQ40: When was a message updated? CQ41: When was a document type created? CQ42: When was a document type updated? CQ43: When was a media created? CQ44: When was a media updated? CQ45: When was a topic created? CQ46: When was a document collection created? CQ47: When was a document collection updated? CQ48: What is the email of a user? CQ49: What is the real name of a user? CQ50: What is the role of a user? CQ51: What are the privileges of a user? CQ52: What is the title of a document? CQ53: What is the content of a document? CQ54: What is the source of a document? CQ55: What is the location of a document? CQ56: What is the type of a document? CQ57: Who is the user that created a document? CQ58: What is the category of a document? CQ59: What is the caption of a media? CQ60: What are the collection associated with a media? CQ61: What is the type of a category? CQ62: What is the description of a category? CQ63: Who is the User that created a category? CQ64: What is the parent category of a category? CQ65: What is the name of a collection? CQ66: What is the description of a collection? CQ67: Who is the user that created a collection? CQ68: What are the documents in a collection? CQ69: What are the users that can access a collection?
D4.1 Enriched Semantic Models of Emergency Events
28 | P a g e
© Copyright 2016 Grégoire Burel
CQ70: What is the name of a role? CQ71: What is the description of a role? CQ72: What are the permissions of a role?
5.3.4 Crisis Related Datasets
As previously highlighted, many sources of information found in the crisis related dataset can be modelled by the Ushahidi data structures, and are therefore already taken into account by the Ushahidi data structures competency questions presented in the previous section. The competency questions retained when analysing the crisis related datasets are listed below:
CQ73: What is the title of a document? CQ74: What are the media associated with a document? CQ75: What is the actor at the origin of the information referred in a document? CQ76: How many data sources are referenced by a document? CQ77: How precise is an event or resource geolocation? CQ78: What is the type of information reported in a document? CQ79: What is the language of a document? CQ80: Is there an English translation for a document? CQ81: What are the document related to crises? CQ82: How informative is a document? CQ83: What is the relation between the actors involved in an event? CQ84: What are the actors involved in an event? CQ85: How many people died during an event? CQ86: What is the impact of an event on the stability of a country? CQ87: What are the document related to a document? CQ88: How positive is an event? CQ89: How many times a document was favorited? CQ90: Who was targeted by a message? CQ91: Is there a parent event to an event? CQ92: How many times a document was shared? CQ93: Where did an event take place? CQ94: What is the lifespan of an event or resource? CQ95: What is the type of an actor? CQ96: How is an actor described? CQ97: How many times a user account has been favorited? CQ98: Who is following a user account? CQ99: What are the users followed by a user account? CQ100: What is the religion of an actor? CQ101: what is the ethnic group of an actor? CQ102: What is the organisation associated with an actor?
6 Term Glossary Now that we have extracted a set of competency questions, we can extract the terms that are the most used in the questions in order to help the development of the model. The idea is that the most frequent terms are key aspects of the model and need to be modelled prominently (e.g. classes), whereas infrequent term may not need to be represented as prominently in the final model.
D4.1 Enriched Semantic Models of Emergency Events
29 | P a g e
© Copyright 2016 Grégoire Burel
The typical NeOn approach requires competency questions that are linked to actual data in order to extract each of those terms. This is different from the type of competency question that we have, since the previously listed questions are conceptualised from data structures and general interviews. As a result, our competency questions are not data specific and are more conceptual. The NeOn methodology [9] distinguishes three different types of terms: 1) competency question terms; 2) competency question answer terms, and; 3) object terms. The competency question terms are the top words that appears in competency questions, whereas answer terms are the ones that appears in the answer of the competency questions. The object terms are the named entities that are extracted from competency questions and answers. Since we do not have instantiated competency questions that both contain data specific questions and answers, we generate the glossary terms as follow: 1) we extract the most frequent terms appearing in our competency questions; 2) we extract the most frequent terms appearing in the data structures that we have used for creating our competency questions (Error! Reference source not found. and Error! Reference source not found.). The idea is that besides the terms extracted from the competency questions, the property descriptions and names of the different datasets and the Ushahidi can help the identification of the key concept and attributes of the COMRADES model. The top terms extracted from the competency questions are listed in Table 4.
Table 4 Top Terms Extracted from the Competency Question, Crisis Related Dataset and the Ushahidi
Data Structures.
7 Summary In order to specify the COMRADES model, we analysed the COMRADES project requirements and the Ushahidi platform as well as related crisis datasets in order to extract the data requirements for the model. We also analysed user requirements from stakeholder interviews in order to derive the model requirements from the future user perspectives. This approach was based on the structural and qualitative design approach discussed in the introduction. The analysis helped us to better understand the aims of the model, its future usage and users. We also produced a set of competency questions that form the basis of the model implementation. The next part of this document reuses those findings for fully specifying and implementing the COMRADES ontology. In particular, we reuse the competency questions for guiding
TYPE TERM (FREQUENCY)
Competency Question
Document (27), Event (17), User (13), created (10), Type (9), Information (8), Message (8), Collection (7), Category (6), Platform (6), Actor (6), Media (6), updated (6), associated (5), related (4), Role (4), Report (4), name (3), description (3), Source (3), language (3), Account (3), Events (3), reliable (3).
Data Structures
created (13), allowed_privileges (11), id (10), URL (10), Form (9), data (9), User (8), Type (8), updated (7), Post (6), Message (6), Media (6), Multiple (5), Event (5), Creator (4), Posts (4), description (4), sources (4), Collection (4), Crises (4), social (4), contact (4), name (3), annotated (3).
D4.1 Enriched Semantic Models of Emergency Events
30 | P a g e
© Copyright 2016 Grégoire Burel
the development of the ontology as well as the common data structures observed in the Ushahidi platform and crisis related datasets.
D4.1 Enriched Semantic Models of Emergency Events
31 | P a g e
© Copyright 2016 Grégoire Burel
Part II: COMRADES Ontology Model
COMRADES Model Implementation
8 Introduction In the previous sections, we created the Ontology Requirement Specification Document (ORSD) [1] for the COMRADES model based on multiple analyses and extracted a set of competency questions as well as a glossary of key terms. In the following sections we create the COMRADES ontology16 based on the ORSD by identifying the key components of the model and then identifying the relations between each component as well as the properties of the ontology. We also integrate the ontological model with different existing ontologies for improving the model interoperability and usability. For simplifying the usage of the model between different communities, we also translate the ontology classes, properties and relation to different language. Finally, we discuss how domain knowledge can be added to the COMRADES ontology.
9 Model Principles Before introducing the COMRADES model, we discuss the main approach used for organising the gathering and organisation of information and resources about crises. Many of the datasets and data structure analysed when creating the ORSD are centred on reports and the ingestion of external documents rather than the direct modelling of events. In this context we decide to centre the COMRADES model on reports rather than events where reports are clustered together for describing events that result in real world situations and external documents (or other information sources such as other reports or an informant) are used for documenting what is discussed in a report. Reports can be used in different ways for documenting events, needs, resources and so on and form the base of the COMRADES model. The advantage of using a report centred approach is that it allows a more organic gathering of information related to events without needing rigid data structures. This is particularly suitable for resilience platforms that are deployed in large variety of situations where the types of reports are context specific. We use a situation model for documenting how events affect their environment. Typically, a situation would involve different entities (e.g. local population, building, political situation…) and would define the state that was induced by the situation. For example, a building explosion (situation) would induce a particular building (entity) to be collapsed (status). 16 COMRADES Ontology, http://socsem.open.ac.uk/ontologies/comrades.
D4.1 Enriched Semantic Models of Emergency Events
32 | P a g e
© Copyright 2016 Grégoire Burel
Another important component of the model is the representation of categories and collections. We distinguish collections from categories as manually user curated groups of documents and reports whereas categories are hierarchical organisation used for classifying reports and events. For representing users and the permissions associated documents, reports and other model classes we use the concepts of roles and accounts where user hold roles that are associated with user permissions. We also use the concept of user account that are used for holding platform specific user information such as the user contribution reliability. Finally, we add a simple model for representing tasks that can be attached to reports and assigned to users.
10 Ontology Components Based on the previous model principles, we discuss the classes, relations and properties of the COMRADES model. We refer to the COMRADES namespace as com in the following sections.
10.1 Classes and Relations
As previously discussed, the COMRADES model is divided in different classes that separate crisis related data in four different types of information (reports, documents event and situation) and associate them with tasks as well as users. In the following sections, we discuss how the different classes of the model are named and linked. Figure 3 shows the different classes and relations of the COMRADES model.
Figure 3 The COMRADES Ontology classes and relations.
D4.1 Enriched Semantic Models of Emergency Events
33 | P a g e
© Copyright 2016 Grégoire Burel
10.1.1 Information Sources, Reports and Situations
The competency questions show that many properties and relations are focused on different types of documents and that both the Ushahidi platform and the crisis related dataset model prefer modelling event indirectly using user submitted reports or automatically generated documents. As a consequence, we decide to centre the representation crisis related information around the concepts of com:Report, com:Situation, com:Event, com:Document and com:Informant. In order to connect each of these components, we decided to associate a com:Report to a com:Document that represent the information sources that were used for creating a report such as an external media (com:Media) or message (com:Message). These classes can be subclassed as needed if a new com:Document representation is necessary. A com:Report can be also linked to a com:Informant that can be an com:Agent or com:Organisation and used for representing the organisation or person that gave the information used in a com:Report. We extend messages (com:Message) from documents (com:Document) as contrary to documents, messages occur in conversations (e.g. Twitter messages, forum posts) whereas documents are standalone information pieces (e.g. news articles, blog posts). Besides associating reports to documents and informants, reports are also connected with the events (com:Event) and situations (com:Situation) that they are describing or updating. Events are things that happens or takes place whereas situations are used for representing the states (com:State) of entities (com:Entities). The separation between com:Document and com:Informant with com:Event and com:Situation is designed for identifying how a piece of information obtained from an external source is integrated and processed through a report (com:Report) into a piece of usable knowledge in the form of a situation (com:Situation) or event (com:Event). This allows the model to be queried from different perspectives. For instance, com:Document can be used for understanding where an information comes from whereas com:Report can be used for understanding how a com:Document was brought in the COMRADES platform and, finally, com:Event and com:Situation can be used for analysing current emergency situations by observing the com:State of a com:Entity. Besides com:Document and com:Message, com:Report can be also associated with different type of medias (com:Media) such as pictures (com:Picture) and videos (com:Video).
10.1.2 Collections, Categories and Topics
The different type of information collected by the COMRADES model can be grouped and categorised in different ways. For instance, com:Report and com:Document
D4.1 Enriched Semantic Models of Emergency Events
34 | P a g e
© Copyright 2016 Grégoire Burel
can be grouped into com:Collection whereas com:Report and com:Event can be grouped in come:Category. Collections (com:Collection) are directly designed to emulate the Ushahidi document collection by allowing different types of information to be grouped together as a list according to arbitrary criteria while categories (com:Category) are used as a public hierarchical classification model for information retrieval purpose.
10.1.3 Actors, Organizations and Accounts
Another important part of the model is the representation of the actors, organisations and the accounts that are used for representing the creator of com:Document and the person that posted a com:Report as well as the people and organisation that created com:Situation or com:Event. We distinguish different types of users. In particular, we define com:Agent as a generic type of user and the com:Organisation class that can be used for defining different types of organisations or group a user can belongs to. For instance, a user could belong to a particular NGO or religious group. Users (com:Agent) are all defined as a subclass of com:Informant that can be used as the information source of com:Report when no document source (com:Document) is available but when an information comes from a known individual or person. For contributions within the COMRADES model, the com:Accout class is used for abstracting contributor specific information that only exist within the COMRADES model such as the number of documents created by a com:Agent.
10.1.4 Tasks, Roles and Permissions
As highlighted by the ORSD, the COMRADES model needs to support access permissions to the different content represented by the model. In this context, we define the class com:Role and com:Permission that are used together for associating permissions to multiple model classes. The Ushahidi platforms also supports the assignment of tasks to platform users. We support tasks by adding the com:Task to the model and linking it to com:Account and com:Report so that reports can be used for assigning tasks.
10.2 Properties
Contrary to relations, properties are not associated with other classes of the COMRADES ontology. The different properties required for each classes can be directly extracted from the competency questions as well as the previously analysed data structures. The properties of the classes displayed in Figure 3 are listed in the following table (Table 5):
D4.1 Enriched Semantic Models of Emergency Events
35 | P a g e
© Copyright 2016 Grégoire Burel
Class Property Description
Document
updated Date when a class was updated.
created Date when a class was instantiated.
title The document title.
content The content of a document.
informativeness How informative is a document (i.e. useful for crisis analysis).
favourites How many times a document was bookmarked.
shares The number of times the document was shared.
language The language of the document.
polarity Indicate a document sentiment.
englishTranslation The English translation of the document content.
Report
updated Date when a class was updated.
created Date when a class was instantiated.
title The title of the report.
informativeness How informative is the report (i.e. useful for crisis analysis).
language The language of the report.
approvalStatus The report status (e.g. draft, published, deleted...)
polarity Indicate a report sentiment.
englishTranslation The English translation of the report.
Situation
created Date when a class was instantiated.
updated Date when a class was updated.
description The description of the situation.
title The title of the situation.
startTime When the situation started.
endTime When the situation stopped.
informativeness How informative is the situation (i.e. useful for crisis analysis).
polarity Indicate the sentiment of a situation,
Event
updated Date when a class was updated.
created Date when a class was instantiated.
title The title of the event.
startTime When the event started.
endTime When the event stopped.
informativeness How informative is the event (i.e. useful for crisis analysis).
polarity Indicate the sentiment of an event.
Entity
created Date when a class was instantiated.
updated Date when a class was updated.
description The description of the entity.
name The name of the entity.
lifespan The entity lifespan (e.g. permanent, temporary, consumable...).
Category title The title of the category
description The description of the category.
D4.1 Enriched Semantic Models of Emergency Events
36 | P a g e
© Copyright 2016 Grégoire Burel
created Date when a class was instantiated.
updated Date when a class was updated.
Collection
title The title of the collection.
description The description of the collection.
created Date when a class was instantiated.
updated Date when a class was updated.
Agent
realName The real name or full name of the agent.
email The email associated with the agent.
description The description of the agent.
Account
favourite The number of times the account was bookmarked.
created Date when a class was instantiated.
updated Date when a class was updated.
Task
created Date when a class was instantiated.
updated Date when a class was updated.
title The task title.
description The task description.
status The status of the task (e.g. accepted, pending, assigned...).
Geolocation precision The accuracy of the geolocation.
Table 5 Properties of the COMRADES Ontology
It is important to note that the reliability of the different elements of the ontology are not represented as properties. Instead, the reliability and trustworthiness of resources is represented using the Veracity ontology17 [10].
11 Integration with Existing Ontologies As displayed in Error! Reference source not found., the COMRADES model reuse multiple ontologies for modelling the different classes, properties and relations discussed in the previous section.
11.1 Crisis Related Ontologies
Although many ontologies have been designed for representing crises or related information, most of them do not focus on the concepts of report and document. Rather than using those concepts, existing models prefer focusing on the event representation of emergency crises and ignore the collection of evidences and user submitted reports as a mean for representing event related information. Task representation is also generally absent from crisis related ontologies. Few ontologies have been designed for modelling event in crises situations such as MOAC18 (Management of a Crisis) and HXL (Humanitartian eXchange Lnaguage) [11]. However, despite modelling resources, processes, damages, and disasters (fire, 17 Veracity Ontology, http://purl.org/net/veracity/ns. 18 MOAC, http://www.observedchange.com/moac/ns/.
D4.1 Enriched Semantic Models of Emergency Events
37 | P a g e
© Copyright 2016 Grégoire Burel
people trapped, medical emergency), these models do not provide representations for documents and reports. The need for more complete models was highlighted by Liu et al. [12]. Moreover, existing semantic models were mostly designed for providing a static view of emergency situation, where elements are captured but not their temporal evolution. In term of document representation, the CURIO19 ontology (Collaborative User Resource Interaction Ontology) provides means for representing the collection of documents in an emergency context. However, the model only provides a simple model of event without the concept of event situations. However, the CURIO ontology shares some similarities with the COMRADES model as it is reusing many concepts from the SIOC20 ontology [13].
11.2 Other Ontologies
Most of the ontologies reused in the COMRADES ontology are based on widely used ontology. The main reason for reusing such kind of ontologies is that it improves the reusability of the model by allowing it to be used similarly to existing ontologies. The COMRADES ontology reuses five different ontologies for modelling its components and properties. The main ontology reused for representing the different elements of the COMRADES model is the SIOC ontology [13] that provides constructs for representing online communities. We reuse the SIOC ontology for representing documents, reports, collections, permissions and roles as well as a different properties and relations of the model. We also reuse the FOAF21 (Friend Of A Friend) ontology for representing users in the model as it integrates well with the SIOC ontology and provides ways for representing agents and organisations. For modelling geolocation, we use the Geonames22 and WGS8423 ontologies as they provide basic representations of geolocation coordinates that can be used for identifying the location of events and other resources. The Dublin Core24 model is also used as it provides many properties, relation and classes specifically designed for modelling documents. Finally, for representing the trustworthiness of the different content of the platform we us the Veracity ontology [10] as it provides methods for asserting the reliability of different resources. The different mappings are described in Figure 3. 19 CURIO, http://purl.org/net/curio/. 20 SIOC, http://rdfs.org/sioc/spec/. 21 FOAF, http://xmlns.com/foaf/spec/. 22 Geonames Ontology, http://www.geonames.org/ontology#. 23 WGS84 Ontology, https://www.w3.org/2003/01/geo/#vocabulary. 24 Dublin Core, http://dublincore.org/documents/dcmi-‐terms/.
D4.1 Enriched Semantic Models of Emergency Events
38 | P a g e
© Copyright 2016 Grégoire Burel
12 Multilingual Support One of the aims of the COMRADES model is to support multiple languages so that the model can be used by different communities around the world. In order to do so, we translate the name of the classes, properties and relations of the ontology in different languages using the language tagging features of RDF [14]. For example, we translate the com:Organisation class as follow:
com:Organisation rdfs:isDefinedBy com: ; a rdfs:Class, owl:Class ; rdfs:comment "An organisation"@en ; rdfs:label "Organisation"@en ; rdfs:comment "Une organisation"@fr ; rdfs:label "Organisation"@fr ; rdfs:comment "Una organización"@es ; rdfs:label "Organización"@es ; rdfs:subClassOf foaf:Group .
At the moment, we only translate labels into Spanish and French and do not translate the description of the ontology classes, properties and relations. Nevertheless, such translation can be added if necessary later on and it does not affect the usage of the COMRADES model as the ontological concepts are translation independent.
13 Domain Knowledge The specification of domain knowledge in the COMRADES ontology is mostly centred on: 1) The definition of user organisations, religious groups and ethnic groups; 2) The specification of report types, document types and event types, and; 3) The definition of categories, entity types and entity statuses. Although different methods can be used for creating such resources such as creating domain specific gazetteers, we decided to not enforce any specific domain knowledge in order to simplify the integration of the COMRADES model into existing dataset and tools. Each tool and dataset can specify its own domain knowledge depending on the model usage specifics. If interoperability between different datasets or model is required, resources can be linked to external entity resources such as DBpedia25 so that similar entities or resources can be identified more easily even if of the COMRADES ontology is used in different contexts.
14 Summary In the previous sections we introduced the COMRADES ontology based on the ORSD. First, we analysed the competency questions and ORSD glossary in order to create a high level version of the COMRADES model. Second, we implemented the ontology
25 DBpedia, http://dbpedia.org.
D4.1 Enriched Semantic Models of Emergency Events
39 | P a g e
© Copyright 2016 Grégoire Burel
using RDF/OWL and aligned the implemented ontology with existing ontological models. We also translated ontological classes, properties and relation to different language for simplifying the usage of the ontology in different communities. During the model development we decided to not implement any specific domain knowledge in order to simplify the model by not enforcing any default domain knowledge that can complicate the model integration into existing tools. Rather than proposing default domain knowledge, the COMRADES model provides classes that can be extended depending on the model usage or the integrated datasets. This allows for a more targeted usage of the model and a simpler integration of the model into existing applications or tools.
D4.1 Enriched Semantic Models of Emergency Events
40 | P a g e
© Copyright 2016 Grégoire Burel
Part III: Model Evaluation
Theoretical Model Evaluation
15 Introduction Although different methods can be used for evaluating ontologies, many methods rely on mapping existing data and then evaluating if the competency questions can be verified on real data. Since we do not have datasets that cover all the parts of the COMRADES ontology, we decided to perform a theoretical evaluation by checking if the classes and properties of the COMRADES ontology can be mapped to the competency questions. In the following section, we discuss the evaluation approach and how competency questions are mapped to the ontology properties, relations and classes of the COMRADES model. We also show how the current model represents the current competency questions.
16 Ontology Evaluation In order to evaluate the COMRADES ontology, we first extract the key classes, properties and relations associated with each competency questions. Then, we check if a path exists between each element of the extracted properties, relations and classes. Finally, we assert if a competency question is validated based on the path existence.
16.1 Competency Questions Mapping
For each competency question, we list the classes and relations that needs to be connected and evaluate if the competency question is validated (i.e. if there is a path between the classes, relations and properties associated with the competency question). The mapping and results for each competency question is listed below (Table 6): CQ PATH CQ VALID? CQ1 com:Message Yes (COUNT)
CQ3 com:Message → com:language Yes
CQ5 com:Message →, sioc:topic Yes (with SIOC)
CQ6 com:Message → com:DocumentEntity Yes (LIST)
CQ7 com:Message Yes (LIST)
CQ8 com:Message / vo:Proposition → vo:has_trustworthiness → vo:Trustworthiness → ( vo:trusted; vo:is_asserting → vo:TrustworthinesssAssertion → vo:confidence)
Yes
CQ9 Similar to CQ8 Yes
CQ10 com:Agent → foaf:account → com:Account Yes (COUNT)
CQ11 com:Event Yes (COUNT)
D4.1 Enriched Semantic Models of Emergency Events
41 | P a g e
© Copyright 2016 Grégoire Burel
CQ12 com:Event Yes (LIST)
CQ13 com:Event → com:describes → com:Report → com:describes → com:Report Yes (Report)
CQ14 com:Entity → com:state → com:State Yes (COUNT)
CQ15 com:Entity → com:state → com:State Yes (COUNT)
CQ16 com:Event → com:geolocation → com:Geolocation Yes
CQ17 com:Event → com:startTime Yes
CQ18 com:Event → com:describes → com:Report → com:informant → com:Informant Yes
CQ19 com:Event → com:describes → com:Report → com:source → com:Document Yes (LIST)
CQ20 com:Event → com:describes → com:Report → com:informant → com:Informant Yes (Agent properties)
CQ21 com:Picture → com:source → com:Report → com:describes → com:Event Yes (LIST)
CQ22 com:Report → ( com:language; com:describes → com:Event) Yes
CQ23 com:Report → com:scope → com:Role → com:role → com:Account → foaf:account → com:Agent Yes
CQ24 Similar to CQ8 Yes
CQ25 com:Report → com:approvalStatus Yes
CQ26 com:Report → com:task → com:Task Yes (LIST)
CQ28 com:Task → com:assigned_to → com:Account Yes
CQ29 com:Document; com:Informant Yes (LIST)
CQ30 com:Report Yes (LIST)
CQ31 com:Media Yes (LIST)
CQ32 com:Category Yes (LIST)
CQ33 com:Collection Yes (LIST)
CQ34 com:Role Yes (LIST)
CQ35 com:Agent → foaf:account → com:Account → com:created Yes
CQ36 com:Agent → foaf:account → com:Account → com:updated Yes
CQ37 com:Document → com:created Yes
CQ38 com:Document → com:updated Yes
CQ39 com:Message → com:created Yes
CQ40 com:Message → com:updated Yes
CQ41 com:Document → com:created Yes
CQ42 com:Document → com:updated Yes
CQ43 com:Media → com:created Yes
CQ44 com:Media→ com:created Yes
CQ45 com:Category → com:created Yes
CQ46 com:Collection → com:created Yes
CQ47 com:Collection → com:updated Yes
CQ48 com:Agent → com:email Yes
CQ49 com:Agent → com:realName Yes
CQ50 com:Agent → foaf:account → com:Account → com:role → com:Role Yes
CQ51 com:Agent → foaf:account → com:Account → com:role → com:Role → com:permission → com:Permission Yes
CQ52 com:Document → com:title Yes
D4.1 Enriched Semantic Models of Emergency Events
42 | P a g e
© Copyright 2016 Grégoire Burel
CQ53 com:Document → com:content Yes
CQ54 com:Report → com:source → com:Document Yes (Report)
CQ55 com:Document → com:geolocation → com:Geolocation Yes
CQ56 com:Document Yes (LIST)
CQ57 com:Document → com:created → com:Account → foaf:account → com:Agent Yes
CQ58 com:Document → sioc:topic Yes (with SIOC)
CQ59 com:Media → com:description Yes
CQ60 com:Media → com:collection → com:Collection Yes (LIST)
CQ61 com:Category Yes (LIST)
CQ62 com:Category → com:description Yes
CQ63 com:Category → com:created→ com:Account → foaf:account → com:Agent Yes
CQ64 com:Category → com:parent_category → com:Category Yes
CQ65 com:Collection → com:title Yes
CQ66 com:Collection → com:description Yes
CQ67 com:Collection → com:created → com:Account → foaf:account → com:Agent Yes
CQ68 com:Collection → com:collection → com:Document Yes (LIST)
CQ69 com:Collection → com:scope → com:Role → com:role → com:Account → foaf:account → com:Agent Yes (LIST)
CQ70 com:Role → dc:title Yes (with dcterms)
CQ71 com:Role → dc:description Yes (with dcterms)
CQ72 com:Role → com:permission → com:Permission Yes (LIST)
CQ73 com:Document → com:title Yes
CQ74 com:Report → com:source → com:Media Yes (Report)
CQ75 com:Report → com:informant → com:Informant Yes (Report)
CQ76 com:Report → com:source → com:Document Yes (Report/LIST)
CQ77 com:Event → com:geolocation → com:Geolocation → com:precision Yes
CQ78 com:Report Yes (Report)
CQ79 com:Document → com:language Yes
CQ80 com:Document → com:englishTranslation Yes
CQ81 com:Document → com:informativeness Yes
CQ82 com:Document → com:informativeness Yes
CQ83 com:Event → com:results_in → com:Situation → com:involves → com:Entity Yes (Entity)
CQ84 com:Event → com:results_in → com:Situation → com:involves → com:Entity Yes (Entity)
CQ85 com:Event → com:results_in → com:Situation → com:involves → com:Entity Yes (Entity)
CQ86 com:Event → com:impact Yes
CQ87 com:Report → com:describes → (com:Event; com:Situation) → com:describes → com:Report
Yes (Report/LIST)
CQ88 com:Event → com:polarity Yes
CQ89 com:Document → com:favourites Yes
CQ90 com:Message → com:addressed_to → com:Account Yes
CQ91 com:Event → com:results_in → com:Event Yes (ASK)
D4.1 Enriched Semantic Models of Emergency Events
43 | P a g e
© Copyright 2016 Grégoire Burel
CQ92 com:Document → com:shares Yes
CQ93 com:Event → com:startDate Yes
CQ94 com:Entity → com:lifespan Yes (Entity)
CQ95 com:Agent Yes (LIST)
CQ96 com:Agent → com:description Yes
CQ97 com:Account → com:favourites Yes
CQ98 com:Account → com:followed_by → com:Account Yes (LIST)
CQ99 com:Account → com:follows → com:Account Yes (LIST)
CQ100 com:Agent → foaf:member → com:Organisation Yes
CQ101 com:Agent → foaf:member → com:Organisation Yes
CQ102 com:Agent → foaf:member → com:Organisation Yes
Table 6 Competency questions ontology mappings and evaluation.
16.2 Results
As observed in Table 6, all the competency questions are successfully represented by the model. However, it is important to note that some mappings are not directly mapped by the COMRADES ontology but are inferred through merged ontologies. For instance, the topic of com:Message is not modelled directly by the COMRADES model but can be represented through the sioc:topic relation. Similarly, some of the competency questions are ambiguous with the loose usage of the term “document”. In the implementation of the COMRADES model, some of those “documents” are actually reports. We corrected those mappings when validating the competency questions. There are also some competency questions that are not represented directly but can be represented by adding subclasses to the existing model. For instance, the com:Agent involved in an com:Event can be represented through a com:Situation and a new type of com:Entity.
D4.1 Enriched Semantic Models of Emergency Events
44 | P a g e
© Copyright 2016 Grégoire Burel
17 Conclusions We introduced the COMRADES ontology as a model that supports the representation of events and related information during emergency crises. We based the development of the model on the NeOn methodology [1] and on a qualitative and structural design approach [2] and evaluated the COMRADES ontology by mapping competency questions to ontology properties, relations and classes. After creating an Ontology Requirement Specification Document (ORSD) [3], we implemented the model using semantic web technologies (RDF/OWL) and linked the newly developed data structures to existing ontologies such as FOAF and SIOC. Although the model is still not populated with the input and output data of the different components of the COMRADES platform, since they are still under development, we provided a partial evaluation of the COMRADES ontology by mapping a list of competency questions to the COMRADES ontology properties, relations and classes. Competency questions are commonly used in ontology evaluation practices, to test the capability of the model in answering all required queries. Since the needs and requirements of the COMRADES resilience platform are likely to evolve during the project, we designed the model to be easily extensible. For instance, additional types of data and reports can be added to the model and new types of events or resources can be specified. Further evaluations will be performed on the model in the COMRADES platform when further data becomes available.
D4.1 Enriched Semantic Models of Emergency Events
45 | P a g e
© Copyright 2016 Grégoire Burel
Appendix
18 References [1] M.C. Suárez-‐Figueroa, A. Gómez-‐Pérez, M. Fernández-‐López, The neon
methodology for ontology engineering, in: Ontol. Eng. a Networked World, 2012: pp. 9–34. doi:10.1007/978-‐3-‐642-‐24794-‐1_2.
[2] G. Burel, Community and Thread Methods for Identifying Best Answers in Online Question Answering Communities, (2016). http://oro.open.ac.uk/46144/ (accessed November 30, 2016).
[3] M.C. Suárez-‐Figueroa, A. Gómez-‐Pérez, B. Villazón-‐Terrazas, How to write and use the ontology requirements specification document, in: Lect. Notes Comput. Sci. (Including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), 2009: pp. 966–982. doi:10.1007/978-‐3-‐642-‐05151-‐7_16.
[4] M.F. Lopez, A. Gomez-‐Perez, J.P. Sierra, A.P. Sierra, Building a chemical ontology using Methontology and the Ontology Design Environment, IEEE Intell. Syst. 14 (1999) 37–46. doi:10.1109/5254.747904.
[5] S. Staab, R. Studer, H.P. Schnurr, Y. Sure, Knowledge processes and ontologies, IEEE Intell. Syst. Their Appl. 16 (2001) 26–34. doi:10.1109/5254.912382.
[6] D. Vrandecic, S. Pinto, C. Tempich, Y. Sure, The DILIGENT knowledge processes, J. Knowl. Manag. 9 (2005) 85–96. doi:10.1108/13673270510622474.
[7] M.C. Suárez-‐Figueroa, A. Gómez-‐Pérez, M. Fernandez -‐ Lopez, The Neon methodology framework: a scenario -‐ based methodology for ontology development, Appl. Ontol. 10 (2015) 107–145. doi:10.1007/978-‐3-‐642-‐24794-‐1.
[8] P. Schrodt, Ö. Yilmaz, The CAMEO (conflict and mediation event observations) actor coding framework, Annu. Meet. …. (2008). http://eventdata.parusanalytics.com/papers.dir/APSA.2005.pdf (accessed December 13, 2016).
[9] A. Pérez, M.D.F. Baonza, B. Villazón, Neon methodology for building ontology networks: Ontology specification, Methodology. (2008) 1–18. doi:10.1016/j.landurbplan.2011.04.007.
[10] G. Burel, A.E.C. Basave, M. Rowe, A. Sosa, Representing, proving and sharing trustworthiness of web resources using Veracity, Knowl. Eng. Manag. by Masses. (2010) 421–430. http://ekaw2010.inesc-‐id.pt/accepted_short_papers.html.
[11] C. Keßler, C. Hendrix, The Humanitarian eXchange Language: Coordinating disaster response with semantic web technologies, Semant. Web. 6 (2015) 5–21. doi:10.3233/SW-‐130130.
[12] S. Liu, D. Shaw, C. Brewster, Ontologies for crisis management: a review of state of the art in ontology design and usability, ISCRAM 2013 -‐ 10th Int. Conf. Inf. Syst. Cris. Response Manag. (2013) 349–359. http://windermere.aston.ac.uk/~kiffer/papers/Liu_ISCRAM13.pdf.
[13] J.G. Breslin, S. Decker, SIOC: an approach to connect web-‐based communities, Int. J. Web Based Communities. 2 (2006) 133–142. doi:10.1504/IJWBC.2006.010305.
D4.1 Enriched Semantic Models of Emergency Events
46 | P a g e
© Copyright 2016 Grégoire Burel
[14] E. Montiel-‐Ponsoda, D. Vila-‐Suero, B. Villazón-‐Terrazas, G. Dunsire, E.E. Rodríguez, A. Gómez-‐Pérez, Style guidelines for naming and labeling ontologies in the multilingual web, Int. Conf. Dublin Core Metadata Appl. (2011) 105–115. http://dcpapers.dublincore.org/pubs/article/view/3626%5Cnhttp://oa.upm.es/12469/.