COBWEB A quality assurance workflow authoring tool for citizen science and crowdsourced data

Post on 13-Jan-2017

199 views 1 download

Transcript of COBWEB A quality assurance workflow authoring tool for citizen science and crowdsourced data

A Quality Assurance workflow Authoring Tool for citizen science and crowd-sourced data.

Didier Leibovici, Julian Rosser, Mike Jackson and the COBWEB project

Nottingham Geospatial InstituteUniversity of Nottingham, UK

• Aim is to bring together a precise, structured, top-down and formal standards-based institutional approach with low cost, relevant, rich and timely citizen-focussed approach of the crowd but where there are short-comings of completeness, precision, interoperability and often minimal direction.

• Not straight forward - the two perspectives of what

constitutes useful, QA’d, fit-for-use data are very different.

Research Objective - to integrate (with QA) authoritative and crowd-sourced data

Crowd Sourcing Authoritative Government Data

‘Non-systematic incomplete coverage vs Systematic + comprehensive Near ‘real-time’ and ongoing data collection allowing trend analysis

vs ‘Historic’ and ‘snap-shot’ map data

Free ‘un-calibrated’ data but often at hi-res and up-to-the-minute

vs Quality assured ‘expensive’ data.

‘Unstructured’ and mass consumer driven metadata and mash-ups.

vs ‘Structured’ and defined metadata but often in rigid ontologies.

Unconstrained capture + distribution from ‘ubiquitous’ mobile devices

vs ‘Controlled’ licensing, access policies and digital rights.

Simple’ consumer driven web services for data collection + processing.

vs ‘Complex ‘institutional survey + GIS applications

A clash of paradigms and Market Dynamics:

Jackson, M. J., Rahemtulla, H. + Morley, J. (2010). “The Synergistic Use of Authenticated + Crowd-Sourced Data for Emergency Response”, Proc, 2nd Int Workshop on Validation of Geo-Information Products for Crisis Management (VALgEO), 11-13/10/10, Ispra, Italy, pp 91-99. http://globesec.jrc.ec.europa.eu/workshops/valgeo-2010/proceedings

mobile data capture & Quality Assurance / conflation

http://cobwebproject.eu

When considering the use of crowd-sourced GI data we need to quality assure it from:

1. A Spatial (geometric) perspective2. A Thematic (domain attribution) perspective3. A Temporal (time-related attribution) perspective

And in terms of data quality “Elements” we have to consider: Completeness – by area, by class, Consistency – e.g. topological, semantic, temporal Accuracy – relative, absolute Usability – fitness for purpose for a particular application or

requirement

Aspects of Quality

Solution adopted (i)

• “Internal” quality metrics <Completeness, positional accuracy, consistency, etc.> defined by ISO 19157

• “External” consumer quality <fitness for purpose> metrics based on GeoViQua [www.geoviqua.org>]

• Stakeholder model QA <data collector’s judgement, trust, reliability> [Meek et al 2014]

Metadata on Data Quality three models• ISO19157 (producer model)

where DQ_Scope will be ”feature"DQ_Usability• DQ_CompletenessDQ_CompletenessCommissionDQ_CompletenessOmission• DQ_ThematicAccuracyDQ_ThematicClassificationCorrectnessDQ_NonQuantitativeAttributeAccuracyDQ_QuantitativeAttributeAccuracy• DQ_LogicalConsistencyDQ_ConceptualConsistencyDQ_DomainConsistencyDQ_FormatConsistencyDQ_TopologicalConsistency• DQ_TemporalAccuracyDQ_AccuracyOfATimeMeasurementDQ_TemporalConsistencyDQ_TemporalValidity• DQ_PositionalAccuracyDQ_AbsoluteExternalPositionalAccuracyDQ_GriddedDataPositionalAccuracyDQ_RelativeInternalPositionalAccuracy

Simplified GeoViqua model (consumer model) where DQ_Scope will be ”external data"GVQ_PositiveFeedbackGVQ_NegativeFeedback

COBWEB Stakeholder Quality Model  where DQ_Scope will be ”volunteer"CSQ_VaguenessCSQ_AmbiguityCSQ_JudgementCSQ_ReliabilityCSQ_ValidityCSQ_TrustCSQ_NoContribution

Solution adopted (ii)

• OGC WPS standard which allows access to a repository of processes and services from compliant clients

• A key aspect of the standard is the provision to chain disparate processes and services to form a reusable workflow

• Use of BPMN rather than (BPEL) for workflow engine - excels in modelling processes visually allowing non-domain experts to communicate and mutually understand their models.

• Configurable workflows - stakeholders able to design a solution to fit use case from a generic set of WPS processes

Solution adopted (iii)

• Github used for code repository and open source evolution of solution

• Built on open source implementations of WPS, client libraries (52 North), BPMN implementation is JBPM maintained by JBOSS, WPS runs on Apache Tomcat, JBPM deployed on JBOSS Wildfly

• Full details in “A BPMN solution for chaining OGC services to quality assure location-based crowd-sourced data”, Meek, Jackson, Leibovici (2015) submitted to: Computers and Geosciences

Mike Jackson, 4-5 Nov., 2015, China

the COBWEB QAQC the 7+ pillars of Quality Controls (QC)

7 pillars of QC and the 7+ cross-pillar a QC

.workflow authoring toolBPMN encoding

.composition supportSKOS encoding

.repository of QCsas WPS

QAQC: workflow of QC as WPS

QAwAT

QAwOnt

QAwWPS

Example Workflow from EU COBWEB Project https://cobwebproject.eu

Qualifying the Observations, the Volunteers and the Authoritative data

Quality elements generated & evolving

QC examplesExample of a QA workflow

Design and composition using a graphical tool

QC examplesQAQC workflow Authoring Tool (QAwAT)

QAwAT

Design and composition in Eclipse

Design and composition JBPM web editor

Some results on the Japanese knotweed co-design

beforeQA

Some results (ground truth from photo)

afterQA

Rosser J, Pourabdolllah A, Brackin R, Jackson MJ, Leibovici DG (2016) Full Meta Objects for Flexible Geoprocessing Workflows: profiling WPS or BPMN? 19th AGILE Conference, 14-17 June 2016, Helsinki, FinlandLeibovici DG, Williams J, Rosser J.F, Hodges C, Scott D, Chapman C, Higgins C, and Jackson M.J (2016) The COBWEB Quality Assurance System in Practice: Example for an Invasive Species Study. ECSA conference 19-21 May 2016, Berlin, Germany Meek, S., Jackson, M., Leibovici, L. (2016), A BPMN solution for chaining OGC services to quality assure location-based crowdsourced data , Computers &Geosciences, 87(2016)76–83Leibovici DG, Meek S, Rosser J and Jackson MJ (2015) DQ in the citizen science project COBWEB: extending the standards. Data Quality DWG, OGC/TC Nottingham, September 2015, U.KLeibovici DG, Evans B, Hodges C, Wiemann S, Meek S, Rosser J and Jackson MJ (2015 ) On Data Quality Assurance and Conflation Entanglement in Crowdsourcing for Environmental Studies. ISSDQ 2015 - The 9th International Symposium on Spatial Data Quality, 29-30 September, La Grande Motte, FranceMeek S, Jackson MJ, Leibovici DG (2014) A flexible framework for assessing the quality of crowdsourced data. AGILE conference, 3-6 June 2014, Castellon, SpainLeibovici DG and Jackson MJ (2013) Copula metadata est. AGILE conference, 14-17 May 2013, Leuven, BelgiumLeibovici DG, Pourabdollah A and Jackson MJ (2013) Which Spatial Data Quality can be meta-propagated? Journal of Spatial Sciences, 58(1): 3-14Leibovici DG, Pourabdollah A and Jackson M (2011)  Meta-propagation of Uncertainties for Scientific Workflow Management in Interoperable Spatial Data Infrastructures. EGU 2011, European Geosciences Union, General Assembly, Vienna, Austria April 2011Pawlowicz S, Leibovici DG, Haines-Young R, Saull R and Jackson M (2011) Dynamical Surveying Adjustments for Crowd-sourced Data Observations. EnviroInfo 2011, Ispra, ItalyLeibovici DG and Pourabdollah A (2010) Workflow Uncertainty using a Metamodel Framework and Metadata for Data and Processes. OGC TC/PC Meetings, 20-24 September 2010, Toulouse, FranceJackson, M., Rahemtulla, H., Morley, J. (2010). The synergistic use of authenticated and crowd-sourced data for emergency response, International Workshop on Validation of Geo-Information Products for Crisis Management (VALgEO), Ispra, Italy. pp 91-99.

Quality Assurance workflow Authoring Tool (QAwAT)

Didier G. Leibovici, Julian Rosser, Mike Jackson and the COBWEB project

Nottingham Geospatial InstituteUniversity of Nottingham, UK

Email: firstname.secondname@nottingham.ac.uk

Thank you!