The Open Data Commons: a new vision for the future of Open Data

126
The Citadel Open Data Commons A collection of excerpts from reports and deliverables of the CIP (ICT-PSP) Project ‘Citadel … On the MoveJesse Marsh, Francesco Molinari, and Ricardo Stocco Alfamicro Lda, Cascais, PT with contributions from the partners and participants in the Citadel project consortium www.citadelonthemove.eu

description

Authors: J. Marsh, F. Molinari, and R. Stocco.A collection of excerpts from reports and deliverables of the EU's CIP (ICT-PSP) Project "Citadel... on the Move" - www.citadelonthemove.eu - related to the concept of the Open Data Commons. Issues covered include: practical experimentation in four pilot cities, privacy issues, semantics, Open Data governance, and policy.

Transcript of The Open Data Commons: a new vision for the future of Open Data

Page 1: The Open Data Commons: a new vision for the future of Open Data

The Citadel

Open Data Commons

A collection of excerpts from reports and deliverables of

the CIP (ICT-PSP) Project ‘Citadel … On the Move’

Jesse Marsh, Francesco Molinari, and Ricardo Stocco

Alfamicro Lda, Cascais, PT

with contributions from the partners and participants in the Citadel project consortium

www.citadelonthemove.eu

Page 2: The Open Data Commons: a new vision for the future of Open Data

LICENSE

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0

Unported License. https://creativecommons.org/licenses/by-nc-sa/3.0/

ACKNOWLEDGEMENT

The material in this book was produced as a result of work carried out in the context of the

“Citadel… on the move” project, which was partially funded through the EU’s CIP ICT PSP

programme under Grant Agreement n. 297188.

Cover image: http://commons.wikimedia.org/wiki/File:Leipzig_1632_Theatrum_Europaeum.jpg

FURTHER INFORMATION

For the most recent updates on the project, its objectives and results, visit the Citadel website

at www.citadelonthemove.eu.

DISCLAIMER

This project has been made possible by the EU cofinancing under the CIP (ICT-PSP) programme

2012-2014. However, the opinions expressed here are solely of the author(s) and do not

represent the official standpoint of any EU institution.

Page 3: The Open Data Commons: a new vision for the future of Open Data

3

FOREWORD

Citadel… on the Move was an EC funded project under the CIP (ICT-PSP) programme 2012-2014,

with a simple objective: to make Open Data an achievable reality for every city in Europe. By

working with the four pilot cities of Ghent (BE), Issy-les-Moulineaux (FR), Athens (EL), and

Manchester (UK), Citadel developed an easy to use platform that makes it possible for all

governments, especially the small ones that often get left behind, to Open Data and unlock

smart city innovation. The tools and the datasets and apps generated (over 500 apps in the

project’s lifetime) are all part of a central concept – the Open Data Commons – that extends the

scope of Open Data from specific city portals to all actors in a Smart City, promoting the active

engagement of citizens and local businesses as well as city departments and agencies to

contribute their own data and build a common data resource for the whole city.

The Open Data Commons is a concept that was developed by Alfamicro, one of the key partners

of the Citadel consortium, and reported in a fragmented way across a range of project

documents. To make the Open Data Commons concept less obscure and popularize it to a

broader international audience, we have assembled in this book some of the key contents

developed all along the project. This book therefore briefly presents the Citadel project and the

stakeholder based approach applied to the development and governance of the Open Data

Commons in the four pilot cities. It then describes the Open Data Commons and how it was

implemented in the course of the Citadel project, with a special focus on the semantic

dimension and a specific chapter on the issue of privacy. Finally, the policy implications of the

Open Data Commons and the future of Open Data are briefly explored.

This work has of course been made possible by our interaction with all of the partners of the

Citadel consortium, under the leadership of Geert Mareels of CORVE (BE) supported by Julia

Glidden and the team at the 21c Consultancy of London. The pilot driven approach was

successfully carried out by the City of Ghent, Issy Media of Issy-les-Moulineaux, MDDC of

Manchester, and DAEM of Athens, and supported by the evaluation work of iMinds (BE). The

technical team included Intrasoft and ATC of Greece as well as Derby University (UK), ITEMS

(FR), and V-ICT-OR (BE). Project dissemination was entrusted to the Euractiv Foundation and

coordination to IS-practice, both in Brussels. Part of the work described here was also made

possible by additional funding from the FI-WARE project’s Lisbon Pilot, whose technical team

included FullIT, IPN, and DRI, all from Portugal.

Our thanks go out to all of the dedicated people with these organisations whom we have had

the pleasure to work with over the last years, as well as the citizens and developers who have all

engaged with us in the design and development of the Citadel Platform. Finally, my thanks go to

the Alfamicro team who have written and compiled this book, as well as to Leonardo Alberto dal

Zovo, who played the essential role of developing the main tools of the Open Data Commons.

Àlvaro Duarte de Oliveira, President

Alfamicro Lda

Page 4: The Open Data Commons: a new vision for the future of Open Data

4

Page 5: The Open Data Commons: a new vision for the future of Open Data

5

CONTENTS

Foreword .......................................................................................................................... 3

CONTENTS ............................................................................................................................ 5

Index of Figures ................................................................................................................. 8

Index of Tables .................................................................................................................. 9

Definitions used in this Book ............................................................................................ 11

The Citadel… On the Move project ...................................................................................... 15

The Citadel approach to Open Data .................................................................................. 15

Defining the Open Data Commons ...................................................................................... 21

Stakeholder dynamics for Open Data ............................................................................... 21

Experimentation in Pilot Cities ......................................................................................... 28

Roles in the ODGG ................................................................................................................. 29

Reaching ODGG Objectives .................................................................................................... 30

The Open Data Commons at Work ...................................................................................... 33

Operationalisation through Experimentation ................................................................... 33

First issue (2012) .................................................................................................................... 34

Second issue (2014) ............................................................................................................... 36

The Semantic Dimension of the ODC ................................................................................ 38

Standards issues in the first ODC concept ............................................................................. 38

The Emergence of the Converter-AGT model........................................................................ 39

The central issue of semantics ............................................................................................... 41

Semantic convergence in the pilot cities ............................................................................... 42

The ODC as a Semantic Framework ....................................................................................... 43

Privacy and the Open Data Commons .................................................................................. 47

General Framework ......................................................................................................... 47

The Privacy Impact Assessment Framework ......................................................................... 49

Privacy types .......................................................................................................................... 50

Privacy at the Community level ........................................................................................ 53

Periodic surveys of the Pilot Cities ......................................................................................... 53

Analysis of Outcomes ............................................................................................................. 56

Towards a Community PIA ..................................................................................................... 57

Privacy at the App Level................................................................................................... 57

Mapping of Citadel Apps ........................................................................................................ 57

Analysis of Implications ......................................................................................................... 59

Proposal for an App PIA Framework ...................................................................................... 60

Privacy at the Data level .................................................................................................. 63

Privacy in the Open Data Commons ...................................................................................... 63

Page 6: The Open Data Commons: a new vision for the future of Open Data

6

Analysis of implications .......................................................................................................... 64

Proposal for a licensing mechanism ....................................................................................... 65

Maturity of Open Data Governance ..................................................................................... 71

Capability of Open Data ecosystems ................................................................................. 71

Governance roles ............................................................................................................. 76

Process ................................................................................................................................... 79

Guidelines for completion ...................................................................................................... 81

The Citadel Governance Toolkit ........................................................................................ 82

The ODC and the Future of Open Data ................................................................................. 85

Towards the Semantic Web .............................................................................................. 85

The Citadel Vision: “Territories of Data” ................................................................................ 85

Flattening datasets ................................................................................................................. 86

Semantic relationships in Citadel apps .................................................................................. 89

Back to LOD ............................................................................................................................ 91

The ODC as a Policy Concept ............................................................................................ 93

The ODC in practice ................................................................................................................ 93

Open Data as a public good ................................................................................................... 96

Policy implications .................................................................................................................. 98

Towards a Regional Cloud for a Territory of Data.................................................................. 99

References ........................................................................................................................ 103

Annex I: The ODC Toolkit ................................................................................................... 105

The Citadel Converter .................................................................................................... 105

The Library ........................................................................................................................... 105

The GUI Standalone ............................................................................................................. 106

Portlet for Liferay Portal ...................................................................................................... 107

The PHP Converter Library ............................................................................................. 108

The CitySDK-Citadel conversion script** ......................................................................... 108

Annex II: Standards adopted in the Citadel Platform .......................................................... 111

File Formats Mapping .......................................................................................................... 111

File Formats Choices ............................................................................................................ 112

Data Model Choice ............................................................................................................... 112

Points of Interest (POI) Standards ........................................................................................ 113

Geospatial Standards ........................................................................................................... 114

Date and Time Data ............................................................................................................. 114

Sensor and IoT ...................................................................................................................... 114

Metadata .............................................................................................................................. 115

POI Dataset Categories ........................................................................................................ 115

Mobile Application Templates ............................................................................................. 116

Gaps in Existing Standards .............................................................................................. 116

Character Encoding Issues ................................................................................................... 116

Page 7: The Open Data Commons: a new vision for the future of Open Data

7

POI Issues ............................................................................................................................. 116

Events Issues ........................................................................................................................ 117

Geospatial Issues ................................................................................................................. 117

Annex III: The Citadel Charter ............................................................................................. 119

Preamble ....................................................................................................................... 119

The Malmoe Declaration ..................................................................................................... 119

The Citadel Statement ......................................................................................................... 119

Learning from the Citadel Project ........................................................................................ 120

Towards the Malmoe Objectives ......................................................................................... 121

The Citadel Manifesto .................................................................................................... 122

Vision.................................................................................................................................... 122

Commitments and challenges ............................................................................................. 123

About the Authors ............................................................................................................. 125

Page 8: The Open Data Commons: a new vision for the future of Open Data

8

INDEX OF FIGURES

Figure 1. Open Data Value Chain .................................................................................................. 21

Figure 2. The Citadel Vision ........................................................................................................... 22

Figure 3. Typologies of innovation ................................................................................................ 23

Figure 4. Mapping of stakeholder domains .................................................................................. 23

Figure 5. Mapping of stakeholder transactions ............................................................................ 24

Figure 6. Citadel additional stakeholder transactions ................................................................... 24

Figure 7. Citadel integrated ecosystem ......................................................................................... 25

Figure 8. Key areas of stakeholder interaction ............................................................................. 26

Figure 9. Outcome of stakeholder interaction .............................................................................. 26

Figure 10. Open Data activity contribution to Citadel objectives ................................................. 27

Figure 11. Pilots’ contributions to Citadel objectives .................................................................... 28

Figure 12. Issues for specification of the Open Data Commons ................................................... 34

Figure 13. The range of functions in the ODC ............................................................................... 35

Figure 14. The vision for the Citadel ODC ..................................................................................... 36

Figure 15. The ODC as a semantic model ...................................................................................... 41

Figure 16. The Semantic Core of the ODC ..................................................................................... 44

Figure 17. The Open Semantic Ecosystem .................................................................................... 45

Figure 18. The ODC in the 'Real World' ......................................................................................... 46

Figure 19: Stepwise PIA process (source: [20]) ............................................................................. 50

Figure 20: Typologies of privacy in Citadel .................................................................................... 51

Figure 21: Stakeholders role in the definition of privacy policies (Citadel members) .................. 54

Figure 22: Stakeholders role in the definition of privacy policies (Citadel non-members) ........... 55

Figure 23: Conditions and purposes of MoU definition ................................................................ 79

Figure 24. A typical relational database structure ........................................................................ 87

Figure 25. A typical GTFS folder unzipped ..................................................................................... 88

Figure 26. A typical CKAN Data Store ............................................................................................ 89

Figure 27. AGT Apps by Month ..................................................................................................... 90

Figure 28. The basic RDF syntax .................................................................................................... 92

Figure 29. The LOD schema for the statue of Einstein .................................................................. 92

Figure 30. Citadel JSON Data Model ........................................................................................... 113

Page 9: The Open Data Commons: a new vision for the future of Open Data

9

INDEX OF TABLES

Table 1. Comparison of Pilot ODGGs ....................................................................................... 29

Table 2. Governance Roles in Pilot ODGGs .............................................................................. 29

Table 3. ODGG Objectives in Pilot Cities .................................................................................. 31

Table 4. Summary of the three levels of the Citadel PIA framework ....................................... 52

Table 5. Mapping of Citadel Application Templates ................................................................ 58

Table 6. Taxonomy of Data/Application Pairings in Citadel ..................................................... 61

Table 7. Proposed data privacy classification scheme ............................................................. 65

Table 8. Proposal for a Data Licensing Mechanism (based on the CC scheme) ....................... 67

Table 9. Proposal for a Data PIA Framework ........................................................................... 68

Table 10. Open Data Ecosystem capability matrix ................................................................... 73

Table 11. A CMM for LOD (example) ....................................................................................... 76

Table 12. Ecosystem role definition and potential MoU contribution .................................... 77

Table 13. Citadel Charter Approaches ..................................................................................... 83

Table 14. Citadel Common File Formats Mapping Grid ......................................................... 111

Page 10: The Open Data Commons: a new vision for the future of Open Data

10

Page 11: The Open Data Commons: a new vision for the future of Open Data

11

DEFINITIONS USED IN THIS BOOK

OPEN DATA

A statement of principles regarding the right of people to access, use and republish certain

datasets as they wish, without any restrictions from copyrights, patents or other IPR regimes. In

Citadel, we mostly consider Open Government Data, which may refer to citizens or enterprises,

based on the common assumption that its content is ‘owned’ by the general public, therefore it

should be made freely accessible by everyone for the promotion of transparency and/or the

provision of incentives to economic and social initiatives. In this sense, one of the project goals

has been to make life easier for those public sector organisations holding raw datasets and

willing – but not knowing exactly how – to publish them electronically.

APPLICATION

Any piece of software purposefully developed to perform a certain task, particularly, but not

limited to, the benefit of mobile users. In Citadel, we focused on applications using Open

Government Data resources and one of the main project goals has been to provide an easy-to-

use platform that could dramatically simplify the creation of applications even by non-expert

users (the so-called ‘Citizen Developers’). A notable component of the Citadel vision has been

the requirement that application templates be unbundled from the datasets, with the app

accessing the information only when needed, directly through a remote server located

somewhere on the Internet.

API

API stands for Application Programming Interface, and is a software module that allows a

programmer designing an application direct access to another application. In the area of Open

Data, APIs are generally used for an app to fetch data from (or, if the programmer is allowed

access, to write information to) an on-line service such as transport or weather info or a city’s

information services.

COMMUNITY

In Citadel, a stylized representation of the Open Data Community has been provided, composed

of the following actors: Policy Makers, being in charge of the high-level direction and regulation

of the process of opening up and cleansing government owned datasets; Data Providers,

responsible for the creation and management (setup, organization, structuration, cleansing) of

those datasets; Application Developers, usually ICT-savvy companies/individuals but also

‘Citizen Developers’, with the mission of transforming available datasets into ‘human readable’

forms – either products, or services, or both; and Business/Citizen Groups, including not-for-

profit entities and NGOs as well as City travellers and visitors, the ultimate beneficiaries of the

generation, transformation and utilization of datasets according to their respective purposes.

Page 12: The Open Data Commons: a new vision for the future of Open Data

12

OPEN DATA COMMONS

A repository concept specifically developed in Citadel, where it was designed as a mixed socio-

technical platform showing eco-systemic features (promoting the cooperation between Data

Providers and Application Developers) and acting as an open and public collection of tools and

services that users can navigate, acquire/adopt, and populate as they wish. Its particular name

is due to the simple idea that Open Data should be considered as a common good, in a public

sphere whose stewardship is to the benefit of both public and private stakeholders as well as

individual citizens.

CITIZEN DEVELOPER

In Citadel, the ‘Citizen Developer’ metaphor has been used to identify a particular category of

users of the Open Data Commons. As an implication of it, we make the further distinction

between ‘Citizen as Application Co-Developer’, who improves an existing app provided to

her/him with an Open Source Code, and Citizen as ‘Garage Developer’, who uploads to the

project platform a new app realised from scratch by him/her – possibly without immediate

business purposes (e.g. a free app, etc.).

PRIVACY

A human right enforced by almost all constitutions worldwide, which pertains to the protection

of the personal sphere against the abuses of the State or Law. According to the EU Directive

95/46/EC, privacy concerns may emerge wherever personally identifiable information is

collected and stored – in digital form or with partly automated means – and only partial,

improper or non-existent disclosure risk prevention is guaranteed to the “data subject” by the

so-called “data controller” (or “processor”, if acting on data controller’s behalf). In Citadel, we

have made the distinction between three types of privacy: Community, Application and Data

level.

PRIVACY AS A SERVICE

The Privacy as a Service concept [10, 12] refers to a group of security protocols through which

the privacy and legal compliance of user data can be exercised in cloud service architectures. It

can be associated to the provision of feedback to the user on the current risk of personal

information exposure in dependence of his/her privacy settings. Together with the Privacy By

Design operational principles, it has guided the first instantiation of Citadel’s Privacy Impact

Assessment Framework (PIAF).

PRIVACY BY DESIGN

Privacy by Design is a concept developed by the former Information and Privacy Commissioner

of Ontario, Dr. Ann Cavoukian, to mitigate the impact on personal data of ICT and large–scale

networked data systems. The objectives of Privacy by Design are, for people, to gain personal

Page 13: The Open Data Commons: a new vision for the future of Open Data

13

control over own information and, for organizations, to acquire a sustainable competitive

advantage by taking a positive sum, not a zero-sum, approach to privacy protection1.

SEMANTICS

Semantics is technically the study of meaning, but in the area of data management it takes on a

more specific definition as types of data structures specifically designed to represent

information content. Semantics can thus refer, for instance to the meaning of column headings

in an Excel table and whether to expect the same information under the headings “name” and

“title”.

CITADEL HUB

Available online at http://www.citadelonthemove.eu/en-us/Thehub.aspx - it is a collection of

Open Data, mobile application templates, user extensions and discussions about these. It was

setup in the early stages of the Citadel project, before migrating its contents to Github

(https://github.com/citadel-eu).

OPEN DATA GOVERNANCE GROUPS

Established since the early project stages, in compliance with the Living Lab approach, they

were informal groups consisting of the key stakeholders from the Open Data Community in each

of the pilot settings.

CITADEL CHARTER

Better referred to as Citadel Open Data Charter, it was originally conceived of as a formal

protocol (MoU) to be signed in preparation and accompaniment of the governance of opening

up processes. What has in fact emerged as a common need for the cities involved in Citadel is to

share a common vision and principles, so that the final version of the Charter has taken the

form of a manifesto, in continuation of the Citadel Statement of 2010 on which the project is

originally based.

1 The 7 foundational principles of Privacy by Design are described at:

http://www.privacybydesign.ca/index.php/about-pbd/7-foundational-principles/

Page 14: The Open Data Commons: a new vision for the future of Open Data

14

Page 15: The Open Data Commons: a new vision for the future of Open Data

15

THE CITADEL… ON THE MOVE PROJECT

THE CITADEL APPROACH TO OPEN DATA

The Citadel... on the Move project builds on two key political statements of principle that

underpin much of the Smart City movement:

The Malmö Declaration, agreed on 18 November 2009 at the 5th Ministerial

eGovernment Conference in Malmö, Sweden, calling for a new generation of open,

flexible and personalised eGovernment services of administrations at local, regional,

national, and European level

The Citadel Statement, signed in Ghent on 14 December 2010, aiming to operationalize

the Malmö Declaration with a local government Action Plan based on five key

principles: a common architecture, Open Data, citizen participation, privacy, and rural

inclusion.

The Citadel… on the Move project is for many aspects carrying that process one step further by

implementing these principles in practice, building Open Data-based mobile application

templates for pilot experimentation in four pilot cities across Europe – Ghent, Issy-les-

Moulineaux, Manchester, and Athens – and since extending engagement to over 120 cities in 5

continents. The results of the project allow all cities, even the smaller villages, to offer public

services on the mobile phones of their citizens and visitors at a low cost. All they need to do is

to publish their data using the Citadel tools and formats and the mobile apps built according to

the same standards will be usable in their municipality.

Citadel thus makes it possible for all municipalities to offer

Data to be used by Mobile Apps developed by citizens or companies (local or from other

cities)

Mobile Apps to their citizens and visitors.

The Mobile Apps are the most visible and concrete services based on open data to make life

easier for people. But of course the cities will have to do part of the work themselves. They have

to publish their data in a way they can be picked up by the Applications. Sounds easy - but they

will have to overcome the political, administrative and legal constraints, which still slow down

the Open Data movement.

In contrast, policy makers can expect that there will be a growing demand from their citizens to

be able to use the same app in their city as they have experienced in a neighbouring city. Like

the use of Internet, which slowly but surely found its way in the Public sector twenty years ago,

Citadel helps the same modernization of the Public services in the use of Open data on Mobile

Applications.

In the three years of development and experimentation of the Citadel concept, three key

principles have emerged, which we can fix as strategic guidelines for Smart Cities:

Page 16: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

16

The new roles of citizen developer and traveller (visitor)

Standardization processes rather than standards

Open Data as a Commons.

CITIZEN DEVELOPERS AND TRAVELLERS

The emphasis in the Malmö Declaration and Citadel Statement on citizen empowerment and

participation are finding application in Citadel through concrete approaches that really place

citizens at the centre of the process.

The concept of Citizen Developer means that technology development can no longer be entirely

in the hands of technicians, but must always imagine the products of the technology industry as

“incomplete” artefacts whose construction as technological tools is only finalized by citizens.

This approach builds on a trend that has been in vogue for some time; it was as long ago as

2006 that Time magazine deemed the computer user to be Person of the Year, recognition of

the increasing role of User Generated Content. Nonetheless, the divide between producer and

consumer has remained intact to date. Citadel’s Citizen Developer instead gains full status as

producer of services and applications.

The way Citadel implements this concept is through the idea of Application Templates. In

common office software, templates are ready-to-go models for standard uses such as a business

letter or a monthly budget, allowing non-experts to add the content and finishing touches.

Citadel provides Templates for Open Data, as modules that carry out the technical work of, say,

accessing a database of events or air quality data and visualizing the information contained

therein. Citizen Developers, or rather those with some familiarity of HTML5, can then piece

these modules together to build mobile applications for their own city, such as an app to advise

those with allergies as to whether or not to attend an open-air concert.

The Citizen Developer is thus not just a technical concept, but a new form of empowerment and

democratization of Internet technologies. Mobile applications can now be designed by the same

people that will use them, rather than devised in far-away research laboratories, and they thus

“belong” to a city and its citizens in a new way. In this sense, Citadel is part of the growing

number of initiatives in the Human Smart City movement2, which envisages future urban

services driven by people more than by the underlying infrastructures, through processes able

to seamlessly blend technical with social innovation.

The second role concept to emerge from Citadel is that of the Traveller. When dealing hands-on

with a mobile application such as a city tour or a guide to local restaurants, the question

emerges as to whom this application is really for. From there, the idea of Traveller takes shape

as a cosmopolitan figure who is neither a total stranger nor a member of a generations-old local

family. In this perspective, we are perhaps all Travellers both in our own cities – whose various

quarters we may know in greater or lesser detail – and across Europe, as the Erasmus

2 http://humansmartcities.eu/

Page 17: The Open Data Commons: a new vision for the future of Open Data

The Citadel… on the Move Project

17

generation builds trans-cultural fluency into an emerging notion of nomadic European

citizenship.

If the author-reader dialogue in Citadel is between citizen developers and travellers, then the

idea of Smart City services can be framed in the concept of hospitality. Open Data becomes a

welcoming gesture on the part of a City Administration, whose role is no longer one of cold

efficiency but as the host who opens the house to familiar and unfamiliar faces alike. Public

sector information becomes a shared asset, the basis on which the dialogue within and between

cities can unfold, with travellers as messengers of other places and experiences offering new

understandings of local urban life that only an external eye can bring.

STANDARDS AND STANDARDISATION PROCESSES

Both the Malmoe Declaration and the Citadel Statement call for the adoption of standard

technical formats to facilitate the uptake of Open Data based applications and their

interoperability across different cities and even Member States. Indeed, over the last decades

we have seen how agreement within an industry on a technical standard such as VHS or MP3

can be a determining factor in speeding up innovation and opening up markets. In a sort of

policy paradox however, most attempts to impose such standards “by decree” – such as the

attempts to introduce Digital Rights Management to protect copyrighted material – have failed,

overtaken by rapid technological advances that either bypass top-down constraints or shift the

foundations on which the proposed standards are intended to work, thus making the proposed

standards irrelevant. Nonetheless standards do emerge and they are important; it is just very

risky to try to pick a winner.

Citadel … on the Move addresses this conundrum by shifting the emphasis from standards to

standardization processes. If we look at the history of standards, especially the kind of data and

semantic standards most relevant to Citadel, we see that they tend to first appear in the midst

of a flurry of innovation focused in a specific area, with one or more of the key innovators at

that moment proposing to “clean things up”. The key innovators can be end users or technology

providers, each with a different self-interest in proposing a standard, but each with their own

concrete needs together with an eye to the benefits of scaling up in mind. It is this combination

of healthy self-interest together with an awareness of positive network effects that often leads

to the emergence of a “good” standard.

As innovation literature from Thomas Kuhn onwards shows, the key then lies in part in the

intrinsic quality – even “elegance” – of the proposed standard but also in the credibility of the

body proposing it and the degree to which it is judged to be acting in the general interest and

willing to defend and maintain that standard over time. An important sign is thus the degree of

acceptance of a standards proposal, evident not only through adoption by “lead users” but

above all by the emergence of tools, translators etc. that can adapt non-conformant datasets to

that standard or, conversely, allow access to that standard by a tool originally designed for

another one. This ecosystem of tools often means that two or three standards can be adopted

concurrently with a high degree of interoperability.

Page 18: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

18

Rather than proposing single standards for a given area, Citadel thus prefers to focus on building

“literacy” in standardization processes, both for both today’s and tomorrow’s emergent needs.

Indeed, this means learning to clearly identify the area of standardization, search for on-going

activities and standards proposals, search for the richness of the toolkit ecosystem built around

the available options, and evaluate the best strategy in relation to the current landscape.

In short:

Standards are relevant but cannot be defined top-down

Standards result from social adoption and technology convergence processes

Standards tend to define a path towards a coherent vision (ie Web of Data)

In each area such as file formats, data models, etc., usually a limited set of standards

emerges and prevails

Tools exist to translate between standards and allow for on-the-fly interoperability

The key strategy is thus understand these processes and align standards to practice

Citadel offers a solution to integrate at each step the up to date standards.

OPEN DATA AS A COMMONS

Open Data is generally promoted, beyond its technical modularity within the “Web of Data”

vision, as a matter of principle: Open Data is a good thing in terms of a) transparent government

and b) collaboration between the public and private sectors for the creation of services in the

public interest. Open Datasets themselves are in the public domain, as Public Sector Information

(PSI), while it is private businesses who build the applications using PSI. As we will see below,

however, the situation is not as simple as it appears, as a mature Open Data ecosystem includes

a wealth of tools, interfaces and toolkits between the data and the applications, making it

increasingly difficult to draw the line between where the obligations of the public authority end

and the opportunities for individual enterprises begin.

Citadel … on the Move proposes an innovative approach in which the set of all available tools

and services that can be considered as non-specific to either a given dataset or a given

application, is considered as a Commons: a collection of re-useable items that “belong” to the

community. This Open Data Commons (ODC) approach aims to provide the greatest benefits to

data providers and application developers alike, lowering the risk of specific standards decisions

by the former and the investment required for a new application by the latter. In fact, the

required tools that allow the two to interact are likely to be already in the Commons, and if they

are not the required re-useable part of the new application is developed and “donated” for use

by others in a win-win situation for the developers themselves.

Governance of the ODC is a collaboration between the city administration, citizens and

businesses, and application developers, but it is the city government that oversees that the

process is open and fair. All parties discuss Open Data strategies for the city, i.e. which datasets

to open, possible applications, standards, privacy and security, and so forth, since the ODC

highlights the public relevance of these issues. Indeed, with the Open Data Commons, the role

Page 19: The Open Data Commons: a new vision for the future of Open Data

The Citadel… on the Move Project

19

of the public sector is elevated from mere “data provider” to the stewardship of the collective

interest.

In short:

Citadel has defined a common space in the public domain as key to uptake of Open Data

The Open Data Commons as the on-going collection of shared tools and resources

allows to publish and access datasets transparently

Promoting the emergence of standards and sharing standards of practice

Based on a partnership of the data and development communities

Governance principles are required to define ODC structure and nature

Role of the City Government in guaranteeing openness and transparency of governance

Page 20: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

20

Page 21: The Open Data Commons: a new vision for the future of Open Data

21

DEFINING THE OPEN DATA COMMONS

STAKEHOLDER DYNAMICS FOR OPEN DATA

Over the last decade, several authoritative studies [6, 13, 14] have dealt with the definition of a

value chain for the commercial and non-commercial re-use of Public Sector Information,

including Open Data as a specification thereof. These attempts were based on a number of

assumptions [17], namely that:

• Enabling technologies such as the Internet and open source software applications are

supporting and enhancing the main value-creating functions;

• Much of the currently expanding re-use activity only started once low-cost ICT applications

and networks became available;

• A positive economic value is actually created out of Open Data / Public Sector Information

reuse, according to a number of relevant business models [8];

• Recent trends on collaborative data and service production between governments and

citizens [15] do not add significant feedback loops to the workflow schematized in the

following Figure:

Figure 1. Open Data Value Chain

In the above representation, four main actors, or stakeholder categories, can be identified, in

close association with well specific tasks:

• Policy Makers, being in charge of the high-level direction and regulation of the whole

process, and with specific respect to Data Providers;

• Data Providers, usually, though not always, public bodies or agencies (such as public utility

companies, statistical offices, chambers of commerce etc.), being responsible for the

creation (setup, organization, structuration) of the open datasets, and sometimes also of

their adaptation and specialisation to the needs of the Application Developers;

Page 22: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

22

• Application Developers, usually ICT companies, sometimes under the control of public

bodies, otherwise acting on the free market, with the mission of transforming the datasets

available into “human readable” forms – either products, or services, or both;

• Business/Citizen Communities, including not-for-profit entities and NGOs, who are

ultimately beneficiaries of the transformation, generation and utilization of public datasets

according to their respective (business / non business) purposes.

Activities beyond raw data creation, collection and aggregation, which can be relevant to value

creation include, for instance: data processing, editing and packaging, marketing and delivery.

More recently, they also comprised the development of API’s, mash-up’s and other forms of

user friendly – if not user generated – content. However, as the following picture shows, the

essence of Citadel vision is to complicate the previous representation of the value chain by

adding three forms of interaction between the four stakeholder categories introduced before:

a) Data co-production, deriving from the Business/Citizen Communities themselves, as

parallel and additional sources with respect to Data Providers;

b) Application co-design, again reflecting the spirit of freedom and initiative that

characterizes most end user communities;

c) And policy co-creation, as joint result of the feedback searched for by the “smarter”

Policy Makers and received back from all of the remaining stakeholder categories, after

a complex process of Living Lab interaction that is the goal of Citadel development

activities to achieve.

Figure 2. The Citadel Vision

As final outcome of this set of feedback loops and interrelations, two main goals are to (should)

be achieved: intelligent policy learning, from the perspective of workflow directors and

regulators; and the creation of (additional) value from the disclosure of Open Data and the re-

use of Public Sector Information, that what could be reasonably guaranteed using the

conventional, one-way logic depicted in Figure 1 above.

Page 23: The Open Data Commons: a new vision for the future of Open Data

Defining the Open Data Commons

23

The way this outcome becomes feasible can be described as follows. In Figure 3, we add

another relevant analytical dimension to our vision, namely the distinction between

technological and social (including also institutional) innovation. Among the many definitions of

the latter, we would like to adopt the following: innovative solutions and new forms of

organisation and interactions to tackle social issues.

Figure 3. Typologies of innovation

By the combination of the value chain tasks depicted in Figure 1 with the typologies of

innovation introduced above, we can easily locate the four stakeholder groups as per the

following diagram:

Figure 4. Mapping of stakeholder domains

Here, the corresponding value transactions - using the jargon popularized by the Value Network

Analysis paradigm [16] – can be depicted as in the Figure on the next page:

Page 24: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

24

Figure 5. Mapping of stakeholder transactions

One can notice the addition of the “Impact and Requirements” function from the

Business/Citizen Communities to the Policy Makers, in such a way that the linear workflow

outlined in Figure 1 may hold an iterative feature permanently added to it.

However, the contribution of the Citadel project to refining the above vision is more extended

than what has been discussed by now. In particular, the operational objectives set out in our

work on Open Data are introducing de facto a symmetrical iteration, going counterclockwise as

described by the following scheme:

Figure 6. Citadel additional stakeholder transactions

In this scenario, Policy Makers act as “prime movers” with respect to the Business/Citizen

Communities, in launching and promoting the constitution of the ODGGs in the respective Cities

(by now, those that are formal partners of the Citadel consortium; in the future, those that will

adhere to the proposed scheme and play the role of supporting or affiliated partners). This

ensures the definition of the scope, limitations and conditions under which the whole

experiment takes place – including, but no less important, the privacy, confidentiality and

security aspects related to the procedures of Open Data disclosure and Public Sector

Information dissemination.

Within this overall framework, it is desired, and somehow expected, that the local

Business/Citizen Communities, adequately stimulated and supported, may start defining their

range of expectations, desires, and purposes, with respect to the specific utilization examples of

the various applications developed, or to be designed and worked out with the integration of

the public datasets available or to be made available. This backward process, which also

Page 25: The Open Data Commons: a new vision for the future of Open Data

Defining the Open Data Commons

25

includes the generation of own datasets, whereby citizens and/or businesses themselves act as

complementary Data Sources with respect to the Public Sector, should positively influence the

strategic behaviour of the Application Developers, who could stay more focused on the

developments that hold the maximum level of utility, usability and social acceptance, instead of

wasting precious resources in a tedious and never ending process of ex post validation for the

API’s or other ICT applications established meanwhile.

As a by-product of this virtuous interaction between prospective end users and solution

providers, a new range of access and acquisition protocols should also be foreseen, between

the Application Developers and the Public Sector Data Providers. The latter should make

reference to the Policy Makers again, for revised and revamped guidelines concerning pricing

and availability of datasets, in relation to the priorities expressed or “signalled” by the ultimate

beneficiaries.

Although the proposed representation may look oversimplified (as it does not include, for

instance, the cases of user generated or private sector owned datasets, nor it considers

application developers as capable of achieving social innovation), most of its heuristic value is

given by the juxtaposition of Figure 5 to Figure 6 into a single, integrated ecosystem, as shown

in the picture below:

Figure 7. Citadel integrated ecosystem

This exercise is helpful, in that it identifies four main areas of interaction, with the

corresponding feedback loops:

Page 26: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

26

Figure 8. Key areas of stakeholder interaction

As a result of those interactions, the goals of policy learning and value creation (as per Figure 2

above) should ultimately be achieved.

Figure 9. Outcome of stakeholder interaction

The overarching objective of Citadel is to grow and nurture such an ecosystem providing tools,

methodologies, cases and exploitation opportunities.

In the next two (and final) pictures, we identify the contribution of Citadel Open Data and

development activities, respectively, to the achievement of such objective, through a number of

instrumental and operational reports.

Page 27: The Open Data Commons: a new vision for the future of Open Data

Defining the Open Data Commons

27

Figure 10. Open Data activity contribution to Citadel objectives

Beside the definition of management rules for the ODGG’s, Citadel also dealt with the creation

of Open Data Charters concerning the use of public datasets. Later in the project, Privacy Impact

Assessments were defined to identify the risks for personal/sensitive data related to the

introduction of a culture of openness and transparency in Public Sector Information and Data

handling. In parallel to this effort, the semantic dimension of dataset production and usage was

explored, in order to define a common Semantic Framework. Finally, as a collective space on the

Citadel project website, an Open Data Commons Repository was conceived, in its first instance

as a collection of links to available datasets together with a variety of open source tools

providing for adaptation, refinement, and access to public datasets and application resources.

Most of the above achievements, including the technical developments related to each Citadel

pilot, were ensured by the joint contribution of technical partners and pilot cities. As far as the

latter are concerned, the following diagram summarizes their contribution to the Citadel

objectives, namely:

• Scenario Development, to create a shared understanding of “what makes a City smart”;

• User Requirements Gathering and Technical Development;

• Future Proofing with Geo-Based Technologies;

• Template Mobile Applications Creation, Testing and Review.

The intention is not to describe each of these steps in depth, but rather to highlight the close

interconnections between the Open Data Commons Repository and the Template Applications,

both lying at the crucial point of convergence between Business/Citizen Communities (as the

prime receptors of the commercial/non commercial value created) and Application Developers

(including the project’s technical partners, as well as third party organizations, including citizens

and NGO’s acting under the Web 2.0 / FLOSS logic on the ICT “market”).

Page 28: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

28

Figure 11. Pilots’ contributions to Citadel objectives

EXPERIMENTATION IN PILOT CITIES

At the heart of the Citadel approach was the actual experimentation of its Open Data tools and

concepts in real settings, engaging city administrations with local citizens and developers in four

pilot cities: Ghent (BE), Issy-les-Moulineaux (FR), Manchester (UK), and Athens (EL). In order to

frame the stakeholder-based approach across the four cities, the common idea of an Open Data

Governance Group (ODGG) was put forth as a means of collectively managing Open Data as a

common good.

Pilot experimentation of the ODGG concept in the four cities took on a different path as a

function of the degree of development of Open Data strategies and engagement with local

developers, as follows:

For Ghent, Citadel coincided with the launch of a double strategy for Open Data and

constitution of the local Living Lab. Citadel thus helped frame and guide this process as

it rapidly grew; the ODGG constitutes somewhat of a lead-user forum. This explains the

significant number of meetings held throughout the project.

In Issy, Citadel was aligned with a new strategy definition process, so the ODGG

included the different stakeholders including both different city government

responsibles and application developers to define together a common strategy. This

explains the larger composition of the group and the slower process of data publishing.

For Manchester, Citadel reinforced long-standing Smart City and Living Lab strategies.

There already existed a strong Open Data community in Manchester, so the ODGG

mainly aligns those activities with the work in Citadel.

In Athens, Citadel helped to define a new Open Data strategy, which, due to the mere

size and complexity of the city, needed to raise awareness among many institutional

departments as well as citizen stakeholder groups. The ODGG thus worked as the

strategic core group guiding this process, and tended to focus on the required actions

for pilot start-up to deliver concrete results.

The final reports from the pilot cities, following two years of experimentation of the Citadel

tools and platform, confirm the different approaches taken in terms of governance strategies,

while at the same time delivering successful outcomes across the board, as shown in the

following table:

Page 29: The Open Data Commons: a new vision for the future of Open Data

Defining the Open Data Commons

29

Table 1. Comparison of Pilot ODGGs

City ODGG members

Events Total participants

Governance style

Ghent 11 24 >575 Tightly technical ODGG with active engagement of developer and end-user communities in numerous events.

Issy 34 7 195 Broad representation in the ODGG, with more selective and structured events.

Manchester 10 5 >250 Continuity with on-going Open Data strategy, increasing links with community.

Athens 8 9 32 ODGG composed of key political actors, coupled with direct engagement of active citizen groups.

ROLES IN THE ODGG

Mid-way through the pilot testing, a survey carried out mostly within the project and pilot

communities aimed to obtain a first mapping of what the roles for different actors in the

community should be. The findings for each role (i.e. the functions for which a leading role was

attributed) were as follows:

Mayor, City Government: defining strategies, promotion, privacy, and evaluation

City ICT Department: leading role for all activities, namely ODGG coordination

Public/private data providers, leading role for all except app development and

promotion

Software companies: leading role for refinement, app development, promotion, and

R&D

Citizen developers: as above but with a leading role also for evaluation

User communities: as above but not developing apps

Citizens and visitors: evaluation

Against this background, we can note the roles actually played by the different actors in the

ODGGs of the pilot cities as the project evolved, as shown in the following table:

Page 30: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

30

Table 2. Governance Roles in Pilot ODGGs

Role Ghent Issy Manchester Athens

Mayor, City Government

A clear Open Data strategy was already in place with political support.

The Open Data strategy was launched and defined in the course of Citadel.

A clear Open Data strategy was already in place with political support.

Attaining strong political support was a key objective for Citadel in Athens.

City ICT Department

The ICT department coordinated the pilot throughout.

Issy Media already had a strong mandate to promote innovation.

Manchester MDDA with a strong mandate, acting as pilot leader.

DAEM has already a strong mandate to manage innovation.

Public and Private Data providers

Key roles in the ODGGs, data provision became demand-driven over the course of the project.

Issy engaged many city offices (e.g. tourism) but also neighbouring municipalities and multi-level stakeholders (agglomerate and regional levels)

Dealing predominantly with municipal data holders.

DAEM carried out a broad policy of one-on-one meetings and discussions with data holders, based on a demand sparked by citizen engagement. This also included other government portals.

Software companies

Ghent tended to work with SMEs and citizen developers.

Issy has strong connections with the software industry (e.g. Microsoft).

Good representation of development community.

Industrial participation will be defined with the Athens Living Lab follow-up.

Citizen developers

Played an active role in Hackathons and app development.

Engaged through workshops and Issy Media events.

Multiple roles identified and engaged for citizen developers.

Athens in particular encouraged the role of citizen developers.

User communities

Strong role for Open Knowledge Foundation, also for the cultural sector.

Role for schools, urban communities, etc.

User communities mainly engaged through pilot activities.

Citizen community NGOs played a driving role in the demand-driven strategy.

Citizens and visitors

Engaged through Living Lab activities, especially in the open co-design events.

Engaged through testing, but also workshops and conferences.

User communities mainly engaged through pilot activities.

The tourism industry is a key concern for Athens.

REACHING ODGG OBJECTIVES

From the reports, it is also possible to identify the different ways in which the objectives for the

ODGG have been met. These can be synthesized as follows:

Open up data: the first and foremost objective of the ODGGs was to spark off processes

for opening up datasets.

Engage with the community: the second objective was to actively involve data owners,

the development community, and local citizens and businesses in Open Data

Page 31: The Open Data Commons: a new vision for the future of Open Data

Defining the Open Data Commons

31

Define a strategy: the final objective for the ODGGs was to enable the community to

identify the best way forward to maximize the value of Open Data for their city.

Table 3. ODGG Objectives in Pilot Cities

City Open up Engage Define a strategy

Ghent In the context of an Open Data policy already in place, the Citadel ODGG helped reinforce the link between the city and the open development community.

Ghent’s engagement with the community was reinforced through the Citadel co-design events: Ghent pioneered the Apps4Dummies format.

Through the work of the ODGG, Ghent shifted from a data-push to a demand-pull strategy, particularly as regards the cultural sector.

Issy Issy used the Citadel ODGG to launch its Open Data policy, going from nothing to a significant number of opened datasets.

Issy Media’s existing structures and activity frameworks provided the setting through which to engage citizens and local businesses.

The ODGG is helping Issy carry out an original multi-level strategy involving nearby municipalities, coordinating with national and regional portals.

Man-chester

Citadel helped reinforce an already existing Open Data strategy and extend the user base.

The Citadel tools helped bring new actors into the picture with less technical skills.

The Manchester Open Data strategy is extended and reinforced by the availability of the tools.

Athens In a situation of political hurdles and severe austerity, the Athens ODGG approach has been successful in gaining strong political support for Open Data.

The Athens ODGG engaged directly with key government data holders, while at the same time co-designing application scenarios with citizens and community groups to gain bottom-up consensus.

Athens now has a clear, Citadel-driven Open Data strategy that will be sustained by the Athens Living Lab currently being established.

Page 32: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

32

Page 33: The Open Data Commons: a new vision for the future of Open Data

33

THE OPEN DATA COMMONS AT WORK

OPERATIONALISATION THROUGH EXPERIMENTATION

The Citadel ODC was originally formulated in general terms, as a shared collection of items in

the public space – APIs, transformers, converters, etc. – situated between datasets on the one

hand and applications on the other. Any further definition of the ODC concept required a

definition of the two main interfaces between the ODC and applications on the one hand and

the ODC and datasets on the other. The “higher” the common ODC-App interface (closer to the

single applications), the more “work” the ODC would have to do to comply with the interface

standards but the greater the number of applications that could access the ODC. On the

contrary, the “lower” the interface, less work for the ODC but more for the individual Apps; the

same logic goes for the interface between the ODC and the datasets. Once the “level” of these

two interfaces were defined, it would then become relatively easy to develop the tools to

connect them.

Such a concept could only be formulated in the context of a Living Lab methodology, since only

through the engagement of real users and real cities in real settings could the most appropriate

“level” for these interfaces be established. In this sense the user groups in the pilot cities

effectively co-designed the ODC. The “upper” interface corresponded to the JSON data model

required by the Citadel templates in the experimentation with local developers. The data model

proved sufficiently flexible, especially since more than one template was available with slightly

different data models, but there were hardly any datasets were available to be used.

In fact, the large majority of datasets was only available as an Excel or CSV file, at best properly

structured and refined. A particular case in point here was at Issy-les-Moulineaux, who followed

the data models of the templates but built the datasets by manually entering information into

Excel spreadsheets for feasibility. Inspired by a first prototype Converter built by the Ghent pilot

that transforms parking data into the JSON format compatible with the Parking template.

Building this input from the pilots into the ODC concept basically required two steps: first, the

apps using Citadel templates would need to “unbundle” their data by configuring the template

to read the JSON file from a separate external source. This approach works only if the JSON files

that are remotely accessed are exactly in the form that the Template expects to see. As a

consequence, the second step involved defining a standard tool to convert files from Excel or

CSV into this standard format: the Citadel Converter.

We thus have the two interfaces defined to fit the needs that emerged during the pilot

experiments: at the lower level (dataset-ODC), any well-structured Excel or CSV file. At the

upper level (ODC-application) a standard JSON schema. The ODC thus consists in the conversion

tool or tools (variations on the converter have already been carried out, ie to read from

geoJSON or CitySDK databases on the input side and write to alternative data models on the

output side), together with the listing of JSON files that are compatible with one or more

Templates.

Page 34: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

34

FIRST ISSUE (2012)

The preliminary specification of the application templates and the datasets for the pilot cities

also involved some preliminary investigation of the ODC concept, since this constitutes the

interface between the templates and the datasets. Even in the case where a template can

directly access a dataset scenario that may work for a given pilot instance but with a very low

level of flexibility and transferability for other uses – the ODC needs to play the role of

interpreting the URI (Unique Resource Identifier) for the datasets. In the vast majority of cases

however, the ODC will be called upon to carry out more sophisticated functions, such as

filtering, translation, and/or access interfacing.

During the requirements capture phase, the main issue for Citadel was to find the right balance

between dataset formats, the ODC, and the templates. If the ODC is required to do too much,

e.g., in order to have a broad variety of datasets and light templates, then the entire ecosystem

becomes too dependent on the ODC, “betraying” the principle of Open Data. If the ODC aims to

be “lighter” and more efficient, then the dilemma is between templates with a greater

complexity (in order to access datasets with different formats) and placing greater constraints

on datasets, thus increasing the cost of opening up data.

Figure 12. Issues for specification of the Open Data Commons

At this point it was useful to look more closely at the functions that could be included in the

ODC, mainly by looking at the different tools and approaches that currently exist for supplying

open data application with an external dataset. The main approaches can involve:

Direct access, as mentioned above: this can occur in the case that an external dataset

offers data exactly as the application expects it to be formatted and structured;

Plugs or translators, that have the function of translating file formats or re-mapping

data structures to make data fit applications (we can consider some XML translators in

this category).

Page 35: The Open Data Commons: a new vision for the future of Open Data

The Open Data Commons at Work

35

Data dumps, mirrored databases constructed for a variety of purposes such as: a)

storing “translated” datasets as per the above; b) where a dynamically updated

database (eg. Meteo) is copied at regular intervals so that external applications avoid

overloading the primary system with queries or c) for other reasons such as security.

One or more APIs can be developed either from the data side (as in the CitySDK project)

or from the application side (as with Pachube or Foursquare) providing interface

functionalities.

Various combinations of the above.

We considered all of these approaches as appropriate to the functional requirements of the

ODC as illustrated below:

Figure 13. The range of functions in the ODC

The interesting fact for Citadel is that the above tools and devices are part of an ever-evolving

ecosystem in which application developers, data owners, and third parties interact. Indeed, the

indirectly are the key drivers of standardisation processes in Open Data, since the emergence of

a standard such as GTFS (General Transit Feed Specification) is generally accompanied by the

production of APIs, translators etc. to help fit that standard, even in the presence of emergent

enhancements to that standard such as real-time-GTFS.

The second noteworthy fact is that most of these tools and devices are freely available in the

public domain if not Open Source. The only exception to this are APIs that are paid for as part of

the business model for applications such as Foursquare, but this does not mean that the ODC

cannot list the API and facilitate its adoption, leaving it up to the user to decide whether or not

to adopt a commercial (generally proprietary) format.

This opens an interesting scenario for the ODC within an Open Data Smart City strategy. The

ODC can become the space which manages an ever-evolving ecosystem that as a collection of

tools can be said to define the public space of a city’s information capital. On the one hand, it

makes it possible for the city’s data to be “seen” by a broad range of applications; on the other,

Page 36: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

36

it allows applications easy access to the city’s information capital. Use of the ODC would occur

not by any constriction but by the convenience it offers to data holders and application

developers alike.

In order to maintain this role, ODC cities (through the Open Data Governance Group) would

negotiate with application developers to ensure that the components they develop that can be

said to be of public utility – data access tools that could be re-used by other applications – be

“donated” to the ODC and remain in the public domain, in exchange for access to the city’s data

through the ODC. The community of developers that participate in such an endeavour would

thus have an interest in collaborating to ensure that the different components developed work

together smoothly where appropriate as well as promoting adherence to emergent standards.

This vision for the ODC can be illustrated as follows:

Figure 14. The vision for the Citadel ODC

The above figure also illustrates an additional added value that can be provided by the ODC in

line with this scenario. Since the ODC will be managing the interfacing between application

queries, it can maintain a record of queries coming from the application side as well as records

of which applications access a given dataset and for what purposes. This allows us to imagine

the introduction of the concept of “bi-directional traceability” of Open Data, which opens up

interesting possibilities for the management of privacy and security issues. In addition, analysis

of the queries and transactions over a given period of time could allow for the identification of

semantic patterns, thus allowing for a bottom-up definition of emergent semantics that could

then be fed back into the definition of appropriate data structures.

SECOND ISSUE (2014)

The concept for the Converter tool was born during the first phase of Closed User Group testing,

as on the one hand the pilot cities were opening their data mostly as Excel files, while on the

other the Citadel Templates required JSON files based on specific data models linked to each

Page 37: The Open Data Commons: a new vision for the future of Open Data

The Open Data Commons at Work

37

template. Indeed, the City of Ghent first built a simple tool for this purpose, translating parking

information to the required JSON format. The request therefore arose for a general Converter

that could transform any CSV or Excel file into the required JSON format for use in Citadel.

This idea was taken up as the first step in actual realisation of the Open Data Commons concept

that had been defined in the first year of the project. Indeed, the Converter as described would

be the first of a series of tools bridging the gap between datasets (as they currently stand in 95%

of public administrations, ie. Excel files) and applications built using the Citadel Templates.

By the time the request was formalized and the first UML specifications of possible Converter

workflows had been defined, the pilot cities were waiting anxiously for the Converter in order to

finally begin building apps. An unusual development plan was thus defined for the Converter,

following three main stages in an open sequence that allowed to test the concept as early as

possible and then proceed on the basis of user feedback:

A first prototype was realised in less than a month, using php for a server-based

converter. This version only worked with CSV and mapped columns in the original

spreadsheet directly onto the data schema of the JSON format. This version was

released in December 2013 and was an immediate success with the pilot cities.

A second prototype was built using Java, as an off-line tool. This was meant to provide

more stable features and possibly be used for batch processing for files with the same

data structure and/or with constant updating. This version also separated a first phase

of semantic mapping (pairing source column headings with standard field names) with

that of the export schema (matching with the actual fields of the template data model,

and in addition adding necessary metadata such as language, licensing, etc.). This

version was less successful due in part to the large size of the file to download but

mainly because pilot cities were preferring the simpler though less sophisticated

conversion of the php version. This version was released mid January 2014 but received

little feedback.

Shortly thereafter the final version was released as an on-line Java tool encapsulated in

Liferay. The basic functionalities were essentially the same, and it was this version that

has been gradually improved through interaction with pilot users in the Living Lab

settings. The first step was to add help texts along the way, as well as feedback on

possible errors in the mapping to the export schema. This version was released in time

for demonstration and testing at the Data Days conference in Ghent in February 2014.

Since then, further refinements of the Java code, together with significant upgrades of the

server features, have been carried out with the objective of improving performance, and in fact

the response times have been notably reduced (another of the reasons why the pilot cities

initially preferred the php version). Following initial user testing, the Converter tool was then

integrated into the Citadel platform. This involved stripping away the user registration of the

Liferay environment in order to allow a smooth passage from the Citadel Hub, within which the

Converter is inserted as a simple i-frame. Other enhancements to the Converter have been

carried out in a dialogue with end-users, and are reported in the following section.

As a final note, it is interesting to see that one of the hypotheses of the development plan – ie.

that outside developers would prefer to work with the php version – has been validated by the

Page 38: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

38

recent adaptation to geoJSON of the Converter and other activity on Github. Thus both the php

and the Java versions continue to co-exist, the first as a more open, technical, and experimental

version and the second as a more stable, user-friendly version useful for the front end of the

Citadel platform.

The Citadel Converter’s operational maintenance has mostly been a question of adapting to

continuous user requests for enhancement. The main problem is that the Converter significantly

raises expectations – promising to convert just about anything into an app – while in fact there

are many problems to address, mostly related to problems or inconsistencies with the original

dataset. Addressing these issues has involved a combination of: technical improvements,

accompanying information, and human support.

The technical improvements have mainly been carried out on two fronts: geographical

coordinates and dataset publishing. As for geographical coordinates, the Citadel JSON format

foresees latitude and longitude in one field separated by a comma and a space. Other common

formats (ie. with a comma only as in Google) returned an error message. Work was therefore

carried out to automatically recognize different formats and adjust them where necessary. The

second question is related to the desire to directly save the converted file and publish the

metadata on the Citadel Hub. This issue introduced the option of saving converted files to any

CKAN server, though also required an API through which to write to the Citadel Index.

Another aspect is related to providing user information. This has occurred through: presentation

of the Converter so as to lower expectations (raising awareness of the difficulties involved),

explanatory trouble-shooting pages on the Citadel website, improvements in the help texts

accompanying the different phases of conversion, and improvements in the error messages to

make them more understandable. A final aspect, human interaction, has led to a series of

actions that are not within the scope of this book, except perhaps for the preparation of a series

of help sheets and template Excel files used as support tools for the Apps4Dummies

workshops3.

THE SEMANTIC DIMENSION OF THE ODC

STANDARDS ISSUES IN THE FIRST ODC CONCEPT

It is useful to remember how the actual implementation of the ODC developed as compared to

the original vision. The first idea for the ODC was in fact as an open collection of APIs,

converters, transformers and similar tools in the public domain, with the idea that they might be

somehow configured or chained so as to connect a given dataset with a given application

template.

While this description provided little guidance from an operational point of view, its innovation

consisted in the clear vision of some sort of autonomous space ‘between’ datasets and

applications, considered as a public good. By suggesting that transformation resources should be

pooled rather than selected, the ODC concept implied that it would not be necessary for the

Citadel project to choose a semantic standard. More precisely, it implied that if Citadel were to

3 These were awareness raising events organised by the Citadel consortium in several locations across Europe.

Page 39: The Open Data Commons: a new vision for the future of Open Data

The Open Data Commons at Work

39

define a data model for practical purposes, then it didn’t necessarily have to become a

‘standards proposal’ since other data models could perfectly well co-exist with it in the ODC

space, together with the transformation tools needed to convert towards them.

The ODC model was thus presented as a vision, without the specific intention of implementing it

in practice. The goal was rather to use the ODC as a framework capable of guiding the thinking

and actions of the pilot cities. The main objective at this early stage was to see whether in

practice such an autonomous space ‘between’ data and applications did exist and, if so, to

identify where the borders or interfaces above and below this space were and what defined

them.

What did emerge in the first cycle of pilot testing was the ‘emptiness’ of this common space.

Feedback from the pilots noted the significant gap between city datasets on the one hand and

the application templates on the other, which require data to be in a specific JSON format.

Normally, this gap would be bridged by a specific tool such as an API, but the whole idea of the

ODC is to introduce a different concept that, rather than bridge this gap, fills it with elements

that are open and re-useable4.

In addition, APIs generally work only with on-line relational database services, while the great

majority of the datasets of the pilot cities consist of Excel files. The gap between data and

applications was thus a significant one, and cities found themselves with few applications with

which to use their data (giving them little motivation to open more data), while developers had

little data ready to use with the application templates (giving them little motivation to go

through the complicated process of installing the templates’ client-server configurations). To

begin to overcome this problem, the Ghent pilot devised a simple tool to convert data from one

of the city’s services for parking data to the JSON format required for the parking template. This

was heralded as the first instance of the ODC concept at work, with the spontaneous emergence

of a conversion tool to begin the process, but it actually laid the ground for the further

development of the ODC itself as a more complex system. In the process, what initially appeared

as a question of technical formats (the use of JSON), ultimately emerged as a question of

semantics.

THE EMERGENCE OF THE CONVERTER-AGT MODEL

Evaluation and reflection on this experience and that of the other pilot cities led to the proposal

for the introduction of the Converter-AGT toolkit to bridge the gap between datasets and

templates. The idea was to build on the example of the Ghent converter but in a more

structured way that would work for most of the datasets in all of the pilot cities. Indeed:

the Application Generator Tool (AGT) adopts a generic version of the Citadel data format

(mostly based on the POI template’s data model) to generate apps that read one or

4 APIs in fact are in general not conceived as belonging in the common space of the ODC. They are either designed as

an accessory to an application, so that a data service needs to write specific code to feed data to it (example Google

Maps or Xively) or they are written for a specific data service, so the application has to have special code to be able to

use each data service’s API. The idea of the CitySDK project (http://www.citysdk.eu/) is to standardize the APIs

associated with common types of data services, but only works with data services and not, for example, with the

spreadsheet type static files that make up most of available public sector information.

Page 40: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

40

more appropriately structured JSON files and then visualize the POIs in a list or map

visualization.

The Converter, which starts from a row-based Excel or CSV dataset as input (like the ones

most cities had produced, notably Issy-les-Moulineaux), then carries out a mapping of

the column headings to the generic Citadel data model of the AGT (i.e. mapping “Name”

to “Title”) and finally saves the output as a Citadel compliant JSON file.

This required an important step to be taken as regards the architecture of the Citadel templates.

In the first round of pilot testing, apps were created by selecting the appropriate template,

encapsulating the necessary datasets (after downloading the files), and installing the template

software (with the data inside it) on the server side with the visualization part on the client side.

This procedure, which is normal practice for mobile applications, essentially creates a ‘closed’

system, even though the data was originally downloaded from an ‘open’ portal and structured

according to a common data model.

In order for the AGT to be able to generate an app quickly, it was necessary to separate the data

from the application or application template that reads it, thus ‘externalizing’ the dataset. In this

way, the app resides entirely on the client side (i.e. in a smartphone), the data resides as an

autonomous file on the Internet (i.e. on the Citadel Hub), and the app reads the data in real time

when it needs it. This is more similar to the way an API reads data from an on-line web service,

essentially by connecting to the web service, asking the right questions, and knowing how to

expect the data to be returned and subsequently adapted to the needs of the application. The

difference, however, is that the Citadel JSON file is ‘live’5 but already in exactly the format the

application expects to see it: all the application needs is the URL to read the information directly

from the external server, with no further need for an API.

From a functional standpoint, this means that for every original dataset published by a given

city, there is the need to also store a JSON version of the same file so that the AGT can read the

information from the Internet. This may appear to be an unnecessary proliferation of files, but

there are three important benefits to this new approach, all driven by different aspects of the

Citadel scenario:

A Citadel JSON file can be updated at any time – even live feeds can work – as long as

the URL returns the expected JSON schema following the expected semantic structure.

Any city or user can add a new file respecting the same semantic schema and the

application will be able to read it and use the data, as long as it knows the URL: the

same application can be reused with no changes.

Any application developer can access the available JSON files for any purpose, simply by

knowing in advance what semantic structure to expect: the same data can be reused

with no changes.

5 By ‘live’ we mean that, rather than having to download a file and then read the information, the application can

directly access the information from the server hosting the JSON file.

Page 41: The Open Data Commons: a new vision for the future of Open Data

The Open Data Commons at Work

41

THE CENTRAL ISSUE OF SEMANTICS

This new scheme met with an enthusiastic reception from the pilot cities, but from the

standpoint of the ODC a doubt remained: Was this only a way of defining the AGT schema as a

standard data model for Citadel, and thus defeating the principle of the ODC as an open space

without standards? Is it possible to extend the Converter-AGT model to allow for other data

models and other types of applications? The answer to this question lies is in the original

template concept.

Each of the Citadel templates in fact is designed to visualize a specific kind of information

(events, parking, etc.) and thus each works with its own data model. We can thus imagine that

different templates or applications can be paired with their own set of JSON files with the data

structured the way they expect to see it, just as the AGT currently works with its own JSON files.

The standards ‘negotiation’ processes can still take place, based on the co-existence of different

data models in different JSON files feeding information to different templates and applications.

The ODC concept is therefore still alive, if we consider the first Converter-AGT toolkit to be just a

first instantiation of it. The interesting thing to note is that in this process the semantic

dimension has become the driving force of the ODC model.

Figure 15. The ODC as a semantic model6

The above diagram is in fact based on the above discussion, showing the ODC no longer as a set

of tools but rather as a set of JSON files (the green boxes) linked to different templates and

applications, all stored somewhere on the web and accessible through a URL. The tools are still

there (the dotted line below) but there is an important shift of focus: the common space is now

characterized by the way it accommodates datasets with different semantic models. Indeed, the

figure shows the different types of applications on the top – the generic template of the AGT,

the specific application templates developed in the first months of the project, third-party

6 As of October 2013. It should be remembered that this is a conceptual schema that is broader than what the pilot

cities actually tested, which instead had the Converter using only the generic data format required by the AGT.

Page 42: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

42

templates, and even third-party applications and programs – each of which gets its data from

the datasets in the ODC that have been formatted in the way they expect to see it.

From an operational standpoint, this approach requires an index function to keep track of which

datasets can be read by which applications, but once the pairing between a template or

application and one or more JSON files has been configured, the job is done7. The Citadel Index

in its current configuration only foresees pairings between apps generated by the AGT and

Citadel JSON datasets, so it does not yet reflect the open nature of the ODC model. How this

index function might evolve and scale up is one of the issues for future development of the ODC.

SEMANTIC CONVERGENCE IN THE PILOT CITIES

One of the key hypotheses of the ODC concept is that it provides an open framework within

which user-driven semantic convergence processes occur or, in the words of the project

workplan, “convergence of the use of terms”. Although the pilot cities only tested the direct

Converter-AGT data model in successive versions of the Converter (and not the multiple data

models of the broader schema in the preceding section), the usage of the operational version of

the Citadel Converter in the pilot cities already demonstrated some interesting dynamics.

The first php prototype of the Converter converts a source spreadsheet file directly to the export

schema as required by the AGT (the generic template data model). This achieved its main

objective of ‘bridging the gap’ between Excel spreadsheets and the applications requiring the

JSON, thus allowing pilot cities to get on with their activities of opening datasets and testing

applications. As stated elsewhere, however, the ease with which end users could convert

datasets and make an app with them also allowed them to experience the Open Data paradigm

end-to-end8.

One of the most evident effects of this was that users could easily see that a poorly structured

spreadsheet with, say, different ways of abbreviating ‘street’ or a missing address, led to either a

failure of the conversion process or visible problems in the final result9. People in the pilot cities

– once it was clear to them that the cause of the problems they were experiencing was in the

datasets they themselves produced and not a bug in the Converter software – actually went

back and corrected or revised their datasets until the desired result was achieved.

This ‘educational’ aspect of the Citadel converter was further highlighted in the introduction of

the second version, which divided the conversion process into two steps: ‘semantic mapping’

and ‘export schema’. This new structure mainly was intended to support further developments,

but it also had the effect of raising awareness of the importance of semantics to end users. At

first some complained that it appeared to be a duplication of effort. Then the pilot participants

7 The exception to this is Discovery, a Citadel function which allows the user to move from city to city and the

application to automatically detect the presence of a new dataset with information relevant to that city, since

Discovery is essentially a dynamic configuration of the App-URL link that takes place through the Index. For the

purposes of this discussion however, the semantics of the underlying data model remain the same. 8 This may seem obvious but very few civil servants have had the gratifying experience of seeing a dataset they have

published actually used in a mobile application. 9 ‘Cleaning up’ source data is one of the main costs of traditional Open Data initiatives. The original workplan for

Open Data activities envisaged the engagement of citizen groups in this lengthy process, but the dynamics described

here have proven far more effective.

Page 43: The Open Data Commons: a new vision for the future of Open Data

The Open Data Commons at Work

43

gradually saw the usefulness of an intermediate step in which the terms they used (‘name’ or

even ‘titre’ for ‘title’) are mapped onto a standardized vocabulary, and that the output format

required for the AGT was just one of many possible ways in which their data could be used, once

the semantic mapping to standardized terms had been carried out.

It is hard to overestimate the impact of this engagement of civil servants, and how building a

sense of ‘ownership’ of a dataset in the person who generates it – generally seen as a source of

trouble and not a resource – can be the best way to ensure quality. The uniqueness of the

Citadel approach is to actively empower the people who create datasets in the first place,

influencing their behaviour by showing the consequences of sloppy data, directly rewarding

good data with a working app, and thus promoting the convergence of behaviour patterns

towards common standards of practice.

Once the Converter had reached a point of stability and was in active use by the pilot cities, it

was useful to explore how it can be adapted to different data models and different applications,

in order to steer the process from the pragmatic Converter-AGT toolkit prepared for the pilots

towards the multiple-standard approach of the ODC model as originally conceived.

These enhancements effectively open up the conversion process to other options such as Open

Street Map, geoJSON, the CitySDK APIs, etc. Since they were developed in the final stages of the

project, they were not fully tested in the pilot cities, though they nonetheless demonstrate the

flexibility of the Citadel approach and the possibility of migrating from the original Converter-

AGT toolkit towards the multi-standard ODC concept that has inspired the developments of the

Open Data Commons throughout the project.

THE ODC AS A SEMANTIC FRAMEWORK

On the basis of these experiences, we can say that through the Citadel piloting and development

processes, the ODC has evolved according to a path that alternated conceptual modelling with

concrete solutions:

First, the ODC was a conceptual model of an open, multi-standard ecosystem.

Next, the ODC became the Converter-AGT toolkit, designed to overcome a specific

problem identified by the pilots.

Finally, a series of alternative data models and conversion scenarios were explored,

framed by the original ODC framework, extending the Converter-AGT toolkit to re-gain

the goal of an open system.

In this process, ODC development was shaped by the semantic issues that eventually defined a

core, ‘a-standard’ (in the sense of not requiring standards) semantic framework that is at the

heart of the model.

Page 44: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

44

Figure 16. The Semantic Core of the ODC

This schema considers that the common space is driven by the presence of three elements,

JSON files with pre-defined semantic structures (provided by one or more applications

and registered in the Index),

Converter tools that carry out the necessary semantic mapping from the unstructured

CSV files towards the structured JSON files,

CSV files with any semantic structure (namely with whatever choice and sequence of

column headings the original user defines),

plus the index which keeps track of a) where the CSV files are and where they come from; b)

data models used by applications and the JSON files that conform to them; and c) which

converters can be used to produce which data models.

The various tools developed above and below this core interact through it, allowing for different

standards to co-exist by providing the semantic framework that matches data models to the

applications that can use them, forming an open semantic ecosystem.

This open ecosystem, upon closer inspection, consists of the very tools that were originally

conceived of as populating the Open Data Commons, considered as the shared space in the

public domain. Indeed, the services and tools shown are all reusable, generic components,

whose semantic interoperability is guaranteed by the fact that they can speak to and get

information from the semantic core.

In addition, this schema fulfils the original idea of the ODC as a ‘negotiation space’ for user-

driven convergence towards standard semantic structures. As stated above, any data model can

be registered in the index together with the conversion tools to create the necessary files. At

the same time, however, users are likely to converge on the data models with a greater number

of JSON files available, so long as they meet their needs. This encourages both the development

user-driven standards both for specific data models for precise requirements (i.e. restaurant

menus) together with data models of general relevance (i.e. POIs), with the balance being

gradually defined within this operational semantic framework.

Page 45: The Open Data Commons: a new vision for the future of Open Data

The Open Data Commons at Work

45

Figure 17. The Open Semantic Ecosystem

The model of the open data ecosystem above contains nearly everything developed to date in

Citadel: what, then, is outside of the Open Data Commons? The fact is, Open Data systems in

the real world are ultimately fed by real (in the sense of not necessarily Open Data) office

systems, files, and services on the one hand, and are used to contribute to the development of

real (in the sense of normal use, not only finding a parking place) applications on the other. This

is easy to forget, since most Open Data discourse to date seems to take place in a separate

world from our daily life of interacting with ICT systems. The end objective of Open Data, at

least in the Citadel perspective, is to become part of the ‘real world’, simply as an efficient way

of addressing interoperability issues when linking different data sources to applications, as

illustrated in the figure on the following page.

This scenario is conceivable only with a massive uptake of the Open Data paradigm, which

Citadel considers to be a possibility enabled by the ODC with its semantic core. At least in the

context of this book, we can say that the definition of the semantic core has been the key

enabling mechanism for unlocking the Open Data Commons, and remains the driving force of

the concept.

Page 46: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

46

Figure 18. The ODC in the 'Real World'

Page 47: The Open Data Commons: a new vision for the future of Open Data

47

PRIVACY AND THE OPEN DATA COMMONS

GENERAL FRAMEWORK

Citadel reflections on privacy issues commenced in January 2013. At that time, the EC proposal

for a general reform of data protection legislation in Europe was already available10. This

included a Communication, stating the EC policy objectives, and two legislative drafts: a

Regulation setting out a general EU framework for data protection, and a Directive on

protecting the personal data processed for the purposes of prevention, detection, investigation

or prosecution of criminal offences and related judicial activities. Therefore, our analysis started

by outlining the current legislative setup in the EU as far as privacy protection was concerned,

and followed on by analysing the impact of the upcoming legal provisions on the framework

that informed both the Open Data and the AGT development activities in Citadel. Incidentally,

that analysis is still up to date, as neither the Regulation nor the Directive has entered into force

as yet. The former in particular, in order to become law, has to be adopted by the EU Council of

Ministers using the ordinary legislative procedure (i.e. co-decision mechanism). On this we note

that the Conclusions of the 24-25 October 2013 summit of EU27 heads of State and

Government committed to a "timely" adoption of the new Regulation, which was supported by

the European Parliament in its plenary assembly of 12 March 2014, with 621 votes in favour, 10

against and 22 abstentions. On the same day, the EP also expressed its consensus to the

Directive with 371 votes in favour, 276 against and 30 abstentions.

Among the key findings of Citadel’s early analysis, two were particularly notable, as seemingly

going in the same direction of the new proposed legislation:

The existence of compelling socio-economic reasons to speed up the process of opening up

and cleansing government owned datasets, particularly at the local level. Such reasons refer

to the definition of a sustainable value chain for the (commercial and non-commercial)

reuse of Public Sector Information, including Open Data as a specification thereof. In fact,

one of the key aims of the upcoming reform is (among other goals) the provision of a single

legislative framework for businesses and citizens, in order to promote the digital market and

ultimately economic growth in Europe;

The need to tackle with privacy protection issues upfront, not as an afterthought: in

particular, the ‘Privacy by Design’ concept means that data protection safeguards should be

built into new open data based applications and services since the earliest stage of

development. This concept is also invoked by the new legislation principles. In Citadel, it has

led to the definition of a number of ‘Privacy as a Service’ scenarios for the Open Data

Commons, including the possibility of logging Open Data related transactions as well as the

appropriate management of access rights as a function of specifically designed metadata

structures.

However, the analysis carried out at the time also showed the existence of a latent tension

between the promotion of the commercial (and/or non commercial) use of Public Sector

Information and an excessive protection against the risk of personal data disclosure that didn’t 10

http://ec.europa.eu/justice/newsroom/data-protection/news/120125_en.htm

Page 48: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

48

take into account the blurring of traditional distinctions between data holders and collectors,

application providers and final users. Such tension has been epitomized by the so-called ‘Citizen

Developer’ profile, a natural person engaged in the processing of open data with the support of

Citadel application templates. This person is empowered to create, own and control his or her

own datasets and applications, and share them with other participants in the Open Data

Commons on terms that are set and negotiated, as need be. For instance, a local bridge club

member can be incentivised to publish online the list of fellow members, together with their

home addresses, to make it easier to find a location for the next game, through a newly

developed app that mashes up the bridge club list with the official dataset of city parking

facilities. Unfortunately, the same list once made public could be of interest for some IT-savvy

burglar, noting which homes are left unattended on the occasion of the next club meeting.

While possibly trivial as an example, it shows how significant and irreversible can be the

unwanted consequences of privacy carelessness, even in the case of prior approval by personal

data owners, and despite the fact that nobody acted with profit making purposes in this

scenario (except perhaps the burglar). This is however, all but an occasional risk in the ‘Citadel

world’: in fact, it was a precise mission of our project to facilitate the realisation of open data

driven applications by non-expert users through the Open Data Commons resources. And the

objection that publishing someone’s home address goes beyond open data is weak, for at least

two reasons: a) that many other public sources may already have disclosed the association

between a certain name and a specific address, and b) that there could be a shared interest in

running the privacy risks of this disclosure, for instance to make known to other potential

members the existence of an active bridge club in the city – with practical venues for playing in

the same neighbourhood – or to invite other IT-savvy people to co-create a mash up service

showing all the ‘right’ PoI’s (Points of Interest) and ‘useful’ connections at hand.

As a matter of fact, due to its seeming irrelevance for digital business, this case is not

considered by the upcoming legislative reform. According to Art. 3 of Directive 95/46/EC, its

provisions do not apply to the processing of personal data “by a natural person in the course of

a purely personal or household activity”. The new Regulation’s Art. 2 repeats the same concept.

For sure, the ‘Citizen Developer’ figure complies with the first part of the definition – being a

natural person – but not necessarily with the second qualification, given the possibility offered

to him/her by the Open Data Commons, either of adding data to an initial City PoI collection, or

of making improvements to an existing application provided by someone else. Paradoxically, if

someone started to claim (probably little) money for the mash up service from some time on, all

the legal consequences of past privacy carelessness would be charged to the last edge of the

value chain, although this could also be taken as a (clumsy, yet innovative) example of open

data exploitation for business purposes.

With this case description, we do not intend to imply that Community Law should consider

adding to its scope the activities of ‘Citizen Developers’, but only reinforce the importance of

preventive privacy assessments, rather than corrective or punitive actions. A relatively short-

term scenario, also driven by the likely popularisation of the Open Data Commons, will see a

growing number of applications heavily relying on citizen’s own datasets – if not also on user

generated improvements, according to the Open Source or Living Lab logic. There, the risk of

Page 49: The Open Data Commons: a new vision for the future of Open Data

Privacy and the Open Data Commons

49

involuntarily merging relevant (to the new service targeted) with irrelevant personal data is high

and must be considered upfront.

THE PRIVACY IMPACT ASSESSMENT FRAMEWORK

As a first level of response towards fulfilling the need for preventive actions, a PIAF – Privacy

Impact Assessment Framework – has been developed in Citadel. Generally speaking, a PIA is a

managerial process that helps an organisation identify and remove or reduce the privacy risks of

a certain project, initiative or IT system. To this end, the PIA analyzes the way(s) personal data

and information is collected, stored, protected, shared and managed by and within the

organisation. For instance, Art. 33 of the proposed EU Regulation obliges organisations to

conduct a “data protection impact assessment” where processing operations present specific

risks to the rights and freedoms of data subjects. Despite the change in terminology, the

requirement is unequivocal, also in light of the previous Recommendation of May 2009 on

privacy in RFID applications [7] where the EC asked Member States to make sure that “industry,

in collaboration with relevant civil society stakeholders, develops a framework for privacy and

data protection impact assessments. This framework should be submitted for endorsement to

the Art. 29 Data Protection Working Party”. As far as RFID based innovations are concerned, the

Working Party formulated an Opinion on the framework in February 2011, welcoming “the

explicit inclusion of a stakeholder consultation process as part of the internal procedures

needed to support the execution of a PIA”. It also observed that a PIAF should be aimed to

promote ‘Privacy by Design’, better information provision to individuals as well as transparency

and dialogue with competent authorities [3].

In order to be effective in fulfilling those requirements, a PIAF should allow systematic detection

and monitoring of how the privacy of involved people is affected by the proposed project,

initiative or IT system. Historically, several distinct approaches have been developed and carried

forward in a number of countries, including Australia, Canada, Ireland, New Zealand, UK and US.

In 2002, the Canadian government became the first jurisdiction to make PIA mandatory for

government bodies [19]. The EC-funded project PIAF (A Privacy Impact Assessment Framework

for data protection and privacy rights) reviewed existing PIA methodologies in the above

mentioned countries with the most experience in PIA, identifying the principal similarities and

differences between the different PIA guidance documents and the best practice elements that

a successful PIAF should include [20]. Most of these elements are mentioned in Art. 33 of the EC

Regulation, including the requirement for data controllers to “seek the views of data subjects or

their representatives on the intended processing, without prejudice to the protection of

commercial or public interests or the security of the processing operations”. Such consultation

allows gathering inputs on stakeholder perceptions of the severity of each privacy risk and the

possible measures to mitigate it. This inclusive approach implies that – despite the process level

commonalities, with most of PIA handbooks complying to the workflow described in the Figure

below – the outputs and outcomes of individual instantiations of the process may considerably

differ to one another.

Page 50: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

50

Figure 19: Stepwise PIA process (source: [20])

Currently, a PIA standardisation effort is ongoing at the ISO/IEC Joint Technical Committee No.

1/SC 27 ‘IT Security techniques’ [9], but its results will only be made known in 2016 or later. The

ISO/IEC 27002 standard for IT systems security already includes privacy protection. Yet, despite

this claim, it leaves privacy policies and measures unspecified. Therefore, no single agreed PIA

procedure or guideline exists at the moment and the Citadel PIAF only adds to a number of

concurrent methodologies and approaches. In particular, current PIA schemes follow a ‘risk

assessment’ approach, aiming to minimize the risk of privacy breaches and the consequences of

that on a particular organisation. In Citadel, this is complemented by an ‘empowerment’

approach, whereby communities and citizens can have greater control over their own

information, thus also contributing to lower the risks for whoever manages it.

PRIVACY TYPES

The original contribution of Citadel to the theoretical and practical debate on PIA’s comes from

the distinction between three types of privacy: Community, Application and Data level. These

have gradually emerged from the iterative activities done across the project tasks during the

past couple of years. This distinction becomes relevant in three respects: first, because it is

commonly agreed that a fully functioning PIA should deal with all types of privacy within their

respective scope; second, because with the introduction of an Open Data Commons it is no

longer clear who the data controllers are, whom the liability of privacy protection should be

attributed by law; third, in relation to the fact that the practical measures to embed privacy

concerns into the design change quite a lot in relation to the specific nature of the privacy issues

tackled.

These three typologies of privacy are only partly overlapping, as the following picture exhibits:

Page 51: The Open Data Commons: a new vision for the future of Open Data

Privacy and the Open Data Commons

51

Figure 20: Typologies of privacy in Citadel

Community level privacy can be defined as the way this concern is perceived and assessed by

the stakeholders potentially affected by it. As the PIA process outlined in Figure 19 documents,

it is essential for data controllers to make sure they understand the distinct interests and

arguments of the people and organisations involved in their community of reference, as far as

the management and the potential risk of disclosure of personal data and information is

concerned. As a matter of fact, the Citadel project since its early stages has promoted the

constitution, in each of the pilot Cities, of the so-called Open Data Governance Groups (ODGGs),

consisting of the key stakeholders interested in the opening up and cleansing of public (normally

local government owned) datasets. One of the key topics of discussion internally to the ODGGs

has inevitably been how to deal with the privacy implications of the utilization of open datasets,

particularly in association with no profit activities. The resulting template of Open Data Charter

is meant to include a specific section on the terms and conditions of privacy protection, with its

contents reflecting the specific outcomes of the thematic discussion in the Cities.

In turn, Application level privacy can be defined as the extent to which applications such as

those experimented in Citadel deal with user information – by either disclosing or protecting it.

Apps can collect significant information about users and their devices, often without their

knowledge or permission. It is quite rare that comprehensive information in clear and plain

language is provided to new users about the features of a given app, what information will be

accessed by whom and how it will be used or to whom it will be disclosed. Merely offering a

single 'Accept' or 'Install' button is unlikely to support valid user consent. In February 2013, the

‘Art. 29 Data Protection Working Party’ formulated an opinion on the security and privacy risks

associated with the use of applications and proposed a set of recommendations to each of the

different players in the marketplace [2]. During the Citadel project, a number of application

templates as well as a generic resource called Application Generation Tool (AGT) were

developed and positioned on Github and the Citadel Hub. Ideally, each of these tools can help

generate innumerable apps (as they already have) with only few differences across the various

Page 52: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

52

possible datasets, locations and utilisations. Therefore, it makes sense to define the privacy

policy of each application template as well as the AGT.

Finally, Data level privacy is a concept that has been developed during the project as a result of

the reflections and experimentations done as mentioned above. It can be defined as the

qualification of a single data item (or row in a dataset) in terms of its possibility to be safely

disclosed, without generating any harm for the original data owner. Of course, being an

attribute of the single data entry, it cannot be assigned by any other subject than the data

owner, nor can it be changed from ‘public’ back to ‘private’ any more11. Obviously, the specific

attribute of a data item affects the quality of the dataset it belongs to and each transformation

thereof. For example, a JSON file created out of existing CSV or similar with the Citadel

Converter (which is another free tool of the project) should in theory preserve the same data

level privacy attribution as the source of this transformation.

The following three sections separately delve with these privacy typologies – also in relation to

the prospective impact of the forthcoming legislative reform described earlier in this book.

Taken together, they form the three distinct conceptual elements of the Citadel PIAF, as shown

by the table below. Each of these aspects implies specific governance issues that will be

discussed in the perspective of their being instantiated in the agreements supporting Citadel’s

Open Data Governance Groups. In this way, the PIAF contributes to ongoing PIA standardization

efforts, by further highlighting the communitarian – not only organizational – dimension of

privacy management even in a context like the one of Open Data, that at first sight poses little

(if any) challenges to the protection of personal information.

Table 4. Summary of the three levels of the Citadel PIA framework

Level Focus Goal Issues Proposal

Community City leading Open Data Governance Group(s)

City level PIA

Beyond risk assessment in organisations to community stewardship of open data policies

PIA embedded in Open Data Governance Charter as a multi-faceted framework to highlight emergent privacy issues

Application Professional Developer (or Citizen Developer)

App level PIA

Citadel scenarios leading to multiple authorship and personal data mash-ups with potentially unforeseen outcomes.

Open Data Commons, AGT and templates with privacy policy embedded by design (scope for more privacy as a service features)

Data Individuals generating data items

Data level PIA

Adequate information on who is using personal data and guarantees that individual privacy requirements will be respected

ODC Index based licensing system (explicitly dealing with converted datasets) based on the Creative Commons analogy for supporting “Privacy lifestyle” decisions

11

This statement could soon be reversed, according to the results of the debate on the “right to be forgotten”.

Page 53: The Open Data Commons: a new vision for the future of Open Data

Privacy and the Open Data Commons

53

PRIVACY AT THE COMMUNITY LEVEL

PERIODIC SURVEYS OF THE PILOT CITIES

During the project’s lifetime, the key stakeholders of the four pilot cities of Athens, Ghent, Issy-

les-Moulineaux and Manchester have periodically been surveyed by the Citadel consortium, to

gather - in a cost efficient and effective manner - inputs and experiences on open data

publication and exploitation as well as related issues and concerns.

The first survey was launched in April-May 2012 on a platform managed by CORVE and LOLA. It

was based on the principles of the Citadel Statement (http://bit.ly/citadel-statement), which

was presented at the 'Flemish Conference on local eGovernment' on December 14th, 2010 in

Ghent. The Citadel Statement was a call for a European policy supporting cities and

municipalities in their implementation of the eGovernment action plan, which received the

support of numerous organisations across Europe.

Following the analysis of survey results, a number of key performance indicators were

proposed, in order to provide a more precise definition of a viable benchmark for use in the

remainder of the Citadel project. The concerns about Privacy have been addressed in KPI-1

‘Legal and Policy Frameworks’, as the latter can be a catalyst for open data publication and

exploitation against the constraints of privacy and security, and in KPI-4 ‘Scope of Guidance’ for

local policies, which was requested to include: principles, standards, privacy, security, filtering,

data formats, data definitions, repositories, and channels of open data publication.

In January 2014, an online survey was administered to member and non-member organizations

(as well as individual persons) to assess the relevance and progress of governance related issues

in the emergent Open Data scenario. The survey was anonymous. The questionnaire started by

introducing a list of possible actors of the city open data governance system, namely the

following:

Mayor, City government

City ICT Department

Public Data providers

Private Data providers

Software companies

Citizen developers

User communities

Citizens and visitors

and the interviewed panel was asked to select which roles were/should be more appropriate for

each category, picking up from the following options:

a) Being informed and consulted - the actors are kept up to date on developments and

consulted when general strategies or plans need feedback;

b) Participating in decisions - actors contribute to specific decisions on how to implement

an Open Data strategy;

Page 54: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

54

c) Active, leading role - they are directly involved in the concrete implementation of the

Open Data strategy;

d) Not relevant – none of the above options is applicable to the actor group at hand.

As far as the definition and enforcement of privacy related policies are concerned, the

respondents assigned a crucial role of leadership to the City government and ICT department,

and to a lesser extent to the public and private data providers. As far as the remaining

stakeholders are concerned, while information and consultation is always welcome during the

process, a slightly more participative role was invoked for the ‘Citizen Developers’ only, as the

following diagrams exhibit:

Figure 21: Stakeholders role in the definition of privacy policies (Citadel members)

Page 55: The Open Data Commons: a new vision for the future of Open Data

Privacy and the Open Data Commons

55

Figure 22: Stakeholders role in the definition of privacy policies (Citadel non-members)

A third survey was run in conjunction with the Evaluation activity and it was specifically

designed to capture detailed feedback about the Citadel tools and their usage. At that time, the

Citadel tools (Converter plus AGT) were already available and thus the prospects for a much

more open approach to Open Data were also evident. Over 100 people participated in the

experiment, thus providing an adequate statistical base for extrapolating results from the

answers received to the online questionnaire. It is relevant to note here that the opinions

expressed on privacy related issues seemed to differ very much across the interviewed panel.

On the one side, some respondents rightly affirmed that privacy regulations in force have little

or no impact on the process of opening up the public datasets belonging to the pilot cities.

While this can be undoubtedly true within the Citadel partnership, in other cases, however,

there is evidence of privacy issues and concerns building a sort of psychological barrier against a

faster and widespread implementation of open government and data principles. The Open Data

Charter has been named as a viable solution to make people aware of possible breaches or

consequences of opening up data.

On the other side, it should be noted that the upcoming reform of EU data protection legislation

reinforces the responsibilities of data and service providers, placing a heavy burden of overhead

for compliance. As was argued in the previous section, this reform seems to endanger the status

of the ‘Citizen developer’ lying at the heart of the Citadel vision, even though natural persons as

such do remain out of the scope of the privacy legislation. Again, the Open Data Charter is seen

as the right localisation for some ad hoc provisions in the direction of privacy management.

More generally, the judgement concerning the current and prospective framework and

guidelines on data protection was mixed. While most respondents were aligned with the

principles and directions of the EU initiative, others thought that the real issues of concern were

not properly addressed, as e.g. the social norms about privacy are shifting over time, and the

Page 56: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

56

exploitation context proposed by the Citadel project, in which published datasets are used by

citizens for non profit purposes, is worthier of trust than mere commercial exploitation.

A closer look at the findings from the questions in this survey specifically related to privacy, in

the light of the three-level Citadel PIAF, instead provides an explanation of these apparent

contradictions.

The highest risk people see of not opening data is “missed opportunities for useful

applications and services”. Therefore, people see the value of open data and are expecting

someone to solve privacy issues upfront, as we suggest in the Citadel PIAF.

People want to be able to manage their own data in more detailed ways than foreseen by

EU legislation, for instance allowing access to specific groups (44%), prohibiting commercial

use (42%) and so forth.

People are very concerned about what happens to their private information; fears of

inappropriate disclosure, unacceptable use, and insecure storage are the first three

concerns, all above 60%.

Surprisingly, over 70% of respondents (many of whom within city administrations) don’t

know whether or not their City has even published a PIA. Since people are indeed

concerned about privacy, the PIA appears to be considered more as a question of

compliance rather than a process that guarantees what they’re looking for.

A specific question confirms the result of the previous survey, namely that people trust

municipal authorities and in particular municipal IT departments more than anyone else and

don’t trust “suppliers to the government” nor “Communication and PR office”.

While City governments may be trusted to manage privacy issues, should not act on their

own but rather follow guidelines and policies decided upon through public consultation.

ANALYSIS OF OUTCOMES

Alongside the periodic surveys, some discussions on privacy issues can be reported within the

Open Data Governance Groups in the four pilot Cities. Overall, the results in the pilot cities echo

the last external survey: people are concerned about privacy but the PIA looks insufficient to

meet these concerns. However, during the project the contribution of the ODGG’s has stayed

well below initial expectations. While everyone agrees that the ODGGs can be defined as (sort

of) ‘knowledge broker’ on the challenges and barriers of opening and using government data,

received input on privacy and data protection issues have been limited. In particular:

1) An Early Stage Approach has been invoked, to integrate best practice around privacy and

data protection right at the start of any City release of Open Data;

2) Trust and Confidence building are deemed essential, although the question remains of how

to create and maintain them in local citizens and businesses;

3) External (Third Party) Control was also required, for instance through appointing an official

Ethical Advisor to help the local authority oversee privacy and data protection matters.

Additionally to the above, as noted above, the initial concept of Open Data Charter proposed in

the project gradually evolved from a ‘standard protocol’ to be adopted and formally signed by

the interested City stakeholders, to an alternative, more flexible structure – akin to a set of

Page 57: The Open Data Commons: a new vision for the future of Open Data

Privacy and the Open Data Commons

57

guidelines – that focuses more on the respective roles and contributions of governance group

actors together with specific indications on how privacy is to be managed.

TOWARDS A COMMUNITY PIA

This evidence – together with the recommendations for an inclusive and consultative

preparation process – leads us towards integrating an on-going Privacy Impact Assessment

exercise into the mandate of Open Data Governance Groups, accompanied by a broader scope

and tools to manage privacy in a more objective and transparent manner (as we shall see in the

following sections). Advantages of so doing include:

The opportunity to ignite a specific discussion on privacy among the ODGG actors to explore

the consequences of data opening and utilisation, and especially to design scenarios where

the Open Data Commons acquires value-adding features such as Privacy by Design or

Privacy as a Service at the community level;

The possibility to establish a more favourable regime in the City, as far as data protection is

concerned, for those datasets and applications which do not fall – or not easily so – within

the provision of extant and forthcoming legislation;

The chance of bringing these aspects to the attention of public decision makers with even

greater relevance and urgency with the prospective diffusion of Open Data Commons

facilities, such as the Converter facilitating the publication of “own data” by individual

citizens and the App Generator facilitating the “own app” creation scenario for non IT savvy

or expert users.

PRIVACY AT THE APP LEVEL

MAPPING OF CITADEL APPS

As reported above, the project across four consecutive iterations has developed five application

templates, which are available at the URL: http://demos.citadelonthemove.eu/ and summarized

in the following table. It should be added that all the templates are open source and come

under the new BSD 3 license12.

12

http://opensource.org/licenses/BSD-3-Clause

Page 58: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

58

Table 5. Mapping of Citadel Application Templates

Athens Ghent Issy-Les-Moulineaux Manchester

Find a Parking Lot

http://demos.citadelonthemove.eu/parking-athens/

http://demos.citadelonthemove.eu/parking-gent/

http://demos.citadelonthemove.eu/parking-manchester/

Events in the City

http://demos.citadelonthemove.eu/events-athens/

http://demos.citadelonthemove.eu/events-gent/

http://demos.citadelonthemove.eu/events-issy/

Points of Interest in the City

http://demos.citadelonthemove.eu/pois-athens/

http://demos.citadelonthemove.eu/pois-gent/

http://demos.citadelonthemove.eu/pois-issy/

User Generated Points of Interest

http://demos.citadelonthemove.eu/crowd-sourcing-athens/

http://demos.citadelonthemove.eu/crowd-sourcing-gent/

http://demos.citadelonthemove.eu/crowd-sourcing-issy/

Environ-mental Data

http://demos.citadelonthemove.eu/environment-athens/

http://demos.citadelonthemove.eu/environment-manchester/

In short, ‘Find a Parking Lot’ provides information about the parking lots of a given city. The city

is configured in the back end of the application. The first page of the application presents all the

available parking lots on a map that is centred to the centre of the selected city. ‘Events in the

City’ displays the events of a city on a map or list view and helps the user get information on the

types of events (s)he is interested in. The first page of the application presents a map of the city

centre just like in the Parking facilities template. The events are always geo-localized, so the

ones that take place near the city centre are those that are displayed in the first place. ‘Points of

Interest in the City’ is a general application to display any kind of PoI’s on the map of a chosen

city. Every PoI is categorized under one or more categories, e.g. Museums, Transportation, etc.

The template offers a filtering functionality that uses all the categories of PoI’s found in the

given dataset and provides users with a list of checkboxes corresponding to those categories.

‘User Generated Points of Interest’ is a template that provides user-generated PoI’s of a given

city and other crowd-sourced information. The city is configured in the back end of the

application. The first page of the application presents a map that is centred to the centre of the

selected city. Users can select different categories of user-generated PoI’s to be shown.

Different colours of pins represent different categories of PoI’s. Finally, ‘Environmental Data’

provides information about the environmental, and in the future, traffic and transportation data

of a given location in the city.

With these templates at hand, also made available on Github (https://github.com/citadel-eu),

every ‘Citizen Developer’ can potentially create their own mobile applications, linking to the

open public datasets made available by the respective local governments (and communities, as

it is the case of user generated points of interest). As Table 5 shows, no fewer than two pilot

cities, if not more, have adhered to the task of feeding the templates with (at least simulated)

data.

Page 59: The Open Data Commons: a new vision for the future of Open Data

Privacy and the Open Data Commons

59

Additionally and with the purpose of making life easier to the less IT savvy ‘Citizen Developers’,

the project has delivered an App Generation Tool (AGT), available online at the following URL:

http://www.citadelonthemove.eu/en-us/createanapp/applicationgenerationtool.aspx

At present, over 100 European Cities are subscribed as data publishers in the AGT and over 500

mobile apps have been created in about one year. About one tenth of all apps created are

multi-city, namely they concretely demonstrate potential for reuse, and about one in four are

multi-data, namely they provide valuable information to their users through data mash-up’s.

ANALYSIS OF IMPLICATIONS

The argument, according to which the EU legislation on personal data protection is not

applicable to Citadel application developments, is based on the fact that with only one partial

exception (i.e. User Generated PoI’s), all the other templates in Table 5 are totally dependent on

government data from different sources. Therefore, the project has been successful in

promoting new and original, if not innovative, ways to exploit the publication of open public

datasets for basically non-commercial purposes by the private sector and particularly the

‘Citizen Developers’.

However, the Citadel vision itself, configuring an active role for people in the development of

applications, as well as in the generation of data (like PoI’s and other crowdsourced

information), introduces scenarios for privacy management that are only partially foreseen by

current and pending legislation.

Based on the periodic reports from the ODGG’s in the Citadel pilot cities, the main privacy risks

to be considered at application level can be listed as follows:

a) Geolocalisation. The User Generated Points of Interest are inevitably georeferenced on

the City map – which can create a feedback loop with personal information.

Additionally, each Citadel app template to ensure better performance may require the

communication of the exact localization of the user, which can considerably facilitate

his or her identification by third parties13.

b) Shared Access. It is unclear whether the apps developed with the support of the Citadel

AGT should be considered for private use only. The notion of private use might include

the creation of small closed groups (including e.g. friends or relatives), though it is

unlikely that a group registration system would be added in that case. As a matter of

fact, when shared with other users or linked on Web 2.0 communities, the app can

disclose and publicize personal information (on user localization, behaviour and related

aspects) to a broader audience than originally planned or desired14.

c) Extension in Scope. In principle, the app templates provided for a baseline “100% open

and public datasets” scenario could be profitably reused for other purposes, where

underlying datasets, still owned by third parties, may be private or confidential or just

not authorised for such particular use. A variant of this extension in scope can occur if

13

Actually, one uses the localisation feature of HTML5, which therefore has to do with the browser more than the

app. But the app has to be authorized to work with it anyways. 14

Not surprisingly, some Associate cities (eg. Amsterdam) have asked for "private" ODC spaces.

Page 60: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

60

and when the user decides to mash-up the open and public with his/her own generated

datasets (like in the example of the bridge club above).

d) Crowdsourced Data Inputs. Depending on the way an (extended) app is configured,

users can be called (or tempted) to add their own datasets or complete/integrate the

data items in the existing ones. This may facilitate personal identification or change in

the level of privacy protection of used datasets.

e) Open Source Software Improvements. The Citadel app templates are already available

on Github. According to the OSS logic, developers can bring improvements to the

existing release(s) under the condition of free redistribution. However, depending on

the nature of the app or template, this can reinforce or reduce the level of privacy

guarantee.

f) Recurrent need for user consent. For the reasons expressed above, it might be

necessary to repeat the request for user consent even after the specific application has

been installed, for instance whenever a new user enters the community or does specific

actions with and on the app.

PROPOSAL FOR AN APP PIA FRAMEWORK

Most of the previous problems emerge because of an improper management of the

implications of data and application ownership. The ‘conventional’ approach of data protection

regulators rightly focuses on the clarification of privacy policy from the perspective of mobile

app developers in order to strengthen the information basis in support of user consent.

However this approach has as starting point the following principles:

That app developers are juridical persons, usually profit making organisations

That the personal data under scrutiny only belong to the individual user

That the app features are consolidated and do not change much over time.

As the examples provided above demonstrate, these principles are no longer valid in the Citadel

world, where

The app developers can be natural persons (citizens) or not for profit entities like e.g. social

networks or associations

That the datasets used may belong to several owners and subject to different policies (of

openness and permission to reuse) including no policy at all

That the app features are subject to continuous revision by multiple parties at the same

time, according to the OSS logic.

In the early stages of the Citadel project, a taxonomy of data/application pairings was proposed

in order to make room for such variations of the theme, in the perspective of existing and

upcoming EU legislation on data protection. Then, we identified nine possible instantiations (use

cases) worth consideration in terms of liability for data providers and app developers. That

taxonomy is proposed again here, with modifications, thanks to the following table, where the

use cases of interest have now become eight.

Page 61: The Open Data Commons: a new vision for the future of Open Data

Privacy and the Open Data Commons

61

Table 6. Taxonomy of Data/Application Pairings in Citadel

Data is stored locally on the app (be it one’s own or a 3

rd

party’s)

Data is shared with peers in a closed group through 3

rd party’s app

Data is peer shared or made public through one’s own app

Data is made public through 3

rd party app

Person owns data

1. Out of the scope of current and future data protection legislation, as no disclosure occurs

2. In the scope for the app provider, provided it acts professionally, ‘grey zone’ otherwise

3. Out of scope, but awareness of ‘citizen developer’ should be promoted somehow

4. In the scope, if app is professionally run, ‘grey zone’ otherwise

Person produces data

Person uses 3

rd

party data

5. Out of scope, but common sense should be used (eg diligence in custody)

6. Out of scope for the individual, provided data is for personal use or he/she acts within the group

7. Out of scope for the individual, within the limitations posed by data owner

8. Out of scope for the individual, as data captured was already public

1. Person owns/produces data, which is stored locally on the application. Example: annotating

events in the agenda of a cellular phone or tablet PC. This case (highlighted in green colour)

should remain out of the scope of the current and future legislation on data protection, being

referred to personal use only. No difference should be made by the circumstance that the app

used was bought from a third party or directly produced by a citizen developer for his/her own

purposes.

2. Person owns/produces data and shares it with peers in a closed group using a 3rd party app.

Example: the person sends an email to a friend or chats on Facebook or other social network

about a certain topic. The contents of this communication are private, but what if the receiver

discloses them without a prior consent from the sender? The infringement of privacy would be

certain, not yet its legal relevance (unless a crime is committed due to this disclosure). An early

warning (or a periodic reminder) from the system might be desirable, however, in order to

minimize this risk. From the perspective of the developer of that app (normally a 3rd party, but

there could be some exceptions), being the group closed, liability for any privacy infringement is

limited if a registration procedure existed that foresaw the collection of user consent before

entering the system. This is the ‘prior consent’ required by EU Directive 95/46/EC to be formally

and explicitly given by the owner (or ‘subject’) before any data treatment.

3. Person owns/produces data and peer shares it or makes it public through his/her own app.

Example: a self-developed application that spreads information about energy consumption of

home appliances for monitoring and benchmarking purposes. Given the full identity between

data owner and application developer, pre-emptive consent to data sharing should be

considered as embedded in the system. However, taking into account the risk that data

publication may lead to third party appropriation for other uses (e.g. commercial) not expressly

authorised by the owner, if not for the commission of crimes, it would be advisable to promote

better awareness of these risks by the user in some way. Here the provisions of current and

Page 62: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

62

future data protection legislation do not apply, but a role could be identified for a specific

service in the Open Data Commons, for instance.

4. Person owns/produces data and makes it public using a 3rd party app. This may be the case

of a service residing in the cloud – for instance, to the benefit of car drivers in a City centre –

that shows the respective locations in order to promote e.g. the exchange of parking lots by

those going in and out of the town. Disclosing a trivial piece of information like one’s own car

plate number can have unwanted consequences, which should be the object of prior caveats

and informed consent. However, an alternative action would be to ‘embed’ Privacy as a Service

within the application, so that each user can select the acceptable level of personal data

protection in relation to the scope and purposes of the service required.

5. Person uses 3rd party data, which is stored locally on the application. If private information

belonging to any third party is stored on local devices only, then we might presume it is only for

the individual use of that person. Example: a standard phone directory or address book in

someone’s cellular phone or tablet PC. While the case is not relevant for the current and future

legislation on data protection, it may become of interest for criminal law if a fraudulent use

materializes. Without going that far, common sense recommendations like adding a password

and regularly changing it can be appropriate, to minimize the risk of involuntary disclosure.

6. Person uses 3rd party data, which is peer shared in a closed group. Example: information

that is acquired on Facebook. This can be reused without limits in the same context (eg.

Facebook) but some behavioural rules (Netiquette?) should possibly be adopted. Use of this

information only for personal purposes (provided they are legal) is also allowed. Another

question might be whether the disclosure of such information outside the borders of the group

would be allowed. Here the answer would certainly be negative.

7. Person uses 3rd party data, which is peer shared or made public by the data owner through

an application developed by him/her. Example: a searchable repository of knowledge provided

to registered users (peer shared) or to the general public by the owner of that knowledge.

Depending on the limitations posed by the data owner, reuse may be free or subject to

conditions. However, the provisions of data protection legislation may not apply.

8. Person uses 3rd party data, which has been made public on 3rd party applications. Example:

downloading information from a repository of open datasets. Here the public nature of

information used is already clear from the start. Therefore, no infringement of privacy law could

be foreseen.

The two ‘grey zones’ identified in points 2 and 4 above refer to the case of citizen developer,

who is most probably lacking a juridical person’s nature and thus difficult to attribute with

certainty to the scope of application of privacy legislation. Here also two contrasting interests

are visibly in action, one to promote the transformation of this spontaneous initiative into a real

business with potentially huge returns, another to avoid that when this happens, it will also be

too late to protect individual user against a voluntary or involuntary disclosure of the embedded

personal information.

Page 63: The Open Data Commons: a new vision for the future of Open Data

Privacy and the Open Data Commons

63

By identifying several cases as ‘out of scope’ of the proposed EU legislation, we do not intend to

imply any shortcoming of the proposed reforms nor propose any extension of the normative

framework. On the contrary, Citadel holds that the above dilemma requires a ‘soft’ regulation to

be designed in the context of the principles of the Open Data Commons and the multi-level

PIAF, and namely by linking the ‘out of scope’ issues at the application level to the other two

levels identified:

At the Community level, by collectively reviewing app developers’ privacy policies and how

they are implemented on a regular basis, with more frequent and clear reminders of privacy

risks to app users, in the context of local ODGG’s PIA exercises.

At the Data level, by allowing application developers to have greater certainly as regards the

privacy implications of datasets they are accessing. This is the subject of the following

section.

PRIVACY AT THE DATA LEVEL

PRIVACY IN THE OPEN DATA COMMONS

As argued in the previous section, it seems quite unlikely to be able to fully solve the ‘data

protection vs. valorisation’ dilemma through a more accurate handling of application level

privacy within a Community PIA exercise. While valuable, this approach needs to be

complemented by a mechanism through which to embed privacy concerns in the Open Data

Commons by design.

The way it has been conceived and offered to European stakeholders’ attention, the Open Data

Commons is not meant to be (or become) a single, giant information silo where someone,

typically a ‘Leviathan’ acting as cloud service provider, collects and stores resources in the

perspective of future utilization by registered or even unregistered ‘customers’. It is rather a

new kind of network or a data and application ecosystem, in which all actors are ‘peers’,

interacting on a level playing field and where every connection – be it person-to-person, person-

to-business, or business-to-business – is peer-to-peer, i.e. with no ‘middleman’ in between.

In this ecosystem, no single entity has full access to or stores all the data, which is only linked to

and from the location of its original source(s). Additionally, individuals are empowered to

create, own and control their own datasets and applications, and to share them with other

participants in the Open Data Commons on terms that they set and negotiate, as need be.

Embedding privacy by design in the Open Data Commons (ODC) implies making sure that a list

of ‘Privacy as a Service’ features are baked in from conception, possibly including the following:

1. Data Identification: Every single data item (and as a consequence, any dataset containing it)

should always be associated to its source and ownership15. Every dataset transformation (eg.

converting, merging, purging, cleansing, etc.) should be able to preserve this attribution and

communicate it transparently to ODC actors.

15

Two fields of metadata that are already mandatory in the Citadel Index.

Page 64: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

64

2. Real time Updating: Every time the source system adds, deletes, or changes record in a

dataset, this should in theory be immediately reflected in the availability of the new,

transformed dataset on the ODC. How this translates into practice will need to be seen in future

developments, but the issue assumes a specific urgency in the light of privacy considerations.

3. Data Classification: The owner of each dataset (data item) should always be able to classify it

as ‘open and public’ or ‘private and restricted in use’ according to a certain licensing

mechanism. A facility on the ODC should help users associate and visualise the last updated

license to each dataset (data item) before usage16.

4. Data Anonymization: Another facility on the ODC should enable data owners to anonymize a

dataset, cleansing it of any reference to specific individuals or organisations during a conversion

process. This can be implemented in future versions of the Citadel Converter.

5. Transparency and Control: Every ODC actor should be entitled to make real time searches on

the log files, discovering who has inspected, appropriated or transformed any datasets

available. In perspective, this should lead to the possibility of assessing whether any ownership

rights have been broken illegally or without justification.

Given the extremely varying nature of the linked datasets and the early maturity stage of the

ODC implementation, it can be more productive to stay focused on process level innovations

such as those above17 as a contribution to the roadmap for future development.

ANALYSIS OF IMPLICATIONS

At present, the ODC ‘prototype’ environment as developed within the Citadel project is made

up of two main components:

A collection of online datasets – i.e. accessible via their URL – that contain data in a format –

i.e. platform and data structure – that can be accessed remotely by at least one template or

third party application without any further conversion;

A unique Index that includes a listing of a) online datasets converted as described, including

where and how they can be accessed, and b) the relevant template and application data

format that are compatible with the files.

In the future roadmap outlined in Citadel, the ODC (together with additional enhancements to

the Index) offers a distinctive opportunity to fulfil the requirements of embedding ‘Privacy as a

Service’ into an Open Data ecosystem.

First and foremost, basic logging has been implemented for the Converter and the Index, so

that key events are tracked such as the registration of new datasets (by whom and how),

accesses by templates, configurations used, etc. Further enhancements of these features

can considerably promote the transparency and control requirement, provided that all

actors can effectively access the log information contained therein.

Secondly, recent work with the Citadel Converter has explored features that contribute to

the requirements listed above: besides allowing the ‘ad hoc’ transformation of multiple file

formats into JSON files, compatible with one or more data models, on-the-fly conversion

16

A CC license field is also mandatory in the Citadel Index, at the dataset level. 17

As compared to exploring privacy implications of different kinds of information, i.e. financial, health, etc.

Page 65: The Open Data Commons: a new vision for the future of Open Data

Privacy and the Open Data Commons

65

scripts have also been tested. This can lead to enhancements that will allow the Converter

to directly access an original data source (or a regularly copied data dump) and only save

the configuration info to the ODC, thus better preserving the integrity of attribution.

Thirdly, the initial project policy based on developing application templates has been

considerably altered by the extremely good performance of the Application Generation

Tool, which has allowed the generation of dozens of original applications but also has the

limit (in its current version) of using only one data model that includes all fields from the

PoI, parking, and event templates. A different roadmap can be that of developing and

storing new application templates, able to visualize types of datasets that go beyond the

Citadel application scenarios. This can significantly contribute to realizing the open and

interoperable vision of the ODC in an incremental fashion.

Finally, the Citadel platform – the first common project space where both datasets and

application templates have been made publicly available – will continue being open and

accessible after the end of funded activities. This allows to anyone the publication of

georeferenced datasets and their immediate visualisation on a map on their mobile phone.

As already demonstrated by the Associate city outreach activity, this has great potential for

motivating the publication of datasets beyond those of the project partners. However, the

functionalities of data classification and (if required) anonymization need to be enhanced.

PROPOSAL FOR A LICENSING MECHANISM

We conclude this section by formulating a technical proposal for a data classification scheme

and a licensing mechanism. Its purpose is to provide a framework for qualifying data sensitivity

upfront, i.e. right at the moment a data item is created by the respective owner. This proposal

provides the foundation for establishing protection profile requirements and use allowances for

each class of data, irrespective of the application that handles or transforms it.

Table 7. Proposed data privacy classification scheme

Protection level

Consequences of disclosure Sample data (non exhaustive list)

Required action(s)

0 (Limited or

none)

Information intended for public access City PoIs Official statistics Environmental data

Acknowledge source

1

(Moderate)

Association with personal identity of data owner(s)

Geolocalisation Personal contacts Unique device identifiers

Ask permission

2

(Contractual)

Non compliance with terms and agreements set out by data owner(s)

License keys Electronic subscriptions Access logs to services

Pay against usage

3

(Sensitive)

Disclosure of sensitive personal information about data owner(s)

Health records Political opinions Sexual orientations

Anonymize

4

(Confidential)

Privacy breach, likely criminal offense Legal files Credit card data Commercial records

Destroy

The way this scheme should work is the following:

Page 66: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

66

At data item level, the system (most likely a ‘Privacy as a Service’ resource in the ODC, like

the aforementioned Index) should enable a user to add a metadata, akin to a Creative

Commons license18, that clarifies the extent to which it will be possible to copy, distribute,

and make some use of that data item, either non commercially or for business related

purposes;

At dataset level, the data item with the highest protection level should determine the whole

dataset’s qualification and classification. However, it should become possible to delete

some data items in order to create a new dataset with a lower protection level. In other

words, the classification is confined to the data, it does not extend its scope to any dataset

created with this data, provided the data itself is not manipulated;

At application level, the risk of privacy breach is zeroed in case a particular intelligence is

added to the system, which attributes to e.g. data mash-up’s the higher protection level of

all used datasets to that specific purpose, asking eg. the permission of the user only if and

when necessary (of course, all the caveats and provisions of the data protection legislation

should remain valid);

At community level, five events would be particularly relevant and interesting to explore:

1. What happens after the data owner has eg. given the permission to use that data in a

certain context. For sure, the licensing metadata should trace this circumstance. But is

this enough to authorize future reuse, in other contexts than the former? Here the

answer should be “probably not”. However, it would be hard to prevent unauthorised

reuse once e.g. someone’s phone number has been published on the web for the first

time. Therefore we might think of a lighter approach, which allows free reuse of a data

item (for instance, by lowering protection level from “1” to “0” permanently) under the

only condition that the owner should be informed whenever that data is used again.

2. What happens if a third party (like another user) manipulates a dataset by adding in

coherence new records to it? In this case, each additional data item should bear its own

metadata and again the protection level of the dataset would match the highest

protection level of any single data item contained therein. Same outcome in case the

addition was made by an application, rather than a human being. At least in principle,

an algebra of data transformations could be devised in that case – eg. if a new data item

is the sum of two, the result should automatically bear the protection level of the

highest addend.

3. Of course, any conversion of a dataset into a different format should not determine any

change in the predefined privacy attribution. For example, a JSON file created out of

existing CSV or similar with the Citadel Converter should preserve the same data

protection level than the original source of this transformation.

4. What if there is a mistake in a dataset proposed by someone, which is corrected by

someone else? The case is not that relevant per se, because it should be treated as the

previous in this list (with the new data item bearing the licensing metadata attributed

by its owner), but more as an example of conflict between false positives and false

negatives, which severely affect the world of data19. Probably the best way to solve a

18

For more information see https://creativecommons.org/licenses/?lang=en 19

By false negative we intend a data item discarded because it is seemingly wrong, while it is true, and by false

positive a data item which is considered true, while it is wrong. For a discussion of related issues in the world of big

Page 67: The Open Data Commons: a new vision for the future of Open Data

Privacy and the Open Data Commons

67

potential impasse is to keep track of all versions of a certain dataset, in order to allow

the recovery from “mistakes in correcting mistakes”.

5. Could a user change his/her mind and modify the attribution of a certain data item from

a previous, lower level to a higher level of protection? While the answer is certainly

“yes” in principle, this would prove impossible to do in practice after the protection has

been set at “0” for the first time. Following this train of logic, we might infer that once

the user has received sufficient advice for informed consent to data publication, this

decision should be presented as irreversible before collecting his/her approval on it. The

lesson learnt from this case is that EU privacy regulators should reinforce the procedure

of information delivery prior to user consent collection.

The proposed framework however poses important challenges to the Open Data Commons and

all similar ecosystems where important collections of data and applications would be made

available. In particular, the ODC should provide both the framework for the collective definition

of privacy guidelines related to both applications and datasets, and be an integral part of the

governance of open data as a public good.

The above cases, which derive from the way applications use data, all have to do with the

dynamics of applications during their use. And here we should remember that high data

protection levels (if unjustified, of course, in relation to envisaged uses) prevent the

development of the digital market and ultimately economic growth in Europe – two major goals

of the incumbent and upcoming EU legislation. Therefore, an adequate privacy scheme should

be closer to a licensing scheme, in the sense that it is not just a question of ‘see vs. don’t see’

but a more articulated issue of ‘what happens to my data’. In other words, like for the

classification scheme above, it is at the level of the individual data item that a license

mechanism should be applied, like the following table shows:

Table 8. Proposal for a Data Licensing Mechanism (based on the CC scheme)

Abbr. Meaning Status Use for Open Data

PR Privacy Restriction Proposed as new umbrella

Framework for derogation to data protection level 1 (see Table 7)

PR-BY Re-use with attribution (by)

Existing in CC

Data item can be re-used but link to source (of data item, either directly or indirectly) must be maintained

PR-SA Share Alike Existing in CC

Data item can be re-used by same license must be applied to derivatives (eg. when aggregating data or data mining)

PR-ND Non-derivative Existing in CC

Data item can be re-used by cannot be modified (ie. no aggregation or data mining)

PR-NC Non-commercial Existing in CC

Data item can be re-used but not for any commercial purposes

PR-NI Non-identifiable Proposed Data item can be used but without identification of data generator/owner

PR-NP Non-position Proposed Data item can be used but not in association with the location.

A given individual, either at the moment of providing data or enabling a device to generate

private data, could thus assign a PR license to each data item (ie. a positioning reading), for

data see: http://jeffjonas.typepad.com/jeff_jonas/2011/02/sensemaking-on-streams-my-g2-skunk-works-project-

privacy-by-design-pbd.html

Page 68: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

68

example PR SA ND NC NI. The way such a license should be applied varies according to the way

data is captured20:

‘Volunteered’ data by people who explicitly share information about themselves through

electronic media - for example, when someone creates a social network profile or enters

credit card information for online purchases;

‘Observed’ data captured by third parties while recording activities of users (in contrast to

data they volunteer) - examples include Internet browsing preferences, location data when

using cell phones or telephone usage behaviour;

‘Inferred’ data from the analysis of personal data belonging to the previous categories. For

instance, credit scores are calculated based on a number of factors relevant to an

individual’s financial history. (This is a derivative form of data capture only allowed if the ND

is not present, but in the case of SA the aggregated data item should maintain the same

license).

The following table summarizes the required characteristics of ODC or similar environments to

align with the logic and implications of the proposed data licensing scheme.

Table 9. Proposal for a Data PIA Framework

Data level License handling Compliant system

Agreements Validation

Data item Defines level of data protection thru PR license

Only accepts data items with license

Users can configure devices to generate data with selected licenses

User can always inspect data item in CSV (through the ODC Index)

Original Dataset (CSV)

Contains mix of data protection levels and a PR license as common denominator

Contains all data with all licenses, held by trusted administrator

Community (ODGG) can agree to assign basic licenses by default to non-licensed data items

Community (ODGG) defines terms for control of administrator activity while Index can handle the license aggregation

Dataset (C-JSON)

Is compliant with specific level of data protection by the corresponding PR license

Contains only data items coherent with dataset license

Administrator guarantees coherence of Dataset license with data items

Users can trace datasets using data items through the Index logs

Application Guarantees usage coherent with data protection level as stated by PR license

Only reads/accepts compliant datasets

Application developers agree to abide by licenses

Community (ODGG) can trace use made by applications via the ODC Index

Third parties (agency / company)

Guarantees usage coherent with data protection level and the PR license

Reports usage of datasets

Community (ODGG) can make specific agreements with third parties

Community (ODGG) defines terms for control of third party activity & configures Index accordingly

We could consider this as an innovative Data PIA Framework, to be “embedded by design” in

the future structure and functioning of the ODC itself.

20

We owe this distinction to [18].

Page 69: The Open Data Commons: a new vision for the future of Open Data

Privacy and the Open Data Commons

69

Overall, the diffusion of this licensing mechanism may have important consequences, both in

terms of personal rights and commercial exploitation of open data. On the one hand, the license

helps data owners to keep track of who uses their data and when – retaining copyright and

credit if that is the case – while not impeding to others the appropriation and manipulation of it.

Differently from the Creative Commons license, “ShareAlike” would be possibly allowed, under

the condition of giving feedback to owners about the uses of their data – at least for commercial

purposes. On the other hand, particularly after a dataset has reached the protection level of “0”

or has been licensed as PR-BY, app developers and other digital businesses would be facilitated

in their activities, being able to demonstrate that data legitimately belongs to the public

domain. In this way, the unwanted legal consequences of past privacy carelessness would no

longer be charged to the last edge of the value chain.

Page 70: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

70

Page 71: The Open Data Commons: a new vision for the future of Open Data

71

MATURITY OF OPEN DATA GOVERNANCE

In the Citadel project “Common Data Charters” (the plural referring to the intention to have one

per pilot city, then a single and final edition for the whole consortium) were defined as

“operational set of principles that will specify the common goals and principles for common

Data, as well as the rights and responsibilities for both data ‘providers’ and data ‘users’. On the

side of data providers, this includes the principles of common data formats in the public

interest, as well as duties and obligations regarding privacy and identity management. For the

data users, this includes conditions for a) contributing to the collective assets of common SDK

and API components and b) rights to exploitation of data by compliant applications”. Therefore,

the contents and prescriptions of Common Data Charters are inherently procedural.

This does not imply, however, that they also have an intrinsic contractual nature, namely to

engage the key stakeholders from each pilot city into reciprocal commitments related to

“provision” and “usage” of data and information. At least, this has been the way during the

project, the four Citadel pilot partners have been first trying to gain the commitment of the

most relevant socio-economic actors: propose drafts of Open Data Agreements (as Memoranda

of Understanding) and then collect as many signatures on them as possible.

Unfortunately, an early feedback received from the respective local communities was that the

Citadel project was at too an early stage to propose a written agreement that would be able to

attract a sufficient and qualified number of stakeholders. This is also why we soon decided to

abandon the idea of a City-specific Memorandum of Understanding, requiring formal signatories

by the participants in open data groups, opting for a more loose, non-binding registration form,

provided as a Google Document, which is still available at

https://docs.google.com/spreadsheet/viewform?formkey=dDVtekh5YzBGVkxtcXpoYnd4eWNNT

2c6MQ#gid=0

In so doing, additional degrees of freedom were allowed to prospective participants in the

Citadel project pilots, without losing track of the respective requirements in terms of data,

information and technological development.

CAPABILITY OF OPEN DATA ECOSYSTEMS

Now that the final objective is to issue a common Open Data Charter for the entire Citadel

community, the following considerations can be made:

First and foremost, flexibility is a value. While there have been talks within the Citadel

consortium, leading to rethink the priority of signing a local Data Charter agreement, this has

not reduced the intensity of efforts from the pilot partners, as documented in the previous

section, to deploy open datasets on their respective infrastructure. In the meantime, very few

Cities worldwide have demonstrated appreciation for the value of such an approach to kick-

start open data generation processes in the respective areas of competence or interest. A

relevant exception is the City of Edmonton, Canada - see their MoU (Memorandum of

Page 72: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

72

Understanding) at

https://docs.google.com/a/edmonton.ca/viewer?a=v&pid=sites&srcid=ZWRtb250b24uY2F8b3B

lbi1kYXRhLWNhdGFsb2d1ZS0yLTB8Z3g6MmUwMTMyNjhiNTBkZDNiOQ). Other Cities or public

institutions, particularly in Europe, have either used the MoU instrument to establish strategic

relationships with different levels of government (this can be the case of the Vienna Smart City

agreement between the City Mayor and the Austrian Ministry of Infrastructure, see the news

about it at https://smartcity.wien.at/site/die-initiative/strategie/smart-city-wien-neue-

initiative-bundelt-krafte/), or with same level organizations and institutions (compare the Mid-

America Regional Council, which gathers 9 member counties from Kansas City, MO – see

http://cfakc.tumblr.com/post/60775247513/digital-innovation-in-government-resources), or

with leading think tankers and experts in the domain (for example, the four MoU’s signed by the

BBC in 2013 – see the news at http://www.techweekeurope.co.uk/news/bbc-agrees-open-data-

132653). Other Cities have preferred a more formal establishment of rules concerning Open

Data governance, such as through ad hoc legislation (example: New York City – see

http://www.nyc.gov/html/doitt/html/open/local_law_11_2012.shtml). Still others have issued

ad hoc licensing agreements (such as: Goteborg – see

http://gbgdata.files.wordpress.com/2012/02/avtal-goopen-1-3-0-copy-eng.pdf, or Nantes - see

http://data.nantes.fr/licence/).

Second, the Open Data governance system outlined by the Citadel vision has the merit of

integrating all the local stakeholders belonging to the public sector’s data and information reuse

value chain we first outlined in the beginning of this book and reproduced in Figure 7 above.

While the eight stakeholder typologies presented to the Citadel survey respondents fully map

(with more internal specifications) the four communities originally displayed in that picture,

there is an obvious need for clarifying the respective roles and contributions to the

achievement of a common vision and the reciprocal gains and benefits that can derive from it.

This special need will be partly fulfilled in the remainder of this section.

The vision behind this ecosystem’s representation is that of a socio-technical environment,

made up of people, networks, institutions and technology artefacts, which co-determine the

direction and progress of open data publication and use policies (in this case). In addition to

communication and collaboration activities among the four groups of stakeholders that make

up this ecosystem, a set of behavioural rules, resources and practices contribute to shaping the

main function that this environment has to deliver in order to survive: innovation. According to

[11], after an extensive review of literature from various fields (economics of innovation,

entrepreneurship, sociology of technology and political science), seven are the key capabilities

to be enhanced for such systems to evolve and perform well in terms of innovation:

1. Knowledge base creation, development and diffusion

2. Influence on the direction of search and investment processes

3. Entrepreneurial discovery and experimentation

4. Formation and support of new markets for innovation

5. Visioning and legitimization of a common future

6. Mobilization of resources (human, financial, etc.)

7. Development of positive externalities

Page 73: The Open Data Commons: a new vision for the future of Open Data

Maturity of Open Data Governance

73

In the following table, we provide a few examples of how different governance activities

contribute to improving the above capabilities. Some of these can well be enhanced by the

application of a MoU-style agreement, some others can not, depending on historical and

cultural circumstances.

Table 10. Open Data Ecosystem capability matrix

Ecosystem capability

Example from Citadel pilots

Governance system contributions to capability enhancement

Supported in Citadel vision? If so, how

1. Knowledge base creation, development and diffusion

http://data.gent.be/datasets

Empowerment of citizens and mobile users to create own datasets. Publication of private, non-government data according to an open access approach that will enable its subsequent re-use by the public.

Citadel data converter

2. Influence on the direction of search and investment processes

Local debates on published and to-be-published datasets to figure out new applications

Formulation of shared requirements for providing open data, including standards, formats, licensing approach etc. Empowerment of interested stakeholders from the ‘bottom’ of the value chain to propose changes in open data policies and plans if reasonable.

Establishment of open data governance groups in the four pilot cities

3. Entrepreneurial discovery and experimentation

http://data.gent.be/apps

Empowerment of citizens and mobile users to develop own applications using open data. Fast prototyping, innovation and uptake of new templates for smart city services.

Citadel app templates

4. Formation and support of new markets for innovation

Hackatons, Open Data Days, etc.

Creation of new mobile apps relying on current and to be published open datasets. Promotion of application interoperability, rather than mere data convergence and/or integration. Privacy as an embedded service.

The ODC as a virtual brokering system that brings offer close to demand of open data and applications

5. Visioning and legitimization of a common future

http://fr.amiando.com/Citadel_EN.html

Generation of political backing to the provision of open access to data. Understanding citizen needs in terms of innovation in public services. Creation of new partnership models of working and co-creating services between government and citizens. Monitoring of the impact and outcomes of making data and applications open and available.

MoU’s, Open Data Charter

6. Mobilization of resources (human, financial, etc.)

http://opendatamanchester.org.uk/

Ability to bring together all relevant city stakeholders. Inclusion of domain experts and technical advisors to support specific parts of the process. Incubation and financial support of most relevant entrepreneurial initiatives.

MoU’s Open Data Charter

Page 74: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

74

Ecosystem capability

Example from Citadel pilots

Governance system contributions to capability enhancement

Supported in Citadel vision? If so, how

7. Development of positive externalities

N/A Provision to all interested stakeholders of a free access to existing knowledge base(s), against the only commitment to share and augment knowledge at equal conditions (reduction of uncertainty and costs of information acquisition). Migration of new apps from a city to another, following the common needs of the end users and the similarities of city datasets (induction of ‘spillover’ effects, increased efficiency due to shared and incremental development). Scaling up/out of existing governance systems through targeted communication campaigns (and possibly the signature of MoU’s) to extend the Citadel network to additional Cities and stakeholders.

The ODC as a holistic concept that takes further momentum and gains credibility across time and cities.

The table should be read as follows: in the first column, we identify a number of capabilities

that a generic ecosystem should be able to demonstrate. Where relevant, we provide in the

second column some examples of these capabilities taken from the Citadel pilots. In the third

column, we list the enhancements that a well-established open data governance system should

offer to existing capabilities. Finally, the last column shows examples of these enhancements as

emerging from within the Citadel project and partnership.

What can be gathered from the table is a latent conflict between two opposed visions of how

the process of opening up data and promoting their utilization can be finalized and made more

effective through formal agreements: on the one side, there are some functions (like 1 through

4) that do not necessarily require formalization through city level MoU’s – unless there is a need

to attract and include in the process all of the key stakeholders belonging to the ecosystem at

hand. In fact, this is the experience that has emerged with strength from the Citadel pilots. On

the other side, the table lists a few additional functions (namely 5 through 7) where the utility of

a signed MoU can be validly argued.

We hereby propose to solve the conflict in terms of a “Maturity Model” for the Cities that are

involved in this process. In literature, several models of such a kind exist that aim at the

fulfilment of heterogeneous purposes – from merely descriptive to evaluative up to normative

goals. In Citadel, we decided to focus on the CMM (now a registered service mark of Carnegie

Mellon University in the US). CMM – or Capability Maturity Model – is a five-levels qualitative

model assessing the maturity of an organization with respect to software development

processes [5]. Historically, the first CMM was developed between 1987 and 1997 for the US Air

Force. Prior to the CMM introduction, organizations tended to emphasize the results of

development, rather than focusing on how to improve the process. In principle, the five-level

structure of CMM and its underlying logic can be replicated and applied to any other process,

Page 75: The Open Data Commons: a new vision for the future of Open Data

Maturity of Open Data Governance

75

including the gradual establishment of an Open Data Ecosystem like the one described in this

book.

Instantiated to the Citadel socio-technical environment, the five CMM levels of a City could be

redefined as follows:

I. Accessible (e.g. when large sets of public and private data are provided free of charge

to consumers of content and developers of knowledge services in the city);

II. Inclusive (e.g. when all the major value chain stakeholders, including citizens as both

developers and users, are integrated in periodic consultations to express their individual

judgment and evaluation about the opening of data process);

III. Participatory (e.g. when a joint system of decision-making is permanently set up and

used to integrate local communities of data holders, service providers and users in

collective decisions regarding the design, implementation and evaluation of new

services and apps);

IV. Co-creating (e.g. when resources are in place that enable individual persons as well

as local entrepreneurs and larger companies to create new services by the mash-up and

orchestration of existing resources, application templates, or chunks of data);

V. Leader (e.g. when the city government and/or community become attractive leaders,

creed ambassadors, authoritative gurus and opinion catalysers for sustainable

innovation in public services through open data).

Differently from other maturity models, we do not necessarily see these five stages as steps of a

ramping-up pathway. In other words, our vision is not to promote the once-for-ever jump of a

city from level I to (say) IV by the introduction of an ODC ‘instantiation’, or assume that you

need to land in level IV before taking off towards level V. Our vision is more similar to a spiral

model, where progress can be incremental over time in all the five maturity stages, and a city or

community may well experience several recurring cycles that go from I to V. As an additional

clarification, we may wish to use the familiar 5-star deployment scheme for linked open data

introduced by Sir Tim Berners Lee as early as in 2006 (see

http://www.w3.org/DesignIssues/LinkedData.html). The following matrix simply maps the

proposed CMM against Lee’s 5-star scheme, to demonstrate that (depending also on the

starting point) a city may well be a “leader” in open licensing of public data on the web, and still

lag behind in other kinds of more advanced deployment. Presumably, but this would require an

empirical demonstration, the process evolves gradually across time, but it may also be subject

to quantum leaps or radical innovation experiments, here shown as a zig-zag pattern.

Page 76: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

76

Table 11. A CMM for LOD (example)

Possible patterns:

1-Star

2-Stars

3-Stars

4-Stars

5-Stars

(ma

ke y

ou

r d

ata

ava

ila

ble

on

th

e W

eb in

wh

ate

ver

form

at

un

der

an

op

en li

cen

se)

(ma

ke y

ou

r d

ata

ava

ila

ble

on

th

e W

eb in

str

uct

ure

d

form

- e

.g.,

Exc

el in

stea

d o

f im

ag

e sc

an

of

a t

ab

le)

(use

no

n-p

rop

riet

ary

fo

rma

ts f

or

da

ta p

ub

lica

tio

n in

m

ach

ine

rea

da

ble

fo

rm -

e.g

., C

SV in

stea

d o

f E

xcel

)

(use

UR

Is t

o d

eno

te t

hin

gs,

so

th

at

peo

ple

ca

n p

oin

t a

t yo

ur

da

ta m

ore

ea

sily

an

d q

uic

kly)

(lin

k yo

ur

da

ta t

o o

ther

da

ta t

o p

rovi

de

con

text

rel

ate

d

info

rma

tio

n)

I. Accessible City (e.g. when large sets of public and private data are provided free of charge to consumers of content and developers of knowledge services in the city)

* ** *** **** *****

II. Inclusive City (e.g. when all the major value chain stakeholders, including citizens as both developers and users, are integrated in periodic consultations to express their individual judgment and evaluation about the opening of data process)

* ** ***

III. Participatory City (e.g. when a joint system of decision-making is permanently set up and used to integrate local communities of data holders, service providers and users in collective decisions regarding the design, implementation and evaluation of new services and apps)

* **

IV. Co-creating City (e.g. when resources are in place that enable individual persons as well as local entrepreneurs and larger companies to create new services by the mash-up and orchestration of existing resources, application templates, or chunks of data)

*

V. Leader (e.g. when the city government and/or community become attractive leaders, creed ambassadors, authoritative gurus and opinion catalysers for sustainable innovation in public services through open data).

*

GOVERNANCE ROLES

Another important outcome of the pilot experiences has been a clarification of the distinct roles

played by the various stakeholders in an open data governance system. As we have tried to

demonstrate with the previous discussion, there can be different levels of maturity in this

system, which correspond to different intensities of engagement for those stakeholders.

However, in a mature community, all of them must be represented and actively engaged.

According to the survey results, there is a considerable awareness of the need for stakeholder

representation in both the Citadel members and non-members who have responded to the

Page 77: The Open Data Commons: a new vision for the future of Open Data

Maturity of Open Data Governance

77

survey. However, the underlying (common) vision is still too centred on the City’s ICT

Department (as a proxy for all technical and domain experts who are certainly required to ignite

and support the process from within the local government), while the contribution of other

stakeholders sitting at later stages of the value chain is certainly appreciated, but probably with

a certain amount of lip service paid to it. The reason for this might be that clear rules and

procedures are lacking for the definition of the perimeter and scope of each stakeholder’s

typology involvement – and the signature of a MoU might be a good solution to this impasse.

This aspect is also worth mentioning with respect to the Mayor’s (and other policy makers’)

contribution to the process. In fact, particularly if and when a formal MoU was not signed for

the discipline of open data governance groups, political coordination becomes essential in order

to deliver legitimization and ensure the proactive and committed behaviour of all key

participants.

The following table borrows from the questionnaire results in highlighting the potential

contribution of a MoU with respect to the enhancement of participation and engagement of

stakeholders in the open data governance system.

Table 12. Ecosystem role definition and potential MoU contribution

Role AS-IS (from the questionnaire responses)

TO-BE (possibly through formal MoU’s)

Defining Open Data strategies (what data to publish, ownership and property rights, pricing, etc.)

ACTIVE LEADERS: Mayor/City Government; City/ICT Department; Public Data providers PARTICIPATING IN DECISIONS: Private Data providers INFORMED AND/OR CONSULTED: Software companies; Citizen Developers; User communities; Citizens and visitors

Participation of citizens as developers and users is essential for the full realization of the vision. Information and consultation are not enough to create awareness and build consensus and engagement. A holistic strategy is more efficiently drawn up and tested with the support of the whole constituency. This can also contribute to the full adherence to the ODC concept – with all its embedded functions and services.

Defining technical and quality standards for Open Data (platforms, security, data and semantic standards, data quality, etc.)

ACTIVE LEADERS: City/ICT Department; Public Data providers; Private Data providers PARTICIPATING IN DECISIONS: Mayor/City Government; User communities INFORMED AND/OR CONSULTED: Software companies; Citizen Developers; Citizens and visitors

A more inclusive decision-making process in this domain might certainly be beneficial to the progress of activities and the achievement of results. In our vision, technical and quality standards for open data are strongly dependent on what happens at later stages of the public sector information use and reuse value chain. A critical role should be played here by the Mayor (or other policy makers) to create the conditions for a full and stable collaboration among all stakeholders.

Dataset refinement and validation of the quality of datasets

ACTIVE LEADERS: City/ICT Department; Public Data providers; Private Data providers; Citizen Developers; User communities

No relevant change should be foreseen in this structure of responsibilities.

Page 78: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

78

Role AS-IS (from the questionnaire responses)

TO-BE (possibly through formal MoU’s)

PARTICIPATING IN DECISIONS: Mayor/City Government INFORMED AND/OR CONSULTED: Software companies; Citizens and visitors

Publishing and updating open datasets

ACTIVE LEADERS: City/ICT Department; Public Data providers; Private Data providers PARTICIPATING IN DECISIONS: Citizen Developers INFORMED AND/OR CONSULTED: Mayor/City Government; Software companies; User communities; Citizens and visitors

No relevant change should be foreseen in this structure of responsibilities (provided there is a fully developed, ODC-like framework already in place to enable those operations).

Design, development, and configuration of mobile applications that use Open Data

ACTIVE LEADERS: City/ICT Department; Software companies; Citizen Developers PARTICIPATING IN DECISIONS: User communities INFORMED AND/OR CONSULTED: Mayor/City Government; Public Data providers; Private Data providers; Citizens and visitors

Here a more prominent role of the City Government – in its strategic policy planning function regarding service delivery and innovation – would probably be required, unless it is fully delegated to the ICT Department (see the responses given to the question about dataset refinement and validation of quality of datasets – perhaps there should be coherence between the two).

Promotion and/or selection of apps that use a city's Open Data (for example, organizing Hackathons, selecting best picks, etc.)

ACTIVE LEADERS: Mayor/City Government; City/ICT Department; Citizen Developers; User communities PARTICIPATING IN DECISIONS: Public Data providers; Private Data providers INFORMED AND/OR CONSULTED: Software companies; Citizens and visitors

No relevant change should be foreseen in this structure of responsibilities.

Defining and enforcing policies related to privacy

ACTIVE LEADERS: Mayor/City Government; City/ICT Department; Public Data providers; Private Data providers INFORMED AND/OR CONSULTED: Software companies; Citizen Developers; User communities; Citizens and visitors

Participation of all stakeholders ensures a more active and committed involvement in tackling this issue, which may imply the definition of “local solutions” taking extant/contingent conditions (as well as general regulations) into account, particularly in the sharing of applications that use personal data by the users/beneficiaries themselves.

Evaluation and impact assessment of a city's Open Data policy

ACTIVE LEADERS: All stakeholders, except Software companies INFORMED AND/OR CONSULTED: Software companies

Inclusion of Software companies in the process of evaluation and impact assessment. Establishment of a shared methodology, with public evidence of interim implementation results. Formation of permanent / ad hoc committees to enable this function as an embedded (in the ODC?) service to the entire community.

Page 79: The Open Data Commons: a new vision for the future of Open Data

Maturity of Open Data Governance

79

Role AS-IS (from the questionnaire responses)

TO-BE (possibly through formal MoU’s)

Research and innovation activities to explore new uses and applications of Open Data.

ACTIVE LEADERS: All stakeholders Initial identification of R&D priorities and periodic evaluation and monitoring of activities – aimed at revision and enhancement of objectives according to the results available. Formation of permanent / ad hoc committees to ensure this target is reached.

PROCESS

Historically in the four Citadel pilots, documented progress towards the definition and

clarifications of above roles and tasks has not been dependent on formal agreements, but more

on the growing “maturity” level of underlying Open Data Governance Groups. We can therefore

hypothesize four alternative configurations of a city/community MoU (or Open Data Charter),

depending on two main conditions:

- The initial level of “maturity” of the underlying Open Data Strategy;

- Its current socio-economic impact, in compliance with the Citadel vision (data and

information that are not published per se, but in relation to precise purposes and

exploitation opportunities).

The resulting options can be depicted as follows:

Figure 23: Conditions and purposes of MoU definition

Apart from the top right quadrant where the formalization of a MoU (or an Open Data Charter)

is not required, in the remaining three areas it is left to the decision of the local policy makers

whether this would be required or not. In some cases, particularly when both the maturity and

impact are low, a MoU can be recommended to activate (or rather accelerate) the take-up of

Page 80: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

80

open data policy: this is the case of the bottom left quadrant in the picture. In other situations,

it can well happen that despite the good level of maturity in current open data policy, its socio-

economic impact remains negligible, presumably due to lack of involvement and commitment

of local stakeholders. Therefore, one single or a set of MoU’s can be designed and implemented

by a city government to attract and consolidate the participation of the market in current and

prospective open data policies: this is what we call finalization in the above scheme. Finally, the

bottom right quadrant is representing the (possibly extreme, but not unlikely) situation where

the city has received clear signals from the market in terms of early impact of open data

initiatives, which now require the integration of expert and specialist knowledge to gain

momentum and become more and more widespread and inclusive. Again, a set of MoU’s (like

those signed by the BBC as mentioned at the beginning of this section) may be recommended

here.

For the process of MoU development and negotiation, a set of guidelines can be outlined, based

on the Citadel experience. We split these guidelines in five groups: A) Guidelines for

preparation, B) Guidelines for drafting, C) Guidelines for negotiation, D) Guidelines for

completion, and E) General purpose guidelines.

GUIDELINES FOR MOU PREPARATION

The preparatory stage begins with the realization of the need for a MoU. It is somehow the

following step to the assessment of the conditions stated in Figure 23 as preliminary and

essential for the decision of having one in place. After this assessment has been done, the

purposes of the MoU will be clarified – as well its scope and impact. Based on this

understanding, a first draft of the MoU provisions can be obtained.

Proposed steps:

Internal discussion within the city administration, possibly by a dedicated team, to identify:

AS-IS situation regarding open data publication;

Needs and requirements that motivate the decision to propose a MoU;

Targeted stakeholders in the community and ways to approach them;

Minimum / maximum achievements expected / to be reached;

Value / services to be negotiated in exchange;

Financial resources available (if any);

Strategy to be put in place for stakeholder involvement;

Internal staff to be involved in the drafting and negotiation phases.

GUIDELINES FOR DRAFTING

A first draft of the MoU is often unavoidable to approach the potential signatories from the

local community. Examples abound, of various different kinds (including some that have been

mentioned at the beginning of this section). In this process, it is essential to avoid the risk of

coming up with a pre-defined text, dissipating the credibility of a city wanting to become more

inclusive with its stakeholders.

Page 81: The Open Data Commons: a new vision for the future of Open Data

Maturity of Open Data Governance

81

Proposed steps:

Nominate the signatories to be invited;

Make sure that a representation of all key stakeholders is guaranteed;

Keep the agreement open for additional parties in the future:

Be realistic and transparent with the MoU goals and objectives;

Assign clear roles and functions to all parties;

Make sure that all prospective signatories have reason to enter the agreement;

Keep the language short and simple;

Do not make the MoU more complicated than necessary;

Set periodic review dates;

Specify procedures for amending the MoU.

GUIDELINES FOR NEGOTIATION

Ideally, any MoU should come up as the result of a preliminary discussion with the local

stakeholders involved. This would help signatories “buy” the purposes, roles and actions

foreseen in the agreement, as well as make important qualifications and additions to the text,

which may not have been figured out before. Therefore, it would be better to run this step in

parallel with the previous one, making a plan for stakeholder meetings and other forms of

encounter (including public events for the visitors or population as a whole).

In this stage, it is important to start by identifying those individuals (from both the city

administration and the external actors) who are more knowledgeable and influential, to enlist

their early engagement and assistance during the whole process.

Proposed steps:

Nominate the staff to involve in the process as early as possible

Make a plan of meetings and other events with targeted stakeholders;

Base your discussions on the general principles and objectives of the MoU;

Better if you avoid starting by a predefined text, unless there is value in sharing it;

Negotiate the contribution of each stakeholder in the form of concrete tasks;

Define the minimally acceptable standards of performance for each task;

Conclude the meetings by assuming the responsibility for preparing the minutes;

Publish the minutes in accessible places, where they can be read and reused.

GUIDELINES FOR COMPLETION

Ideally, during the negotiation with local stakeholders, a number of open issues should be

clarified. First of all, the very nature of the MoU in relation to its purposes – for instance, to

increase socio-economic impact of open data policy, integrate the missing stakeholders into a

shared development vision, or simply solve specific issues (such as licensing or privacy

protection) with the contribution of domain experts. Second, the MoU objectives should be

translated in terms of measurable outputs and outcomes, which would facilitate monitoring,

evaluation, and renegotiation later on. Third, having framed negotiation with local actors in the

scope of the MoU purposes and with an open and inclusive approach, it is likely that the

Page 82: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

82

discussions will focus on controversial or grey areas and lead to revision of existing drafts (if

any).

It is right at this stage that the advantages and disadvantages of developing a MoU should be

weighed against its objectives and the reasonably expected results. In some cases, the signature

of a formal agreement may create unnecessary bureaucracy or rigidities in the way things are

done. It can also be misunderstood and disrupt a good relationship with some stakeholders,

giving the false impression of building unwanted differences, making preferences where they

didn’t exist before, etc.

In case the formation of an agreed text becomes possible after the negotiation phase:

Circulate the draft to all the other parties at the same time;

Involve the persons with the authority to negotiate for their organization;

Identify the immutable points and be open to changing the remaining ones;

Try to finalize the revisions by phone or in person;

Keep everyone informed of the latest changes;

End up with a public event for approving and signing the MoU.

GENERAL PURPOSE GUIDELINES

Normally, a MoU is not legally binding – resembling more an exchange of letters of intent

among its signatories. However, some of its implications may create relationships (by law or de

facto) or modify the distribution of rights and obligations in a permanent manner. Therefore, it

has to be handled with care and typically structured in such a way that leaves little room to

interpretation.

Additional caveats include:

Simplicity. Most problems can be eliminated avoiding the legal jargon;

Duration. A limited time for the MoU provisions to apply helps minimize errors;

Clarity. In defining roles and tasks of all actors is recommended;

Openness. To new entrants and exits, by clear procedures, is also recommended;

Flexibility. There should be procedures for monitoring, revision and adaptation.

THE CITADEL GOVERNANCE TOOLKIT

The final objective of pilot management in Citadel is to define a general charter that codifies the

role and structure of the ODGGs in the pilot cities and lays the basis for cooperation in the

future. Work during the project explored different approaches to doing so, leading to the

general guidelines for developing a MoU as set forth above.

This exploratory exercise, carried out in the course of the last two and a half years, has in fact

been a cumulative effort, in the sense that each iteration has contributed to provide new

approaches which taken together can be seen as a suite of tools that can be adapted to the

governance of Open Data processes in any city.

Page 83: The Open Data Commons: a new vision for the future of Open Data

Maturity of Open Data Governance

83

Table 13. Citadel Charter Approaches

Year Approach Description Value

1 Open Data Ecosystem model

Mapping roles and interactions in Open Data ecosystems

Provided the basic framework for the Open Data Governance Groups

1 Memorandum of Understanding

Formal signed document declaring common commitment to Open Data

Not utilized as local MoU, provided basis for Palermo Guidelines

2 On-line registration to ODGG

On-line form expressing interests and competencies for participating in ODGG

Adopted to avoid having a signed MoU, helped defined roles in ODGGs, not used as such.

2 Associate Partner Letter of expression of interest and procedure for inclusion of Associate Partners

Associate Partner campaign accelerated with platform in place and specific outreach programme.

3 Survey of roles A survey on roles of stakeholders in OD governance.

Contributed to governance model, currently in distribution with Associate Cities for validation.

3 Maturity model A framework for defining Open Data objectives and strategies.

Used in outreach programme to guide engagement strategies.

3 MoU framework A modular table of contents (based on the original MoU and Palermo Guidelines) for a local MoU

Can be used for the drafting of local procedures and guidelines

This ‘Charter toolkit’ guided the pilot cities in the gradual structuring and opening up of their

local Open Data Governance Groups, and in addition provided a supporting framework for the

Outreach programme. The experience gained in the project has shown that such an open and

flexible approach can remain as the modular elements with which any city can build and

consolidate their own Open Data governance model.

The final version of the Citadel Charter therefore needed to be some sort of statement that

pulls these elements together, drawing in addition on the extent and success of the outreach

activities and the awareness of the innovative potential of the Citadel vision. Rather than a

formal document of adherence to a network of Citadel-compliant cities (risking in addition to

duplicate the efforts of EuroCities, the Connected Smart Cities Network, and others), what

appeared to be most useful and needed was a declaration of common principles that can:

Clarify the Citadel vision for Open Data

Call on different actors to play their part in reaching shared objectives

Above all, recall and consolidate the Citadel Statement on which the project is originally

based, while demonstrating the contribution to achieving the objectives of the Malmoe

Declaration of 2009.

The text of the Citadel Charter, which appears as Annex III to this book, is therefore intended as

an open document whose primary aim is to promote the Citadel vision more than the specific

Page 84: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

84

tools developed within the project, as a forward-looking strategic protocol that can gain the

adherence of cities around the world.

Page 85: The Open Data Commons: a new vision for the future of Open Data

85

THE ODC AND THE FUTURE OF OPEN DATA

TOWARDS THE SEMANTIC WEB

In this chapter, we look towards the possible evolution of the Citadel semantic framework

(primarily as embodied in the ODC) and how it might contribute towards the emergence of the

Semantic Web vision as famously described by Tim Berners-Lee. This might seem like an odd

proposition, since we instinctively think of the five star model as requiring sophisticated web

services to run on (as most Linked Open Data services do), together with the general impression

that Citadel deals ‘less seriously’ with Open Data by paying more attention to Excels than APIs.

THE CITADEL VISION: “TERRITORIES OF DATA”

Yet one of the hypotheses of Citadel is that pre-conceptions like this are actually hindering the

development of Open Data, by remaining fixed in technological daydreams rather than learning

from real people in real settings. As an example, take the evolution of the Internet. In 1993, Ed

Krohl wrote the 543-page The Whole Internet: User’s Guide and Catalogue, extolled by Kevin

Kelly of Wired as “an encyclopaedic compendium of all the places to explore, the short-cuts to

get there, the reasons to linger, the treasure you might find, and the tools to make this free

world-wide service worthwhile.” Today, even thinking of an Internet ‘compendium’ is

impossible, and skimming through this book it is evident how only twenty years ago the internet

was still the domain of a small group of technical enthusiasts, well-versed in UNIX and Gopher

and extolling it as a “valuable electronic tool” for the free world.

Open Data today appears to be in a similar situation: there are several portals out there whose

job it is to provide a “compendium of all the places to explore”, since their number is still in the

range of the countable. Public debates on Open Data21 still ask whether “this free world-wide

service [is] worthwhile” rather than attempting to understand where it is going and what its

broader impacts might be. Yet at the same time, Krohl’s book appeared just before the internet

began to explode into a totally different phenomena affecting every aspect of the way we live,

work and play. The second edition, in 1995, includes a new chapter on the World Wide Web

(suggesting it might be an interesting alternative to Gopher), and only a year later Google was

launched, based an algorithm that turned the tree-like catalogues of Yahoo and Altavista on

their heads.

If we are on the eve of a similar destiny for Open Data, then it can be a useful exercise to

imagine what an Open Data scenario might look like in 20 years (probably much less). If Open

Data is to explode like the Internet, then it is likely that Berners-Lee’s scenario of a ‘Web of

data’ may even appear limited, since it is likely that the web may give way to simpler (un-

noticeable?) front-ends, with the actual workings driven far more by automatic, IoT-type or

agent driven transactions than by human intervention. One thing is for sure: the data in

question will not be limited to datasets published by public administrations, but data generated

21

A good example was the Q&A at the session “Cohesion Policy and Open Data: boosting transparency, performance

and engagement” at Open Days 2014.

Page 86: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

86

by all types of human activity as well as natural and machine events22. This fine-grained web of

data is likely to reveal new relationships between data and the specific place where it is, as

geographical, physical, and cultural elements of context become intertwined with ICT services23.

The Citadel project refers to this vision (described as its key value proposition) as a ‘Territory of

Data’. This concept implies that the density of information about a given territory leads to a

diffused awareness of all the features, activities, and dynamics happening there. This allows not

only governments to manage public services, but also businesses to understand market

dynamics, citizens to identify life opportunities, and so on.

Indeed, the Citadel project has been working to shift the Open Data paradigm from a finite set

of public administration portals towards a more territorially diffused data environment in three

main ways:

Citadel has done everything possible to break out of the technological temples of Open

Data, putting its tools in the hands of citizens.

Citadel works with cities not just as points on the map but as places where people come

together to give meaning to a place: witness the emergent role of the ‘visitor’ as

explored in Citadel.

Citadel is based on an open Semantic Framework as a dynamic social construction more

than a technical data model.

In the following, we take a look at how Citadel also aligns with some emergent trends that may

develop to unleash or at least accelerate the transition towards a Territory of Data.

FLATTENING DATASETS

A first signal we see as evidence of this transformation is what might be called the ‘flattening’ of

data structures. Since the early days of information technology, data has been organized into

increasingly complex structures of inter-relationships in an attempt to more closely represent

the way data is used in a particular domain. This occurred first in nested or hierarchical

relationships, and since the 1970s in relational structures, that instead emphasize links between

simple tables, such as a listing of companies on the one hand and the addresses for each on the

other. Relational databases have since become the norm – used in programs ranging from

Microsoft Access to MySQL – and in fact are behind most of the web services driving many of

the open data applications we see today.

In the following diagram of a typical relational database structure, the different tables are shown

divided by logical or functional areas, with the links between specific elements in each table also

shown. These links are then used to query the database, according to different ‘views’ onto the

information, i.e. a view to show a listing of all of a company’s suppliers (with addresses), and

another view to show a listing of all outstanding invoices (with company name). This structure of

22

For a preview, see www.thingful.net and http://www.slideshare.net/JustinHayward1/sss14haque 23

See the Periphèria project’s city ‘arenas’ with ‘people in places’, validated by the emerging trend of ‘Internet of

Places’ (more info at: www.peripheria.eu).

Page 87: The Open Data Commons: a new vision for the future of Open Data

The ODC and the Future of Open Data

87

tables, relations and views onto the information is studied at depth with the client organization

in order to best represent their needs and operations.

Figure 24. A typical relational database structure

If we want to publish information in such a relational database as Open Data, there are basically

two choices:

An API can be provided that essentially queries the database from the outside, with the

result being provided to the external application in the desired format. While some

systems publish to the web information about how to query them, use of an API

generally requires a knowledge of the database’s structure in order to extract

information from it. In particular, it is necessary to know in advance the exact names of

fields, or in other words the semantic structure.

The owner of the database can make a query for some subset of the information

contained in the database (i.e. company names and addresses but not invoices), and

write that to a file that is then published as Open Data. This can be done either

‘manually’, producing an Excel or CSV file, or ‘automatically’ i.e. through a protocol such

as XML.

Neither of these choices, however, fully ‘opens’ the database since much of it, especially its

semantic structure, remains hidden. Since the database has been structured to be a mirror of

the specific organizational context it serves – a company, a public administration, etc. – it can

never be fully adapted to a broader context nor can its data be seamlessly integrated into a

territorially defined ‘web of data’.

In many aspects, the LOD paradigm externalizes the relationships of such structures by re-

creating links between data structures as external and publicly viewable RDF triples, as Burners-

Lee’s vision in fact tends towards an evolution of the web as a database for the whole world.

This transfer of the structure of semantic relationships from ‘inside’ a relational database to

‘outside’ is driving the trend towards the ‘flattening’ of datasets, or in other words a preference

for working with two-dimensional tabular files. Consisting of spreadsheet-like layouts with a

Page 88: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

88

series of rows of information using the same column headings, tabular datasets are far more

easy to read externally, especially if they are presented in an open format such as CSV. Indeed,

many on-going efforts to transform existing data structures (notably INSPIRE-based geographical

information systems) into LOD pass through the stage of first generating one or more output

datasets in tabular CSV format.

Evidence of this trend is for example the emergent standard for transport data, GTFS (General

Transport File System)24. GTFS is not actually the way data is held in transport information

systems, but rather a common interchange format, useful for telling an external service such as

Google Maps how a given city’s transportation service is organized, independently of the system

used. Nonetheless, it contains all the information necessary, even though not in a format that is

immediately operational. As shown in the diagram below, a GTFS file for a given city consists of a

zipped collection of seven text files (actually structured as CSV), each of which contains a certain

part of the information that needs to be linked afterwards: for instance one contains information

on stops, another on transit lines, and so forth.

Figure 25. A typical GTFS folder unzipped

The interesting thing about GTFS is that while each of these seven files follows a precise

structure, that structure and the relationships between the data in each of the files is not

contained in any GTFS file instance, but rather in the description of the standard. In fact, the

links between each dataset are external, and they are not explicitly stated, mainly because they

are obvious: it is clear that the buses of a given line stop at bus stops. In sum, the GTFS standard

consists of seven tabular datasets, linked by externally (or ‘socially’) known relationships

between the datasets.

Another example is in the CKAN data portal software25, an open source platform that is rapidly

becoming the standard for Open Data services. CKAN primarily hosts datasets or links to data

services using a typical data portal structure (similar to the original Citadel Hub); here no

choices are made about semantic structures, only a complete listing of files based on the

relevant metadata. In addition to this File Store service, however, CKAN introduces a new

feature called the Data Store, which is very relevant to this discussion.

24

https://developers.google.com/transit/gtfs/ 25

http://ckan.org/

Page 89: The Open Data Commons: a new vision for the future of Open Data

The ODC and the Future of Open Data

89

The Data Store ‘exposes’ any tabular dataset hosted in the File Store (or can also be set up on its

own right), in a way that it is possible to query the data ‘inside’, say, an Excel file without having

to download it, using a simple external API. As the CKAN Data Store is used ever more widely,

this tabular data format is generally gaining greater interest.

Figure 26. A typical CKAN Data Store

The problem of course with a tabular dataset is that CSV doesn’t provide for any facility to store

metadata information about the dataset. It is a simple task (far more simple than with a

relational database) to read the first row – column headings – and capture the semantic

structure of the dataset, but a system for storing the links between different tabular datasets,

using an RDF file or other system, has yet to be devised. To facilitate this process some have

suggested using standard column headings (more or less what Citadel is doing, as discussed in

the previous chapter), while others are highlighting the importance of identifiers as the anchors

for linking open datasets26.

How and where to define and store information that interconnects flattened datasets is in fact a

key challenge for future research. The important point in the context of this book is that, in the

journey towards the Citadel vision of ‘Territories of Data’, the trend is to imagine a massive

number of flat, tabular datasets as the foundation.

SEMANTIC RELATIONSHIPS IN CITADEL APPS

The Citadel project finds itself part of the trend of flattening datasets not so much by design but

rather by the simple fact that the majority of the datasets provided by the pilot cities were

originally in the form of Excel spreadsheets. On the other hand, Citadel is aligned with this

26

“Creating Value with Identifiers in an Open Data World”, Open Data Institute and Thompson Reuters, available at

http://thomsonreuters.com/corporate/pdf/creating-value-with-identifiers-in-an-open-data-world.pdf

Page 90: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

90

trend, and the experimentation throughout the course of the last year can give some valuable

insights as to the paths for future developments.

The Citadel AGT in fact offers the possibility for any user, and in particular non-expert users, to

generate an application using more than one dataset. In the following paragraphs, we will

explore those instances where more than one dataset is used to build an application. By

examining and classifying the associations between datasets that emerge, we can gain insights

on how a bottom-up identification of RDF triples, or any other expression that captures the

relationships between two datasets in a useful way, could occur.

Figure 27. AGT Apps by Month

In the first year of operation of the Citadel toolkit (fully operational only starting in January

2014), a sufficient number of dataset couplings has occurred to be worth investigating. As the

above diagram shows, of the 567 apps generated until end October 2014, 138 or approximately

25% include more than one dataset. The generation of multi-data apps more or less parallels

that of single-dataset apps, except in the final months27.

If we eliminate the 60 apps that use multiple datasets simply because information is coming

from different cities (either the same information in two cities or apps created only to

demonstrate the tools), we still have 78 apps to examine, or about 14% of the total of apps

generated. Further eliminating apps that are clearly demonstrations (mashing up five or six

unrelated datasets from the same city in an app with a name such as “test”) or that repeat the

same combination of datasets (multiple trials), we are left with 38 apps combining datasets in

an original and meaningful way.

Given the nature of the AGT as a map visualisation tool, we thus have 38 instances where users

have spontaneously created an association between two datasets as a function of their spatial

relationship. In other words, by combining multiple datasets the user is exploring some sort of

logic – of the sort that the LOD model tries to express – that is expected to emerge when shown

27

This can be attributed to the extensive outreach activity in fall 2014, where single-data apps have been generated

to illustrate the potential of Citadel to a new city.

Page 91: The Open Data Commons: a new vision for the future of Open Data

The ODC and the Future of Open Data

91

on a map. A closer look at this sample reveals four meaningful classes of relationship – each

accounting for about a quarter of the total sample – that appear to motivate the couplings:

Associations of datasets: different sources of pretty much the same information are

combined to generate a more complete representation. In these cases, we can imagine

the user wishing to combine datasets as generated by different authorities into a more

complete picture that makes sense in practical terms.

o Examples: Parking lots + on-street parking; Bus stops + bike stations; Childcare

centres + schools; UNESCO Heritage sites + Tourism POIs; Historic sites +

Abandoned villages.

Functional relationships: two types of information are connected by their usefulness

(often transport together with destinations). In these cases, we can imagine a user

associating two datasets according to their purpose. Note that these relationships are

not necessarily permanent: the fact that a parking place is near a cinema is irrelevant if

the cinema is closed.

o Examples: Parking + Cinema; Parking + Bars + Events + Diesel prices; Doctors +

Health insurance; Defibrillators + Pharmacies; Hotels + Tourism POIs, Pastry

shops + Transport stops; Markets + Parking

Temporal relationships: this class is similar to the previous one but with a clearer

sequence in time. Here we imagine the user thinking “after you do this you might want

to do that”. These are also relationships that are not necessarily permanent.

o Examples: Cinemas + Bars; Schools + Markets; Voting seats + Tourism POIs;

Meeting places + Planned visit; Tourism POIs + Restaurants; Museums + Bars

Urban settings: these are relationships between datasets that associate related public

or civic facilities with neither a specific logical, functional, or temporal relationship. Here

we imagine the user attempting to highlight the features of a neighbourhood in a city,

representing quality of life in spatial terms.

o Examples: Hotels + Cinemas; Parks + Community Centres; POIs + Trees; Coffee

shops + Parks allowing dogs; Sports facilities + POIs + Galleries + Parks

BACK TO LOD

As stated previously, one idea concerning the Citadel Open Data Commons is that it can provide

a way of constructing semantic LOD relationships in a bottom-up rather than top-down fashion.

Already in the early project stages, it emerged that this indeed could be the trend, although the

approaches mentioned there could be considered more as crowd-sourced labour than fully

bottom-up methods. Instead, the ODC concept already suggests a different possibility when, in

Figure 14 above, it is suggested that “Semantic patterns” could be identified in the “Query

recordings”, considered at that stage of development as a log file of activity within the ODC

containing information about the conditions in which datasets were accessed.

Page 92: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

92

What ultimately appears to be the most promising approach is in fact to infer relationships from

the combinations of datasets as discussed above. Were the kind of activity witnessed in the first

year of use of the Citadel Toolkit to reach a massive scale of the kind justifying a big data

approach, the types of relationships tentatively suggested in the previous section could be

identified with greater certainty. At that point, however, we need to ask: is the RDF framework

appropriate and sufficient to express these relationships?

RDF, in its essence, expresses relationships in a subject > verb > object syntax, meaning that the

relationships are not just neutral associations, but they can have a meaning and a direction

associated with them.

Figure 28. The basic RDF syntax

In this logic, we can imagine that, in the “Association of datasets” category above, the

combination of “Parking lots + on-street parking” (the first example from the list in the previous

section) can be modelled as two datasets that can be the ‘subjects’ with a verb ‘provide’ and

object ‘parking spaces’. Indeed, this sort of descriptive relationship fits well with the LOD

scheme; the well-known example shown below in fact uses the verbs “is a”, “is located at”, “is

on the topic of”, “depicts”, “is famous for”, and “discovered”.

Figure 29. The LOD schema for the statue of Einstein

The other three categories, on the other hand, introduce new elements that may not be able to

be fully captured by the RDF syntax.

Functional relationships: as shown in the previous section, these relationships are often

contingent on certain aspects of the context of time, place, and role of the user. RDF on

the other hand only expresses permanent relationships; how to situate them then in

the context for which they hold true: where using RDF can you express “as long as the

cinema is open”?

Page 93: The Open Data Commons: a new vision for the future of Open Data

The ODC and the Future of Open Data

93

Temporal relationships: the contingency here is even more complex, since it depends

on a sequence of events; might we then imagine algorithms that generate RDF triples as

temporal sequences depending on what happens when?28

Urban settings: here the combination of datasets seems to express spatial qualities that

are very related to the map but not directly related to the individual datasets taken

singly; can we imagine some new vocabulary of spatial qualities (not necessarily limited

to urban environments), for instance capable of describing a ‘nice’ neighbourhood, a

‘city centre’, or even landscapes?

These questions are by no means trivial, since they touch on the very usefulness of LOD

relationships, which, apart from some rather elementary applications, have not been tested to

date on a wide scale with citizens and businesses in city settings. From the evidence that

emerges from the Citadel experimentation, there are significant research tasks ahead in better

exploring the semantics of place, time, and space.

THE ODC AS A POLICY CONCEPT

THE ODC IN PRACTICE

Towards the middle of the Citadel project, feedback from the pilot cities indicated that the

Open Data Commons (ODC), which had initially been set forth as a guiding framework concept,

needed to be implemented in practice as least in terms of some first tools for the pilot

communities. This gave rise to the development of the Citadel tools – the Converter together

with the Application Generator Tool (AGT) – but it also changed the perspective from which the

ODC concept evolved from then on. The tools under development needed to be constantly

mapped onto the original ODC principles in order to a) see whether the toolkit was actually

working in the directions proposed by the ODC concept and b) if so, what future scenarios and

roadmaps were needed to reach the mature vision starting from the first tools.

What both the ODC concept and the toolkit shared since the outset was the basic Citadel

objective of promoting the uptake of Open Data by making life easier for two groups:

Make it easier for those holding data to publish it electronically and thus make it

available for access by third-party applications

Make it easier for developers to design applications that can move smoothly from city

to city, allowing citizens to access and visualize datasets independently of the format

and standards by which they were originally published.

In its starting configuration, previous to the introduction of the toolkit, the ODC was simply a

collection of static datasets published on the Citadel Platform, the first common space. In the

initial cycle of pilot testing, these datasets were incorporated into the Citadel templates for

each city, as JSON files in a relatively closed client-server framework. The open and flexible ODC

scenario was thus discussed as a possible vision or way ahead for Open Data but not

implemented in practice.

28

To some degree, one could argue that the Google Now service attempts to do this.

Page 94: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

94

The first implementation of the Converter and AGT at first seemed to defeat the main principle

of the ODC, which was intended to remain as open as possible to different standards, data

models, etc. through a public collection of tools, not just one toolkit. On the other hand, this

solution could also be framed within the ODC concept as just a first instance of a more open

framework based on the same approach, recovering the original idea of having several app

templates each with its own data model in addition to the AGT.

In October 2013, in parallel with the specification of the tools, we thus considered the following

scenario as an extension of what was being developed. In this context, the ODC could be seen to

be made up of two main elements:

1. One or more servers containing “live” files – accessible via their URL – that contain data

in a format (platform and data structure) that at least one template (Citadel or third

party) or application (Citadel or third party) can access remotely without any further

conversions. (The first implementation of the toolkit being a first server with a first data

model for a first application, the AGT, but with nothing prohibiting further

development.)

2. A unique Index that includes a listing of a) converted “live” files as described in point 1,

including where and how they can be accessed, and b) the relevant template and

application data formats that are compatible with the files. (The first implementation on

the Citadel Platform consisting of an Index that lists all files that have been validated as

compatible with the AGT, the first of possibly many data formats.)

This broader scenario allowed to imagine some possible use cases that have in fact arisen

throughout the pilot testing and outreach activities as concrete situations. They also illustrate

the broader potential of the ODC, taking Citadel far beyond the paradigm of cities publishing the

typical datasets into a city portal:

The local bridge club

Any local club such as a bridge club can regularly publish the list of members, together

with their addresses, onto the public ODC (of course with member approval). Besides

giving publicity to the bridge club, the AGT can easily show the location of members and

make it easier to find the address for the next game (possibly with an app that mashes

up the bridge club list with the official city dataset of parking facilities). It is likely that

the members will also be stimulated to think about other datasets they hold that could

contribute to the city’s ODC.

Local City government ODCs

The publishing of Open Data with the Converter is so easy that a city can, in addition to

publishing publicly on the Citadel Platform, ask individual city departments to publish all

city datasets on an internal server that mimics the ODC’s functionalities, with a closed

AGT that can be used for training and awareness raising among civil servants. In

addition, such a “private” Citadel platform can be used to develop internal services as

well as to validate the coherence and quality of the data being published. The city

government’s data managers can then simply re-publish the appropriate validated

datasets onto a public Index when needed.

Page 95: The Open Data Commons: a new vision for the future of Open Data

The ODC and the Future of Open Data

95

Restaurant menus

A city’s local restaurant association might find it useful to be able to publish the menus

and special offers of member organizations on a daily or weekly basis. They could

therefore commission a special template that can display restaurant menus on the city

map, together with a special version of the converter that converts to the new data

model. Individual restaurants and pubs that publish their information according to the

agreed format will then be visible through the AGT version that incorporates the new

template.

The broader ODC scenario also allowed us to define a roadmap for development of the

Converter + AGT tools in the direction of realizing the open and interoperable vision of the ODC

in an incremental fashion. In this mature scenario, all the datasets in the ODC are published as

JSON files compatible with one or more data models, so in theory a template or application (or

enhanced future version of the AGT) only needs to know which cities have published data in the

expected format and what URLs should be used. This information is stored in the Citadel Index,

so that developers can easily configure the templates they incorporate into a given application

and be sure that it will work in the different cities29.

In this context, the following were identified as possible areas for development in late 2013.

(Followed by a note on what actually happened.)

Index logging

This consists in logging events that happen through the Index, namely new datasets registered

(by whom and how), accesses by templates, configurations used, etc. This was thought of as

likely to yield very useful information for both the template and application developers as well

as for the data providers. In addition, further services (eg. privacy management or semantic

tracking) could also be built into the system that manages the Citadel Index. (Basic logging has

been implemented with the Citadel Index, and the potential for both privacy as a service and

semantic analytics have been identified in other ODC reports. Further developments are a good

topic for future work.)

Converter enhancements

An important enhancement to the Converter would be to enable on-the-fly conversions. In this

case, rather than generate and save a new file to the ODC, the Converter would save the

configuration info only, directly accessing the original dataset (or a regularly copied data dump)

on the fly. (On the fly conversion has in fact been implemented as a proof-of-concept script with

the PHP Converter Library. A future enhancement of the Converter could include saving semantic

mappings for batch processing, although actual effectiveness in practice would need to be

tested.)

Template developments / enhancements

As shown in the restaurant scenario above, the development of new templates was considered

an important space for the future, in order to visualize types of datasets that go beyond the

Citadel application scenarios, ie. socio-economic data. A new template would simply need to be

29

This feature of “discovery” of thematically compatible datasets in different cities (though using the same data

model) has already been implemented for the AGT.

Page 96: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

96

registered in the (future version of the) Citadel Index, with information on where it resides and

the platform and data format it uses. (Currently, the AGT uses only one data model that includes

all fields from the POI, parking, and event templates. A more modular design for the AGT,

capable of incorporating different templates according to the selected datasets, could be a

future objective. As it stands, the additional data models that have been implemented for the

Converter are related to specific applications outside the AGT, as with the MyNeighbourhood

data model30 in the Lisbon Pilot, see below).

Dataset enhancements

With the ease of access to datasets through the Converter and AGT, it is possible for anyone to

publish open data to the Citadel Platform, eg. a listing of bridge club meetings, and immediately

see them on a map on their mobile phone. This has great potential for motivating the

publication of datasets beyond those of the municipality. In parallel, by shifting the emphasis

from applications to data visualization, it can make sense for a city to enhance existing datasets

rather than developing specific apps. For instance, in order to make a reservation for a concert,

it can be possible to embed a link (which would then be visible via the App Generator) to an

external reservation service in the description text of the concert rather than building

reservation functionalities into the app. (Most of the datasets to date in Citadel contain city

information, with citizen datasets e.g. Lisbon pastry stores, appearing only recently. In some

instances, however, enhanced datasets have been experimented, as with the Museum tour app

in Ghent.)

Dataset refinement

As the Converter can access any dataset on the condition that it is “refined”, a broad, “de-

professionalised” uptake of open data as suggested above was expected to bring this topic to

the fore very quickly. Although there exists a broad range of tools and toolkits to help refine

large datasets, there is little awareness of them or diffused expertise on how to use them. Pilot

responsibles were suggested to engage developers with data owners in order to teach them

how to use these tools. The best strategy, however, is to build awareness from the start,

something which can be achieved by eg. helping people to publish bridge club meetings and

then seeing what happens when the address format is not consistent. (Dataset refinement has

played a lesser role than expected, in part because many of the Citadel datasets were made

from scratch. Nonetheless, the toolkit has had a powerful impact of raising awareness on data

quality, and the Apps4Dummies workshops effectively put into practice the recommendation of

mixing developers with data owners.)

OPEN DATA AS A PUBLIC GOOD

As the toolkit was implemented – validating rather than betraying the ODC concept in concrete

terms – and the uptake on the part of pilot cities began to accelerate, feedback from evaluation

activities helped to clarify what the potential impact of the ODC concept could be on the Open

Data paradigm overall. The following paragraphs result from reflections on the potential policy

implications of the ODC, as Citadel began to realize that a potentially new paradigm was

30

http://my-neighbourhood.eu/

Page 97: The Open Data Commons: a new vision for the future of Open Data

The ODC and the Future of Open Data

97

emerging. This process started in early April 2014 and continues to date, feeding as well into the

public debates on Open Data mentioned above.

Indeed, one of the central and most transformative tenets of the Open Data Commons concept

is the simple idea that Open Data be considered as a common good, in a public sphere whose

stewardship is to the benefit of both public and private stakeholders as well as citizens. The

mainstream paradigm for Open Data, especially as promoted by technology providers,

essentially ignores this common space, instead identifying a two-step process:

Governments at all levels publish datasets generally formed by internal administrative

processes in machine-readable form. According to the technology providers this should

occur using increasingly sophisticated (and often unaffordable) web services, but in this

Age of Austerity most agencies are in fact publishing raw data in the format they have

and to their own semantic and syntactic standards.

Software developers build applications using the datasets published by governments.

Due to the above, this often requires that developers convert the data from its raw

form into something more usable by each specific application, thus posing a barrier of

time, cost and general efficiency.

Although the relatively cautious uptake of Open Data across European cities is often attributed

to a lack of a culture of transparency or concerns about privacy and “sensitive” information, this

is not sufficient to explain the lack of information regarding topics such as the location of

galleries, museums, and public toilets. We suggest that three other factors inherent in the

current paradigm might also be identified as creating barriers:

The two-step process – governments open data and then sit and wait for developers to

come along and use it – creates a discontinuity between supply and demand and a

specialisation of roles that inevitably makes it difficult to engage different points of view

in defining comprehensive Open Data strategies.

This separation of roles also affects the propensity of the actors involved to engage with

similar initiatives under way in related fields, such as the EU’s standardisation efforts in

Spatial Data Infrastructures (INSPIRE), Sustainable Energy Technologies (SETIS), etc. This

inevitably leads to interoperability issues both along the data value chain and across

thematic sectors (a key issue to attain the Linked Open Data vision).

These factors together greatly limit the applicability of the Open Data paradigm as it is

today to the few cities who meet the profile of having a strong political will, a culture of

transparency, IT staff capable of managing data publishing, and ideally an active

developers’ community willing to develop apps.

The Open Data Commons concept directly addresses these issues, by politically identifying the

“space” between datasets of whatever form and applications of whatever type and declaring

that space to be within the public domain and in the public interest, following the paradigm of

the public Commons. This space belongs neither to data providers nor to data users, but is a

neutral domain containing the tools and knowledge allowing providers and users to connect in a

more nimble, efficient, and innovative way than either could achieve by themselves. The Open

Data Commons can thus be said to include any software element that is generic enough to have

relevance for more than one dataset on the one hand, while independent of the market

Page 98: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

98

exploitation potential of a given application or set of applications on the other. Such elements –

while the Citadel Converter is central to defining this space, it can also include generic APIs,

convertors, transformers, and tools of various nature – collectively bridge the gap between

datasets and applications in the most dynamic and flexible way possible.

POLICY IMPLICATIONS

This has a direct impact on the three barriers identified above as follows:

By unlocking the technical paradigm, the ODC concept allows for data to be exploited

before standards have been fully defined, thus promoting demand-driven processes of

standards convergence and adoption. This not only brings forward the benefits of Open

Data, but it also opens up to greater interoperability flows with other standards

formation processes. This is particularly important in areas where standards adoption is

relatively immature, such as spatial data infrastructures, IoT sensor network

architectures, big data analytics, system dynamics modelling, etc.

By “filling the gap” between data supply and demand and creating a concretely testable

end-to-end process, data owners can see the purpose of publishing data and application

developers can see the need for new datasets with greater clarity. A common space for

on-going dialogue and interaction between governments and developers is created,

allowing for new dynamics to emerge such as “application driven” data strategies.

By allowing the introduction of simple tools such as the Citadel Converter and AGT, the

ODC substantially de-professionalises the practice of Open Data, opening up to a full

and active participation of citizens and local businesses in both data supply and demand

and a potentially massive uptake of data-based activities. This also allows for a more

diffused territorial impact of Open Data, no longer confined to large, well-to-do, and

“innovative” cities but opening up to wide-scale engagement and collective creativity.

At the broader policy level, the ODC concept transforms data into a key element of territorial

capital and its stewardship an essential activity in the public sphere, in an emerging policy

landscape in which the public sector is re-defining its role in a transformation from command

and control to the orchestration of collective and collaborative innovation processes. These

broader policy implications can be mapped onto the pillars of the EU 2020 strategy as follows:

Smartness: Data gains a new status as a driver of economic development, with value

streams emerging from the production, analysis, and coupling of diverse and diffused

datasets as produced by social and economic activities themselves. A data-driven Smart

City concept can lead to data-driven local and regional development strategies, that

broaden the scope of Open Data to include public, private, citizens, and businesses, as

well as nature and machines, as data producers, owners, and users.

Sustainability: Ecosystem-based management concepts for sustainable development

depend on knowledge and awareness of the current and potential dynamics embedded

in a territory and its natural and human capital. The “Open Data as a common good”

approach can interlink with paradigms such as the IoT-based “wisdom of the earth”

concept to underpin an integrated and dynamic vision of sustainability.

Page 99: The Open Data Commons: a new vision for the future of Open Data

The ODC and the Future of Open Data

99

Inclusiveness: The ODC concept proposes data as a basis for an emerging model of

citizenship, in which data as a right and the stewardship of personal and collective data

as an activity in the public sphere. By democratising access to the operational workings

of Open Data processes, the ODC unleashes the creative potential of all parts of society

on an equal footing of opportunities.

TOWARDS A REGIONAL CLOUD FOR A TERRITORY OF DATA

This policy concept for the ODC took shape in parallel with the pilot experimentation and the

launch of the Associate outreach programme, which in turn had the effect of producing an even

more ambitious policy vision. In July 2014, the project launched the format of the

“Apps4Dummies” workshops, which targeted civil servants in several small and medium

municipalities belonging to generally metropolitan areas. As the ODC concept thus gained

relevance for networks of neighbouring cities and towns, it became clear that the typical Open

Data paradigm (one administration = one portal) would be inadequate to support the concrete

needs of these more complex configurations of administrative competence.

An important gap appears in fact at the regional or sub-regional scale, for say a metropolitan

area that includes many municipalities ranging from a few thousand to hundreds of thousands

of inhabitants (with ICT budgets in a similar range)31. Common problems and issues, from

environmental monitoring to transport planning to citizens’ favourite panoramic views, lead to

common data models and possible apps and services; but who is supposed to offer the platform

for this varied set of local authorities, open also to the contribution of datasets from citizens,

business associations, NGOs, etc.?

The experience of the Lisbon Pilot, with main activities in August-September 201432, offered a

possible answer to this question. The integration of the Citadel Converter involved the use of

the FI-WARE cloud33, together with one of the key “Generic Enablers” of that platform: CKAN,

an open source Open Data portal system developed and managed by the Open Knowledge

Foundation. With the integration of CKAN into FI-WARE, this would mean that for any territory

or cluster of territories it is in principle possible to open a “FI-WARE instance” and run an Open

Data service, including the Citadel toolkit, on it. Indeed, the Citadel project has been invited to

formally propose the Converter as a FI-WARE Generic Enabler itself.

Given the interoperability between the Converter, the AGT, and CKAN, this could therefore be a

possible platform supporting the kind of “Territory of Data” vision emerging in Citadel, providing

the flexibility of management that would allow it to be configured in the most appropriate way

for a given region, metropolitan area, or other territorial cluster of administrations. In addition,

using the same basic platform functionalities across Europe would mean ensuring an even wider

31

A good example of this is the Apps4Dummies workshop held in Palermo in July 2014, in the context of the signing

of the “Ventimiglia Pact”, a joint strategic agreement among 52 city governments in the area. Among the objectives

of common interest is listed Open Data and smart city services, but the question immediately arose as to who should

manage the platform. 32

As stated previously, the enhancements to the Converter in the Lisbon Pilot were funded by the FI-WARE

programme. The policy reflections related to FI-WARE and the ODC are instead in the sphere of the Citadel Project. 33

http://www.fi-ware.org/

Page 100: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

100

portability of applications and tools built on the Citadel platform and thus a richer business

ecosystem for the Citadel development community.

In this way, the ODC vision of open data as a common good extends beyond the specific

platform architecture of the Citadel toolkit to include a platform vision with far greater scope.

The issue that then remains is the implementation of FI-WARE as an open public facility in a

given territory, rather than as the highly access-controlled cloud platform supporting the

European ICT industry’s applications and services as it is currently conceived. This brings us to

the relevance of the Digital Agenda, the policy initiative supporting the development of

connectivity and service infrastructures across Europe, and the way it is implemented through

regional ERDF policies, where most of the funding is to be found.

In fact, in the 2007-2013 programming period, R&D&I represented some 26% of the planned

expenditures for Structural Funds, for a total of over 86 Billion Euro (more than the FP7 and CIP

programs together). With several key EU 2020 Flagship initiatives (e.g. Innovation Union, the

Digital Agenda) pinpointing regional policy as the main instrument for implementation, this

figure is likely to rise even further. The set of 271 Regional Operational Programs currently being

drafted therefore represent an important opportunity for the Citadel ODC vision.

The new conditionalities for this cycle of Regional programming – in particular so-called Smart

Specialisation model for innovation strategies – impose certain principles and processes for

each region, such as stakeholder engagement, “entrepreneurial discovery”, and the integration

of social innovation. These are in turn leading to significantly new policy approaches for many

Regions, particularly in Southern Europe and the New Member States, very much in line with

the “human” approach to technology innovation also shared by the Citadel project and

supported by many initiatives in DG Connect. To support the new policy process, DG Regio has

engaged the Commission’s IPTS in Seville to advise and coordinate individual Regions in

complying with the requirements for Smart Specialisation, but despite their efforts the Digital

Agenda has yet to appear on the top of the policy agenda with any degree of sophistication.

The “Digital Agenda Toolbox”, one of many instruments designed for this purpose, provides

guidelines as regards regional ICT infrastructures, services and applications, and methods for

take-up and digital literacy, and even includes a section on Living Labs, the methodology

adopted in Citadel. Cloud platforms and Open Data are also mentioned, but in relatively

traditional terms compared to the Citadel ODC concepts mentioned above, In addition, no

mention is made of either FI-WARE or the FI-PPP, despite the fact that the Commission has

already funded FI-WARE with over 800 Mln. Euro. This is evidence of the low level of awareness

among Regional policy makers responsible for Smart Specialisation of the possibilities that could

be offered by a wide-scale uptake of the FI-WARE cloud together with the CKAN Open Data

service, the underlying infrastructure onto which the Citadel toolkit would ideally be integrated.

This situation can be attributed to barriers on both sides of the equation. On the one hand, FI-

WARE and many FI-PPP services are still in an experimental stage and not ready for commercial

launch. On the other, ERDF regulations make programming and assignment of funds a long and

cumbersome process that often misses windows of opportunity in a fast-changing sector. Yet

these difficulties are perhaps hiding the real potential and benefits to be gained by

implementing the FI-PPP at the regional scale, especially as framed by the Citadel ODC vision.

Page 101: The Open Data Commons: a new vision for the future of Open Data

The ODC and the Future of Open Data

101

While the rollout and provision of broadband can proceed within the context of existing

regional innovation strategies and traditional tenders, cloud services and the ODC concept

overall instead raise a whole series of new questions and opportunities.

With the funding available for the Digital Agenda, this is a potentially very important part of the

business opportunity for Citadel as well as FI-WARE’s cloud services. Yet implementation at the

territorial scale is not only a technical issue: who should ensure data policies across

administrations, guaranteeing openness and citizen engagement through governance in the

public interest, and who should ensure quality of service, privacy and security, and

interoperability among platforms and systems? What are the potential benefits of a pan-

European approach in terms of the business opportunities for local ICT SMEs working with a

common information infrastructure across and between their regions (or, why not, at macro-

regional space level)? And as a consequence, what is the right approach for procurement on the

public side34?

These issues cannot be solved in the abstract but should most effectively be addressed through

a vast and diffused co-design process that engages regional actors and authorities engaged with

the Citadel toolkit together with the FI-PPP as a whole. Since this is essentially an exploratory

process involving large scale (though not necessarily “heavy”) pilot experimentation, it could be

framed in the context of Pre-Commercial Procurement or the EU Public Private Partnership, i.e.

as a shared-cost experimentation whose principal actor can be the Commission itself (in terms

of defining the framework guidelines linking H2020 to the ERDF). How such an infrastructure, if

successfully tested, can be then implemented in practice could then be integrated into by-then

ongoing ERDF-funded activities.

At a more manageable level, the FI-WARE Accelerator programme provides the opportunity to

further develop the Citadel toolkit and enhance its integration with CKAN and other relevant FI-

WARE Generic Enablers. A series of sixteen projects are launching calls for SMEs to propose ICT

services to be developed using the FI-WARE platform, and the pilot testing in these initiatives

can include scenarios of use with neighbouring municipalities, as a proof of concept of some of

the more functional aspects of the ODC vision.

In parallel, however, the bottom-up exploration of Citadel as an innovation support

infrastructure can be equally carried out from a regional policy perspective, for instance through

discussions with the IPTS and interested Regions or collaboration in the framework of on-going

European Territorial Cooperation projects. The feasibility of this is illustrated by the diffused

response to the Citadel Associate outreach program, providing a bottom-up platform of

interested cities. In addition, the CreativeMED project35 has been exploring possible areas of

concrete exchange of thematic and operational knowledge among 12 Mediterranean Regions

for the implementation and monitoring of their 2014-2020 Smart Specialisation strategies. Here,

the hypothesis of concretely experimenting the ODC Territory of Data vision using the Citadel

toolkit on top of the FI-WARE cloud (or at least using CKAN) is being explored by regional

programming responsibles in Portugal, Italy, Slovenia, Greece and Cyprus. Similar concepts are

34

In fact the new framework for EU public procurement creates more room for informal negotiations with

prospective awardees. 35

http://www.creativemed.eu/

Page 102: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

102

also the subject of specific discussions with CORVE, the Citadel Lead Partner, as concerns the

Flanders Region.

Page 103: The Open Data Commons: a new vision for the future of Open Data

103

REFERENCES

[1] All Citadel deliverables, including those mentioned in this book, are publicly available at:

http://www.citadelonthemove.eu/en-us/results/publicdeliverables.aspx

[2] Article 29 Data Protection Working Party (2013) ‘Opinion 2/2013 on apps and smart devices’, 27

February. Available online at http://www.huntonprivacyblog.com/wp-

content/files/2013/03/wp202_en.pdf (last accessed: December 2014)

[3] Article 29 Data Protection Working Party (2011) ‘Opinion 9/2011 on the revised Industry Proposal for a

Privacy and Data Protection Impact Assessment Framework for RFID Applications’, 11 February. Available

online at http://cordis.europa.eu/fp7/ict/enet/documents/rfid-pia-framework-a29wp-opinion-11-02-

2011_en.pdf (last accessed: December 2014)

[4] BEPA (Bureau of European Policy Advisers) (2011) Empowering people, driving change: Social

innovation in the European Union. Luxembourg: Publications Office of the European Union. ISBN 978-92-

79-19275-3

[5] Capability Maturity Model (MATURITY MODEL),

http://en.wikipedia.org/wiki/Capability_Maturity_Model (last accessed: December 2013)

[6] Dekkers, M., Polman, F., te Velde, R. and de Vries, M. (2006) Measuring European Public Sector

Information Resources. Final Report of Study on Exploitation of public sector information – benchmarking

of EU framework conditions. Executive summary and Final Report. European Commission, Directorate

General for the Information Society and Media

[7] European Commission (2009) ‘Recommendation on the implementation of privacy and data

protection principles in applications supported by radio-frequency identification’, C (2009) 3200 final,

Brussels, 12 May. Available online at: http://eur-

lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:32009H0387:EN:HTML (last accessed: December

2014)

[8] Ferro, E. and Osella, M. (2011) Modelli di Business nel Riuso dell'Informazione Pubblica. Studio

Esplorativo. Osservatorio ICT – Piemonte, www.sistemapiemonte.it

[9] ISO/IEC WD 29134 ‘Privacy impact assessment – Methodology’. Available online at

http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber=62289 (last accessed:

December 2014)

[10] Itani, W., Kayssi, A. and Chehab, A. (2009) Privacy as a Service: Privacy-Aware Data Storage and

Processing in Cloud Computing Architectures. In Proceedings of the Eighth IEEE International Conference

on Dependable, Autonomic and Secure Computing (DASC '09), 12-14 December, pp. 711-716. Available

online at http://dl.acm.org/citation.cfm?id=1724449 (last accessed: December 2012)

[11] Jacobsson, S. and Bergek, A. (2007) A framework for guiding policy makers intervening in emerging

innovation systems in 'catching up' countries. European Journal of Development Research, (18), 4, 687-

707

[12] Maximilien, E.M., Grandison, T., Sun, T., Richardson, D., Guo, S. and Liu, K. (2009) Privacy-as-a-

Service: Models, Algorithms, and Results on the Facebook Platform. In Web 2.0 Security and Privacy

Workshop, held in conjunction with the 2009 IEEE Symposium on Security and Privacy, 21 May. Available

online at http://w2spconf.com/2009/papers/s4p2.pdf (last accessed: December 2012)

Page 104: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

104

[13] OECD (2006) Digital broadband content: Public sector information and content. Paris, OECD

Publications, 31 July.

[14] Pira International Ltd., University of East Anglia and KnowledgeView Ltd. (200) Commercial

exploitation of Europe’s public sector information. Executive Summary and Final Report. Luxembourg:

Office for Official Publications of the European Communities. ISBN 92-828-9934-9

[15] Szkuta, K., Osimo, D. and Pizzicannella, R. (2012) When people meet data: Collaborative approaches

to public sector innovation. Paper presented at the 1st

EIBURS-TAIPS conference on “Innovation in the

public sector and the development of e-services”, University of Urbino, April 19-20.

[16] Value Network Analysis, Wikipedia (accessed April 2012) -

http://en.wikipedia.org/wiki/Value_network_analysis

[17] Vickery G. (2011) Review of Recent Studies on PSI Re-Use and Related Market Developments.

Information Economics, Paris

[18] World Economic Forum (2011) Personal Data: The Emergence of a New Asset Class. An initiative in

collaboration with Bain & Company, Inc.

[19] Wright, D. (2011) ‘Should Privacy Impact Assessments Be Mandatory?’, in Communications of the

ACM, 54(8), pp. 121-131

[20] Wright, D., Finn, R. and Rodrigues, R. (2013) A Comparative Analysis of Privacy Impact Assessment in

Six Countries, in Journal of Contemporary European Research 9(1), pp. 160‐180. Available online at

http://www.jcer.net (last accessed: December 2014)

Page 105: The Open Data Commons: a new vision for the future of Open Data

105

ANNEX I: THE ODC TOOLKIT

Various aspects and elements of the ODC have been realised as software in project years 2 and

3. To the extent that the ODC is a common resource space with a range of tools, this can be

equated with the Citadel space on Github, available at https://github.com/CitadelOnTheMove,

which is used by developers in the Citadel community for both on-going and stable projects.

Among the resources there is the “landing page” http://citadelonthemove.github.io/ which

aims to provide an entry point for new developers with an easy overview of the Citadel toolkit

overall, including examples, tutorials etc.

In these spaces, the resources that can be said most properly to belong to the ODC (apart from

the AGT, which is discussed separately in specific reports) include:

The Citadel Converter (actually three projects)

The PHP Converter Library with geoJSON conversion

The CitySDK-Citadel script

The specific code for each of these resources is available through Github, so in the following we

simply provide an overview of how each element is structured and how it fits into the ODC

concept.

THE CITADEL CONVERTER

The Citadel Converter is the main tool to have been developed as an implementation of the

ODC concept, in that it converts from the most widely found data formats (tabular datasets in

CSV or XLS/X format) into the C-JSON format used by the Citadel templates and the AGT.

Elements of the Converter that have been developed as part of the Lisbon Pilot (namely with

external resources) and then re-integrated into the Converter available on the Citadel Platform

are marked with a double asterisk**.

The Java Converter is made up of three software components:

The Library: this is the heart of the converter, and carries out the actual conversion

function and makes it available to external parties thanks to its APIs

The GUI standalone: this is the graphical interface for those wishing to use the Library

off-line.

The Portlet: this is the portlet installed in a Liferay Portal to use the Library via web (and

integrated into the Citadel platform).

All three modules are written in Java 1.7 using Maven.

THE LIBRARY

Github: https://github.com/CitadelOnTheMove/converter-lib/

Wiki: https://github.com/CitadelOnTheMove/converter-lib/wiki/

Page 106: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

106

Full API documentation: https://github.com/CitadelOnTheMove/converter-lib/wiki/API-

Documentation

The main features of the Converter library are to:

Open virtually any kind of source dataset, at the moment:

o Comma-Separated Values (.csv): stores tabular data (numbers and text) in plain-

text form following RFC 4180.

o Microsoft Excel files (.xls and .xlsx): Microsoft Excel produces documents in the

generic OLE 2 Compound Document and Office Open XML (OOXML) formats.

The older OLE 2 format was introduced in Microsoft Office version 97 and was

the default format until Office version 2007 and the new XML-based OOXML

format.

Enter metadata for source datasets

Support semantic match using categories and contexts

Support column mapping and string operations on columns and custom text to create

target datasets with different structures and fields

Support virtually any target format of the converted dataset, at the moment the Citadel

common POI format using JSON and MyNeighbourhood** format using CSV

Validate generated target datasets, at the moment only according to the Citadel

common POI schema (C-JSON)

Support uploading the generated dataset in CKAN**

Customizable configuration to tailor CSV parsing options, contexts and categories

Multilingual, at the moment English, French and Italian are supported

Easy to plug your favourite logging framework

THE GUI STANDALONE

Github: https://github.com/CitadelOnTheMove/converter-gui/

Wiki: https://github.com/CitadelOnTheMove/converter-gui/wiki/

Step by step User Guide with pictures:

https://github.com/CitadelOnTheMove/converter-gui/wiki/User-Guide

Video Guide: https://github.com/CitadelOnTheMove/converter-gui/wiki/Video-guide

The main features of the Converter GUI includes the features of the Converter library and make

it easy to:

Change the Language

Start a new conversion or cancel it through a navigable wizard

Load one or more source datasets

Specify the settings to use to load the CSV file or the Microsoft Excel spreadsheet with

automatic preview

Drag and drop to carry out the semantic match with categories and contexts

Drag and drop to map columns, add custom text and concatenate both custom text and

columns

Page 107: The Open Data Commons: a new vision for the future of Open Data

Annex I: The ODC Toolkit

107

Visual indication of different kinds of messages (notice, warning and error) on data

mapping

Display error message boxes in the conversion process or on validation

Preview the generated target dataset

Save the generated dataset locally

PORTLET FOR LIFERAY PORTAL

Github: https://github.com/CitadelOnTheMove/converter-portlet/

Wiki: https://github.com/CitadelOnTheMove/converter-portlet/wiki/

Step by step User Guide with pictures:

https://github.com/CitadelOnTheMove/converter-portlet/wiki/User-Guide

Video walk-through: http://youtu.be/oTn76MqzuG4

The main features of the Converter Portlet include the features of the Converter library and

make it easy to:

Change the Language

Start a new conversion or cancel it through a navigable wizard

Load one or more source datasets

Specify the settings to use to load the CSV file or the Microsoft Excel spreadsheet with

automatic preview

Drag and drop to carry out the semantic match with categories and contexts

Drag and drop to map columns, add custom text and concatenate both custom text and

columns

Visual indication of different kinds of messages (notice, warning and error) on data

mapping

Display error message boxes in the conversion process or on validation

Preview the generated target dataset

Save/publish the generated dataset with three options

o Download the file locally

o Save the file to a CKAN server and publish the dataset information to the Citadel

Index**

o Save the file to the Citadel Platform and publish the dataset information to the

Citadel Index**

USAGE STATISTICS

The following provide some statistics on access to and use of the Converter starting April 24,

2014 (installation of a significantly revised version following user feedback) and ending

December 12, 2014:

Total of 603 user sessions (persons initiating a conversion process for at least one

dataset), average of 2.6 sessions per day

Datasets loaded: 814 (386 CSV, 375 XLSX, and 53 XLS)

Page 108: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

108

Datasets successfully converted: 627 (606 Citadel JSON and 21 MyNeighbourhood CSV)

THE PHP CONVERTER LIBRARY

The PHP Converter library was developed as a modification of the original PHP prototype

version of the Converter software (December 2013), as a proof of concept (tech-friendly more

than user-friendly) for the following features:

It can carry out on-the-fly CSV, geoJSON and osmJSON to Citadel JSON conversions for

mobile applications

Conversion from osmJSON format enables to get live data from Open Street Map

GeoJSON export format enables to use the converted data into other geoJSON

compatible applications (including web mapping services)

It includes a mapping template editor, for easy generation of config files, which enable

live encoding from various datasets

It provides some converted data caching (so we update the file only when requested, or

depending on some specific criteria, allowing to serve converted files faster, while

updating them on a regular basis)

It is designed to be embedded into other Open Source products, such as CMS or Data

stores, to allow them to natively provide Citadel JSON output.

Github: https://github.com/CitadelOnTheMove/converter-php-lib

The code is mainly intended as a basis for more advanced projects, with the following overall

roadmap.

Implement the complete set of data fields from the Citadel-JSON format

Add an editor feature to allow using various fields into the output description field for

POI

Plug the library to other data sources than CSV files, and particularly database backends

from existing CMS.

THE CITYSDK-CITADEL CONVERSION SCRIPT**

The CitySDK-Citadel conversion script allows to query a CitySDK tourism POI database using a

variation of the CitySDK API36, with the results being fed directly into the Citadel Converter for

conversion. The original script was developed37 on the fly during the CitySDK-Citadel workshop

held in the context of the Open Data Days in Ghent (February 2014).

Github: https://github.com/CitadelOnTheMove/CitySDK-Citadel-Script

36

CitySDK is a sister project to Citadel. It developed standard APIs for Open Data webservices to allow portability of

apps across Europe. 37

Development of the CitySDK Citadel conversion script was carried out without resources from Citadel. The first

script was developed with resources from the CitySDK project, while its refinement was carried out as part of the

Lisbon FI-WARE Pilot.

Page 109: The Open Data Commons: a new vision for the future of Open Data

Annex I: The ODC Toolkit

109

During the Lisbon pilot, the script was further refined to allow accessing the database (with

parameters, filters, etc.) through a URL query such as

http://citysdk.ist.utl.pt:8000/?city=amsterdam&format=csv&limit=5.

Github: https://github.com/rsbarata/CitySDK-Citadel-Script

In parallel, an additional feature was added to the Citadel converter** that allows to upload a

file through a URL field rather than by browsing and selecting. This has allowed to add to the

Citadel Platform datasets with tourism POIs from Lisbon, Amsterdam, Helsinki, and Rome

through direct queries to the CitySDK API.

Page 110: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

110

Page 111: The Open Data Commons: a new vision for the future of Open Data

111

ANNEX II: STANDARDS ADOPTED IN THE CITADEL PLATFORM

This Annex is based on material from Citadel Deliverable 2.3.3, “New Standards and

Recommendations”, authored by Florian Daniel, Julia Glidden, and Geert Mareels.

Although the Open Data Commons aims to act as an open framework, in practice it is

nonetheless necessary to identify one or more standards to adopt for use. The following

therefore details the main choices made for the Citadel Platform as currently available at

http://www.citadelonthemove.eu/.

FILE FORMATS MAPPING

From our interactions with Pilot and Associate cities, Citadel quickly understood that most data

owners in cities do not have a strong grasp of the relative benefits of different data models and

file formats. Therefore, all but the most advanced cities take the path of least resistance;

publishing data in whatever data structure they already hold using spreadsheet-friendly file

formats as .CSV and .XLS. In only a few cases (10% of Associates) did Associate cities have any

data published in an RDF-model format.

In light of this on-the-ground reality, the Citadel team realized that it needed to create tools and

associated recommendations that balanced the simple compliance steps required to get cities

on board on the one hand, with maximizing data reuse on the other. Toward this end, the team

conducted a mapping of the most appropriate file formats available:

Table 14. Citadel Common File Formats Mapping Grid

Format Description Pros Cons

.XLS/.XLSX (Excel)

Represents data as tables accepted by all spreadsheet programmes

Very accessible to people and widely used

Proprietary format. Does not use Unicode. Too simple to allow for most programmes to make use of this data directly. Does not express relations between data

.CSV Represents data in ‘flat’ tables easily read by people or machines

Easily understood or ‘parsed’ by most programmes. Easily read by humans. Application-neutral.

Tabular format does not express relationships between data making it less applicable for complex applications

.XML Represents data as a structured tree schema that expressed relations between data

Strong schema makes it possible to attach rich information data

Schemas are complex making it difficult to write programmes

.JSON Represents data in a simple tree structure making it programmer-friendly

Structure and links make it very easy to build services using data

Lack of schema means it only supports simpler types of data

RDF Formats (e.g Turtle,

Represents data as a network of linked points that make it easy to

Highly-structured information makes it easy to search and

Complex structure makes it difficult to ‘parse’ and therefore costly to work with.

Page 112: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

112

Format Description Pros Cons

N-Triples and JSON-LD)

understand complex patterns using computer programmes

retrieve information in services. Easy to visualise data relationships

Relational structure makes human reading challenging. Complex to work with in development.

Based upon the above mapping and in alignment with the trends outlined above, Citadel

ultimately chose to use two types of files formats in its own work out:

CSV – Recommended for publishing Open Data for the public38,

JSON - Recommended for publishing Open Data to use in mobile applications

FILE FORMATS CHOICES

Citadel’s choice to use CSV as the main input data format was first and foremost led by the

project’s ambition to foster easy access to open government data. As discussed, a majority of

cities around the world use spreadsheet programmes such as Excel for managing their data.

Thus, rather than fight this trend and try to impose a more technically advanced standard from

above, Citadel elected to work with the practice via a format like CSV that is easy to edit and

export39 and noted for its simplicity and accessibility40, compactness, ease and speed of

processing and scalability41.

The choice of JSON for mobile development was motivated primarily by a desire to make our

Open Datasets attractive to App developers. JSON presents data in a simple ‘tree’ structure

without the need for a formal ‘schema’. JSON is consequently well-suited to building capable

apps and popular with the developer community42 because, unlike ‘schema’ formats like XML,

the dataset contains all the information the app needs to work well. At the same time, JSON’s

lack of schema likewise makes it relatively easy to convert standard CSV files into this

developer-friendly format.

DATA MODEL CHOICE

To support the release of developer-friendly JSON, Citadel created a common data model for

each dataset – which we call ‘Citadel JSON’. Citadel JSON’s Data Model is based on an extension

of the W3C PoI Core Data Model43. Citadel JSON’s data model has two advantages over other

commonly-used JSON data models:

1. Semantic Annotation – A Citadel JSON file tags every piece of information with a

machine-readable category. These tags are contained within the dataset, meaning any

app can easily read and make sense of the data without additional programming.

38

Some of the Citadel converting tools also accepts Excel, OSM JSON or geoJSON files 39

http://www.opendataimpacts.net/2014/10/data-standards-and-inclusion-in-the-network-society/ 40

CSV files can be read and edited by humans, using a simple text editor 41

Huge CSV datasets can be easily handled, as the “one line per entry” structure enables sequential processing 42

http://blog.mongolab.com/2011/03/why-is-json-so-popular-developers-want-out-of-the-syntax-business/ 43

http://www.w3.org/TR/poi-core/

Page 113: The Open Data Commons: a new vision for the future of Open Data

Annex II: Standards adopted in the Citadel Platform

113

2. Cross-Border Use – A Citadel JSON file will work with any other application designed

using Citadel JSON. This means that an app developed to find art galleries in Helsinki can

also find galleries in Palermo with no need to develop and download a new service.

The two features above make Citadel JSON a significant improvement, from the perspective of

developers over existing Open Data models. The following visual provides an overview of the

Citadel JSON data model:

Figure 30. Citadel JSON Data Model

The Citadel JSON data model was a significant extension of the W3C PoI Core Draft, currently

the global guideline for the production of PoI Data Models.

POINTS OF INTEREST (POI) STANDARDS

As noted above, Citadel’s Data Model is based on the W3C POI Core draft44 which defines a data

model for ‘location about which information is available.’ As the model was designed to be

used in mobile web applications, it was implemented in JSON, which, as discussed above, is the

most used and suitable data format45. The resulting format called “Citadel JSON” includes both

data from the original dataset, information about the dataset itself (known as “metadata”:

“data about data” – see below), and specific fields that describe how the data should be used

into mobile applications.

Citadel JSON can be compared with the related format GeoJSON, which is also used to describe

POI, but does not include information about the dataset, and is not specifically designed to

provide all required data for mobile applications. Both files formats can be merged, or easily

converted.

44 http://www.w3.org/2010/POI/documents/Core/core-20111216.html 45

JSON is the native data format used by JavaScript, which is responsible for the dynamic part of the applications

Page 114: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

114

GEOSPATIAL STANDARDS

Citadel JSON uses the WGS84 coordinate reference system to represent the location of points of

interest accurately. WGS84 is used by GPS providers and most well-known mapping systems

and can be considered the world standard. The coordinates of a given point on Earth are

expressed into decimal format, using axis order latitude, then longitude, and separated with a

space, e.g. “50.838908 4.373942” for the European Parliament building in Brussels. The Citadel

JSON conversion process also uses latitude and longitude fields to produce the Citadel JSON file.

The resulting format in Citadel JSON combines these two fields with a separating space,

resulting in a single value with latitude first, then longitude.

The Citadel format also allows other standards to be declared and used, though this is not

recommended and not handled by the mobile templates46.

DATE AND TIME DATA

Citadel uses date and time information both to identify the dataset itself and as part of datasets

related to events. The Events Template created by Citadel uses this information to filter the POI

based on selected dates. The dataset time metadata is formatted using the ISO 860147,

preferably with time zone shift information (e.g. “2014-10-02T15:13:19+00:00”), as the

applications may be used in various time zones.

Due to the variety of input data formats in source datasets, the date format used for Events POI

is inputted free–form into Citadel JSON. However, precautions should be taken to ensure that

the format used can be parsed by the Events Template.

In addition to ISO 8601, Citadel Events Templates notably handles RFC2822 Date and Time

Specification48 (eg. “Mon, 25 Dec 1995 13:30:00 GMT”), as well as dates using the format

“DD/MM/YYYY”.

SENSOR AND IOT

As part of Citadel’s Pilot Activities with the City of Manchester, we integrated some real-time

sensor data into our App Generator. Citadel work on sensor data needed to find a relevant

standard and ultimately led the project to the work of the Open Geospatial Consortium (OGC)49

on Sensor Web Enablement. OGC has developed a series of industry-leading data model

standards50 that are designed to describe, query and exchange sensor information. Of these

model standards, two proved useful for the Citadel sensor data:

1. “Sensor Observation Service” (SOS)51 - describes a web service to query sensor data,

46

Because the underlying map uses WGS84 too 47

http://www.iso.org/iso/iso8601 48

https://www.ietf.org/rfc/rfc2822.txt 49

http://www.opengeospatial.org/ 50

These standards are: SensorML (sensor model), O&M (observation data), SOS (observation service), SPS (planning

service), and SAS (alert service) 51

http://www.opengeospatial.org/standards/sos

Page 115: The Open Data Commons: a new vision for the future of Open Data

Annex II: Standards adopted in the Citadel Platform

115

2. ISO 9156:2011 “Observations and measurements”52 - describes the sensor data.

Data produced by sensors and by Internet of Things (“IoT”) devices are often only available in

proprietary formats, or vendor-specific data models. Such data feeds are generally accessed

through a webservice (an automatic dataset that supplies the latest information when

requested by an app or service) which most commonly export data in JSON, XML and CSV data

format. Citadel chose to use the proprietary sensor platforms already installed in Manchester to

showcase live sensor data in action53. For others exploring the use of sensor data as Open Data,

we recommend the use of a webservice with either JSON or CSV format.

METADATA

Metadata, or ‘data about data,’ is ‘structured information that describes, explains, locates, or

otherwise makes it easier to retrieve, use or manage an information resource.’54 The reference

metadata standard to describe online resources is Dublin Core Metadata,55 from which 15 core

terms56 have been normalised in ISO 15836:2009.57 Citadel chose to conform to this widely-

recognised international standard - all Citadel data therefore uses Dublin-Core Metadata.

POI DATASET CATEGORIES

ISO 1911558 describe the main themes for geographic dataset categorization. These top level

thematics have been extended by the INSPIRE directive which recommends for each dataset:

A unique INSPIRE theme59

Additional keywords from the GEMET-Concepts,60 or a professional thesaurus or free

keywords

The Citadel Data Index61 uses a different, narrower categories list as it is more convenient for

general public POI categorization. This classification is available using a JSON implementation of

RDF Data Catalogue standard (DCAT)62 through the dataset web service of the Open Data Index.

The “categories” used inside the datasets themselves are free because they reflect the ones

used in the original data file which may or may not be structured. While not enforced at all,

Citadel’s use of existing categorization vocabularies can be a step forward toward better

interoperability. It would allow better dataset auto-discovery in the future, and is therefore

advised.

52

http://www.iso.org/iso/catalogue_detail.htm?csnumber=32574 53

Which uses Xively API, which was used by 2 pilot cities, fetching data as JSON : https://xively.com/dev/docs/api/ 54 http://en.wikipedia.org/wiki/Metadata_standards 55 http://dublincore.org/ 56

The 15 core terms are : title, creator, subject, description, publisher, contributor, date, type, format, identifier,

source, language, relation, coverage, rights 57 http://www.iso.org/iso/fr/home/store/catalogue_ics/catalogue_detail_ics.htm?csnumber=52142 58 http://www.iso.org/iso/fr/home/store/catalogue_tc/catalogue_detail.htm?csnumber=53798 59 http://inspire.ec.europa.eu/theme/ 60 http://www.eionet.europa.eu/gemet/en/themes/ 61 http://www.citadelonthemove.eu/en-us/opendata/opendataindex.aspx 62 http://www.w3.org/TR/vocab-dcat/

Page 116: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

116

MOBILE APPLICATION TEMPLATES

Citadel’s mobile applications templates use HTML563 – which is mobile-platform agnostic –

rather than vendor-specific mobile platform technologies such as Google's Android or Apple's

iOS64. While HTML5 itself doesn't allow developers to build native apps (an app program

developed for use on a particular platform or device) at this point in time,65 it can be embedded

into native applications for iOS and Android to offer native features to mobile users. Citadel’s

HTML-based templates can be used even on the most basic web servers,66 and adapted without

a heavy development environment, using only text editors.

GAPS IN EXISTING STANDARDS

CHARACTER ENCODING ISSUES

Citadel supports the UTF-8 character encoding standard – which has existed since 1996, been a

world standard since 2003 and supports all known alphabets on earth. Despite the benefits of

UTF-8, use of Citadel tools has identified a number of outstanding issues regarding character

encoding done using a different format:

Some popular spreadsheet editing software including Microsoft Excel still uses regional

character encodings by default

This choice leads to less-interoperable files, with accents and special characters being

misinterpreted

Such regional encoding can cause challenges for the conversion process including

unreadable accent characters in files

Citadel Recommendation:

Adoption of the UTF-8 standard should be ensured so that text data and information

can be exchanged in an interoperable manner throughout Europe and across the world.

Access to state-of-the-art standards and their implementation into varying languages

should be free wherever possible.67

POI ISSUES

As W3C did not finalise a standard for POI, Citadel had to use the unfinished draft. Use of this

draft uncovered a number of issues:

Some fields did not suit “flat files,”68 and were simplified to become more user-friendly,

63

HTML5 standards also rely on CSS3 and JavaScript languages 64

Despite being Open Source, Android remains complex to use for non-professional developers, and iOS is

proprietary 65

A big issue with HTML5 66

The templates also use PHP, and MySQL for database-powered apps, which both power most of the world's

websites 67

Eg. ISO standards are still paid-access.

Page 117: The Open Data Commons: a new vision for the future of Open Data

Annex II: Standards adopted in the Citadel Platform

117

Citadel needed to describe the dataset itself, and extend POI drafts to add fields that

describe the dataset – these fields basically wrap the updated POI data,

Citadel needed to add fields specifically designed to be used with mobile applications,

which were implemented through an additional extensible data model (the “tpl”

identifiers),

The draft W3c standard did not go far enough - real-world implementation revealed a

need to build usable tools without using a full linked data infrastructure.

Citadel Recommendation:

Foster the development of probed standards that fit developer needs.

EVENTS ISSUES

The W3C POI data standard did not include calendar information by default which made it

impossible to properly display events on the map.

Citadel Recommendation:

Citadel added to POI data using the extendable attributes defined in the Citadel JSON

format.

GEOSPATIAL ISSUES

Geospatial standards still are a fuzzy standards area. Many reference systems exist with no clear

guidance on which ones are best to use and why.

In our work, Citadel found that geographical coordinates from different countries had variations

in both axis order (whether latitude or longitude comes first when written) and the form in

which Lat/Long were written (one cell or two separate cells). Citadel also found that geographic

coordinates are based on various geospatial reference systems which often leads to

inconsistency between datasets as they are not always explicit (especially once used on table

files). As an example, Barcelona – which has published many rich data sets – offers a bus stop

dataset, which even after proper conversion to the global standard for Latitude and Longitude

(Wgs84), shows a small shift (about 2 blocks) for all POIs - making it concretely unusable for

Citadel apps.

Conversion between geographic coordinate systems remains a complex issue for non-GIS

specialists. We believe this complexity has contributed to a lack of easy-to-use available tools

and best practice on used geospatial reference systems. Finally, the auto-discovery feature of

the Citadel AGT, which allows any app to automatically detect data corresponding to one’s

current location and load it into the app, shows that there is a key need to be able to describe

the covered area of a given dataset (instead of attaching it to a central point) in order to enable

applications to get the most accurate data at different scales depending on their completeness.

68

They are rather designed for Linked Data, using extensive URI and namespaces instead of clear text and URL, which

are more user-friendly for developers and data editors that lack the surrounding infrastructure to easily produce

these structured and linked data files.

Page 118: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

118

A possible pre-requisite for this functionality would be a common thesaurus of administrative

boundaries and their evolution through time.

Citadel Recommendation:

Set a central open interoperable standard for geographic coordinates based on WGS84

and a once defined axis order and coordinates formatting

Provide adequate and open conversion tools to enable data publishers to publish their

data using a shared, unique coordinate system to exchange information outside from

the GIS community

Establish an administrative ontology of European boundaries, at scales and with a

historical perspective, in order to enable local data naming and discovery using both

geographical coverage and administrative entities

Where possible, cities should geocode their POI in latitude and longitude fields.

Page 119: The Open Data Commons: a new vision for the future of Open Data

119

ANNEX III: THE CITADEL CHARTER

PREAMBLE

THE MALMOE DECLARATION

The Malmoe Ministerial Declaration on eGovernment, approved on 18th November 200969, sets

forth the following joint vision for 2015:

“European governments are recognised for being open, flexible and collaborative in

their relations with citizens and businesses. They use eGovernment to increase

their efficiency and effectiveness and to constantly improve public services in a way

that caters for users’ different needs and maximizes public value, thus supporting

the transition of Europe to a leading knowledge-based economy.”

At the time, the Malmoe Declaration seemed to be an important step forward, yet as we

approach the year 2015, Europe could not be further from that vision. Since 2009, trust in the

European Union has fallen to new lows, from 48% to 31% and that in national governments

from 29% to 27%70. Whatever progress has been made in eGovernment services and Open Data

strategies has not been enough to have a significant impact. Engagement, representation, and

the provision of services will have to change, and change quickly, in order to recover the gap

between European governments and the citizens and businesses they serve.

THE CITADEL STATEMENT

The Citadel Statement, signed by a group of local authorities a year after Malmoe, on December

10th, 201071, aimed to ‘make Malmoe real’ by identifying the key role local governments should

play in this process. While the Statement’s recommendations have not been adopted at the EU

Ministerial level, they did lay the ground for the ‘Citadel… on the Move’ project, which in fact

has been working since 2012 with pilot cities in four EU Member States to explore new

scenarios with local citizens and businesses. Two years of co-design and experimentation with

communities of users in over 100 towns and cities are paving the way for a massive uptake of

Open Data as the foundation of new business ecosystems, urban lifestyles, and government

services.

The focus of development in Citadel has been an integrated platform designed to engage non-

technical citizens, businesses, and civil servants in Open Data, providing simple tools with which

to convert and publish information and generate an app in only a few minutes. As part of this

effort, Citadel also explored related issues such as semantics and standards, governance and

privacy, and Living Lab engagement and evaluation methodologies, as enabling mechanisms for

local authorities to spark off diffused processes of transformative innovation. To build

69

http://ec.europa.eu/digital-agenda/sites/digital-agenda/files/ministerial-declaration-on-egovernment-malmo.pdf 70

Standard Eurobarometer 81, http://ec.europa.eu/public_opinion/index_en.htm. 71

http://www.corve.be/docs/english/Citadel%20Statement.pdf

Page 120: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

120

momentum, a specific outreach initiative has enlisted over 100 local authorities in over 60

countries worldwide sharing the Citadel vision.

LEARNING FROM THE CITADEL PROJECT

In this process, specific new insights have been gained on the core issues identified in the 2010

Citadel Statement, as follows:

1. Common Architecture, Shared Services and Standards

The Citadel Statement called for a “common service delivery architecture.”

The Citadel project defined the similar but more open and agile concept of the Open

Data Commons, a shared space in the public domain that rather than impose standards

provides an open, multi-standard framework that promotes the convergence of

practice.

2. Open Data, Transparency and Personal Rights

The Citadel Statement called for common data models to make data consistent across

Europe, respecting personal information.

The Citadel project defined these within an open semantic framework that allows for

the integration of new data models. This allows for the emergence of privacy-as-a-

service concepts on the one hand, and for mobile applications to ‘discover’ similar

datasets in new localities on the other.

3. Citizen Participation and Involvement

The Citadel Statement called for citizen participation in decision-making and service

design.

The Citadel project adopted the Living Lab method, engaged citizen developers in co-

designing the tools and services, and developed a platform that opens up Open Data

itself to non-technical citizens and businesses, broadening the scope from Public Sector

Information to include data-holders throughout the community.

4. Privacy and Identification of Individuals

The Citadel Statement called for a European framework to address privacy issues.

The Citadel project explored the implications of privacy in citizen-driven Open Data

scenarios, identified limitations in the current policy debate, and defined procedures for

individual data licensing for privacy-as-a-service in the Open Data Commons.

5. Rural inclusion

The Citadel Statement called for equality of broadband access.

The Citadel project focused on Smart City services, extending the scope beyond large

and well-funded cities to emphasize the innovation potential of small and medium-sized

towns and proposing broadband cloud platforms for Open Data as a regional public

service.

Page 121: The Open Data Commons: a new vision for the future of Open Data

Annex III: The Citadel Charter

121

TOWARDS THE MALMOE OBJECTIVES

These elements, in line with the logic of the Citadel Statement, come together to provide a

more effective and convincing path to reaching the main objectives of the Malmoe Declaration.

“Citizens and businesses are empowered by eGovernment services designed

around users needs and developed in collaboration with third parties, as well

as by increased access to public information, strengthened transparency and

effective means for involvement of stakeholders in the policy process”.

This objective speaks about Open Data without mentioning it. Citadel, together with other

projects and general trends over the last five years, demonstrates that Open Data is the

foundation for any effective eGovernment strategy based on transparency and open access to

public information. Furthermore, Citadel has demonstrated that local authorities need to go

beyond mere “involvement” and adopt strategies for deep engagement of citizens and

businesses in a logic of service co-design and co-production. Public sector information needs to

be considered not the end goal of an Open Data strategy but only the beginning of a path which

will need to integrate data openly provided by citizens, businesses, and any other activity or

entity in the territory.

“Mobility in the Single Market is reinforced by seamless eGovernment services in

the setting up and running of a business and for studying, working, residing and

retiring anywhere in the European Union”.

Citadel has explored services able to cross any administrative border, including those between

municipalities and regions in the same country. The project identified the semantic

interoperability of underlying data structures as key to the seamless, trans-European fluidity of

the services upon which they are based. This issue is addressed not by attempts to agree upon

unique standards, but rather by the definition of an open semantic framework that allows for

standards to interoperate and evolve over time, constantly interacting with end users through

the convergence of practice. Finally, mobility is an issue that needs to be also viewed in cultural

and social terms, as it is a defining feature of European citizenship. Citadel has explored these

issues by considering the status of ‘visitor’ and ‘host’ in the definition of Open Data governance

strategies.

“Efficiency and effectiveness is enabled by a constant effort to use eGovernment to

reduce the administrative burden, improve organisational processes and promote a

sustainable low-carbon economy”.

Citadel pilot cities have demonstrated that the best way to reduce administrative burdens is to

open up to citizen engagement for the co-production of ICT-based services. This leads to

institutional innovation processes that in turn demonstrate positive sustainability effects

through improved awareness of the environmental dynamics of urban systems and the impacts

of service design.

“The implementation of the policy priorities is made possible by appropriate key

enablers and legal and technical preconditions”.

Page 122: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

122

Citadel has focused on inclusive local governance of Open Data strategies as key to identifying

the key legal and procedural enablers that city administrations are able to set in place for

diffused uptake in their territories, including awareness building, public events and hackathons,

etc. Broader engagement of citizens and businesses in debates on issues such as privacy and

security are fundamental in order to provide effective bottom-up input to national and EU legal

frameworks. Finally, the need has emerged for the provision of common Open Data platforms

and tools as an enabling public service open to all, governed by the principles of an Open Data

Commons.

Thanks to the experience gained and the lessons learned in the Citadel project, the key

elements for achieving the Malmoe Declaration’s important objectives are in place and the way

forward is clear, on the eve of the target date of 2015. Now is the time for local authorities to

join forces by declaring these common principles, committing to common action, and

challenging others to play their part.

THE CITADEL MANIFESTO

VISION

Europe and the world are facing severe challenges that place an increasing burden on

governments at every level. Municipal governments, however, are those best placed to address

many of these challenges, as they are closest to the citizens and businesses who will need to be

engaged in order to find the right solutions. While local communities harbour the innovation

potential we need, the greatest barrier to engagement is the increasing lack of trust in

government experienced in recent years.

One of the key levers for building a new relationship between local governments and citizens

and businesses is the opening up of public sector information. Open Data is a concrete step

towards transparency and engagement, while it also highlights the value of the work of civil

servants, the innovation potential of local development communities, and the benefits that can

be attained when the two collaborate towards common goals. Open Data processes also need

to deeply engage local citizens and businesses, as the first step in going beyond the public sector

to involve the whole local community in publishing and using data.

The objective is to achieve the aims of the Malmoe Declaration – engagement of citizens and

businesses, mobility of people and services, and efficient and sustainable public services –

through promotion of a massive uptake of Open Data. The vision is that of a ‘Territory of Data’,

composed of networks of towns and cities with data-driven strategies that enable local

communities to co-design sustainable services based on an improved awareness of local

phenomena and activities, socio-economic and environmental dynamics, and market and

business opportunities.

Page 123: The Open Data Commons: a new vision for the future of Open Data

Annex III: The Citadel Charter

123

COMMITMENTS AND CHALLENGES

To that end, we the signatories of this Citadel Charter, commit our local governments to work

towards:

The massive opening up of all data we and our constituencies hold, with due respect for

individual and collective rights and using open platforms and standards frameworks.

The deep engagement of local development communities together with citizens and

businesses in data-driven societal innovation processes, including social and

institutional innovation as well as the development of innovative products and services.

The establishment of local Open Data governance groups that allow our administrations

to play a role of process orchestration, in order to most effectively define common

policies for privacy, security and related issues as well as business opportunities in civic

innovation environments and to ensure that access to open data is continuous and

consistent over time.

We challenge the European Commission and national and regional governments to:

Engage with our local governments as key actors for capturing bottom-up energies as

well as implementing Europe 2020 strategic objectives top down, including the Digital

Agenda for Europe and Regional Smart Specialisation Strategies.

Provide a cooperation framework that allows us to effectively work together with local

authorities across Europe to promote the institutional innovations that are key to

capturing the potential for our territories of Open Data.

Support the provision of common and open platform infrastructures and services,

including those which have already received European funding such as those developed

in the CIP Smart Cities initiative and in the Future Internet Public Private Partnership (FI-

PPP), and play their part in using these platforms to open the data they hold.

We challenge technology developers, from local tech communities to multi-national

corporations to:

Engage directly with public authorities and citizens to discover innovation potentials and

needs, using approaches such as Living Labs to co-design more effective products and

services.

Work to jointly explore the business benefits and market potentials of technology

innovation having the public interest as the primary goal, with a particular focus on

cultural expression, public and civic participation, services for the needy, and

environmental sustainability.

Adopt open platforms, standards, and frameworks that support interoperability while

promoting openness, participation and engagement and full respect of individual and

collective rights in the conception and design of services.

We challenge citizens and businesses both in our own local communities and throughout

Europe to:

Engage with public innovation communities, open up to innovation, and participate in

the co-design of new public services and spaces.

Page 124: The Open Data Commons: a new vision for the future of Open Data

The Citadel Open Data Commons

124

Reflect, both individually and collectively, on emergent issues of personal privacy and

identity, recognizing the key role for citizen engagement in designing new societal

frameworks of entitlement and citizenship.

Demand openness and transparency from governments and businesses at all levels, as

the prerequisite for gaining the trust required to work together in addressing the key

problems society faces today.

Page 125: The Open Data Commons: a new vision for the future of Open Data

125

ABOUT THE AUTHORS

JESSE MARSH

Jesse Marsh has been exploring innovation since the late 1980s, when his professional interest

shifted from industrial design to information and communication technologies and local

development. He has participated in over 35 EU projects dealing with a range of issues from

cultural identity to smart cities, and has worked as a consultant to the FAO, the European

Parliament, and the World Bank. He has been an active member of the Living Lab movement

since 2007 and is currently Special Advisor to the President of ENoLL, consultant to the City of

Palermo for its Open Data and Smart City strategies, and is advising the Sicilian and Calabria

Regions on the role of Social Innovation in regional Smart Specialisation Strategies 2014-2020.

FRANCESCO MOLINARI

Francesco Molinari is currently research associate at Politecnico di Milano and visiting professor

at the Ulster Business School of the University of Belfast. As research and project manager he

has worked for several public and private organizations in Europe, including clients from

Belgium, Cyprus, Greece, Israel, Italy, Portugal, Slovenia and the UK. For the European

Commission he wrote in 2008 a study for the assessment of the Living Labs approach in the EU

innovation and Future Internet scenario. He has advised several Italian Regions and central

government bodies in topics related to innovation policy, smart specialization and pre-

commercial procurement.

RICARDO STOCCO

With a degree in Archaeology from the University of Padua, Ricardo has coordinated and

directed several archaeological excavations and surveys in Italy and abroad, with the related

activities of digital documentation. He has also designed and developed ICT solutions and

services for public dissemination and knowledge exchange for a series of archaeological

expeditions. As part of the management of the archaeological site of the Imperial Fora (1999-

2007) in Rome, he began to address issues related to the collection, management and use of

"public" data related to Cultural Heritage. This activity led him to deal specifically with the field

of Open Data, which he has continued to explore in activities related to the theme of cultural

"nomadic" tourism". Since 2011, he coordinates the research activities of the Territorial Living

Lab Prealpe (ENoLL Member), focusing on research and development related to Open Data in

Local Governments.

Page 126: The Open Data Commons: a new vision for the future of Open Data

126