Privacy for the Personal Data Vault Information Systems and Computer Engineering
Transcript of Privacy for the Personal Data Vault Information Systems and Computer Engineering
Privacy for the Personal Data Vault
Tamás Balogh
Thesis to obtain the Master of Science Degree in
Information Systems and Computer Engineering
Supervisors: Prof. Ricardo Jorge Fernandes Chaves
Master Researcher Christian Schaefer
Examination Committee
Chairperson: Prof. Luís Eduardo Teixeira Rodrigues
Supervisor: Prof. Ricardo Jorge Fernandes Chaves
Member of the Committee: Prof. Nuno Miguel Carvalho dos Santos
July 2014
Acknowledgments
First of all I would like to thank Ericsson for providing me with the opportunity to work on this
interesting research project. Special thanks goes out for Christian Schaefer for his great support
during the thesis work.
I would like to thank my thesis supervisor, Prof. Ricardo Chaves for his help and valuable
feedback during the course of this work.
My gratitude also goes out for the European Masters in Distributed Computing program co-
ordinator, Prof. Johan Montelius, Prof. Luıs Rodrigues and Prof. Luıs Veiga, who guided me
throughout my masters’ program.
Last but not least, I would like to thank my family and friends for supporting me all along.
Abstract
Privacy is an important consideration in how online businesses are conducted today. Personal
user data is becoming a valuable resource that service providers collect and process ferociously.
The user centric design, that stands for the basis of the Personal Data Vault (PDV) concept, is
trying to mitigate this problem by hosting data under strict user supervision. Once the user’s data
leaves its supervision, however, the current privacy models offered for the PDV are no longer
enough. The goal of this thesis is to investigate different privacy enhancing techniques that can
be employed in the scenario where PDVs are used. We propose three different privacy enhancing
models, all based around the use of the Sticky Policy (policy attached to data, describing usage
restrictions) paradigm. Two of these models are inspired by previous research, while the third one
is our novel approach that turns a simple Distributed Hash Table (DHT) into a privacy enforcing
platform. We perform several evaluations of the proposed models, having different aspects in
mind, such as: feasibility, trust model, and weaknesses.
Keywords
Personal Data Vault, privacy, Sticky Policy, trust, assurance
iii
Resumo
A privacidade e um aspecto importante a ter em consideracao na forma como as trocas com-
erciais sao realizadas hoje em dia. Os dados pessoais estao a tornar-se um recurso valioso
que os fornecedores de servicos recolhem e processam copiosamente. Um design centrado ni
utilizador, e a base do conceito do “Personal Data Vault (PDV)”, que tenta mitigar este problema,
acolhendo estes dados pessoais sob estrita supervisao do utilizador. No entanto, assim que o
utilizador deixa de realizar esta supervisao, o modelo de privacidade actualmente disponibilizado
pelo PDV deixa de ser suficiente. O objectivo desta dissertacao e investigar diferentes tecnicas
de reforco desta privacidade, que poderao ser aplicadas nas situacoes onde os PDVs sao us-
ados. Seguidamente sao propostos tres modelos de privacidade reforcada, todos baseados no
paradigma do uso de “Sticky Policy” (polıticas associadas aos dados, descrevendo as restricoes
a sua utilizacao). Enquanto, dois destes modelos sao inspirados no estado da arte existente, o
terceiro constitui uma nova abordagem que transforma um simples Distributed Hash Table (DHT)
numa plataforma de privacidade reforcada. Foram realizadas varias avaliacoes aos modelos
propostos, tendo em mente diferentes aspectos, tais como: viabilidade, confianca e debilidades.
Palavras Chave
Personal Data Vault, privacidade, Sticky Policy, confianca, garantia
v
Contents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 System requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Thesis Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.6 Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Background 7
2.1 The Personal Data Vault . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.1 PDV as an Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.2 PDVs in the Healthcare System . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Personal privacy concerns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Related Work 15
3.1 XACML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Usage Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.1 UCON in practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 TAS3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.4 PrimeLife . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5 Other Privacy Enforcement Techniques . . . . . . . . . . . . . . . . . . . . . . . . 23
3.5.1 DRM approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.5.2 Trusted platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.5.3 Cryptographic techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4 System Design 27
4.1 PrimeLife Policy Language (PPL) Integration . . . . . . . . . . . . . . . . . . . . . 28
4.2 Verifiable Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
vii
Contents
4.2.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2.2 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2.3 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.4 Privacy Manager Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.4.A Verifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.4.B Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.5 Interaction Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2.5.A Data Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2.5.B Forwarding Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.3 Trusted Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.3.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3.2 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3.3 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3.4 Privacy Manager Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.3.4.A Trust Negotiator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.3.4.B Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3.5 Interaction Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3.5.A Data Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3.5.B Forwarding Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.4 Mediated Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.4.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.4.2 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.4.3 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.4.4 DHT Peer Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.4.4.A The Remote Retrieval Operation . . . . . . . . . . . . . . . . . . . 46
4.4.4.B Membership . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.4.4.C Keyspace Assignment . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.4.4.D Business Ring Size . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.4.4.E Business Ring Description . . . . . . . . . . . . . . . . . . . . . . 49
4.4.5 Privacy Manager Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.4.5.A Sticky Policy Enforcement . . . . . . . . . . . . . . . . . . . . . . . 50
4.4.5.B Trust Management . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.4.6 Logging Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.4.7 Interaction Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4.7.A Data Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4.7.B Multiple Data Subject (DS) Interaction Model . . . . . . . . . . . . 54
4.4.7.C Multiple Data Controller (DC) Interaction Model . . . . . . . . . . . 55
viii
Contents
4.4.7.D Log Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.4.7.E Indirect data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.4.8 Prototype Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . 58
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5 Evaluation and Discussion 61
5.1 Comparison on Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.1.1 Establishing Trust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.1.2 Transparent User Data Handling . . . . . . . . . . . . . . . . . . . . . . . . 63
5.1.3 Data Across Multiple Control Domains . . . . . . . . . . . . . . . . . . . . . 65
5.1.4 Maintaining Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.1.4.A Direct Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.1.4.B Indirect Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.1.4.C Sticky Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.2 Comparison on Feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.3 Comparison on Trust Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.4 Comparison on Vulnerabilities and Weaknesses . . . . . . . . . . . . . . . . . . . 72
5.4.1 Weaknesses of the Sticky Policy . . . . . . . . . . . . . . . . . . . . . . . . 72
5.4.2 Malicious Data Controller (DC) . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.4.3 Platform Vulnerabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6 Conclusion 77
6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
ix
Contents
x
List of Figures
2.1 Personal Data Vault Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Personal Data Vault in the Healthcare System . . . . . . . . . . . . . . . . . . . . . 11
3.1 Overview of XACML Dataflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Collaboration Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.1 Verifiable Privacy: Abstract Architecture of a single Policy Enforcement Point (PEP)
node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2 Verifiable Privacy: Interaction diagram between a PDV and a Service Provider (SP) 34
4.3 Verifiable Privacy: Example of Forwarding Chain on Personal Health Record . . . 36
4.4 Trusted Privacy: Abstract Architecture of a single PEP node . . . . . . . . . . . . . 39
4.5 Trusted Privacy: Interaction Model of the Data Flow . . . . . . . . . . . . . . . . . 41
4.6 Mediated Privacy: Architecture of a DHT node . . . . . . . . . . . . . . . . . . . . 44
4.7 Mediated Privacy: Business Ring formed around a healthcare scenario . . . . . . 45
4.8 Mediated Privacy: Privacy as a Service (PaaS) design for the Hospital Service
Business Ring node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.9 Mediated Privacy: DC - DS interaction model . . . . . . . . . . . . . . . . . . . . . 53
4.10 Mediated Privacy: Key Dissemination . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.11 Mediated Privacy: Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.12 Mediated Privacy: Indirect Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
xi
List of Figures
xii
List of Tables
5.1 Requirements Comparison Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.2 Detailed Comparison on Maintaining Control . . . . . . . . . . . . . . . . . . . . . 67
xiii
List of Tables
xiv
List of Acronyms
BFS Breadth First Search
DC Data Controller
DFS Depth First Search
DHPol Data Handling Policy
DHPref Data Handling Preference
DHT Distributed Hash Table
DRM Digital Rights Management
DS Data Subject
noSQL Not Only SQL
MP Mediated Privacy
PaaS Privacy as a Service
PD Protected Data
PDP Policy Decision Point
PDV Personal Data Vault
PEP Policy Enforcement Point
PHR Personal Health Record
PM Privacy Manager
PPL PrimeLife Policy Language
RDBS Relational Database System
RDF Resource Description Framework
SQL Structured Query Language
xv
List of Tables
TCG Trusted Computing Group
TP Trusted Privacy
TPM Trusted Platform Module
TTP Trusted Third Party
UCON Usage Control
UI User Interface
VP Verifiable Privacy
XACML eXtensible Access Control Markup Language
xvi
1Introduction
Contents
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 System requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Thesis Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.6 Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1
1. Introduction
The majority of interactions on today’s internet is driven by personal user data. These infor-
mation pieces come in different shapes and forms, some being more valuable than others. For
example, banking details might be considered more valuable than a persons’ favourite playlist.
What all of these data pieces have in common is that they all belong to some specific user. This
property, however, is not reflected in how data is hosted and organized over the web, since the
hosting entities of personal user data consists of multiple service providers. Data belonging to a
single user is fragmented and kept independently under different control domains based on the
context. For example data related to somebody’s social life might be stored in some social net-
work provider, while the same person’s favourite playlist might be hosted by his music provider
service. Different initiatives exist to unify these scattered data. The Personal Data Vault (PDV)
can be considered one such proposed solution.
The PDV is a user-centric vision of how personal digital data should be hosted. Rather than
having bits of informations scattered around multiple sites, the PDV tries to capture these under
a single control domain. Every user is associated with his own PDV where he hosts his personal
data. PDVs are not only secure storage systems, but also offer ways to make access control
decisions on hosted data. External entities, such as different service providers, can request user
data at the user’s PDV, in order to provide some functionality beneficial for the owner of the PDV.
By unifying the source of the personal user data, we are expected to achieve a more flexibility and
better control over how data is being disclosed. By employing an access control solution users
can have assurance that only authorized entities are going to get access to their data. It does
not, however, provide any privacy guarantees with regard to how personal data is being protected
after it leaves the control domain of the PDV.
The PrimeLife [5] was a European project that researched technical solutions for privacy guar-
antees. Their privacy enhancing model introduces a novel privacy policy language, which em-
powers both users and service providers to specify their intentions with regards to data handling.
The privacy policy language, however, lacks the technical enforcement model needed to support
its correct functioning. This enforcement model is required to provide trust and assurance to end
users. A trust relationship needs to be established between remote entities prior to personal
data exchange, while assurance needs to be provided as proof that user intentions have been
respected.
We propose a novel privacy policy enforcement model with an integrated trust and assurance
framework. Our solution utilizes the completely decentralized construct of a Distributed Hash
Table (DHT) to sustain a mediated space between PDVs and service providers. This mediated
space serves as a platform for privacy enhanced data sharing. Pointers to the shared data objects,
which live in the mediated space, are kept by both the owner and the requester. This way data
owners can stay in control over their shared data. A distributed logging mechanism supports our
enforcement model in delivering first hand assurance to end users.
2
1.1 Motivation
1.1 Motivation
Personal user data is becoming a highly demanded and valuable resource, not just for the
users themselves, but also the service providers. Data analytics are carried out at different sites
in order to bring businesses forward. Sometimes these operations on personal user data are even
carried out without the awareness of the user.
Users are mostly unaware of how the explicit data that they provide, like name, address, phone
number, etc. is handled by service providers, such as social networks or e-commerce systems.
Moreover, users also lack control over the information that they are willing to share. The lack of
control manifests in two ways: users are unable to specify the scope in which their data shall be
used, and sometimes they are also unable to retrieve and remove personal information hosted
on a service providers network. The lack of awareness and control leaves the user defenceless
against privacy violations.
The system in place today, used to avoid the privacy violations described above, is built around
a trust framework. The Privacy Policies offered by service providers are considered to be the
pillars of this trust framework. These Privacy Policies are often presented to the end user in the
form of static texts, describing how personal user information is going to be treated by the data
collector. Nowadays we are used to seeing more diverse privacy options that can be set by the
end user, like the sharing setting regarding a post in a social networking website.
The main problem we are faced with when looking at the approach to provide data privacy is
that it is highly unbalanced. It offers guarantees of a one-sided privacy system, since the data
collector is the sole entity that decides how personal data is handled, without the involvement of
the user. This leaves their clients with a ”take it or leave it” offer, which clients are often willing
to take. The result of this compromise is that user data ends up under the full control of the
data collector. Another problem with these Privacy Policies is that they are often lengthy and
ambiguously stated, such that they become hard to decipher for the average user. Moreover, it
only offers a static policy setting that might not fit every user’s requirement. Their more dynamic
counterpart, the user settable privacy options, offer a bit more flexibility, but the implementation
of these settings are again fully up to the data collector himself. This in turn means that they can
revoke or modify these privacy options without the consent of their users.
The lack of a system that promotes the user-centric vision with regard to privacy concern
motivates us to look for possible alternatives to improve how we handle personal data privacy
today.
1.2 Problem Statement
The problem that this thesis focuses on is the one of providing privacy guarantees for a system
where PDVs are widely used. Although the PDV concept allows to have a fine grained access
3
1. Introduction
control over the user’s personal data, it still fails to address the issue of how remotely stored
data should be protected. It is important to notice that once the user chooses to disclose some
personal data he is left vulnerable to privacy violations. User privacy can become compromised
through unawareness and lack of control.
1.3 System requirements
In order to provide a higher degree of awareness and control to the end user the underlying
technology needs to provide a higher level of trust and assurance. The user-centric design of the
PDV system, although offers a comprehensive picture on how data should be organized, leaves
many specifications open regarding the privacy requirements. The following details the major
requirements set by this thesis. This list of requirements forms a solid foundation of the trust
framework that in turn focuses on achieving a user-centric model. They are as follows:
1. Establishing trust between actors, like service providers and data owners. Trustworthiness
refers to the degree of assurance in which an actor can be trusted to carry out actions that
he is entrusted with. The user needs to have some sort of mechanism to determine whether
a service provider is going to treat his data according to pre-agreed set of rules. Pre-agreed
rules, or data handling rules, should be formulated in agreement with both parties, and they
should adhere to the correct handling of personal data.
2. Transparent user data handling should be a priority for every service provider. Users need
to get assurance that their preferences on how to handle their data are carried out by the
actors. Assurances are a form of trustworthy reports that describe the business process
that has been carried out over the user’s data. Continuous assurance will turn into a higher
degree of trust that users can develop over time.
3. Data protection across multiple control domains is needed in order to facilitate the safe
interoperability of multiple service providers. Delegation of rights to forward user data is a
common use case, therefore there should be a clear model that describes how delegations
take place, and how the data protection rules apply to the third party who receives the data.
4. Maintaining control over distributed data promotes user centrality. In the user-centric
model the owner of the personal data is considered to be the user, even in the case when he
chooses to share it with other parties. He must have a way to continue his rights to exercise
operations on his personal data, such as: modifications, revocation of rights, deletion, etc.
1.4 Contributions
The goal of this thesis work is to research the existing privacy enhancing techniques that could
be employed in a PDV oriented system. The first contribution for the work is to investigate whether
4
1.5 Thesis Scope
the privacy policy language proposed by the PrimeLife [5] project fits the highly distributed PDV
system.
The second contribution is to categorize several different privacy enforcing models for the
considered problem. These models are used to guarantee the correct functioning of privacy
policies established in the first contribution, by covering some of the existing privacy enforcing
techniques proposed by related research. While formulating these alternatives, we proposed a
novel privacy enforcement model, which relies on the concept of a mediated space where shared
objects live.
The third contribution is to provide an evaluation of the proposed privacy enhancing models
herein formulated. This evaluation takes into account different tradeoff criteria, namely the initially
proposed requirements, feasibility, trust source, and vulnerabilities. By doing this, we evaluate the
strengths and weaknesses of our proposed models.
The final contribution is the development of a prototype implementation based on our novel
enforcement model, to show that the proposed concept can be carried out within the scope of
currently existing technology.
1.5 Thesis Scope
The design and evaluation of different privacy enforcement models used together with PDVs
bears complexities beyond the scope of this thesis project. First of all, we refrain from talking
about the detailed design and architecture of a PDV. Furthermore, we also do not consider every
security aspects related to the PDV concept. Instead, we use PDVs as abstract building blocks
clearly defined in Section 2.1.
The definition and design of the proposed privacy enforcing models in this thesis are also not
subject to a complete security evaluation, as we are more concerned with the privacy aspects.
Assumptions on the existence of secure channels and storage systems are made throughout this
thesis. Moreover, we also assume a well defined identity framework which guarantees the identity
provisioning and verification of every actor in the system.
Providing privacy guarantees is also a vast research field by its own. This thesis is focused on
enforcement techniques for privacy policy languages, such as the one outlined in the PrimeLife
project. In order to define a clear goal for the thesis, the scope of the work regarding the design
of the enforcement models is narrowed down to a set of requirements outlined in Section 1.3.
Requirements are targeting aspects, such as: trust establishment, data handling transparency,
data across multiple control domains, and maintaining control. These requirements also serve as
a basis for evaluation. We refrain from talking about any quantitative performance measurements
in our evaluations, since the thesis is carried out on a conceptual level.
5
1. Introduction
1.6 Dissertation Outline
The upcoming chapters are organized as follows. Chapter 2 focuses on the description of
the background concepts used in this thesis, containing the research involving PDVs and a short
study on privacy concerns. Chapter 3 presents relevant projects involving research in privacy
enforcement techniques. Chapter 4 presents the three privacy enforcement models herein pro-
posed, highlighting the novelty of the proposed solution, called Mediated Privacy (MP). Chapter 5
contains the evaluation of the models proposed in Chapter 4, based on our requirement set, and
other metrics, such as feasibility and trust source. Chapter 6 concludes the thesis with a summary
of the conducted work and suggestions with regards to future works.
6
2Background
Contents
2.1 The Personal Data Vault . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Personal privacy concerns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
7
2. Background
In Chapter 2 the relevant background material used to carry out this thesis is presented. The
first section is focused on detailing the concept of a Personal Data Vault (PDV). The second
section focuses on the description of privacy concerns, namely: awareness, control and trustwor-
thiness.
2.1 The Personal Data Vault
The interactions that people are having over the Internet contain a significant percentage of
personal user data. Users are asked to provide personal information in exchange for access to
some advertised online service. For example, a person might use a social media site to stay
connected with friends and share information about himself, such as name, address, likes, and
dislikes. This person might also be part of other social community sites, such as a virtual bookclub,
or a career portal, where she has to share similar personal information again. Following this
model, the data that belongs to a single user will end up at multiple hosting sites.
This model, although suits the needs and desires of the service providers, leaves the users in a
difficult position when they want to interact with remotely hosted data. It is becoming increasingly
difficult for users to collect their data from multiple control sites to provide interoperability. One of
the downsides of it is the phenomenon called lock-in. It is getting increasingly difficult for users to
migrate between services that they are using, because the data that they previously shared with
a service provider is locked-in under their control domain. Another concern is data fragmentation
which lets data exist in inconsistent states. A user can have his address hosted by different
services, but under different formatting, which in turn may lead to confusion when interoperability
needs to be provided. The root of all of these concerns are that the user lacks the appropriate
fine-grained control mechanism over his own data.
In order to provide a solution for easy interoperability and fine-grained control the Personal
Data Vault proposes a user-centric design that tries to unify personal data under a singe control
domain.
“Built with security and legal protections that are better than most banks, your vault
lets you store and organize all the information that powers your life. Whether using a
computer, tablet or smartphone, your data is always with you.” [19]
The Personal Data Vault also appears under various other terminologies, like “Personal Data
Store” or “Personal Data Locker”. The attempts to formalize the concept of a PDV are comple-
mentary in the sense that they all try to focus on providing a better control over personal data for
the end user. However, a clear formalization of the term is still missing, since projects are built
with different aims in mind. Some of them conceptualize a raw storage service with the only pur-
pose to host data securely, while others focus on providing software solutions to manage already
existing storage spaces or even link different user accounts.
8
2.1 The Personal Data Vault
There have also been efforts to categorize different approaches that research projects take in
order to formalize what a PDV is actually like [29]. These fall into three main categories:
1. Deployment of these unified user data stores can be facilitated by a centralized cloud-
based service, which in turn grants the user full control over the hosted data. On the other
hand, this requires a high level of trust in the hosting entity. Alternatively, deployment can
also be split between multiple trusted hosting providers, or even kept under end user’s local
machines.
2. Federation is also an important consideration that focuses on interoperability between mul-
tiple different storage providers and individuals. It tries to outline different interaction models
that facilitate the collaboration between different deployments.
3. Client-Side solutions are targeting individuals to use their own devices as data hosts to-
gether with a social peer-to-peer network. Without the need of a centralized entity to govern
data movement the solution focuses on a more ad-hoc solutions.
There is also a substantial difference in how these projects envision the data model and inter-
nal storage system that are used for hosting personal user data. While some are leaning towards
using Relational Database System (RDBS), others are looking into solutions such as Not Only
SQL (noSQL), and semantic Resource Description Framework (RDF) stores.
Since security is a central concern of all of these solutions, they mostly come with an additional
data access layer on top of the storage system. This access layer facilitates the interoperability
between different entities in a secure manner. The fine grained control can be achieved through
the use of access control mechanism that rely on predefined policies. These policies can either
be confirmed by the end user, or constructed on the fly.
Another key aspect of these projects is the interoperability of different entities [11]. PDVs
should integrate seamlessly with other entities and facilitate the secure sharing of data across
different control domains. The security of these operations can be guaranteed by providing en-
crypted channels between entities. These interactions can be of multiple types depending on the
acting sides. Person-to-person connections are trying to connect individuals: independent entities
that serve as representative hosts for a person. Person-to-community solutions try to formulate
groups of persons depending on some social context. Person-to-business connections are de-
scribing how individuals are interacting with different service providers. In order to achieve these
features interoperability needs to be provided, that overcomes the differences in the underlying
data model with the aid of standardized APIs and protocols.
2.1.1 PDV as an Abstraction
For the purpose of this thesis work the PDVs is treated as an abstraction of a data layer
together with a manager layer. We consider these to be entities made out of a single or multiple
9
2. Background
machines with high availability. Moreover, we consider them resilient in the face of failures and
secure in the face of vulnerabilities and exploits that may be used directly by a potential attacker.
Herein, we disregard these security aspects, and focus on the privacy concerns that appear in the
interoperability scenarios.
Figure 2.1: Personal Data Vault Abstraction
Figure 2.1 depicts the high level abstraction of a single PDV entity. The data layer on the bottom
of the abstraction represents the collection of hosting machines, that facilitate secure storage of
personal information. These machines can either be found under the direct control of the data
owner or they can also be multiple interconnected machines residing on external entities that are
fully trusted. Again, the purpose of this project is not to investigate safe data storage for PDV, but
rather focus on what happens to data once it leaves the PDV.
The manager layer above the data layer acts as a guard for the personal data. It guarantees
that only authenticated and authorized requesters are able to get access to data. The rules de-
scribing the access control policies in place are under the full control of the PDV owner. Secondly,
it also offers an external interface that facilitate the interoperability with different PDVs and external
service provider entities as well.
2.1.2 PDVs in the Healthcare System
Several research projects involving privacy enhancement [14][16][18] are focusing on the
healthcare system as their main use case. Benefits of a safe and reliable information system
interconnecting healthcare centers clearly outweighs the benefits in other domains, because of
its potential of saving human life.
The information systems of healthcare centers operate on Personal Health Records. A Per-
sonal Health Record (PHR) is a collection of relevant medical records belonging to a single patient,
containing information such as chronic diseases, check-ups, allergies, etc. PHRs are, usually,
hosted by the healthcare center in which a patient was examined. This design requires PHRs to
be shared among different health centers in cases of patient migration, or emergency situations.
This can become cumbersome, since it requires interoperability of multiple independent services.
10
2.2 Personal privacy concerns
Figure 2.2: Personal Data Vault in the Healthcare System
The user-centric design focusing on data unification fits the presented healthcare scenario
operating on PHRs. Instead of healthcare centers hosting PHRs, they could be kept directly in
a PDV, under the direct control of the owner of the PHR. Figure 2.2 illustrates how a PDV can
become beneficial in an emergency scenario. Imagine Bob, owner of PDV-Bob, is using the Home
Hospital Service for his regular check-ups and treatments. During check-ups and treatments
the Home Hospital Service extends Bob’s PHR with relevant information, such as his allergy
of antibiotics. His PHR is regularly updated in his PDV. Imagine Bob going on vacation in a
foreign country, and suffering an accident where he loses consciousness. As Bob is taken into
the foreign hospital, the doctors determine that he needs antibiotics in order to prevent infections.
Instead of a rushed procedure, the doctor could first discover the patient’s identity from his ID
card, then consult his PHR, from PDV-Bob, through the Foreign Hospital Service. Assuming that
hospital’s staffl are authorized to access Bob’s PHR, the foreign doctor can discover his allergy
and administer an alternative solution, potentially saving Bob’s life. His treatment in the Foreign
Hospital can be appended to his PHR and followed by the Home Hospital, once Bob returns from
his vacation.
2.2 Personal privacy concerns
The maintenance of personal privacy is becoming an increasingly important concern in how
businesses are conducted over the internet today. The safeguarding of personal privacy rights
is relying on a tangled framework which incorporates legal regulations and business policies.
Business policies are required to be built on top of existing regulations that are in place at the
location where the said business is conducted.
For example, the Data Protection Directive formulated by the European Union [12] is one such
legal regulation that provides a set of guidelines on how personal user privacy has to be protected
in the virtual space. In the literature [12][26][5] we can highlight two important terminologies in
use: the concept of Data Subject (DS) and Data Controller (DC). The Data Subject is an individual
who is the subject of personal data. This may commonly be associated with the average user or
client that is sharing some personal data. The Personal Data Vault (PDV) being an entity under
11
2. Background
the control of its owner can also be considered as a DS. The Data Collector is an entity, or a
collection of entities, who is in charge of deciding how collected personal data from the DS is
used and processed. Most of these regulations are targeting the interaction between the DS and
DC to assure that personal data is only collected and processed with the consent of the DS.
The Data Protection Directive has been around since 1995, however, due to the changes in
the IT technology and best practices since then, the directive is becoming obsolete. It fails to take
into account concerns enveloping technologies such as cloud based services or social networks.
A new directive has been proposed [13] in order to face these challenges, since business policies
are starting to become increasingly divergent from initially established regulations.
This new regulation tries to clarify and improve privacy regulations. However, the implementa-
tion of new reforms are always time consuming, and with the quickly changing technology there
is no guarantee that these new regulations will not become obsolete once again. There is also a
great difficulty in formalizing how these regulations protect personal data across different political
zones where other regulations are in place. Business policies associated with service providers
are global, since their services are available regardless of physical location, in most of the cases.
Privacy regulations, on the other hand, are locally applicable laws that change across borders.
The difficulty lies in integrating different local regulations together, since sometimes they are in-
compatible.
The privacy concerns formulated by this and other data protection directives can be catego-
rized under three important aspects [17], namely: awareness, control, and trustworthiness.
Awareness:
The first concern related to privacy is awareness. DSs have to be aware of how the data that
they share is going to be handled by the DC. Handling of data should be in accordance with the
purpose of usage and policies agreed upon by the DS. Policies describing user data handling are
usually provided by DCs and they include information like: processing policies, modification and
forwarding of personal data.
These policies alone, however, only offer a limited amount of awareness for DS on how their
explicitly shared data is processed. More alarmingly, implicit data collected about user behaviour
on the internet, like search keywords, visited pages, clickstreams are also collected and pro-
cessed without the user’s consent. Service providers, such as social networking websites and
e-commerce systems, are notoriously infamous for their practices in collecting their users’ per-
sonal information and through different analytical and profiling techniques use it for different pur-
poses, such as targeted advertisement. Moreover, personal records may also be disclosed to
third parties, such as governments, without the user being aware.
Unawareness of how these personal information pieces are used surrounds many interactions
over the web. In some cases, users can end up giving consent unawarely for information sharing
12
2.2 Personal privacy concerns
because of deceitful user interfaces, or simple carelessness. When seeking comfort in the per-
sonal privacy policies provided by DCs, users can also be left confused because of the complexity
and the abstractness of these statements. Missusage of personal data can lead to problems such
as decontextualization. Explicitly shared personal information can get processed and reposted
under a different context or purpose from which it was initially designed to. This may lead to
confusion and loss of personal privacy.
Control:
Control is the second aspect that surrounds privacy concerns. The policies governing personal
data handling should be created in correlation with the user’s preference. Many service providers
offer a set of privacy options which can give liberty to the user to formulate different privacy
profiles. These options, however, lack the fine-grained control which the users need to have over
their shared data. Policies should be flexible enough to let the users formulate how their data can
be processed or even disclosed to third parties. There is also a need for being able to modify or
even revoke previously given consents. Users should be able to retrieve their personal data at
will.
There is also another category of personal data, called indirect data, which completely lacks
means of control by the DS. Indirect data can be considered data that are not explicitly shared by
a DS, but it is still connected to his identity. For example, pictures that other people share of you
over social networking sites can be considered as indirect data. Frequently, systems offer little
to no control over data objects which are not shared explicitly by a user, but are still tied to his
identity. This in turn can lead to disclosure of personal data without the consent of the original
data owner.
Another concern surrounds the way in which service providers physically host personal data.
In order to offer features such as high availability and fault tolerance, systems often keep replicas
and backup copies of data objects, sometimes across different control domains. This leads to
difficulties when a user decides to discontinue the use of a service, and requests the service
provider to delete all previously shared data. In many cases these service providers are retaining
backup copies for an indefinite amount of time, even after the request for deletion has been
completed.
Trustworthiness:
The mechanism to provide awareness and control are complemented with trust. Trust is given
to DCs by DSs if they follow regulations and respect privacy policies. The existing privacy regu-
lations should serve as the baseline of trust. However, as shown before these can often lead to
confusion whenever contradictory regulations are encountered.
Data Collectors are also trusted to have a secure system resilient to vulnerabilities and outside
13
2. Background
attackers, such that personal data cannot be directly stolen. Failure to implement secure software
solutions may lead to disastrous personal privacy violations in the face of data theft. Unfortunately,
the technical means currently in use are providing little to no assurance in how well these systems
are privacy compliant. Providing a highly trusted service should be a priority of every service
provider, since the lack of trust discourages new clients from using the advertised services, which
in turn is bad for business.
Trust also applies to all entities that get access to users’ personal data. For example in the case
of a social networking site the service provider is trusted to offer a secure and privacy respecting
service, but friends who have direct access to a person’s information are also trusted not to use it
without their consent.
2.3 Summary
The background of this thesis work involves privacy concerns in Personal Data Vaults (PDV). A
PDV is an entity associated with a person or a business, providing safe storage and secure access
to personal data. For the purpose of this thesis, we are using PDVs as abstract building blocks
which serve as the main sources of personal user data. The applicability of PDVs is demonstrated
through the example presented in the domain of the healthcare system, using Personal Health
Records.
Privacy concerns, such as awareness, control and trustworthiness, are surrounding online
interactions. The Data Subject (DS) and Data Controller (DC) are two terms commonly used
to denote the user, whose data is being collected, and the service provider, who collects the
data. PDVs are generally seen as DSs, while the DC role is mostly assumed by external service
providers. Existing local regulations on privacy protection and business privacy policies are not
enough to prevent privacy violations, as shown by the examples of unawareness, lack of control,
and untrusted services.
14
3Related Work
Contents
3.1 XACML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Usage Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 TAS3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.4 PrimeLife . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5 Other Privacy Enforcement Techniques . . . . . . . . . . . . . . . . . . . . . . 23
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
15
3. Related Work
Chapter 3 focuses on existing related work which has been carried out with regards to the
domain of privacy enforcement. The Chapter begins with a short introduction to the XACML
policy framework in Section 3.1. Afterwards it presents some of the relevant research projects
involving privacy enforcement, highlighting the PrimeLife project in Section 3.4.
3.1 XACML
The eXtensible Access Control Markup Language (XACML) is an XML based policy language
standardized by OASIS [4]. The language itself provides a set of well defined policy building
blocks that facilitates the definition of complex access control scenarios. It supports multiple
policies on a single resource, which in turn are combined in order to provide an access decision.
The language is attribute based, meaning that actors and resources can be flexibly described by
a set of attributes. Version 3.0 of XACML also supports obligations for extended access control.
Obligations are specific actions that have to be taken on a predefined trigger, usually after the
access decision has been carried out. Its highly extendible design granted its popularity among
other existing policy language frameworks.
Figure 3.1: Overview of XACML Dataflow1
Apart from the language itself, it also offers a high level architecture that describes how the
policy language can be used to build an access control engine. The dataflow of a high level
architecture can be seen in Figure 3.1. Incoming access requests are routed through a Policy
Enforcement Point (PEP) depicted in point (2) of Figure 3.1, which offers a well defined commu-
nication interface with the rest of the architecture(3). The Context Handler dispatches the request
to a Policy Decision Point (PDP) (4), which is responsible for returning an access decision. The
PDP combines relevant policies stored in the Policy Administration Point (1) with the required at-
tributes (5). The required attributes (specific information on either a Subject, a Resource or the
16
3.2 Usage Control
Environment) are collected by the Policy Information Point (6)(7)(8)(9) and facilitated to the PDP
(10). After the PDP successfully combines the relevant policies, it returns an access decision to
the requester via the Context Handler (11)(12). Additional restrictions that might apply in the form
of obligations have to be carried out by the PEP with the help of an Obligation Service (13).
3.2 Usage Control
There have been many approaches over the years to achieve the safeguarding of valuable
digital objects. Traditional access control solutions offer a way to grant access to protected digital
objects only for authorized entities. These solutions, however, often require a set of predefined
entities in a closed system, such as a company. Trust management offers ways to employ access
control on unknown entities over larger domains.
Digital Rights Management (DRM) solutions are client-side systems that offer the protection
of disseminated digital objects. Each of these mechanisms focuses on different digital object
protection solutions depending on context and requirements.
The Usage Control (UCON) [27][25] research tries to formalize a more extensive solution that
offers digital object protection by embedding traditional access control, trust management and
DRM together with two novel approaches for data protection. UCON tries to capture the whole
lifecycle of a data object, even after it goes beyond authorization. By focusing on the whole
lifecycle, UCON provides the privacy features that previous systems with digital object protection
lack. The two proposed concepts that allow UCON to provide a more extensive control mechanism
over its predecessors are the mutability of attributes and the continuity of access decision.
UCON is envisioned to follow attribute based access control, which requires data requesters
to poses a set of attributes that makes them eligible for authorization. Attributes are used to
formulate rights that a given subject has on a given object. Up until now, this can be realized
through the use of a traditional access control system. The mutability of attributes refers to the
dynamic nature of the attributes, which can be subject to change. Based on these dynamic
changes the authorization rules also have to adapt and be re-evaluated to provide a potentially
new access decision.
Continuity of access decision means that UCON tries to enforce certain security policies not
only during authorization, but also while the object is being used, and after usage, thus covering
the whole lifecycle of it. It carries this out by the use of certain policies that can appear under the
form of:
Authorizations: a set of required attributes that have to be provided and verified during the
pre-authorization phase. This can include certain identity checks of the requesting party.
Conditions: seen as attributes that describe environmental aspects that can affect the access
decision. For example, an object can only be accessible during a given timeframe of the day. Such
1Figure 3.1 source: http://ptgmedia.pearsoncmg.com/images/ch7 9780131463073/elementLinks/07fig09.jpg
17
3. Related Work
conditions have to be evaluated on the pre-authorization phase and during the ongoing usage of
the object.
Obligations: predefined rules that safeguard a protected object after the authorization phase
has granted access to it. Obligations can be activated in any phase during or after the access
decision, and provide privacy enhancing features.
3.2.1 UCON in practice
Although there have been many proposed approaches to implement UCON [27][25][7] it is
generally considered to be a hard problem, given the complex and demanding set of requirements.
In general, UCON tries to realize a data protection framework that relies on the use of certain
enforcement points. These enforcement points can either be present on the server side, providing
a more traditional central approach; or on the client-side, which resembles a DRM system that
is controlling the secure dissemination of digital objects. Hybrid approaches have also been
proposed [27] that try to formalize a symmetric system where both client and server side are
becoming enforcement points. Another proposed solution in [7] is to harness the power of the
quickly growing cloud industry. It proposes the implementation of the UCON framework by shifting
the enforcement point into the cloud. A Software as a Service solution could provide safeguarding
of user data by policies and mechanism described by the UCON research.
Another subset of projects focuses on the security aspects of the enforcement points. In order
to guarantee that these nodes are in fact safeguarding digital objects by enforcing policies, differ-
ent technical measures can be taken. In order to provide assurance, [24] proposes monitoring on
different levels of abstraction. In practice, it focuses on how specialized monitors, such as an OS
monitor, can be used to trigger and carry out events described in obligations. Assurance can be
complemented by providing trust in enforcement points. They propose an implementation that fol-
low the design suggested by the Trusted Computing Group (TCG), which described how Trusted
Platform Module (TPM) enhanced hardware can be used to guarantee that a remote system is
tamper proof.
The features described by the UCON research served as one of the basis for the requirements
set for the models presented in this thesis work. The continuity of access decision captures the
idea of maintaining control over shared personal data objects that are no longer under the direct
control of the user. Moreover, some of the enforcement techniques associated with UCON are
also present in some of our proposed data protection models. We diverge, however, from the
vast focus of UCON to a more narrow scope involving privacy, which means that we are more
concerned about what happens to shared data after disclosure, rather than looking at the whole
lifecycle of digital object.
18
3.3 TAS3
3.3 Trusted Architecture for Securely Shared Services (TAS3)
Trusted Architecture for Securely Shared Services [6] was a European research project from
the Seventh Framework Programme (FP7) concluded in 2011 which addressed some of the se-
curity and privacy concerns regarding personal data distribution across data collectors. Its main
focus was to specify and design a security and trust framework that is generic enough to encom-
pass multiple business domains and provides a user-centric data management in a completely
heterogeneous setting.
In order to promote user-centrality, it examines the possibility of a PDV-like design where data
is kept under the direct control of the end users, rather than scattered around data collectors. The
interaction model required to support data sharing in such a model is facilitated by the Vendor Re-
lationship Management (VRM) [21]. VRM describes a reverse Customer Relationship Manage-
ment (CRM) model where service providers are the ones who subscribe to the users’ personal
information store to get access to data.
It also addresses the difference between by-me and about-me data. By-me data counts as
a direct form of personal data that is submitted or shared by the data owner explicitly. As an
example, a personal CV containing the professional background information of a person is by me
data. On the other hand, if this person attaches a transcript of grades from an institute, that can
be considered about-me data, since its issuer and verifier is the institute rather than the individual.
Control over about-me data can be considered much more cumbersome than that of by-me data,
since about-me data is often hosted and controlled by entities other than the subject of the data.
A proposed solution is to keep updated links pointing to about-me data such that the data subject
can place a relevant data handling policy next to it.
Other subprojects from within TAS3 are examining how changes to the policy framework guard-
ing personal data can promote user-centrality. Today’s unilateral policy system does not meet
the requirements concerned by data privacy, since it empowers the data collector to treat per-
sonal data at will. Traditionally, users are concerned about privacy, while service providers are
concerned about access control over their resources. Instead of treating these two concepts
separately, it tries to encapsulate them under a single bilateral policy framework that lets users
formulate privacy policies and service providers have access control policies. In order to combine
these two policy types, a policy negotiation framework is proposed in [23]. This framework is
responsible for the creation of data protection policies, constraining access to the shared data.
These policies are then signed and distributed in a non-refutable manner in order to assure that a
potential privacy violation can be discovered. Every entity is then responsible for evaluating and
respecting these contractual agreements in the processing and usage of every shared object.
A large part of the research focus is directed towards designing a federated infrastructure
[14][9] which is generic enough to accommodate many different use cases across heterogeneous
19
3. Related Work
systems. The need for high interoperability between independent organizations is partly achieved
by providing a privacy enhancing solution that does not rely on a specific policy language. Con-
straining access to personal data in highly distributed architectures require a complex decision
making process that sometimes relies on multiple independent Policy Enforcement Points (PEP),
which are designed in an application dependent manner. The incompatibility between different
policy frameworks used across different entities raises conflicts when a suitable protection policy
for a shared object has to be formulated. To provide interoperability across organizations a conflict
resolution framework is needed.
Policies and security concepts can have different implementations at different sites. The as-
sumption that all organisations use the same terminology when it comes to data protections does
not hold. In situations when two independent parties need to share data in a secure manner, a
policy negotiation phase has to take place. In order to provide an automated solution it proposes
an ontology based policy matching framework [10] which lets every actor express his security
concerns in his own vocabulary and provides a generic way to map between vocabularies.
Another approach [14] tries to solve the conflict resolution by introducing a central component
called the MasterPDP which governs and combines the independent access decisions coming
from the stateless Policy Decision Points (PDP).
A version that offers a better scalability is proposed in [9]. Instead of having a central decision
point, it introduces multiple application independent Policy Enforcement Points (PEP) that serve as
wrappers over every application dependent PEP, and mediates the access decisions between the
PEP and the PDP. These application independent PEPs are communicating on an independent
communication channel and serve the resolved policies to their application dependent PEP.
The requirements set by our proposed models, defined in Section 1.3, can be seen as a subset
of the requirements formulated by TAS3. We specifically offer an evaluation of our proposed mod-
els that takes into account the differences between by-me and about-me data. Although offering
a generic solution greatly increases interoperability, our solutions are not built with federation as
the main focus.
3.4 PrimeLife
The PrimeLife Project [5] was a research project conducted in Europe under the Seventh
Framework Programme (FP7), concerned with privacy and identity management of individuals.
They are addressing newly appearing privacy challenges in large collaborative scenarios where
users are leaving a life-long trail of data behind them as a result of every interaction with services.
Its extensive research domain investigates privacy enhancing techniques in areas such as policy
languages, infrastructure, service federation and cryptography.
The Privacy and Identity Management for Europe (PRIME) [8] conducted in FP6, predeces-
20
3.4 PrimeLife
sor of the PrimeLife project, also offers valuable insight on privacy and identity management. It
uses pseudonymous identities to achieve different levels of unlinkability between users and their
personal data trails in order to avoid profiling and preserve privacy. Moreover, it strives to give
back control to the end user by designing an architecture that enforces pre-agreed data protec-
tion policies of shared objects. The functioning of such a design is highly dependent on the trust
level given by the end users to service providers. PRIME tries to investigate the different layers
of trust. The system that lets individuals share data with a pre-agreed data handling policy needs
to be enforced by strong technical measures that provide trust and assurance. Major technical
solutions to achieve trust are rooted in verification of trusted platforms in order to guarantee that
remote services are privacy compliant.
The PrimeLife project follows the work outlined in PRIME. One of its major contributions is the
investigation and design of a suitable policy framework that encompasses the privacy features
which promote user-centrality and control of private data. The proposed solution is centred around
the development of the PrimeLife Policy Language (PPL) [33] which is a proposed extension of
the existing XACML [4] standard.
Figure 3.2: Collaboration Scenario2
The core idea of how PrimeLife is intended to use PPL to facilitate privacy options can be
described using a simple collaboration diagram in Figure 3.2. The scenario describes the inter-
action between the Data Subject (DS), who is considered the average user or data owner whose
privacy needs protection; Data Controller (DC), which denotes a wide range of service providers
that the user can be interacting with; and the Third Party, who is considered to be another entity
involved in the business process, like an associate of the service provider. The interaction is initi-
ated by the Data Subject who is requesting some sort of resource from the DC. The DC responds
with its own request, describing what kind of information he expects from the user in exchange
for the resource, and how he is willing to treat that information. The description provided by the
2Figure 3.2 source: http://primelife.ercim.eu/images/stories/deliverables/d5.3.4-report on design and implementation-public.pdf
21
3. Related Work
DC on how he will treat private personal data is called Data Handling Policy (DHPol). The DS
examines the list of information requested together with the DHPol, and combines it with his own
Data Handling Preference (DHPref). The DHPref is the user’s way to describe how his personal
disclosed information is preferred to be treated. A combination between the DHPol and DHPref
results in a Sticky Policy that is sent together with the requested personal data, in exchange for
the resource. The Sticky Policy contains all the relevant data protection rules which have to be
respected by the DC. The direct collaboration between DS and DC ends here. However, the DC
may decide to forward the collected personal data from the DS to a Third Party. In this case, the
DC has to consult the Sticky Policy first, in order to examine whether he is allowed to forward the
information collected from DS or not, and act accordingly. In order to support such a scenario an
expressive language is needed. The PPL is a highly descriptive and easily extendible language
that can support the collaboration scenario described above.
PPL builds on the concept of the existing Sticky Policy paradigm, which serves as the basis
for many privacy and data security related research projects [5][9][28][22]. Sticky Policies are
data access rules and obligations formalized for machine interpretation that are tied together with
a given data object which they protect. The intuition behind it is that data moves around across
multiple control domains together with its associated Sticky Policy, which in turn describes how the
data can be treated. This requires the data object to be closely coupled with its Sticky Policy. In
order to assure that these policies will not get stripped off and ignored, certain Policy Enforcement
Points (PEP) are required to enforce their usage.
One of the contributions that the PPL brings to the existing Sticky Policy paradigm is the two-
sided data handling policy/preference that lets the DS and DC formulate a sticky policy suitable for
both needs. As PPL is designed to be interpreted by the machine it also comes with an automated
matching engine that is resolving conflicts between DHPol and DHPref. It is a symmetric language
that requires both parties of the interaction to formulate their policies in this language.
The language offers a strong expressive nature by which complex policies can be formulated
to accommodate different use case scenarios. Provisional actions and required credentials can
be specified in order to require some authentication before authorization. Data can be kept under
the protection of the purpose of usage, which is used to constrain the actions that DCs can take
with the collected data. It also allows for users to express whether their data can be forwarded to
third parties or not, and under what conditions. More complex use cases can be modelled through
the use of obligations. Obligations are a set of actions that have to be taken when triggered by a
specific event. For example, an obligation could specify to send an acknowledgement back to the
data owner every time his shared personal data gets forwarded to a third party.
Research involving the development of the PPL [32] is also concerned about how the individu-
als fit in this new policy framework. Novel methods for human-computer interactions are required
in order to ease the task of formulating complex data protection policies for the end user, since
22
3.5 Other Privacy Enforcement Techniques
DHPref are fully relying on the assumption that the end user is able to comprehend and formu-
late his own policy. Moreover, situation where the policy matching engine is unable to combine
a DHPol with a DHPref, require an explicit consent and interaction from the end user before the
process can continue. In order to keep the demand for human interaction low an expressive User
Interface (UI) needs to be provided.
This thesis considers the PrimeLife Policy Language (PPL) as its main tool by which privacy
guarantees are provided. However, instead of focusing on the language components of the PPL,
it targets the enforcement model that can be used together with it.
3.5 Other Privacy Enforcement Techniques
3.5.1 DRM approach
Digital Rights Management (DRM) systems are used in order to offer a protection mechanism
of distributed digital content over the web. They offer technical means, such as cryptography
and access control, to safeguard the access to protected content. To achieve this, specialized
software needs to be deployed on the machines of clients requesting access to these protected
data objects. Once the digital content is distributed to the client side, the DRM system prevents
unauthorized usage of it.
User privacy protection, just like distributed content protection, is concerned with the safe-
guarding of personal user data. The valuable resource of privacy protection is the personal data
itself. It is easy to observe the parallel between the requirements of privacy protection and dis-
tributed content protection, since they both can be seen as digital data. DRM-like solutions have
been proposed to overcome the challenges of privacy protection [20]. The client side DRM trans-
forms into a Privacy Rights Management system deployed at the Data Controller (DC). This new
component is then responsible for safeguarding private user data once it has been disclosed, by
enforcing the data protection policies applicable for the disclosed data.
It is worth mentioning that DRM systems are not bulletproof, in the sense that they fail to
offer any kind of protection once digital data has been disclosed in plain site. DRM offers only
limited amount of protection that can sometimes be overcome by technical means. A proposed
Privacy Management System would suffer from the same limitations. Moreover, the operator of
such a PRM system is required to be trusted by the users who are willing to disclose personal
information. Another consideration is that current DRM systems usually require a client-server
scenario, whereas with entities such as PDVs and interconnected service providers, we are facing
a much more distributed peer-to-peer-like structure, where roles such as DS and DC can be
applied interchangeably on a single entity depending on the context.
23
3. Related Work
3.5.2 Trusted platform
Trust is one of the central requirements when it comes to sharing protected data between
unknown entities. The Trusted Computing Group (TCG) defines trust as “the expectation that a
device will behave in a particular manner for a specific purpose” [3]. The TCG offers a range of
technical solutions to accommodate the rising needs for secure systems.
Security is a concern on both the software and the hardware level. They propose an enhanced
hardware extension that serves as the basis of a trusted system. The Trusted Platform Mod-
ule (TPM) is a hardware component closely integrated with the motherboard that offers security
features such as: RSA key generation, cryptographic operations, integrity check. By possessing
an embedded asymmetric keypair the TPM is considered to be the root of trust for the platforms
using it. Being a hardware component it is also considered tamper resistant.
Several solutions have been proposed [22][30] for achieving privacy protection through the
use of trusted hardware, and TPM in turn, by using software attestation techniques. By using
the functionality of the TPM, the integrity of a running application can be attested dynamically.
Checking the current state of an application against an expected value can bring assurance of the
validity of the application, proving that it has not been tampered with. Privacy protection solutions
use remote software attestation to prove that a given software component is in a valid state on
the remote machine. It can, for example, provide proof that a known privacy policy enforcing
software component is in place on a remote server, which brings assurance to the end user that
his protected data is in capable hands.
3.5.3 Cryptographic techniques
Cryptographic techniques are mainly used to ensure secrecy with regard to safe storage and
transporting of sensitive information. There are initiatives researching the use of these also in the
privacy protection domain.
One of the proposed cryptographic models for privacy protection is called Type-based Proxy
Re-Encryption (PRE) [36][18]. It assumes a semi-trusted Policy Enforcement Point (PEP) with
an honest but curious nature, meaning that he is trusted to carry out user intentions, but is also
curious about the shared data for his own purposes. The PEP is trusted to hold the data encrypted
with the data owners public key together with its sticky policy. When a request arrives that asks
for the data, first an authorization is carried out against the Sticky Policy. On permit, the PEP
re-encrypts the data, such that only the recipient can see it. In this setting the PEP becomes the
proxy that performs the re-encryption. They claim that if the receiving party and the PEP are not
conspiring, it is safe to assume that the PEP is not able to decipher the protected data. It employs
the usage of asymmetric keys and assumes that key dissemination and identities are placed and
verified by a trusted third party.
They take this solution further with the type-based PRE which assumes that there are multiple
24
3.5 Other Privacy Enforcement Techniques
proxies from which the user can choose depending on the secrecy and security that he or she
needs. The advantage of this is that if one of the proxies get compromised, there is only a partial
loss of data.
Following their vision, the PEP proxy can be the same as a semi trusted service provider, who
is responsible for distributing personal data. A web-based health-record system, for example, is
responsible for the safe storage and management of personal health records. In this simplified
scenario there can be a doctor and a pharmaceutical company both requesting a personal health
record for different purposes. Let us assume that the owner of the health record specified in his
policy that data can be forwarded to his doctor, but not to any pharmaceutical company. Since the
health-record system is only semi-trusted, the user stores his data encrypted with his public key,
and only provides a re-encryption key tied to the identity of the trusted doctor. In this scenario
the PEP of the health-record system will only be able to re-encrypt the cyphertext for the eligible
doctor. Even if he tries to examine or forward the personal health record to the pharmaceutical
company, all they will see is the cyphertext. By the encryption with the data owner’s public key, his
privacy will be protected, since neither the health-record system, nor the pharmaceutical company
will be able to decipher it.
The solution outlined above, however, is only suitable to a subset of existing use cases. It
does not take into account, for example, service providers who are processing user data in an
automated manner. This becomes impossible under the use of the PRE model, since the data is
encrypted.
Other research projects [15] investigate the potential of self-destructive data. In order to avoid
the persistence of user data in data copies, the self destructive data model offers a method to
render data unavailable after some period of time for everybody, even for the owner of the data.
Their motivation is to avoid unauthorized disclosure of information even if it means losing the
information completely. Some private data, such as private emails do not need any persistence
after they have been received and viewed.
They employ a cryptographic method called threshold-based secret sharing, where a symmet-
ric encryption key is split into multiple pieces, but can be reconstructed with a threshold amount
of key pieces. By their design, personal data gets encrypted with a randomly generated key, that
gets split into multiple pieces and scattered in pseudorandom locations on a Distributed Hash
Table (DHT). The cypthertext, together with hints about the key pieces, is then transmitted to the
recipient via some service. In order for the receiver to be able to decipher the data, he has to
recompute the shared key from its pieces. The receiver will only have to retrieve a subset of the
scattered key pieces from the DHT in order to recompute the encryption key. Once the key is
recomputed, it can access the received object. Their security model relies on the high churn rate
[34] of the DHT which makes key shares impossible to retain after a given time, either because
responsible nodes leave the system, or data gets expired and deleted. The churn rate refers to
25
3. Related Work
the rate at which nodes enter and leave the DHT system.
3.6 Summary
This chapter focuses on the description of existing privacy enhancing techniques. The eX-
tensible Access Control Markup Language (XACML) is an accepted standard, that comes with
a descriptive resource protection language and a high level architecture. Given its flexibility, it is
employed as the basis for many privacy related work.
Usage Control (UCON) represents a vast research area that is focused on the protection
of user data through its whole lifecycle: before authorization is granted, during authorization,
and after authorization. Two main concepts introduced by it are the mutability of attributes and
continuity of access decision.
TAS3 is another initiative focused on multiple aspects of privacy protection, mainly interoper-
ability and federation. The requirements formulated in Section 1.3 are a subset of the high level
requirements defined by TAS3.
The PrimeLife project offers a privacy protection model based around the PrimeLife Policy Lan-
guage (PPL) and the Sticky Policy paradigm. Given the highly descriptive nature of the PPL,the
presented research focuses on how it can be used together with Personal Data Vaults, and what
kind of enforcement models can be built to support it.
Digital Rights Management (DRM) systems exhibit similarities with privacy protection, although
they do not completely cover every aspect of it. The Trusted Computing Group (TCG) conducted
relevant research in developing a trusted computing platform, which in turn can be used for privacy
protection. Other initiatives include cryptographic methods to protect the correct dissemination of
user data, only granting access to authorized parties.
26
4System Design
Contents
4.1 PrimeLife Policy Language (PPL) Integration . . . . . . . . . . . . . . . . . . . 28
4.2 Verifiable Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3 Trusted Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.4 Mediated Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
27
4. System Design
Chapter 4 is dedicated to describe the policy enforcement models proposed by this thesis.
The Chapter begins with an evaluation of the PrimeLife Policy Language (PPL) in Section 4.1 with
regards to integration into the PDV design. The description of three privacy enforcement model
follow, highlighting the novel solution proposed by this thesis in Section 4.4.
4.1 PrimeLife Policy Language (PPL) Integration
In order to meet the requirements in Section 1.3 we base our approaches on the existence
of a well-defined policy framework. This policy framework has to facilitate an extensible and
descriptive policy language that can easily be adapted in specialized use cases. The XACML
policy framework serves as a suitable choice, since its abstract architecture design and flexible
policy language makes it applicable in a variety of use cases. Unfortunately, however, the XACML
was designed to provide a descriptive access control mechanism, and only comes with a weak
privacy profile. The PrimeLife Policy Language (PPL) from PrimeLife, however, outlines a privacy-
oriented XACML extension, which allows for a better approach. We will evaluate how the language
feature of the PPL can fit our requirements.
Trust between two parties who are about to exchange personal information has to be estab-
lished prior to any access control decision. PPL provides two language features that have to
be fulfilled by the data requester: CredentialRequirements and ProvisionalActions. Credential-
Requirements contains a set of credentials that have to be provided by the requester to attest
a required attribute. These credentials are usually tied to a verifiable identity. By verifying ea-
chother’s credentials, both parties can assume a basic trust level. ProvisionalActions can refer
to any action that has to be carried out prior to any access decision. This can refer to signing a
statement or spending some credential (if the requested resource has a limited amount of time it
can be accessed).
Transparency of user data handling refers to the DS’s knowledge of how his personal data will
be treated by the DC. The PPL facilitates the use of the Sticky Policy paradigm through the Data
Handling Policy (DHPol) and Data Handling Preference (DHPref). The DHPol is the DC’s proposal
on how private user data will be used. The final policy, however, that provides transparency is the
sticky policy itself. Sticky Policies are created from resolving the DHPol and DHPref that refer to
the same object, and are composed of Authorizations and Obligations. Authorizations describe a
specific purpose for which a data object can be used, while Obligations can be used to express
more fine-grained control.
Authorizations also contain authorizations on downstream usage together with a purpose.
Downstream usage refers to the disclosure of personal information from the DC to Third Parties.
This language feature allows for a description on how personal data can be forwarded and used
across multiple control domains. The purpose attached to the downstream usage gives the user
28
4.2 Verifiable Privacy
an even greater flexibility in describing under what circumstances can data be forwarded. In data
forwarding scenarios the forwarded data copy has to have a Sticky Policy at least as strict as the
original data copy, in order to avoid the degradation of the protection level.
The main language feature that offers control for the end user is the Sticky Policy itself. Control
over the usage of a specific shared private information can be achieved by the modification of the
attached Sticky Policy. This method allows modification as well as revocation of accesses from
the user. Obligations being part of Sticky Policies allow the user to set constrains on data, after it
has already been shared. One such Obligation, for example, could require the DC to delete the
collected data after a specified amount of time.
The architecture of the system outlined by the PrimeLife project requires a specialized soft-
ware, or multiple interconnected software components, that are responsible for carrying out the
feature described by the language. Moreover, it is supposed to do this in a highly automated man-
ner, working with predefined access control policies, matching DHPol with DHPref, and enforce
Sticky Policies. This ’always on’ software component can be associated with traditional access
control systems of service providers, which portrays the DC. However, this specialized software
also has to be present at the DS site, which often portrays the end user. The PDV is a suitable
data organization scheme which can integrate any kind of specialized software.
This PrimeLife architecture also shows a high resemblance with the initial XACML architec-
ture presented in Section 3.1, relying on components such as Policy Enforcement Point (PEP)
and Policy Decision Point (PDP) to carry out access decisions. The PEP component, however,
becomes a crucial building block that is responsible for evaluating and enforcing Sticky Policies.
We will refer to this specialized software, often residing on a PEP and enforcing privacy policies,
as the Privacy Manager (PM).
As the Sticky Policy is considered to be the main element of user data protection, we also
introduce the abstraction of Protected Data (PD). The PD encapsulates the user data object and
its Sticky Policy under a single unbreakable logical unit. Throughout the formalization of policy
enforcement models we will use the PD terminology when talking about a shared data object
guarded by a Sticky Policy.
The following details the design and description of the enforcement models that are applied to
provide privacy guarantees by Sticky Policy enforcement using the PM.
4.2 Verifiable Privacy
This sections presents the Verifiable Privacy (VP) policy enforcement model together with
aspects of its architecture design and its interaction model.
29
4. System Design
4.2.1 Description
This model relies on remote software verification and monitoring solutions, hence its name:
Verifiable Privacy. This section is dedicated to describe a solution involving enhanced hardware
security. As the software systems running on the machines are becoming more complex and
stacked, keeping track of security aspects becomes increasingly difficult. Software bugs and
vulnerabilities are an unavoidable side effect of every system in production. In order to mitigate
the problem of unsecured software solution, today’s hardware components are built with strong
security aspects in mind.
The Trusted Computing Group (TCG) is a pioneer in the field of secure hardware. They offer
a range of integrated components that can help carry out certain security measures. One of their
main focus areas is the Trusted Platform Module (TPM), which is an embedded hardware com-
ponent, that provides a root trust in the system. By providing strong cryptographic functionalities
together with key generations, integrity checks, storage and reporting, the TPM provides a form
of attestation on the security measures of the software running on top of it. In more detail, the
TPM provides signed attestations of PCRs (Platform Configuration Registers), which contain in-
formation regarding the integrity, configuration and state of a software component. These signed
attestations can be verified by external parties. The TPM itself does not provide any security
solution on its own, it rather serves as a basis of trust between entities.
The Verifiable Privacy relies on a solution that harnesses the power of the security enhanced
hardware technology as an enforcement and trust mechanism. On top of this hardware a DRM-
like software solution is responsible for attesting and verifying privacy settings of sensitive data.
This DRM-like solution, referred to as the Privacy Manager (PM), intercepts all accesses to pri-
vate data from running applications, and performs local access control decisions. The correct
functioning of the TPM and the PM components is supported by another mechanism to keep the
running applications in a secure sandbox, isolated from unauthorized actions. In the next sections
we will elaborate on how these components fit together and what their responsibility is.
4.2.2 Prerequisites
Since the Verifiable Privacy is employing the power of security enhanced hardware, it is a
prerequisite, that every actor and machine involved in the transactions should be equipped with
TPM. We assume that these machines are secured from any physical tampering, rendering the
TPMs tamper-proof.
The TPM is also responsible for key generation and management for multiple purposes. It
generates asymmetric keys for both the Privacy Manager (PM) and any application running on
top of the platform. TPM is also used to verify that the public keys of these software components
are indeed bound to that specific machine. It has an internal safe storage of known keys, which
can be used to re-encrypt data depending on the requester. Apart from the keys that are meant to
30
4.2 Verifiable Privacy
be used by software, TPMs come equipped with a root key, for which the private key is embedded
in the hardware. Encrypting data with the public counterpart of this root key will bring assurance
that every data access by any software will have to consult the TPM before, in order to release
the private information.
Moreover, the PM should also be present on each PDV and service provider, in order to facili-
tate an interface for exchanging privacy related information between the actors. This components
can be placed at different layers as we will see later on, but its main purpose remains to carry out
privacy related actions, such as: remote attestation, trust establishment, and policy enforcement.
We are also assuming the existence of a certain Trusted Third Party (TTP), which plays an
important role in the correct functioning of the monitoring and assurance system described below.
4.2.3 Architecture
This solution tries to approach the sticky policy enforcement problem by simply assuming that
every machine that is involved in handling protected user data is essentially a Policy Enforce-
ment Point (PEP). As such, it focuses on the design of a common architecture for PEPs, that
will facilitate the interoperability of the system across multiple nodes, regardless of their control
domain.
When it comes to the designing of the architecture of a single PEP, we are faced with multiple
choices that we can make. The base architecture, however, as depicted in Figure 4.1, stays the
same. As one of our prerequisites, we have the TPM equipped hardware at the bottom layer.
Figure 4.1: Verifiable Privacy: Abstract Architecture of a single PEP node
On top of the hardware layer we have an abstraction called the Common Platform. When
deciding what the common platform should be we have to take into consideration the level of
isolation that we require to provide in such a system. Applications will have to reside on their own
isolated space, such that interactions that happen outside of this isolated space can be monitored.
This restriction becomes especially important when private data objects are transmitted to third
31
4. System Design
parties. The communication between applications and transmission of data between two separate
isolated spaces should happen with the consent and permission of the PM, who in turn enforces
Sticky Policies. In practice the Common Platform can be two things:
1. A trusted operating system could take the place of the Common Platform. The isolation
space, in this case, would be provided by the process Virtual Machines (VM) of the shared
operating system. Monitoring, in this case, would be done on the hosting operating sys-
tem, since inter process communications and external communications all go through the
operating system.
2. Another solution would be to replace the Common Platform with a hypervisor, and let stan-
dalone services run in their own system virtual machine, thus offering isolation on the operat-
ing system level. Virtualization technology is maturing really fast, sometimes even achieving
nearly native operating system speeds. Virtualization is also a commonly employed solu-
tion in cloud environments, which in turn are hosting several client oriented services on the
web. System VMs are much more heavyweight then their process VM counterparts, thus
there needs to be some planning involved when instantiating new services, not to waste
resources.
The strength of this model is also its drawback. Having applications running in their isolated
spaces with the PM attached to them assures that they are subjected to continuous monitoring,
and verification. The Verifier component makes sure that only eligible application get access to
personal data, while he Monitor keep track of ongoing system events to avoid misusage. Through
monitoring and verification the system delivers proof of trust and assurance to their users. This,
however comes at the price of a strict architecture design.
4.2.4 Privacy Manager Architecture
The Privacy Manager (PM) is the specialized software component, which is responsible for the
localized enforcement of privacy policies. Whenever a Protected Data (PD) object is requested,
either from an internal application or from an external entity, the PM is trusted to evaluate and
enforce the Sticky Policy of the respective PD. Moreover, it is also responsible for delivering trust
and assurance of its correct functioning through the Verifier and the Monitor components.
4.2.4.A Verifier
Verification is a pro-active measure that is taken prior to any data disclosure, which is at
the heart of this model. Trust that the user intentions are going to be carried out is partially
rooted in the verification system. As complex systems are built in multiple layers it is important to
provide verification from the lowest level (which is the hardware) to the highest one (which are the
applications providing a certain service).
32
4.2 Verifiable Privacy
Hardware verification is at the bottom layer and is done by the technology developed by the
TCG. The TPM assists the software verifier in attesting that a specific software component is
indeed running on top of the host platform. The states of different applications are kept hashed
in the TPM registers, and they are signed and transmitted to any requesting party, on demand.
This way, the requester can be assured that the machine he is communicating with has a software
component running in the provided state.
The Verifier component of the PM is responsible for carrying out the TPM assisted software
verification. We distinguish two independent software components that need verification: one
being the application providing some service, and the other being the PM itself. In order to build
a trust framework, the verification of these components is carried out by different means.
The PM component has to be verified to be in a valid state, since it is the core policy enforcing
mechanism of the model. To provide assurance of a correctly functioning PM, remote software
verification is needed, where the verifier entity is independent of the verified subject. The pre-
ferred solution would be to make the communicating parties verify each other’s PMs. This would
require an open PM specification and design, such that all of its valid states are known prior to
any interaction, and are verifiable by anybody. Another alternative would be to outsource the re-
sponsibility of verification to a Trusted Third Party (TTP), which could be the developer of the PM
or any other authority. An additional TTP, however will affect the scalability and complexity of the
whole system. Further discussion on the identity of the verifier is out of scope for the purpose of
this thesis.
The verification of the application component, on the other hand, can be carried out locally to
every node by the PM. The intuition behind it is that since the PM is remotely verified, it is trusted
to carry out local verifications in a truthful manner. The Verifier component is entrusted to do a
local software verification attested by the TPM of every application that requests access to some
protected resource.
4.2.4.B Monitor
Verification on its own only gives partial assurance about the behaviour of the communication
partner. Certificates confirming the state of a remote software component could be vague or not
descriptive enough. The Monitor component will complement the Verifier by providing a reactive
monitoring service, in order to keep track of ongoing actions in the system, and notify the PEP
whenever a potentially illegal operation is encountered.
Monitoring goes hand in hand with log keeping. Logs are a powerful mechanism for reviewing
past events that serve as evidence for or against a violation. The TPM assists the monitoring
system by providing authenticity for the logs, as long as the monitoring system is trusted to do
proper book keeping.
The main reason behind executing applications in their own isolated space is that they can be
33
4. System Design
monitored from the outside. Both process and system VMs offer solutions to monitor and intercept
system calls and translate them into native behaviour. This way the monitoring service could
attach itself to crucial system calls and monitor their execution. A store operation for example
could be evaluated before execution, in order to verify whether the sticky policy allows the data to
be stored.
Interactions often require data to be transmitted between different application. These commu-
nicating parties could either be internal or external, and both need a method of monitoring. Two
applications are internal if they both reside on the same machine, or external otherwise.
4.2.5 Interaction Models
In the following sections we examine the interaction of two separate entities, highlighting the
important parts of the protocol used for exchanging Protected Data (PD) objects. Afterwards, the
case when multiple Data Controllers are requesting the same PD is examined.
4.2.5.A Data Flow
The Privacy Manager is responsible for managing private user data that has been shared
with a remote system. Just like a DRM system, the PM treats the user data as the protected
resource and applies access control on it. Moreover, it goes beyond the standard DRM system,
by providing a fine-grained access control that looks at data accesses on per application basis.
Every application is evaluated independently before granting access to data.
Figure 4.2: Verifiable Privacy: Interaction diagram between a PDV and a Service Provider (SP)
A high level interaction digram can be seen in Figure 4.2 which follows a simple scenario with
the PDV playing the role of the DS, and the SP being the DC. The App, who is considered to be
the service running at SP, is requesting some user data from the PDV. The first two steps are part
of the communication protocol between the two actors by which they establish trust and share
protected information. The third step describes the data access by the requester App on the SP
34
4.2 Verifiable Privacy
side, while the fourth step depicts a potential forwarding of data to an external entity. Note that an
internal access from a second application, App2, would also happen through the PM. An external
forwarding, on the other hand, will initiate another round of the communication protocol involving
steps 1 and 2, with the SP playing the role of the resource owner. In order to accommodate the
need for a system where the roles of DS and DC can be assigned to PDVs and service providers
interchangeably depending on the context, the communications protocols should be the same,
regardless of the real identity of the actors. Resulting from this is that an interaction diagram
between two PDVs or two service providers would follow the same principles.
The first step of the communication protocol establishes the trust relationship among the two
parties by doing mutual verifications on eachother’s systems. Usually the data requester (the
service provider in our case) initiates the protocol by sending a signed certificate proving the
validity of the Privacy Manager (PM) component running on his machine. This certificate is usually
signed by the TPMSP proving that the PMSP has not been tampered with and it is in a valid state.
It also contains the public key of the PMSP such that the user can encrypt sensitive data with it. A
similar certificate from the PDV is sent back to the SP as proof that PMPDV is also genuine and
it is attested by TPMPDV .
The second step in the communication protocol is the exchange of private information be-
tween the parties. In our case, the PDV is sharing some information with SP. The shared data is
encrypted with a secret key that will also be transferred under the receivers public key protection.
Moreover, the data will be bundled together with its Sticky Policy, forming a Protected Data (PD)
object. The Protected Data is kept in the secure storage of the PMSP .
After the communication protocol concluded, the copy residing on the SP side can be re-
quested by applications running on the same machine. Since data is kept encrypted by a secret
key, in order for an application to get access to it, first it needs to be re-encrypted with PubApp
by the PMSP . The PMSP only does this after the state of AppSP is verified to be valid, and in
accordance with the Sticky Policy guarding the data. Once access is granted, the App receives a
copy of the data protected by his public key.
The final step in the interaction diagram is the forwarding of shared data to a third party. This
step is part of the interaction diagram, since data forwarding happens very frequently in every
data processing system.
Whenever data is forwarded another round of the communication protocol described in steps
1 and 2 is initiated by the SP and the Third Party. In this interaction the SP will assume the
role of the data owner (DS) and the Third Party becomes the requester (DC) who initiates the
protocol. Trust is established between the parties just as before, but before the data transfer can
take place the SP has to verify that the forwarding is in accordance with the Sticky Policy. As long
as the PMSP has been verified to be in a correct state, the PM is trusted to carry out the user
preferences.
35
4. System Design
Every forwarding action on private user data should be logged by every party, such that proof
can be provided to the original data owner that his intentions were enforced during data process-
ing. Logs should be aggregated by the original data requester and provided to the original data
owner on a periodical basis.
4.2.5.B Forwarding Chain
The Data Flow, described in Section 4.2.5.A, only specifies the interaction of two entities. The
Forwarding Chain, on the other hand, describes how data is shared across multiple parties. The
Forwarding Chain is a tree-like structure of nodes that share a copy of a Protected Data (PD)
object. The root of the Forwarding Chain is the source of the private data, such as a PDV.
Figure 4.3 illustrates how a Forwarding Chain might be built up on a single user object, which is
a Personal Health Record (PHR). A follow-up scenario of the one presented in Section 2.1.2 using
the healthcare system can be considered. The owner of the PDV shares his PHR with a Hospital
Service under the protection of a Sticky Policy. The Hospital Service is in close collaboration with
two other entities: a Pharmacy and a Research Center, so it shares collected Protected Data with
them, assuming the Sticky Policy allows it. These two entities in turn can share the Protected Data
themselves, like in the case of the Research Center publishing information on a News Service,
thus creating a chain of forwarded data. It is worth noting that every link between two entities in
this diagram actually represents an interaction based on Section 4.2.5.A.
Figure 4.3: Verifiable Privacy: Example of Forwarding Chain on Personal Health Record
Whenever Protected Data is forwarded to a Third Party we can distinguish three different
scenarios:
1. In the simplest case the Protected Data is shared as a whole, without modification. In this
case the data copy can be considered as a duplicate.
2. DCs might decide to share only a fragment of the original data, in order to promote data
minimization. The Hospital Service, for example, might decide to share only a subset of the
36
4.3 Trusted Privacy
information residing in a PHR with its Pharmacy partner. The data fragment, however, still
has to be protected with the same Sticky Policy to assure that it will not be misused.
3. In some cases the DC might want to disclose protected data under a stronger level of pro-
tection. In our example, the Hospital Service shares PHR with the Research Center under a
stricter Sticky Policy, thus limiting the scope of usage of the data. In case of Protected Data
forwarding, Sticky Policies can always be made stricter, but can never be made weaker. This
rule ensures that the original policies set by the data owner will always be respected.
In order to maintain the Forwarding Chain structure every node is responsible of keeping rout-
ing tables with pointer to the previously disclosed data. The maintenance of up-to-date pointers
is a crucial requirement for the logging and the control system described below.
The Monitor component being part of every PM is responsible for keeping logs on every node.
Logs provide traces on the processing of every Protected Data, which can be viewed as assurance
of data protection. Given the distributed nature of the Forwarding Chain, every node holds a
fragment of the logs that are relevant for a single data piece. In our previous example, each
entity keeps logs about the data processing done on the shared PHR. In order to turn logs into
assurance they have to be aggregated and verified. Verification could be carried out locally to
every node as well, thus skipping the step of aggregation. The local solution however offers
relatively less assurance than its counterpart.
We delegate the responsibility of log aggregation and verification to a Trusted Third Party
(TTP), which can play the role of an audit company. The TTP has to collect these logs either by
direct collaboration or some other means, and perform a verification on them. A final digest is
then periodically sent to the data owner as the final form of assurance. Policy violations that show
up in the logs are also included in the digest.
In order to maintain control over already shared data objects the Forwarding Chain also assists
in manipulating and revoking accesses on Protected Data (PD). When data owners wish to update
the Sticky Policy attached to some object, he can do so by using a push method that propagates
his updates starting from the initial data requester. It is important that the pointers are kept fresh,
such that the chain is not broken. Every party who has a copy of the shared data has to update
its policy locally in case of an update, and forward the update operation to all of its children in
the chain. Every node is also responsible for collecting acknowledgements of the success of the
operation and notify the user about the process.
4.3 Trusted Privacy
This section describes the architecture and functioning of the model which closely resembles
the design outlined by the PrimeLife project [5], together with its predecessor, the PRIME project
[8]. Both are vast projects with years of research behind them. The scope of this description,
37
4. System Design
however, is to present the main underlying model and design that these projects follow in order to
provide policy enforcement.
4.3.1 Description
Much like the Verifiable Privacy descibed in Section 4.2, the Trusted Privacy relies on the
use of a specialized software: the Privacy Manager (PM). The architecture supporting the PM
component is relaxed by employing a middleware oriented design. Apart from the basic Sticky
Policy enforcement that is guaranteed by the PM, it comes with a different view on the employed
trust framework. As its name suggests, the Trusted Privacy model relies on the correct functioning
of the trust framework, which is the composition of two independent sources of trust.
4.3.2 Prerequisites
The Trusted Privacy (TP) model assumes an active PM system on every participating actor,
both for PDVs and service providers. In order to assure full functionality these components should
be fully compatible with one another and tamper free.
For PDVs, the PM acts like a client-side protection system designed to govern every interaction
on the user’s personal data. Queries on user data are passed through the PM layer which assures
that only requesters that are found eligible can access the protected resource. Incorporating such
a component into the PDVs is a straightforward task. On the other hand, the PM system must also
be present at the service provider. This resembles a server-side system which acts as a DRM
software, protecting shared user data. The server-side PM component holds the responsibility
to act and protect on behalf of the clients, by respecting Sticky Policies. As the PM is a central
component of the model, it has to be fully trusted. Mechanism describing how this trust can be
achieved will follow.
The existence of TTPs are also assumed playing an important role in achieving the desired
trust level.
4.3.3 Architecture
The PRIME project defines two different PM components: one for the client, and one for the
server. In the original PRIME project the client and server-side components are different systems
with different responsibilities. In the PrimeLife project, however, these two blend into a single
component.
Our scenario needs to accommodate PDVs together with service providers, and multiple in-
teractions between them. This requirement leads to a need for uniformity. DS and DC are clear
abstractions of PDV and service providers. These roles are not fixed, but rather dynamically as-
sumed, based on the context. For example, if PDV1 requests some data from PDV2, it is clear that
PDV1 is the Data Controller and PDV2 is the Data Subject. However, if PDV1 decides to forward
38
4.3 Trusted Privacy
the collected data to a service provider, PDV1 becomes the Data Subject and service provider
the Data Controller. It is easy to see that PDVs and service providers can assume both the Data
Subject and Data Controller role. The need for uniformity discouraged us from using distinct PM
components, thus the PM residing on the service provider has to have the same functionality as
the one on the PDVs.
Figure 4.4: Trusted Privacy: Abstract Architecture of a single PEP node
Conceptually, the PM sits on top of the persistence layer (or Database), as shown in Figure 4.4.
This way it takes the role of a middleware that governs the access over the database system
underneath. The example of Figure 4.4 depicts a PEP node with the installed PM middleware.
The Database system on top of the OS is entrusted with the safekeeping of stored data, and
only lets itself be queried from the layer sitting right above it, and not any of the higher layers.
This prevents the situation where Apps want to bypass the PM in order to get unrestricted access
to some Protected Data (PD). The PM is a middleware that mediates the access to PD from
the upper Application layer. Apart from safe storage and safe access to stored objects, the PM
middleware also plays the role of a monitoring filter. Since interactions between Apps and remote
or local systems is usually mediated through the OS, the convenient placement of the PM allows
it to track ongoing interactions against policy violations.
4.3.4 Privacy Manager Architecture
The Privacy Manager (PM) middleware closely resembles the PM of the Verifiable Privacy (VP)
model in its functionality, however, the mechanisms by which trust and assurance are provided
are different. The description of the Trust Negotiator and the Monitor components follows.
4.3.4.A Trust Negotiator
Although using the PM would mean that every privacy rule is enforced by the middleware itself,
our model still lacks components that provide trust in the infrastructure. The way that the Trusted
Privacy is targeting the trust framework is by outsourcing trust to TTPs with the use of privacy
seals and reputation systems.
With the introduction of a new component into the PM, called Trust Policy Negotiator, users can
evaluate the trustworthiness of an entity they are about to interact with. It gathers trust information
and compiles it in a meaningful way. If the user is actively taking part in the interaction, this trust
39
4. System Design
information should be presented to him in an intuitive way through the user interface. On the other
hand, if the user is receiving a query, the PDV should be able to evaluate the trust level provided
by this component in an automatic manner, and carry out a decision based on that. The sources
and mechanism by which trust is evaluated are:
Privacy and Trust Seals offer assurance that the remote party will not violate the privacy
policies previously agreed upon. These seals are usually certified by TTPs. These seals provide
proof that the system run the by the a remote party lives up to certain security and privacy stan-
dards, or uses a certain software solution. For example, it can provide assurance that a service
provider uses the PM in its backend system. We can distinguish two types of trust seals:
1. Static Trust Seals are simple signed documents by the TTP, which attests the correct sate
and functioning of a system at a given moment in time. Since these static trust seals come
with a certain validity window, they need to be re-evaluated and re-issued in order to provide
up to date proof. Since today’s infrastructure is highly dynamic, these certificates might not
be up to date all the time, as new threats and vulnerabilities are surfacing in a more frequent
manner than that of re-issuing of certificates.
2. Dynamic Trust Seals are generated in real time by the machine serving the user’s request.
Dynamic Seals are only trustworthy if the process by which they are generated is also trust-
worthy. Usually these documents are generated with the assistance of tamper-proof hard-
ware which attests their validity. The Dynamic Trust Seals highly resemble the verification
certificate that is provided by the Verifiable Privacy in Section 4.2.4.A.
Through the security claims provided by Trust Seals a trust score can be evaluated for every
remote party. It is worth mentioning that a Dynamic Trust Seal, if attested in the correct way,
always provides a higher trust score than its static counterpart. The flexibility of the Trusted
Privacy model lets each individual PEP node decide what form of trust certification it is willing to
provide.
Reputation Systems are considered to be the secondary source of trust in this model. This
model assumes the existence of multiple independent reputation systems, such as customer feed-
back services or blacklist providers. Blacklist providers keep track of constant policy violators
and notify every actor who tries to initiate an interaction with them. A User Feedback system
harnesses the power of the crowd in collecting individual opinions, or experiences of previous
interactions. External reputation providers also have to be trusted to base their rating on a well
defined and relevant scale. In case of feedback from the crowd, on the other hand, the trust is
divided between anonymous users who may or may not provide correct information.
The collected scores of from the available Reputation Systems is combined in a reputations
score. The reputation score should have its own scale, independent of the scales used by the
actual sources of the score.
40
4.3 Trusted Privacy
After the interaction with the relevant TTPs the Trust Negotiator combines the trust score
and the reputation score into a final score. Based on this final value different levels of trust can
be quantified, helping the automated decision making. The intuition behind the outsourcing of
trust to multiple sources is that many independent trust scores from independent authorities can
complement or cancel each other out, leaving the end user with a trustworthy estimate. This, of
course, only works under the assumption that TTPs are truly independent and are not conspiring
to provide a pre-agreed score.
4.3.4.B Monitor
The Monitor component integrated in the PM is built to achieve the same functionality as
described in Section 4.2.4.B. Instead of the isolated spaces, this model uses the middleware
approach to intercept and react to unauthorized operations issued by the application layer.
4.3.5 Interaction Models
The following section presents the interaction model that covers the data flow between two
remote entities, with aspects regarding establishing trust and execution paths.
4.3.5.A Data Flow
The first interaction protocol of the two parties focuses on establishing trust by the use of
the Trust Negotiator. The Trust Negotiator gathers all relevant trust and reputation scores and
computes the final score on the remote party. If the final score satisfies the predefined trust
threshold the interaction continues with the exchange of the desired protected data.
Figure 4.5: Trusted Privacy: Interaction Model of the Data Flow
The PD can take multiple paths once it has been shared to an external entity. Figure 4.5
depicts how PD is handled by the PEP of a service provider. PD is passed through the PM to
the Application Layer which carries out the service provider logic. Two usual use cases include
41
4. System Design
storing and forwarding the processed data. Both of these operations have to pass through the PM
middleware in order to evaluate whether they are allowed to be stored or forwarded, respectively.
The evaluation is carried out based on the Sticky Policies attached to the data objects. Similarly,
PDs returned as results of a database query are also subject to evaluation. The PM only lets data
through for applications which are authorized to operate on the requested data.
4.3.5.B Forwarding Chain
Since monitoring is carried out individually at every PEP node, we are again faced with the
problem of transforming logs into assurance. Just as the logging system described in Section
4.2.5.B, the Trusted Privacy also relies on the use of the Forwarding Chain, when it comes to the
modification of Sticky Policies by end users, and log verification.
We introduce a slight deviation, however, in the way that logs are aggregated and verified
from the Forwarding Chain. We eliminate the requirement of a TTP that plays the role of an audit
company, and substitute it with a different scheme. The aggregation of logs is the responsibility
of the original data requester who is in direct contact with the PDV. In the example presented
in Section 4.3, the Hospital Service has to aggregate the PHR logs using a pull method. Every
node in the chain is responsible for forwarding the pull request to its children then returning the
gathered logs to its parent.
Verification is carried out by the PDV who is the owner of the shared data on which the logs
were provided. By providing the logs to the end users in a direct manner we intend to achieve a
higher level of assurance than that of a simple digest of an external entity. PDVs are left with the
responsibility to verify aggregated logs and alert the users first hand of a suspicious behaviour.
This offers a much finer granularity of verification of logs, since PDVs can extract any requested
information from the raw logs.
4.4 Mediated Privacy
In the upcoming sections the novel policy enforcement model proposed by this thesis work is
presented, tailored to fit the defined requirements.
4.4.1 Description
The Mediated Privacy sticky policy enforcement model makes use of a mediated space be-
tween DSs and DCs, on which shared data lives. The requirements based on the user-centric
model motivated us to design this mediated space, in order to improve awareness and control
over the disclosed personal information. The mediated space does not belong to a single control-
ling entity, instead it focuses on providing a platform where DSs and DCs can interact on equal
terms.
42
4.4 Mediated Privacy
The idea of a mediated space can easily be captured by the concept of a Distributed Hash
Table (DHT) [34]. DHTs are decentralized overlay networks, where each node is seen as equal.
Nodes forming this overlay are responsible for maintaining a predefined keyspace, meaning that
every node is responsible for a subset of the keyspace, called the keyspace slice. New data is
entered under a key in the DHT, called the LookupKey, which is hashed in order to compute its
place on the keyspace. Its place in the keyspace determines the node which will host the data
physically.
In this model we employ the concept of the DHT as our mediated space. Users are aware
of all existing copies of their personal data throughout the system by simply maintaining a set of
LookupKeys in the DHT. Awareness about who accesses it is also improved by tracking search
queries that are targeted to a LookupKey. By holding the LookupKey for each personal data item,
users are in charge of modifying and deleting them at any given time, greatly improving control.
4.4.2 Prerequisites
One of the base prerequisites for our model is the existence of a DHT overlay network. DHTs
are widely employed distributed data stores in today’s data dominated world, since they scale
well and offer a quick lookup of O(log(N)). On the other hand, there are only a few systems that
consider it as a building block for data privacy [15]. Our design requires that both DS and DC
entities to be part of the DHT as peers in an active manner.
A follow-up assumption of the Mediated Privacy model states that data introduced in the DHT
should only be queried and distributed through the DHT itself, avoiding the trading of personal
data using outside copies. Distribution of the private data should only happen with the users’
consent. DCs who wish to distribute user data are required to do so via sharing the LookupKey,
under which the specific data can be found. Such requirements rely on the actors of the system
to obey this rule.
4.4.3 Architecture
In the upcoming sections we will present how the DHT overlay network is formed around the
DC and DS peers. Peers of the DHT, regardless of whether they are part of a PDV or a service
provider, all operate on three layers.
Figure 4.6 depicts the high level architecture of a single DHT node, based on three layers. The
bottom layer, which serves as the base for the other two layers, incorporates all the conventional
DHT functionalities. This includes the maintenance of the overlay topology, and the serving of
basic operations, such as insert and retrieve. The Privacy Manager layer, on top of the DHT
layer, is responsible for safeguarding protected data objects and trust establishment. The Logging
layer sits on top of the stack and is responsible for keeping track of every DHT event regarding
operations on private data. The following sections present in detail the functionality of every layer.
43
4. System Design
Figure 4.6: Mediated Privacy: Architecture of a DHT node
Business Ring
The mediated space, represented by the DHT, is used to store disseminated user data. Be-
cause of this, both PDVs and service providers are part of this network. Since sharing user data
is a frequent operation, we are expecting the deployment of a large shared data structure.
The first important question to address is the rules by which a DHT is formed. The first solution
that comes to mind is to have all the actors participate in a single DHT. The largest, currently
active DHT is run on 10-25 million nodes, and in practice can scale further [37]. The performance
of operations like search, insert, or delete are bound by O(log(N)). Even though the DHT is a
highly scalable structure, using a single one will result in some drawbacks. The drawback that
we would like to point out is the requirement for uniformity. The behaviour of the DHT is said to
be uniform across all nodes, since it is a completely decentralized system. This requirement for
uniformity does not fit our requirements, since laws and regulations regarding virtual data handling
and privacy are not uniform across different regions of the world. Moreover, different regulations
can be in place on the business model level as well. Although having a single DHT would be a
simpler solution, it would introduce the problem of handling complicated legal and trust schemas.
Instead of having a single DHT, we introduce the concept of a Business Ring. We propose a
solution where Business Rings are spawned as needed around a group of services that have a
closely integrated business model. Service providers belonging to the same Business Ring are
assumed to have an existing business agreement, which ties them together. In principal, these
Business Rings can be formed around different branches of the existing industries. Competing
service providers can either agree on belonging to the same Business Ring, or start their own.
A mature Business Ring with a clear business model, however, is more likely to be targeted by
users, than a less mature one. For example the Business Ring used in case of the healthcare
scenario using PHRs presented in Section 4.3, could be formed according to Figure 4.7.
The black nodes represent PDVs while the white nodes represent the service providers. The
ring-like representation of the DHT from Figure 4.7 resembles a Chord network [35]. Every node
of the Chord Business Ring is said to be responsible for the slice of the keyspace lying between
44
4.4 Mediated Privacy
Figure 4.7: Mediated Privacy: Business Ring formed around a healthcare scenario
himself and his predecessor in the ring. The keyspace slices of the service providers can be
seen from the arrow markings. Note that the DHT solution used in an actual implementation can
follow any kind of topology. We refrain from evaluating existing DHT solutions. Rather we try to
describe a system in which any kind of generic DHT solution can be used. For simplicity and
better understanding, however, we will keep talking about a Chord-like structure.
The business model that ties the service providers together in the Business Ring of Figure 4.7
could be the public health services provided to users. Although these service providers offer
independent services, they belong to the same logical ring, since they operate on the same set of
PHRs. Together they form a clear business model, which is used as a basic characteristic of the
Business Ring. These Business Rings can vary from business to business, depending on how
many service providers are part of it, how big the network is, or what kind of general data policies
apply for participants.
Since both PDVs and service providers have to become peers of the DHT, we will investigate
how this requirement fits into their design.
PDV peer
By their design PDVs are abstractions of ’always on’ entities that provide safe user data storage
together with safe data management. The responsibilities of a Business Ring node could easily
be incorporated as an additional component inside the PDV. Since they are always on for high
availability, the downside of high churn rates can also be alleviated. Churn stands for the rate at
which nodes enter and leave a DHT system. High churn rates forces the system to focus more on
self-maintenance, while a low churn rate guarantees a more stable system.
45
4. System Design
Service provider peer
When it comes to our requirement to incorporate service providers as Business Ring peers,
we are faced with a more complex scenario. Given that backend systems of service providers
significantly differ from one another, it is hard to envision a generic solution. There is, however, a
common design practice that can achieve the above mentioned Business Ring design by providing
Privacy as a Service (PaaS). The responsabilities of a single service provider’s DHT peer could
be advertised like a service, which in turn can have any flexible design.
Figure 4.8: Mediated Privacy: PaaS design for the Hospital Service Business Ring node
Figure 4.8 depicts the backend system architecture of the Hospital Service with the PaaS as
one of its frontend services. Being a part of the Business Ring, the Hospital Service is required to
maintain control over his assigned keyspace slice depicted by the arrow. In its backend system,
this could be load balanced and supported by multiple machines from his internal system through
the PaaS. This design is flexible enough to be easily implemented in any backend system, while
still maintaining the functionalities of a Business Ring peer.
4.4.4 DHT Peer Layer
The bottom layer that every peer operates on is the DHT layer, which is responsible for exe-
cuting all the classical DHT related functionality (insert, retrieve, remove). Special considerations
have to be taken, however, for every remote retrieval operation, in order to avoid untraced data
copies. The local retrieval operation maintains its normal behaviour. Apart from the classical func-
tions, there are a couple of other aspects that need to be addressed, like membership, keyspace
assignment, ring size and description.
4.4.4.A The Remote Retrieval Operation
The classical remote retrieval operation of the DHT retrieves a data object belonging under
a LookupKey in two phases. In the first phase an internal search operation is executed, which
finds the host of the particular data object, depending on which node is responsible for the par-
46
4.4 Mediated Privacy
ticular keyspace slice containing the LookupKey. After the right host is found, the second phase
establishes a direct point to point connection between the requester and the host, on which the
requested object is transmitted. This process, by its nature, creates an untraced data copy of the
requested object, the PD in our case. In order to maintain references to all existing copies of a
PD object, the retrieve operation is modified to act like a retrieve followed by an insert.
Our modified retrieve operation does not return the new data copy directly, instead it inserts
it back into the Business Ring under the keyspace slice of the requester. The first phase of the
retrieval stays unmodified, but the second one is replaced by a DHT insert operation. The key
for the insertion, called CopyLookupKey, needs to be included in the request process by the
requester. The functionality of the retrieval operation stays the same, since in both cases the
requester will have his own data copy on his local machine. The difference between the two,
however, is that our modified retrieval keeps track of the data copy via its new CopyLookupKey,
while the normal operation is not concerned with the tracking of data copies. The CopyLookupKey,
pointing to the new data copy, can then be appended to the metadata of the original PD. This
guarantees that the DS will be able to retrieve every CopyLookupKey pointing to different data
copies.
4.4.4.B Membership
A Business Ring has to be bootstrapped in the beginning, in order for other peers to join the
network. The most convenient way would be to let service providers bootstrap the DHT overlay,
and advertise their services together with a reference to the Business Ring. After the initial setup,
we have to devise strategies on how PDVs should join the network. As explained later, having a
certain amount of users in an DHT is desirable in order to enable data access tracking. Moreover,
having a large user base can also act as a social incentive in order to establish trust in a given
service. We try to distinguish several strategies:
1. PDVs who are involved with the services provided in a particular ring should be members
of that Business Ring. This strict strategy states that only PDVs who share their data in the
ring, are allowed to be part of it. Joining Business Rings as a result of a successful data
exchange should be an automated process. Leaving a network can be caused by either an
expiring date of shared data or manual intervention by the PDV owner, in case he decides
not to keep track of shared data for any longer. His previously shared data will persist to exist
in the ring, if not deleted explicitly either by the data owner, data collector, or a predefined
obligation.
2. The previous strategy assumes that peers will have enough incentive to join and use the
Business Ring. However, unpopular businesses could become buried, since nobody con-
siders them safe enough to use, without having an initial user base. To accommodate this
47
4. System Design
case we could have a set of randomly chosen nodes from the existing PDVs, who could
join these rings. Their only duty will be to route messages and keep small shares of data,
without taking part of any other interaction. A system could function with random nodes,
since data owners need not to be part of the desired network for the system to function.
Operations on the DHT can also be done from outside the system, by executing them via a
randomly chosen node, who is a part of the system.
Since the second strategy would introduce some indirection and complexity, we argue that the
more stricter first strategy would suit our model better. One of our first initial assumptions was
that every entities’ identity is verifiable. Following from this assumption, a Business Ring can be
constructed strictly based on PDVs who are legitimate data sharers. The impact of anonymous
nodes on a Business Ring is out of scope of this thesis.
4.4.4.C Keyspace Assignment
An important consideration in the design of the system was to let the service providers decide
the keys under which user data has to be inserted. Since every service provider has his own
keyspace that he hosts locally, he is in charge of a set of keys. Whenever a DS wants to share
an object with the said service provider, he does so by inserting it under one of the keys chosen
from the service provider’s keyspace.
We also considered a random placement strategy where the PDVs are choosing a random
key, under which their data is inserted. Once the service provider retrieves the chosen random
key, he would have to issue a search on it. This scheme would introduce a performance penalty,
since a single interaction would require two DHT lookups.
To avoid this overhead we decided to put the service provider in charge of the keys where
the user objects are going to be kept. We argue that this scheme does not empower the service
provider with more trust, since they are bound to receive the same data anyway. Moreover, after
the data has been inserted, the service provider can retrieve it from their local machines, without
the need for an extra DHT lookup.
To accommodate our design decision, we also have to take a look at how the keyspace is
divided among the nodes in the Business Ring. Traditional DHT solutions strive to achieve a
uniform distribution of keys among nodes in order to load balance the system. This, however,
is not suitable for our needs, since the service providers are the real hosts of user data, while
PDVs have a different role in the system. We propose an unbalanced key distribution schema
which favours the service provider nodes. Our key distribution is represented by the arrows in
Figure 4.7. The mechanism that determined how large the keyspace associated to a service
provider can be is closely related to the trust framework of our system, and will be discussed later
on.
48
4.4 Mediated Privacy
4.4.4.D Business Ring Size
As mentioned in Section 4.4.4.B, the size of the Business Ring plays an important role in
the operation tracking and logging system. The logging system presented later on relies on the
existence of routing nodes of the DHT, which route operations such as insert and retrieve. A
minimum predetermined DHT size (counting the PDV nodes) would be desirable to maintain,
in order to make sure that every operation will get routed at least by one random router node.
This minimum value could be computed, depending on the size of the routing tables used by the
particular DHT implementation.
A possibility solution to achieve this minimum is combining the membership strategies from
Section 4.4.4.B. Use the participants of the business model as a base, and compensate with
random nodes until the minimum desired size is met. This solution, on the other hand, would
require a centralized coordinator entity that governs the memberships.
An alternative strategy, less reliant on a centralized entity, is to start new Business Rings as
part of an already existing mature Business Ring with a stable userbase. The mature Business
Ring can serve as a nursery for the newly created one. After the new Business Ring gathers
enough momentum to build a stable userbase, it can be separated from the nursery.
4.4.4.E Business Ring Description
Every Business Ring should offer a description of the network. Nodes who join the network
should have a way to see which are the service providers that are involved in that particular ring.
Service provider nodes could self advertise their own description regarding client restrictions, and
generally applying policies.
The Business Ring description should also contain the keyspace sizes assigned for each ser-
vice provider from within that ring for trust establishing reasons. The size of the ring also has to be
a public information based on which different trust decisions can be carried out. Details related to
the keyspace size, and DHT size do not have to be precise at any given time. An estimate of these
values would be sufficient for the workings of the system. Such an estimate can be computed by
a gossip algorithm [31] that would run on piggybacked routing messages in the system, making
sure that each node has an estimate value for both the service provider’s keyspace size, and the
DHT size. However, the design of a system to provide accurate estimates is out of the scope of
this thesis work.
Additionally, one might imagine that different business models have different policies regarding
customer requirements. For example, a user could only join the ring of a bank, if he is a customer
there. All these extra policies regarding restriction from the service provider side can also be
taken into consideration.
49
4. System Design
4.4.5 Privacy Manager Layer
The Privacy Manager Layer stands for the PM component which is responsible for the safe-
guarding of PD objects by enforcing Sticky Policies. Its main responsibility is to filter the incoming
and outgoing operations that are happening on the DHT layer. This layer acts as a guard of the
user data objects hosted at every node.
4.4.5.A Sticky Policy Enforcement
The main method of data safeguarding is sticky policy enforcement. Business rings are re-
quired to operate on PD objects, that guarantee the existence of a Sticky Policy next to some
shared data. There are two big use cases covered by Sticky Policies: local data usage and
forwarding of data.
When a DC wants to process the collected PD he just has to issue a local retrieval operation to
the Business Ring through one of his own local nodes. Before the local nodes return the desired
data, the PM evaluates the Sticky Policy against the requester’s attributes and grants or denies
access to it.
Forwarding of collected user data is following the same rules, but using a remote retrieval
operation. As stated in Section 4.4.2, entities are only allowed to externally forward LookupKeys,
and not the actual PD, since data sharing has to happen through the ring. Third parties interested
in collecting some shared data have to be part of the same Business Ring with the DS and DC.
Only then, a third party can issue a remote retrieval request for a PD object. The PM layer of
the hosting entity is responsible for evaluating Sticky Policies before the actual data transfer can
happen.
The PM layer is also in charge of the obligation engine, which makes sure all obligations
are triggered and carried out. An obligation requiring the deletion of a PD object can be easily
implemented issuing a delete operation on the DHT layer. It is worth mentioning, that the deletion
is verifiable by the DS himself, since he also holds a reference to the LookupKey of the PD. By
periodically interrogating the ring for known LookupKeys, the DS can always know which of his
previously shared object are still there, and which ones have been deleted.
4.4.5.B Trust Management
Since trust is a required component of every framework, the PM layer also offers a trust nego-
tiating mechanism for peers of the Business Ring.
The unbalanced keyspace assignment described in Section 4.4.4.C reduces the communica-
tion overhead between a DS and DC, but it can also be used as a measure of trustworthiness.
Taking the size of the keyspace slices of every service provider, we can offer a quantification,
by which a trust comparison can be carried out. The size of the assigned keyspace slice allows
a service provider to host a limited set of shared user data, depending on the size of his slice.
50
4.4 Mediated Privacy
A keyspace slice is made out of a set of lookup keys, which can be used to host a set of PD.
The intuition behind the keyspace slice as a trust measure is that trusted service providers are
allowed to host a bigger set of PD than less trusted ones. In this way, every service provider can
be assigned a trust level based on the size of its keyspace slice.
The establishment of these trust levels is the responsibility of the entity who is in charge of as-
signing keyspace slices, since he can decide how big or small it can be. Letting service providers
claim their own slices will lead to a greedy scenario, in which case the trust measurement loses
its value. A better alternative is to involve the whole Business Ring in deciding the keyspace slice
sizes. A minimum baseline keyspace slice size can be assigned to every node, leaving them
equally at the same bottom level of trust. This minimum value can vary from use case to use
case, and its establishment is independent of this work. After the initial assignment, a consensus
algorithm can be run across the peers in order to grant some more space, or take away some
space from different entities. Since the majority of nodes are required to achieve consensus, we
can assume that if the majority of the peers are trustworthy, then the keyspace assignment is also
trustworthy. A trustworthy keyspace assignment leads to a quantitative trust measure that can be
used to categorize each service provider in its own trust level, and define the automated decision
making depending on it.
A secondary trust source can be derived from the description offered by the Business Ring
itself. With the assumed node identities in place, a list of participating service providers can
be derived from the provided Business Ring description. By looking at the individual service
providers in the list, a DS can set his custom trust level. For instance, a DS may decide not to use
the services of a Business Ring which has a government agency as its member. On the other
hand, he also might be more comfortable with sharing data in a ring that has a well known trusted
service provider as its member.
4.4.6 Logging Layer
The Logging layer is the top layer which offers a wrapper around every operation on a PD.
Being at the top, it is responsible for saving traces of every operation. Logging is an essential
mechanism that verifies the validity of the claimed actions, as well as to help maintain assur-
ance for the user, that his intentions were carried out. Our logging mechanism focuses on saving
data request traces throughout the Business Ring. The logging mechanism happens in an asyn-
chronous manner, such that the performance of the service itself is not affected. We are trying to
achieve a relaxed logging mechanism where some loss is inevitable, but not fatal.
The request tracking system leverages the already existing lookup functionality of the DHT. As
specified before, data is not meant to leave the DHT without authorization, and it is meant to be
kept under a LookupKey which is known by the DS. That being said, access to data inside the
ring is only possible via the search mechanism of the DHT.
51
4. System Design
In order to perform any operation on data, first the node responsible for it has to be found.
Since DHTs offer high scalability, nodes cannot store references to every other node in the system.
Search solutions where nodes only keep routing tables with a restricted size are commonly used.
Because of this design, every operation first has to go through several hops, in order to get to the
actual data host. Every such routing node has valuable information regarding the identity of the
requester, as well as the key for the resource requested. Every such <Requester, ResourceKey>
pair provides useful information for identifying who has been requesting access to a certain PD.
The ResourceKey represents the LookupKey of the requested PD.
In order to have a functioning logging mechanism, we need to make sure that there are in fact
routing nodes in the system, and not just a singe node serving all requests. We have to assure
that the size of the Business Ring is large enough, as addressed in Section 4.4.4.D.
Since every node is responsible for keeping logs based on its own routed messages, the
log informations referring to a certain key ends up scattered at multiple nodes. Composing a
comprehensive log out of individual pieces of log events scattered throughout the nodes is the
next challenge. Once logging information is aggregated, we need a way to reveal it to the relevant
data owner, whose data is being kept under the referenced key.
The first intuition is to keep the log object inside the same ring in order to provide easy aggre-
gation and quick access to it. The first problem is that this might cause a cascading of logging
messages that might render the system unavailable. We could separate the logging operations
from any other operation such that logging on logging messages would be disabled. This solves
the problem, but inserts a security threat: normal messages masked as logging messages could
be sent to avoid tracing. We need an additional verification step, during which every router node
checks the validity of a log message against some predefined standard, in order to avoid masked
messages. For example, log messages can be composed of predefined fields, each field can
only take a predefined value from an existing value pool. The verification step would then check
whether each value of the log message has been chosen with respect to the predefined value
pool or not.
Another problem with it is under which key to place the <Requester, ResourceKey> log event
chunks, such that they all get aggregated. A deterministic solution is needed since, the data
owner has to figure out where the aggregated location is. Using the ResourceKey itself to keep
logs will end up taking up space from the service providers keyspace. More importantly, the
service provider would be in charge of hosting the aggregated logs, which is not desired. A hash
could be computed on the ResourceKey to compute another LogKey deterministically where the
data owner could find the aggregated logs, by computing the same hash on the LookupKey he is
about to trace. Using a deterministic hash function will place the aggregates at a random node,
depending on the overlay existing at that particular time.
The <Requester, ResourceKey> log objects should be considered as immutable objects that
52
4.4 Mediated Privacy
can only be read, but not modified or deleted externally by a request. Log objects should be
designed as short lived objects with an expiring date, such that every node can clean up its logs
periodically. This assures that the system will not get clogged by logs.
Retrieval of aggregated logs are happening by using the pull-method. Every PDV is respon-
sible to periodically query the ring for the LogKeys under which his log information is kept. This
way, long term aggregates can be composed at the PDV site, to assure the persistence of logs.
4.4.7 Interaction Models
The following sections present the interaction models that arise with the employment of a
Business Ring. We examine separately how interactions with multiple Data Subjects and multiple
Data Controllers are handled.
4.4.7.A Data Flow
The first interaction model presented focuses on the data flow between a single DS and DC.
Figure 4.9 depicts the high level interaction diagram between the two.
Figure 4.9: Mediated Privacy: DC - DS interaction model
In the first step, the DC makes his request to the DS together with the Data Handling Policy
(DHPol) and a LookupKey, defining his intentions on data handling and the key under which the
requested data is expected. The LookupKey is a valid key in the Business Ring, residing under
the DC node’s keyspace slice. After the received request, the DS interrogated the Business Ring
for relevant details about the DC. This could include information on his keyspace size, and other
trust measures, which contribute into the reasoning in step 3. Depending on the trust level and
the predefined data policies of DS the reasoning can have two outcomes. In step 3.a, access is
granted and a PD object is created, in step 3.b it is denied. After granting access the DS issues
53
4. System Design
an insert operation to the Business Ring in step 3.a. The insert request tries to put the PD under
the LookupKey provided by the DC. Steps 3.a.1 and 3.a.2, marked with blue arrows, represent
the internal routing steps of the DHT. Once the request reaches the DC’s node, the DS sends an
acknowledgement back to the DC with the status of the operation.
When the DC wants to access the PD for processing, he issues a request to his Business Ring
node in the form of a local retrieval operation. The request asks for the PD under LookupKey with
a specified PURPOSE. The PM layer of the DC node checks the PURPOSE attributes against
the Sticky Policy of the PD. Based on the decision of the PM layer, it can either disclose the PD
to the DC or not.
At the end of the interaction the LookupKey, which serves as a reference for the shared PD,
will be known both by the DC and DS. By the means of the mediated space, supported by the
Business Ring, both actors share a pointer to the data object, and both can operate on it.
4.4.7.B Multiple DS Interaction Model
A request coming from a DC could target multiple DSs. The single LookupKey key could
be used to host multiple data objects from different DSs, by simply appending them based on
collision. Collision appears when two objects are inserted in the DHT under the same key. In this
case, however, we lose the fine-grained control over individual data, since now multiple PDs are
mapped to a single key. The DSs involved in the transaction will all share a single LookupKey. We
argue that in order to maintain a fine-grained control over user data, and secrecy, a one to one
mapping of single PD to LookupKey is required.
To accommodate this case, instead of sending a single LookupKey, like presented in Section
4.9, the DC will send a set of available lookup keys together with its request. The DSs involved in
the request will choose a single LookupKey from the provided set. Once a key is taken by a DS it
is considered consumed, such that other DSs cannot use it. To offer a LookupKey dissemination
mechanism, different solutions can be employed.
Commonly, the role of DSs are associated with PDVs. The network of PDVs, being individual
data stores, can be seen as a social graph with PDVs as nodes, and edges based on friendships
or other connections depending on the context. In the presented case, the multiple DS interaction
model turns into a distributed query on the social graph. The DC executing the data request only
has to talk to a single PDV node, and let him forward the request to the targeted PDV group. The
PDV targeted directly by the DC’s request becomes the entry point, also known as the root, from
which the query is disseminated.
In order to achieve a one to one mapping of LookupKey to PD, we propose a key dissemination
protocol in two rounds, presented in Figure 4.10. The first round will distribute the query between
the target PDVs, and collect status messages shown by the blue arrow. Status messages flowing
back to the root after the first round will contain information on the size of the subtree for which
54
4.4 Mediated Privacy
they are considered the root. After the first round every node will know how many keys will it
need, because they know how many nodes their subtree consists of. The second round consist
of the LookupKey dissemination starting from the root, and moving to the leaves. On each step, a
PDV takes a LookupKey from the provided set, and forwards the rest to its children, based on the
subtree count made in the previous round. During the second round, each PDV inserts the query
results into the Business Ring and replies with the status message that in turn will get forwarded
to the DC. The extra round introduced into the algorithm causes some performance overhead,
this however is necessary to achieve the required functionality.
Figure 4.10: Mediated Privacy: Key Dissemination
4.4.7.C Multiple DC Interaction Model
The symmetrical case descibed in Section 4.4.7.B, is when multiple DCs are interested in the
same PD. This is also applicable for cases, when a single data requester forwards user data to
third parties. From the point of view of the original data owner, both the data requester and the
third party are considered to be DCs.
Data sharing between multiple DCs is done through the assumption that once a PD object
is inserted under a certain LookupKey in the Business Ring, it is only retrieved and distributed
via the remote retrieval operation. Having a global pointer, such as the LookupKey, to the PD
makes it possible to disseminate the pointer itself without the PD. This will require every DC,
who receives the pointer, to execute a remote retrieval in the Business Ring. The Sticky Policy
guarding the PD is responsible for filtering the requests of multiple DCs in a pro-active manner. By
forcing DC to execute remote retrievals on the Business Ring, we ensure that every request will
leave a trace through the logging system, and that every data copy will have his own dedicated
lookup key reference.
55
4. System Design
4.4.7.D Log Flow
Figure 4.11: Mediated Privacy: Logging
As described in Section 4.4.6, we are employing a distributed logging mechanism, where a
request event log is placed in a deterministic position inside the Business Ring, and retrieved by
the DS using a pull-method.
Figure 4.11 represents a scenario related to the PHR system presented in Section 4.7. We are
assuming that the Hospital Service executed a data request, by which he acquired a PD under the
LookupKey from a DS. The DS could be any PDV belonging to the ring. Furthermore, assume
that the Hospital Service wishes to share the collected PD with the Research Center. In order to
do so, the Hospital Service share the LookupKey under which the PD is kept.
As the Research Center only holds a pointer to the object, it is required to search for the PD
inside the Business Ring, by executing a remote retrieval. Assuming a common DHT design,
where messages are routed through intermediate nodes of the system, if the Business Ring is
large enough, the encounter of a Router Node is inevitable. The Router Node, being part of the
ring has the three layered architecture described in Section 4.6. Its DHT layer is responsible for
the actual routing of the search request, but before it can do so, the Logging layer is triggered first.
The Logging layer of the Router Node inserts an event record in the form of <ResearchCenter,
LookupKey> into the ring, under the key computed by the hash(LookupKey) function. Note that
any subsequent search for LookupKey originating from any source can be routed through any
random router node of the network. Since the hash(LookupKey) will always yield the same key,
the log messages belonging to the LookupKey will get appended under the same key by collision.
As a consequence of our logging mechanism with a deterministic hash function, there is no
need to explicitly aggregate logs. Whenever the DS wants to verify the request traces of the
shared PD under LookupKey, it simply issues a pull request for the hash(LookupKey). Multiple
shared PD objects will result on multiple pull requests targeted to different keys. By collecting the
logs, the DS can get first hand assurance derived from the traces.
Since the LookupKeys are chosen randomly by DC, the hash(LookupKey) will also result in
56
4.4 Mediated Privacy
a random key for the log placement. This means that any node of the system can end up as
a potential host of logs for any DS. Assuming that the majority of nodes in the system are well-
behaved, we can assume that logs are kept in an orderly fashion. Even if there are some malicious
nodes, who explicitly delete logs, or are unwilling to insert trace logs, the loss of a fraction of the
overall logs is acceptable.
It should also be pointed out that the logging system is capable of recording logs of potential
unauthorized data accesses. In case the Sticky Policy attached to the PHR hosted at the Hospital
Service disallows any data forwarding, no third party should be able to access it. However, if the
Research Center and the Hospital Service conspire to exchange PHR through the ring, regardless
of its StickyPolicy, they can succeed to do so. On the other hand, the random Router Node is still
entrusted to log an entry about the request. As the Router Node is considered to be random and
independent from both service providers, it is very unlikely to be compromised. Even if there is
no proof of a policy violation, the record registered by the Router Node can raise suspicion, which
can make the service provider lose its trustworthiness.
4.4.7.E Indirect data
So far we have been investigating the direct data sharing scenarios between two clearly de-
fined entities: the DS and the DC. There is, however, another type of data, called indirect data
described in Section 2.2. When it comes to indirect data, the definition of a clear DS gets fuzzy,
since an indirect data object can have multiple subjects simultaneously. The Mediated Privacy
model, however, sketches a solution based around additional data pointers, used to keep multiple
references to indirect data objects.
Considering the example depicted in Figure 4.12 inspired by the PHR scenario described in
Section 4.7. When the DS shares its PHR with a Hospital Service the PHR gets inserted in the
Business Ring under key K2. After the Hospital Service runs some tests on the DS, it decides
to share the TestResults with him via the Business Ring, by inserting it under the key K8. The
TestResult is produced and controlled by the Hospital Service. On the other hand, the TestResult
also belongs to the DS, since he is the true subject of the test. This makes the TestResult indirect
data.
Figure 4.12: Mediated Privacy: Indirect Data
57
4. System Design
The Mediated Privacy strives to provide a mechanism that lets the DS track and control his
indirect data, to some extent. Indirect data, such as the TestResults, will be inserted by the DC
in the same ring under one of its free keys, which is K8 in our example. In order to communicate
the knowledge to the user, the DC inserts the pointer to the indirect data, as metadata next to the
PHR. This connection offers a clear way to identify the indirect relationship between different data
objects.
The StickyPolicy2 attached to the indirect data is a derivative of the StickyPolicy1, provided by
the DS. StickyPolicy2 is created according to the forwarding rules described in Section 4.2.5.B.
This gives great flexibility for the user, in case he wishes to further share his TestResult with
anybody else, say his health consultant.
4.4.8 Prototype Implementation Details
A prototype has been implemented in order to prove the viability of our novel proposed model,
the Mediated Privacy (MP). Two separate modules have been implemented, namely: the PPLMod-
ule and the DHTModule. With the use of these two modules a minimalistic demo has been de-
veloped that simulates the interaction between two PDVs and two service providers. The projects
involving the implementation of the PDV and service provider were outside of scope for the pur-
pose of this thesis.
The PPLModule has been developed in order to facilitate the functionalities involving the
PrimeLife Policy Language (PPL). The PPL is an extension of the XACML standard proposed
by the PrimeLife project. The functionalities of the XACML language has been supported by
an existing open source implementation, called the Balana engine [1]. The XACML implemen-
tation serves as an access control engine used to carry out authorizations. The PPLModule is
concerned with the creation of the PPL elements: Data Handling Policy (DHPol), Data Handling
Preference (DHPref) and Sticky Policies, which in turn are attached to the XACML policies. For
the purpose of this prototype, the PPL elements only contain two relevant properties: the Au-
thorizationForPurpose and the AuthorizationsForDowstreamUage. The AuthorizationForPurpose
property defines the purposes under which a DC is authorized to access user data, while the
AuthorizationsForDowstreamUage property defines for what purposes the DC might disclose col-
lected user data to a Third Party. Two additional services have also been impemented into the
PPLModule, namely: the PolicyMatchingEngine and the PolicyDecisionPoint. The PolicyMatchin-
gEngine is responsible for the creating Sticky Policies by matching a DHPol with a DHPref. A
simplfied implementation of the PolicyMatchingEngine has been carried out that always prefers
the DHPref of the DS over the DHPol of the DC. In practice, this means that the resulting Sticky
Policy will be a subset of the DHPref. In a full implementation a reasoning engine is needed,
which can provide a more flexible matching. The PolicyDecisionPoint is a service which is used to
evaluate an existing StickyPolicy against an access request. Access requests are accompanied
58
4.5 Summary
by a AuthorizationForPurpose property, which represents the purpose of data usage. The Poli-
cyDecisionPoint is in charge of deciding whether a StickyPolicy allows data usage for a specified
purpose or not.
The DHTModule has been implemented in order to provide the DHT functionality required by
the Business Ring. We decided to use OpenChord [2] as our base DHT implementation because
of its simplicity. The DHTModule uses the Protected Data (PD) data abstraction, encapsulating
data and Sticky Policy. The insert and retrieve DHT operations have been modified in order to sup-
port the logging mechanism outlined in Section 4.4.6. Before routing operations to the responsible
Chord node, the DHTModule issues a customized insertLog operation. The insertLog operation
acts as a normal Chord insert, except that it rehashes the value of the original LookupKey before
the operation is carried out. The insertLog inserts a LogEvent into the Chord DHT. The LogEvent
object contains information about the requested LookupKey, the identity of the requester and the
nature of the operation (insert or retrieve). The LogEvent can then be retrieved by a modified
retrieve operation, called the retrieveLog operation, which rehashes the LookupKey before the
actual retrieval operation is carried out. Furthermore, a safeRetrieve operation has also been
implemented, which additionally to its LookupKey parameter also takes an AuthorizationForPur-
pose parameter, describing the purpose for which data is requested. After the hosting node of
the requested data is found, the safeRetrieve operation evaluates the AuthorizationForPurpose
against the StickyPolicy of the hosted PD. In case of an allow, the safeRetrieve returns with the
PD. In case of a deny, no data is returned.
4.5 Summary
The first contribution of this thesis work was to investigate how the PrimeLife Policy Language
(PPL) proposed by the PrimeLife project can be used to accommodate a system with Personal
Data Vaults. As the language framework by itself is not enough to provide strong guarantees
without enforcement, we proposed three alternative designs of policy enforcement frameworks.
The Verifiable Privacy (VP) comes with an strict architectural design that facilitates remote
software verification and monitoring of applications. The TPM attested software verification stands
at the basis of this model’s trust framework. We introduced the concept of a Forwarding Chain, a
platform used to maintain control over existing data copies.
The Trusted Privacy (TP) model is a relaxed version of Verifiable Privacy. Its trust framework
relies on trust outsourcing to multiple independent TTPs. A combined trust score is derived for
every entity based on static trust seals and reputation systems. The Forwarding Chain is also part
of this model with an extended responsibility of supporting log aggregation.
The Mediated Privacy (MP) enforcement model is our novel proposed solution, which envi-
sions the design of a mediated space where shared data objects live. The DHT data structure
59
4. System Design
captures our idea of a mediated space. Herein, the creation of a Business Ring platform used
to accommodate a specific business model is proposed. Both Data Subjects and Data Controller
keep a reference of a shared data object, via its LookupKey. The Business Ring comes with its
own integrated trust framework based on keyspace slice sizes and consensus. Moreover, it also
provides request tracking and log aggregation in order to provide an assurance system.
60
5Evaluation and Discussion
Contents
5.1 Comparison on Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.2 Comparison on Feasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.3 Comparison on Trust Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.4 Comparison on Vulnerabilities and Weaknesses . . . . . . . . . . . . . . . . . 72
5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
61
5. Evaluation and Discussion
Table 5.1: Requirements Comparison Table
OnlyPrivacyPolicy
VerifiablePrivacy
TrustedPrivacy
MediatedPrivacy
EstablishingTrust
Static PolicySoftware
Verification
Static\DynamicSeals &
ReputationSystems
Keyspace SliceSize & BusinessRing Description
TransparentUser DataHandling
Static PolicyMonitoring and
LoggingMonitoring and
LoggingLogging
Data AcrossMultipleControl
Domains
Static Policy &Manual
Permission
ForwardingChain
ForwardingChain
Business Ring
MaintainingControl
Only if DCallows
ForwardingChain update
ForwardingChain update
Direct DHToperation
Chapter 5 focuses on the evaluation of the privacy enforcement models outlined in Chapter 4
based on different criteria, namely: defined requirements, feasibility, trust model and vulnerabili-
ties. The Chapter concludes with a short discussion on the subject of privacy enforcement.
5.1 Comparison on Requirements
In this section we will go through our initial requirements formulated in Section 1.3 and evaluate
how each of the above presented models fit them. By doing a comparison per requirement, we
can observe some of the relevant tradeoffs between the systems.
Table 5.1 compiles all of the three models, together with the Privacy Policy Model, which stands
for the privacy protection system in place today. The introduction of the Privacy Policy model is
simply for comparison purposes.
5.1.1 Establishing Trust
In today’s Privacy Policy model, trust establishment is often a step that most of us tend to
disregard. Reading a lengthy and abstract description on data usage often takes too much time
and focus from the end users, delaying their access to the desired functionality. In some situations,
trust is established based on mouth-to-mouth reputation, like recommendations of friends and
colleagues. This, however, can not be considered an accurate trust measure, since it relies on
the peoples subjective perception. A more objective automated trust reasoning process is clearly
needed. The quantification of a trust level, however is not a straightforward job.
62
5.1 Comparison on Requirements
Both the Verifiable Privacy (VP) and the Trusted Privacy (TP) models strive to achieve trust by
proving that the overall system run by the Data Controller (DC) is trustworthy and secure. These
proofs are the result of software verification techniques. Software verification follows the idea, that
a verified software system should run and behave according to some predefined requirements.
Verification of the integrity of the software components can either be done statically or dynamically,
and both require a verifier entity, which is usually a Trusted Third Party (TTP). The TTP supporting
static verification provides a certificate proving that the DC is following some security standard.
The downside of the static verification is that certificates are only issued periodically, leaving
an open time window for vulnerabilities. Dynamic verification on the other hand provides more
assurance, since it is carried out in real time. This extra assurance, on the other hand, comes
with the need for an enhanced hardware. The verification of the dynamically issued certificate is
carried out by a TTP. While the VP focuses on a highly dynamic software verification using the
TPM enhanced hardware, the TP settles for its static counterparts.
Software verification by itself is not enough to provide extensive privacy guarantees. The TP
model has an outsourced trust model, that strives to combine multiple sources of trust into a single
score. While the VP relies heavily on a software verification scheme, the TP compensates with
a mechanism that outsources the responsibility of trust verification. By combining the indepen-
dent trust scores from different TTPs and Reputation Systems, it strives to provide an accurate
measure.
The trust frameworks of both VP and TP are relying on the existence of multiple TTPs. The
Mediated Privacy (MP) model, on the other hand, comes with a built in trust measure that is able
to provide a reliable quantification of trust. The keyspace slice size is a quantification that can be
measured by any peer of the system. Instead of focusing on proving the trustworthiness of some
software, the MP is build around the concept of a trustworthy crowd. Along with the keyspace slice
size, the userbase size is also measured through the peers of the network. The size of a DC’s
userbase is a well adopted social incentive, but is often misused as an accurate trust measure.
The keyspace slice size on the other hand, is a value established by the users of the system by
mutual agreement.
Given the different nature of the approaches for trust establishment presented by the three
models, it is hard to highlight a single model that provides a higher trust level than the other two.
We can conclude, however, that the trust establishment mechanism of the VP is most fitted when
it comes to establishing trust between two physical machines. On the other hand, the other two
solutions are focusing in providing proof of trust for a DC entity in a more broader sense.
5.1.2 Transparent User Data Handling
The only transparency provided by DCs today are formulated in the static Privacy Policy con-
structs. Privacy Policies, dictated by the DCs, only provide a one sided agreement, not considering
63
5. Evaluation and Discussion
Data Subject (DS) preferences. Assurance of the claims of these Privacy Policies is usually not
provided. There is a need for a two sided agreement solution where the privacy preferences of
both DCs and DSs are considered. Moreover, the new solution also has to deliver assurance to
the DS, proving that his data has been handled according to pre-agreed rules.
All three privacy protection models presented above are based on the usage of the Sticky
Policy paradigm, instead of static privacy policies. Thus transparency in this context translates to
assurance that the pre-agreed Sticky Policies have been met. The most common way of getting
assurance is by verification of logs. Logging is part of all of the three models, but are realized
in different ways. Keeping event logs is the responsibility of the Monitor component of the PM
of every machine according to the VP and the TP models. The equivalent component in the
MP model is the Logging layer present on every node of the Business Ring. The Monitor of the
VP model offers the most thorough log keeping solution, since applications are running in their
isolated spaces. The middleware approach of the TP model offers a similar monitoring solution, as
long as the application layer is not bypassing the middleware. The MP model only offers logging
functionalities on Business Ring peers, and is not concerned with the application layer, as the
previous models.
The similarity of the three models in regard to their logging mechanism can be seen by the fact
that, given the distributed nature of the problem, logs are scattered around multiple nodes. Since
logs only turn into assurance once they are verified, they first need to be collected for verification.
The aggregation of logs is done through different means in all of the models. The VP alleviates
the problem by employing an external trusted audit company, which is in charge of aggregating
the logs. The TP and the MP both provide a pull-based solution, supported by the Forwarding
Chain and the Business Ring in turn. A log-pull operation in the TP triggers a traversal of the
Forwarding Chain, the log-pull in the MP, however, only requires a DHT lookup. In both cases the
integrity of the logging platform (the Forwarding Chain and the Business Ring) plays an important
role in the functioning of the logging system. We argue that the Business Ring offers a better
platform than the Forwarding Chain, since the loss of a single node in the Forwarding Chain will
result in the loss of a complete subtree. On the other hand, the problems relating to the loss of a
node in the Business Ring can be alleviated by data replication.
We can also differentiate between the log verification mechanisms, based on where the verifi-
cation is carried out. The VP assumes the existence of a TTP, while the TP and MP both deliver
the raw logs to the DS himself. The solution employed by the VP is more coarse grained, since
the TTP only provides an assurance digest. On the other hand the solution of the TP and MP is
more fine grained, since data owners can process raw logs based on their own requirements.
In conclusion we can state that the VP and TP models offer a more localized logging solution
that focuses on the application layer. On the other hand, the solution in MP is built to support a
resilient logging system with log aggregation built in mind. Moreover, the TP and MP both offer
64
5.1 Comparison on Requirements
first hand log delivery mechanism, which result in a higher assurance level, than the one provided
by the log digest in the VP.
5.1.3 Data Across Multiple Control Domains
Some Privacy Policies dedicate a section to state that collected user data could potentially be
disclosed to third parties, but they usually fail to address any details related to this transaction.
Not only the DS is asked to agree on such data forwarding, but often times the identity of the
third party remains undisclosed. A better data forwarding schema is indispensable for the sake of
control.
Sticky Policies are the main tools that dictate whether data forwarding can take place, and
under what circumstances. A general rule that applies to all of the models is that data forwarded
to third parties should have the same or a more restrictive Sticky Policy than the original Protected
Data (PD). The Privacy Manager (PM) component of the DCs is entrusted to create new Sticky
Policies according to this rule and make sure that data is only forwarded together with it. On
the other hand, the data protection across multiple domains also relies on the properties of the
dissemination platform on which the data is forwarded. An undefined platform, where entities can
directly connect to one another for data sharing purposes, makes maintaining control over shared
data nearly impossible. The VP and TP models use the Forwarding Chain as their dissemination
platform, while the MP has its own novel solution represented by the Business Ring.
One of the main differences of the two platforms is that, while the Forwarding Chain is a
highly dynamical construct with ad-hoc properties, the Business Ring is defined around existing
business models. A separate Forwarding Chain is build for every single shared PD object, while
there is only a single DHT encapsulating every internal data exchange in case of the Business
Ring. Moreover, the restriction free nature of the Forwarding Chain allows any DC eligible for PD
exchange to be part of the chain, leaving the DS without any information about the identities of
the potential third parties. The Business Ring, on the other hand, requires every DC who wishes
to share PD to be part of the same ring, because of our requirement that data should only be
exchanged through the DHT itself, never externally. The Business Ring description, provided to
every DS, can contain information regarding the identity of every potential DC and third party. DSs
can assume that his shared data will not be disclosed to DC outside of the ring in case of the MP,
while in case of the VP and TP there is no restriction with regards to the identity of the DC of the
Forwarding Chain.
In conclusion, the open nature of the Forwarding Chain used by the VP and TP offers a more
flexible solution, but lacks the structured property of the MP. By its design, the Business Ring
offers a clearer data forwarding model than its counterpart.
65
5. Evaluation and Discussion
5.1.4 Maintaining Control
Maintaining Control, being the second aspect of privacy protection, is one of the more complex
and harder requirement. In order to provide a more thorough analysis we take a look at Table 5.2
for a categorization of control, based on the nature of the data.
5.1.4.A Direct Data
Every PD object shared explicitly by the data owner is considered direct data. DCs are left
with the choice whether to provide fronted services for manipulating direct user data, or not. This
becomes especially important when a user tries to remove some previously shared data. For
example an e-commerce site might save your shipping address long after you stopped using their
services. This becomes even more complicated in scenarios where PD has already been shared
across multiples third parties.
Both VP and TP offer the functionality of control over direct data via the Forwarding Chain
platform. The DS issues the modified PD to the original data requester, who in turn is responsible
to push the modification on through the chain. One of the downsides of this solution is that,
since every node hosts his own data copy, every modification operation initiated by a DS triggers
a cascade of requests that is flooded over the Forwarding Chain. The MP, on the other hand,
assures that the DS will possess a lookup key associated with every shared PD object. If there are
two live copies of the same PD under different DCs from within the same ring, the DS maintains
a lookup key for each of them. Modification of PD can be achieved via a simple DHT insert that
replaces the old version. The Business Ring not only keeps track of all the existing data copies
throughout the ring, but also lets the DS modify every PD separately.
The benefit of the Business Ring over the Forwarding Chain is much more obvious when we
take a look at how data deletion is handled. A delete operation on a PD might not get propagated
to every DC of the Forwarding Chain due to a broken link. In the Business Ring, however, the
remove is an already implemented and supported operation of the DHT, which can be used with
this exact purpose. DSs hold a LookupKey associated with every existing copy of a previously
shared PD object. In case they wish to remove the previously shared PD, they just have to
perform a remove operation inside the Business Ring. The physical host of a PD only carries out
the operation, once the identity of the requester is verified to be the original DS associated with
the PD. The mechanism issuing identities and providing verification is not part of this research.
5.1.4.B Indirect Data
A different fraction of data, called indirect data described in Section 2.2, is often left out of the
privacy research. Given the mashup-like structure of the web, however, indirect data is becoming
increasingly important. The only model that addresses the handling of indirect data is the MP.
Given the ambiguous nature of the indirect data, we refrain ourselves from deciding who is the
66
5.2 Comparison on Feasibility
Table 5.2: Detailed Comparison on Maintaining Control
MaintainingControl of
OnlinePrivacyPolicy
VerifiablePrivacy
TrustedPrivacy
MediatedPrivacy
Direct DataOnly if DC
allowsvia Forwarding
Chainvia Forwarding
Chainvia Business
Ring
Indirect Data N/A N/A N/A via Metadata
Sticky Policy N/Avia Forwarding
Chainvia Forwarding
Chainvia Business
Ring
correct DS of such an object, instead we provide a platform by which users can be aware of
indirect data. Since awareness is the first step of achieving privacy protection, we consider this
an important feature. The Business Ring increases awareness of indirect data by maintaining data
pointers in form of metadata attached to PD objects. The data pointers are simple lookup keys
pointing to the shared indirect data inside the ring. Inserting data pointers is the responsibility of
the entity who shares the indirect data, while keeping an updated view on the existence of indirect
data is the responsibility of every DS node of the ring through pulling the metadata.
5.1.4.C Sticky Policy
Some of the use cases are focusing on the manipulation of the access rights and conditions
regarding the previously shared data. Conceptually, this can be achieved by updating the Sticky
Policy of a PD with a new version. This becomes increasingly relevant in use cases where a DS
wants to revoke the access rights of a PD.
The modification of a Sticky Policy can be achieved in a similar manner of that of the direct
data, since both the Forwarding Chain and the Business Ring operate on PD objects. Just as
stated before, a potentially broken Forwarding Chain can leave behind unauthorized data copies,
while the Business Ring makes sure to provide a data pointer to every existing copy to the DS.
Thus the MP offers a finer granularity of control, since it allows the DS to modify the Sticky Policies
of individual PD copies.
5.2 Comparison on Feasibility
This section is dedicated to the analysis of the assumptions made by the proposed models, in
order to evaluate their feasibility. In this analysis we will consider criteria such as prerequisites,
67
5. Evaluation and Discussion
architectural constraints, and other assumptions.
The first observation that we can make after going through our models, is the difference in the
approach taken by the VP and TP from the MP. Both VP and TP focuses on a privacy enforcement
model that is carried out on a per machine basis. Resulting from this, privacy guarantees can
only be provided if the interacting parties are both operating on machines built according to a
specific design. A homogeneous setup, where every machine is designed alike with privacy
features in mind, is a highly demanding assumption. The concept of PDVs is still in its maturing
phase, thus integrating a privacy oriented design into it is still acceptable. This can hardly be
said, however, of the already existing infrastructure. This becomes especially clear when looking
into how the backend of information systems are built today. Service providers tend to have highly
optimized and distributed backend systems, designed to fit their own specific use, which differ from
one to another. Achieving homogeneity, by converting every machine into a Policy Enforcement
Point (PEP) node, implies a radical change required from service providers.
The MP, on the other hand, instead of focusing on a per machine schema, comes with a
design which resembles an additional data access layer. The requirement for homogeneity still
holds across PDVs, since they have to be active participants of the Business Ring, in order for
them to share information. Requiring homogeneity across all machines would have the same
implications as presented above. To avoid this, the MP describes a privacy enforcement model
where the privacy guarantees are enforced on a per entity basis. By this requirement every entity,
being it DS or DC, PDV or service provider, should have their dedicated set of nodes inside the
Business Ring. These dedicated machines form a data access layer, which is enhanced with
privacy enforcing techniques.
As pointed out above, all three models can be fitted into the design of the loosely defined
PDV concept. Differences arise in the strategies in which they can be incorporated into existing
infrastructure. If we abandon the requirement for homogeneity of VP and TP, we can assume a
per entity design, where each entity has its own dedicated PEP, or set of PEPs. This brings all
thee models on similar terms with regards to integration efforts. However, we argue that the MP
is formalized with an integration model in mind, through a PaaS solution, while the other models
are lacking this.
The second concern with regards to feasibility is the architectural requirements of a single PEP
node. The VP model sketches the most demanding PEP design, because apart from specialized
software, it also requires TPM enhanced hardware. In order to achieve the isolation of separate
applications running on top of it, its architecture has to follow either the system VM or the process
VM design. The TP relaxes this architecture model by replacing the TPM with external trust
sources and positioning the PM above the OS. Instead of a whole specialized software system,
the TP only requires the existence of the PM middleware.
The PEP node of the MP corresponds to a Business Ring node, and follows the three layered
68
5.2 Comparison on Feasibility
architecture design described in Section 4.6. Both TP and MP only require the existence of a
specialized software component in the form of a software layer. But while the software layer of
MP is a purely conceptual data access layer, the middleware of TP is a software layer that has to
fully cover the functionality of the OS, providing a privacy enhanced layer to the application layer
on top.
The architectural requirements of both VP and TP, although more strict in terms of design, are
generic enough to support any application running on top of them. The MP node, however has
the most flexible PEP structure. The strictness of the design is inversely proportional to the level
of enforcement that a single node can provide as presented in Section 5.4.
Another aspect of feasibility is the assumptions made with regards to the existence of TTPs.
The VP uses a TTP to outsource the responsibility of aggregation and verification of logs, while
the TP relies on the existence of several TTPs in order to sustain its trust framework. The MP, on
the other hand does not require a TTP, because it is capable of providing a built in log aggregation
and verification system, together with a self reliant trust framework.
Important considerations with regards to feasibility are the performance penalties that every
model introduces. We refrain from talking about any quantitative performance measurement,
since the thesis is carried out on a conceptual level. However, we can make estimates regarding
the performance of some aspects.
Operations on maintaining control over previously distributed data copies can be subject to a
rough performance evaluation. This operation can either require all data copies to be modified
or deleted, or it might even target the Sticky Policies of the data copies. The Forwarding Chain
and the Business Ring are the two subjects of comparison, since they are providing the platform
where these operations are carried out. Suppose that there are N copies of a shared PD that
need to be updated. In case of the Forwarding Chain these copies are organized in a tree-like
structure with N nodes, one for each copy. A simple traversal of the tree, which visits every data
copy, has the time complexity of O(N). In case of the Business Ring, on the other hand, data
copies are organized inside the DHT in an unpredictable manner. A single update operation in
the DHT has the time complexity of O(log(U)), where U denotes the size of the userbase (number
of participants in the Business Ring). An operation that modifies all data copies would have the
time complexity of O(N*log(U)), which is slightly worse then O(N).
The MP, however, makes up for this performance penalty in scenarios where only a single
data copy needs to be updated. In case of the Forwarding Chain, the only entry point being the
root of the dissemination tree, a Breadth First Search (BFS) or a Depth First Search (DFS) is
required in order to find the right node holding a particular data copy. The complexity of both
search algorithm yields O(N), since the whole tree needs to be traversed in the worst case. The
DHT of the Business Ring, on the other hand, only requires a single operation, since every data
copy has its own reference LookupKey. This puts its time complexity to O(log(U)). U is assumed
69
5. Evaluation and Discussion
to be a much larger number then N, since the size of the userbase can reach magnitudes of
thousands, while the number of existing data copies is a much smaller number. This means that
for smaller Ns the Forwarding Chain will yield better performance, but the Business Ring will keep
a constant performance of O(log(U), independent of N. Thus for larger Ns the Business Ring will
show a better performance.
In conclusion, the VP model presents the most assumptions and requirements such as TPM
enhanced hardware, strict PEP design and existence of a TTP. The TP is considered to be
a relaxed version of VP, but it still needs the presence of multiple TTPs in order to function
reliably. The MP comes with the least requirements with regards to architecture at the cost of
some performance penalty. The source of this performance penalty is the requirement for a DHT
operation on every data exchange, which involves hops of several nodes. The VP and TP assume
a point-to-point PEP without any indirection.
5.3 Comparison on Trust Models
This section provides an overview and comparison of the trust frameworks designed by the
three formulated models. Special attention is dedicated towards the way in which the source of
trust is shifting from one model to the other.
In order to provide a comprehensive evaluation of the trust models, first we introduce some
general concepts regarding trust. The PRIME research project [8] depicts a categorization of the
trust model based on layers. Each component of the layer contributes into the final trust score, by
which users decide whether to trust an entity, or not. These layers are as follows:
1. Socio-cultural layer
This layer refers to the general socio-cultural background of every activity. Cultural back-
ground, for example, can have an impact on the attitude adapted towards strangers in dif-
ferent people. While some social groups are more inwards oriented, others are not. This in
turn can influence how easily previously unknown service providers are accepted into the
trust zone.
2. Institutional layer
This layer refers to the underlying law regulations of the legal system, which provide gener-
ally structured rules. Data protection laws described in Section 2.2 belong to this layer.
3. Service Area layer
This layer refers to the difference in trust based on the branch of an existing industry. The
banking sector, for example is supposedly more trusted then the social networking sector.
4. Application layer
70
5.3 Comparison on Trust Models
This layer targets the trust in a particular service provider. The users perception of a direct
interaction with a service provider becomes a deciding factor of trustworthiness. Irregular
events during this interaction can degrade trust. For example, a booking website which
offers incomplete information on the booking, but asks for banking details, is unlikely to be
used by anybody.
5. Media layer
This layer refers to the communication channels via which interactions are conducted. Trust
in the medium is a strong requirement most of the times. The internet is considered such a
medium, where strong trust levels can be achieved via secure encryption.
The VP model’s trust framework is focused on remote software and log verification in order
to provide privacy guarantees. The source of trust is the verification which is carried out by the
TTPs. The user is required to completely trust these entities. This trust framework offers a design
which mostly targets the Application Layer, since verification is carried out on remote parties of
the interaction.
The TP offers a different trust framework which is composed of two components: the trust
score and the reputation score. The trust score strives to achieve similar trust guarantees of that
of the VP. TTPs are attesting the verification of the remote platforms. The reputation score can
either assume another TTP, like blacklist authorities, or rely on the crowd, like feedback systems.
This trust framework also focuses on the Application Layer, since trust is evaluated on individual
service providers. Unlike the VP, however, the trust source of TP is scattered among multiple
entities belonging to an independent crowd. The independent crowd is made up of TTPs providing
the trust and reputation score. The independence property is described in Section 4.3.4.A.
The MP takes a different approach for its trust framework, which no longer relies on remote
platform and software verification. The source of trust shifts form the TTPs to what we call a
collaborating crowd. The collaborating crowd is made up of the peers of a specific Business
Ring, which are collaborating to provide trust measures. The keyspace slice size, which is one
such trust measure defined in Section 4.4.5.B, relies on the correct collaboration of the Business
Ring members achieved through consensus. The MP can partly be seen as part of the Application
Layer of the trust categorization, since it targets trustworthiness of service providers. On the other
hand, it can also be considered as part of the Media Layer, given the construct and properties of
a Business Ring. A Business Ring is regarded as a mediated space, a channel on which different
interactions can be carried out. Of the three models, only the MP is the one which possesses this
property.
Trust can also be divided into two major groups: social trust and technological trust. The
social trust encapsulates all social aspects of trust, which are derived by humans based on the
reputation, interaction history, and other social incentives. The technological trust, on the other
71
5. Evaluation and Discussion
hand, is focused on the trust achieved by technical mechanism, such as tamper free hardware,
cryptographic techniques, and others.
The VP is focused on providing strong technological trust via remote software verification tech-
niques. The high degree of technological trust compensates for the missing, or weak social trust.
The details of the interacting parties become irrelevant once there is a technological assurance
attesting, that both are safe to interact with. The technological trust offered by the software verifi-
cation, however, can be broken by the exploitation of vulnerabilities and software bugs. Similarly,
the TP provides technological trust through verification, but it also proposes an improvement of
the social trust, by combination of independent trust sources.
The MP, on the other hand, is not focused on technological trust, but on social trust via the
collaborating crowd of the Business Ring. The elevated social trust level compensates for the
shortcomings of the technological trust. By trusting the majority of the crowd, the Business Ring
can provide trustworthy assurance in the form of verifiable logs.
5.4 Comparison on Vulnerabilities and Weaknesses
In this section we examine the weak points of every enforcement model, and discuss about
the possibilities of overcoming them.
5.4.1 Weaknesses of the Sticky Policy
Sticky Policies serve as the base data protection mechanism underlying each of the presented
models. As described in Section 3.4, Sticky Policies are created after combining a Data Handling
Policy (DHPol) with a Data Handling Preference (DHPref). The combining of the two may trigger a
negotiation phase, where both parties try to enforce some data handling constraints on the other.
Creating the logic behind such reasoning is not always straight forward. Imagine the following
case: the user wants his data to be deleted after 24 hours, and the service provider states that it
will delete the data, but it will also keep a backup copy. Such scenarios can result in a misleading
agreement.
Moreover, there are other cases when service providers simply refuse to give in to the user’s
data handling preferences. Imagine, again, that the user wants his data to be removed after 24
hours, but the service provider simply denies this request in his DHPol. One might jump to the
conclusion that in these cases the user will simply discontinue the interaction with the service
provider. In real world use cases, however, users are eager to get services on demand as fast
as possible. This can lead to a compromise, which the user makes, letting the service provider
decide what the actual Sticky Policy will look like.
The basic requirement of the Sticky Policy paradigm is that the protected data should always
be accompanied by its Sticky Policy. Privacy enforcement models, like the ones proposed in this
72
5.4 Comparison on Vulnerabilities and Weaknesses
thesis, are required to safeguard the bond between data and policy. Exploits to induce policy
separation from its data however, could lead to privacy violations, since the bare data object is no
longer associated with its protecting policy.
Moreover, exploits can exist to switch or modify Sticky Policies. A malicious DC could replace
the Sticky Policy of a shared Protected Data (PD) object, allowing him to extend the number
of operations it can do on the PD. The switched Sticky Policy can either be a forged version,
or an older deprecated version of the previous policy. The most common method to alleviate
this problem is to introduce integrity checks and provide non-repudiation. Integrity checks are
necessary to assure that the Sticky Policy attached to the data object is the right one. Non-
repudiation guarantees that the DS is the truthful creator of the Sticky Policy.
5.4.2 Malicious Data Controller (DC)
A maliciously behaving DC is the one which is consciously operated, or has been tampered
with, to bypass a restriction or ignore a required operation. A tampered DC system can be consid-
ered unsafe to interact with, because of the high potential of privacy violation. Since the technical
enforcement mechanism are no longer protecting the user’s data, malicious DCs gain unrestricted
access. Moreover, the traces of such a violation can also be disguised by the malicious DC, by re-
fusing to comply with the required logging system. Ignoring, or providing compromised logs, can
help the malicious DC mask a policy violation as a correct access. In order to prevent a malicious
DC all three models have a specialized software component, called the Privacy Manager (PM),
which prevents any misusage. On the other hand, systems might get compromised even if they
follow one of the presented enforcement models.
The weakest model, in regard with the possibility of a malicious DC is the Mediated Privacy
(MP). Without the use of a remote software verification technique, the DSs interacting with a DC
in the Business Ring can only observe its behaviour through his external actions. This gives a
large amount of flexibility and freedom to the internal application layer, which is processing shared
PD objects. Moreover, MP does not employ any of the strict monitoring schemas of the other two
models, leaving the PD vulnerable to exploits in the application layer.
The Trusted Privacy (TP) model relying on a middleware solution offers a better protection
against tampering, because of the software verification techniques employed. TTPs are required
to evaluate and issue static certificates attesting the correct state of the PM on the DC’s system.
A correctly functioning PM, however, can not guarantee the correctness of the DC. Malicious soft-
ware from the application layer could bypass the PM middleware, in order to carry out unrestricted
and unmonitored operations. The likeliness of this bypass, however, depends on the design of
the middleware itself.
The Verifiable Privacy (VP) model offers a much stronger software verification mechanism,
which provides dynamic assurance. The verification of the PM is carried out in real time at every
73
5. Evaluation and Discussion
operation, minimizing the likeliness of a tampered PM. On the other hand, the existence of a
malicious application running on the DC could go unnoticed, even by a fully functioning PM. First,
the malicious application has to pass the verification of the PM and get access to a PD. After
getting the PD, however, the malicious application could open an encrypted channel to an outside
machine, where it can send PD objects unrestricted. The encrypted channel, independent of
any encryption used by the VP itself, leaves the monitor blind to every data exchange with the
outside machine, making unverified data disclosure possible. In this way, the trustworthy PEP of
the DC becomes a proxy for an unverified machine, via a malicious application. Once the external
machine receives the PD, it can do any operation on it.
As presented above, none of the three models are tamper-free, leaving the possibility of a
malicious DC a threat. The efforts required to convert a correct DC into a malicious one varies
from model to model. Lacking software verification and internal monitoring the MP requires the
least effort in face of tampering. The TP and VP requires higher effort, since the bypassing of the
existing enforcement system is more troublesome, but not impossible.
5.4.3 Platform Vulnerabilities
The distributed platform, namely the Forwarding Chain and the Business Ring, on which the
presented models operate on can also have vulnerabilities. Both platforms are responsible to
maintain pointers to existing PD objects, such that the DS can exercise his control on them.
Moreover, these platform are also used by the logging system.
As explained before in Section 5.1.4, a broken link in the Forwarding Chain can cause the loss
of control over a subset of shared PDs. This can lead to both new and deprecated PDs to coexist,
introducing new security holes. Moreover, the broken link also causes the loss of a subset of the
usage logs. The missing logs can lead to weak assurance, followed by a degradation of the trust
framework.
The Business Ring platform, essentially being a decentralized structure with more flexible
properties than the Forwarding Chain, can be targeted by a botnet attack. A botnet living inside a
Business Ring can cause multiple problems, depending on the size of the botnet. A bigger sized
botnet might interfere with the consensus algorithms, rendering the trust framework biased. A
smaller botnet can cause message and log disruptions causing the system to stop functioning. A
viable solution against botnets is the use of an identity management system, where each nodes’
identity is verifiable. Moreover, smaller sized botnets could be discovered by the system itself, and
by running a consensus driven mechanism it could be eliminated.
Both vulnerabilities can cause major disturbances in the correct functioning of both platforms.
We argue, however, that the cost of a botnet attack is much higher of that of a broken link, making
it less unlikely to happen. Botnet attacks need additional resources, put to a specific use, while
the breaking of the Forwarding Chain can be caused by multiple events. One such event is the
74
5.5 Discussion
targeted attack to a single node of the chain. On the other hand, a simple machine failure caused
by an accident can also create a broken chain, thus the Forwarding Chain is less resilient than
the Business Ring.
5.5 Discussion
One of the main conclusions of the above evaluation section is that none of the three formu-
lated models can offer a definite privacy enforcement solution that covers every aspect of privacy
protection. Section 5.4 points out potential weaknesses in the design of the models. Moreover,
software bugs in the implementation of a particular design are inevitable, leaving an even bigger
window opened for vulnerabilities. Since technology is not enough to provide a fully satisfying
solution, alternatives must be considered.
A large subset of privacy violations are happening because of unaware or careless usage of
technology. Sharing private information online is a socially acceptable norm, often even rewarded
through means of upwards pointing thumbs. Users often do not realize the implications of such
data sharing. This is caused by the failure to recognize the world wide web for what it is: a
public domain. Example of real life privacy invasions can be observed on open streets of a city,
or public talk shows. A private investigator, for example, is allowed to track a person though the
public streets of a city. Similarly, a talk show guest is also subject to disclose private personal
information in front of the public. In both cases private information of individuals get uncovered,
but these real world examples are hardly considered privacy violations. In case of the public
streets, an individual is well aware that his behaviour in public can be subject to observation. Talk
show guests might even get some remuneration for their participation for the show, and inherently
for sharing private information. Similarities between real world and online interactions cannot be
ignored. A shift in the public consciousness towards online interactions has to take place, in order
for individuals to realize the implication of online information sharing.
The trust models depicted by the privacy enforcement models formulated can also be subject
of evaluation through real life models and interactions. Trust models that are strongly relying on
TTPs require an authoritative trust from end users. This authoritative trust in real life requires a
trust model similar of the one used for governmental agencies. Both exhibit properties of a semi
closed system which is trusted to carry out some service. Trust in these entities are highly reliant
on subjective opinions, which can range to extremes. Similarly, trust in TTP has the same social
implications.
The trust model of the Mediated Privacy (MP) model relies on the functioning of a collaborating
crowd. The collaborating crowd exhibits properties of a real life community tied together by a
common interest, which is the provided business model inside a Business Ring in our case. A
trust model which focuses on the collective power of individuals is much more appealing, than that
75
5. Evaluation and Discussion
of an authoritative trust. Although the technical means of achieving a strong collaborative online
crowd that can provide trust are not well defined today, we argue that the MP provides valid ideas
in the domain.
5.6 Summary
Chapter 5 is dedicated to the evaluation of our proposed privacy enforcement models consid-
ering multiple different criteria, namely: initial requirements, feasibility, trust source, vulnerabilities
and weaknesses. After evaluating initial requirements, we can conclude that the VP and TP mod-
els are focusing on a stronger localized solution, with strong PEPs, while MP is more concerned
about distributed interaction models. Moreover, the data dissemination platform of the Business
Ring provides a better control model and greater resilience, than that of the Forwarding Chain.
The feasibility evaluation helped us to rule out solutions requiring homogeneity, due to integration
problems, and also points out the high requirements of the VP and TP models, making them less
feasible. The trust source evaluation offers a categorization of the proposed solutions based on
the traits of their trust framework. We can observe a shift in the trust source moving from the TTPs
of the VP and TP towards the collaborative crowd. Evaluation on vulnerabilities show that the MP
model is most easily compromised, but TP and VP also exhibit weaknesses. Finally, we conclude
the chapter with a discussion about our proposed solution in light of current real life aspects.
76
6Conclusion
Contents
6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
77
6. Conclusion
Chapter 6 concludes with a short summary of the thesis, highlighting contributions that made
by this work. The Chapter ends with a section on proposed future work.
6.1 Summary
The goal of this thesis project was to explore different privacy enforcement models employing
the Personal Data Vault (PDV) as the source of personal data. As the related research suggested,
we turned our attention towards privacy enhancing models, that employ the use of privacy pol-
icy languages and the sticky policy paradigm. The PrimeLife project, in particular, offered the
PrimeLife Policy Language (PPL), which formulates Sticky Policies based on a Data Handling
Policy (DHPol) of a Data Controller (DC) combined with the Data Handling Preference (DHPref)
of a Data Subject (DS). Our first contribution consisted of evaluating whether the PPL framework
fits the scenario where PDVs are widely employed. Section 4.1 evaluates the language elements
of the PPL, which were found to be appropriate as a the basis for privacy protection for the PDV.
The second contribution was targeted at the design of several policy enforcement models,
which can guarantee the correct functioning of the PPL framework. A set of requirements were
defined in Section 1.3 in order to focus the scope of the enforcement models on four highlighted
aspects of privacy concerns, namely: establishing trust between entities, providing transparency
in data handling, protecting data across multiple control domains in forwarding scenarios, and
maintaining control over previously shared data.
Herein, three different policy enforcement models are proposed, namely: two based on previ-
ous research (Verifiable Privacy (VP) and Trusted Privacy (TP)), and one novel approach called
Mediated Privacy (MP). The VP provides privacy guarantees through remote software verification
methods attested by enhanced hardware solutions. The TP offers a similar design, but a different
trust framework, which relies on the combination of independent trust sources in order to provide
a quantification of trust. The MP is our novel proposed solution for privacy enforcement, that in-
troduces the concept of a mediated space, which serves as a platform for user data exchange.
We chose the Distributed Hash Table (DHT) data structure as our mediated space because of
its completely decentralized properties. Furthermore, we proposed the concept of a Business
Ring, which is used to define a mediated space, based on a business model. Business Rings
are employed to provide data sharing and hosting between Data Subjects and Data Controllers.
References of objects saved inside the ring are kept by both the DS and the DC, making future
interactions with previously shared data possible for both parties. Together with the Business
Ring, we also propose a stand alone trust framework, which does not require the existence of a
Trusted Third Party (TTP). The proposed trust framework quantifies trust on the keyspace slice
assigned for a particular node inside the Business Ring. The keyspace slice property relies on
the collaboration of the crowd, thus providing a trust model that is based on a network of equal
78
6.2 Future work
peers, rather than TTPs.
The third proposed contribution was to evaluate the models sketched in the second contri-
bution based on a set of different criteria. All of the three models were found to exhibit privacy
enhancing features satisfying our initial requirements. However, differences have been pointed
out between them. We found the MP the most suitable model for maintaining fine grained control
over shared data. Evaluation on feasibility focused on the integration efforts that the enforcement
models would require. The VP proved to have the most demanding assumptions. Our comparison
on trust source offers a categorization of the models based on the traits of their trust framework.
Finally, the evaluation based on vulnerabilities and weaknesses pointed out that all of the models
can be subject to exploitations. However, exploitation efforts are considerably higher in case of
the VP and TP model, than in the MP model.
The considered problem on user privacy protection comes with a large research field. Our
proposed solution only accounts for some of the privacy aspects, given the short time and limited
scope of the thesis work. Likewise, our proposed models are only speculations on a conceptual
level, focusing only on some design aspects. Our evaluation, although not being exhaustive, offers
a basis for discussion on how privacy protection might be carried out in the future.
6.2 Future work
A more extensive design and evaluation of the proposed Mediated Privacy (MP) should be
considered. Research with regards to security aspects that have not been covered in this work
can contribute for the development of the Mediated Privacy into a more complete model. The
weak guarantees in face of compromised nodes pointed out in Section 5.4 could be accounted
for with the use of technologies proposed in the other two models. Investigating whether software
verification and monitoring can become part of the MP is the main direction of the future research.
Another proposal is targeting the evaluation of the sticky policy paradigm, which serves as
a basis for every enforcement model. The main research topic would be to see how a strong
technical bond can be achieved between a sticky policy and a data object, in order to avoid
stripping policies off. Strategies can include research in the direction of digital watermarking and
cryptographic solutions to achieve the strong bond.
79
6. Conclusion
80
Bibliography
[1] Balana, open source xacml implementation. URL https://svn.wso2.org/repos/wso2/
trunk/commons/balana/.
[2] Open chord, open source chord implementation. URL http://open-chord.sourceforge.
net/.
[3] Trusted computing group. URL http://www.trustedcomputinggroup.org/developers/
glossary.
[4] Oasis extensible access control markup language (xacml) tc. URL https://www.
oasis-open.org/committees/tc_home.php?wg_abbrev=xacml.
[5] Primelife, 2011. URL http://primelife.ercim.eu/.
[6] Trusted architecture for securely shared services, 2011. URL www.tas3.eu.
[7] T. Ali, M. Nauman, F.-e. Hadi, and F.B. Muhaya. On usage control of multimedia content
in and through cloud computing paradigm. In Future Information Technology (FutureTech),
2010 5th International Conference on, pages 1–5, May 2010. doi: 10.1109/FUTURETECH.
2010.5482751.
[8] Christer Andersson, Jan Camenisch, Stephen Crane, Simone Fischer-Hubner, Ronald
Leenes, Siani Pearson, John Soren Pettersson, and Dieter Sommer. Trust in PRIME.
In Proceedings of the Fifth IEEE International Symposium on Signal Processing and
Information Technology, 2005., pages 552–559. IEEE, 2005.
[9] David W. Chadwick and Stijn F. Lievens. Enforcing ”sticky” security policies throughout
a distributed application. In Proceedings of the 2008 Workshop on Middleware Security,
MidSec ’08, pages 1–6, New York, NY, USA, 2008. ACM. ISBN 978-1-60558-363-1. doi:
10.1145/1463342.1463343. URL http://doi.acm.org/10.1145/1463342.1463343.
[10] I. Ciuciu, Gang Zhao, D.W. Chadwick, Q. Reul, R. Meersman, C. Vasquez, M. Hibbert,
S. Winfield, and T. Kirkham. Ontology based interoperation for securely shared services:
Security concept matching for authorization policy interoperability. In New Technologies,
81
Bibliography
Mobility and Security (NTMS), 2011 4th IFIP International Conference on, pages 1–5, Feb
2011. doi: 10.1109/NTMS.2011.5721052.
[11] K&L Gates Drummond Reed & Joe Johnston, Connect.Me; Scott David. The personal
network: A new trust model and business model for personal data. May 2011. URL
http://blog.connect.me/whitepaper-the-personal-network/.
[12] Council of the European Union European Parliament. Directive 95/46/ec of the euro-
pean parliament and of the council of 24 october 1995 on the protection of individu-
als with regard to the processing of personal data and on the free movement of such
data, 1995. URL http://europa.eu/legislation_summaries/information_society/
data_protection/l14012_en.htm.
[13] Council of the European Union European Parliament. Commission proposes a comprehen-
sive reform of data protection rules to increase users’ control of their data and to cut costs
for businesses, 2012. URL http://europa.eu/rapid/press-release_IP-12-46_en.htm?
locale=en.
[14] Kaniz Fatema, DavidW. Chadwick, and Stijn Lievens. A multi-privacy policy enforcement
system. In Simone Fischer-Hubner, Penny Duquenoy, Marit Hansen, Ronald Leenes, and
Ge Zhang, editors, Privacy and Identity Management for Life, volume 352 of IFIP Advances
in Information and Communication Technology, pages 297–310. Springer Berlin Heidelberg,
2011. ISBN 978-3-642-20768-6. doi: 10.1007/978-3-642-20769-3 24. URL http://dx.
doi.org/10.1007/978-3-642-20769-3_24.
[15] Roxana Geambasu, Tadayoshi Kohno, Amit Levy, and Henry M. Levy. Vanish: Increasing
data privacy with self-destructing data. In Proc. of the 18th USENIX Security Symposium,
2009.
[16] Tyrone Grandison, Srivatsava Ranjit Ganta, Uri Braun, and James H. Kaufman. Protecting
privacy while sharing medical data between regional healthcare entities. In Klaus A. Kuhn,
James R. Warren, and Tze-Yun Leong, editors, MedInfo, Studies in Health Technology and
Informatics, pages 483–487. IOS Press. ISBN 978-1-58603-774-1.
[17] Prof. Dr. Dr. h.c. Johannes A. Buchmann. Internet Privacy : Options for adequate
realisation. Springer Berlin Heidelberg, 2013. URL http://link.springer.com/book/10.
1007/978-3-642-37913-0.
[18] L. Ibraimi, Q. Tang, P. H. Hartel, and W. Jonker. Exploring type-and-identity-based proxy
re-encryption scheme to securely manage personal health records. International Journal of
Computational Models and Algorithms in Medicine (IJCMAM), 1(2):1–21, 2010. ISSN 1947-
3133.
82
Bibliography
[19] Personal Inc. Personal data vault definitions, 2003. URL http://hub.
personaldataecosystem.org/wagn/Personal_Data_Vault.
[20] Steve Kenny and Larry Korba. Applying digital rights management systems to privacy rights
management. Computers & Security, 21(7):648 – 664, 2002. ISSN 0167-4048. doi: http://dx.
doi.org/10.1016/S0167-4048(02)01117-3. URL http://www.sciencedirect.com/science/
article/pii/S0167404802011173.
[21] T. Kirkham, S. Winfield, S. Ravet, and S. Kellomaki. The personal data store approach to
personal data security. Security & Privacy, IEEE, 11(5):12–19, Sept 2013. ISSN 1540-7993.
doi: 10.1109/MSP.2012.137.
[22] Gina Kounga and Liqun Chen. Enforcing sticky policies with tpm and virtualization. In Liqun
Chen, Moti Yung, and Liehuang Zhu, editors, Trusted Systems, volume 7222 of Lecture
Notes in Computer Science, pages 32–47. Springer Berlin Heidelberg, 2012. ISBN 978-
3-642-32297-6. doi: 10.1007/978-3-642-32298-3 3. URL http://dx.doi.org/10.1007/
978-3-642-32298-3_3.
[23] U.M. Mbanaso, G.S. Cooper, David Chadwick, and Anne Anderson. Obligations for pri-
vacy and confidentiality in distributed transactions. In MiesoK. Denko, Chi-sheng Shih,
Kuan-Ching Li, Shiao-Li Tsao, Qing-An Zeng, SooHyun Park, Young-Bae Ko, Shih-Hao
Hung, and JongHyuk Park, editors, Emerging Directions in Embedded and Ubiquitous
Computing, volume 4809 of Lecture Notes in Computer Science, pages 69–81. Springer
Berlin Heidelberg, 2007. ISBN 978-3-540-77089-3. doi: 10.1007/978-3-540-77090-9 7.
URL http://dx.doi.org/10.1007/978-3-540-77090-9_7.
[24] Ricardo Neisse, Alexander Pretschner, and Valentina Di Giacomo. A trustworthy usage
control enforcement framework. In ARES, pages 230–235. IEEE, 2011. ISBN 978-1-4577-
0979-1.
[25] AsmundAhlmann Nyre. Usage control enforcement - a survey. In AMin Tjoa, Gerald Quirch-
mayr, Ilsun You, and Lida Xu, editors, Availability, Reliability and Security for Business,
Enterprise and Health Information Systems, volume 6908 of Lecture Notes in Computer
Science, pages 38–49. Springer Berlin Heidelberg, 2011. ISBN 978-3-642-23299-2. doi:
10.1007/978-3-642-23300-5 4. URL http://dx.doi.org/10.1007/978-3-642-23300-5_4.
[26] Information Commissioner’s Office. Key definitions of the data protection act. URL http:
//ico.org.uk/for_organisations/data_protection/the_guide/key_definitions.
[27] Jaehong Park and Ravi Sandhu. Towards usage control models: Beyond traditional access
control. In Proceedings of the Seventh ACM Symposium on Access Control Models and
83
Bibliography
Technologies, SACMAT ’02, pages 57–64, New York, NY, USA, 2002. ACM. ISBN 1-58113-
496-7. doi: 10.1145/507711.507722. URL http://doi.acm.org/10.1145/507711.507722.
[28] S. Pearson and Marco Casassa Mont. Sticky policies: An approach for managing privacy
across multiple parties. Computer, 44(9):60–68, Sept 2011. ISSN 0018-9162. doi: 10.1109/
MC.2011.225.
[29] Markus Sabadello. Startup technology report. 2012. doi: http://pde.cc/2012/08/str201201/.
[30] Ravi Sandhu and Xinwen Zhang. Peer-to-peer access control architecture using trusted
computing technology. In Proceedings of the Tenth ACM Symposium on Access Control
Models and Technologies, SACMAT ’05, pages 147–158, New York, NY, USA, 2005. ACM.
ISBN 1-59593-045-0. doi: 10.1145/1063979.1064005. URL http://doi.acm.org/10.1145/
1063979.1064005.
[31] D. Shah. Gossip Algorithms. Foundations and trends in networking. Now Publishers, 2009.
ISBN 9781601982360. URL http://books.google.ee/books?id=EVBoyrxHp_wC.
[32] Jenny Nilsson (Karlstad University) Simone Fischer-Hubner (Karlstad Univer-
sity). Trust and assurance control – ui prototypes. PrimeLife, 2009. URL
http://primelife.ercim.eu/images/stories/deliverables/d4.2.1-trust_and_
assurance_ui_prototypes-public.pdf.
[33] Dave Raggett (W3C) Slim Trabelsi (SAP), Gregory Neven (IBM). Report on design and
implementation. PrimeLife, 2011. URL http://primelife.ercim.eu/images/stories/
deliverables/d5.3.4-report_on_design_and_implementation-public.pdf.
[34] R. Steinmetz and K. Wehrle. Peer-to-Peer Systems and Applications. Lecture Notes in Com-
puter Science / Information Systems and Applications, incl. Internet/Web, and HCI. Springer,
2005. ISBN 9783540291923. URL http://books.google.ee/books?id=A8CLZ1FB4qoC.
[35] Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. Chord:
A scalable peer-to-peer lookup service for Internet applications. In Proceedings of the ACM
SIGCOMM ’01 Conference, San Diego, California, August 2001.
[36] Q. Tang. On using encryption techniques to enhance sticky policies enforcement. Techni-
cal Report TR-CTIT-08-64, Centre for Telematics and Information Technology University of
Twente, Enschede, 2008.
[37] Liang Wang and J. Kangasharju. Measuring large-scale distributed systems: case of bit-
torrent mainline dht. In Peer-to-Peer Computing (P2P), 2013 IEEE Thirteenth International
Conference on, pages 1–10, Sept 2013. doi: 10.1109/P2P.2013.6688697.
84