HIPAA Training: Ensuring Privacy for our Patients Privacy Training for Harvard Medical Students.
Privacy and security for medical applications
Transcript of Privacy and security for medical applications
Privacy and Security for Medical Applications
Or: taking responsibility for privacy protection
Guido van 't NoordendeSystem and Network Engineering University of Amsterdam [email protected]
Our future
Dutch Data Hub
Buzzwords / trends
Big Data: Extraction of medical data and 'anonymized' data processing for medical / policy research- Examples: Mondriaan / IT-pharma; dutch medical registration, ...
Open data- NWO Open Data
Using the Grid and clouds for external data processing- Job/VM submissions from AMC to LifeScience Grid, SARA HPC clusters, Amazon EC2, HPC clouds, ...
Combining external data storage with data processing - Example: Dutch Health Hub
Distributed infrastructures for clinical health exchange- Example: Dutch electronic patient record
Our past?
Hippocratic oath (~400 BC):
“ Al hetgeen mij ter kennis komt in de uitoefening van mijn beroep of in het dagelijks
verkeer met mensen en dat niet behoort te worden rondverteld, zal ik geheim houden en
niemand openbaren.”
Obsolete?
Maybe, medical privacy is not so obsolete.
- HHS study, VS, 2001: 8% patiënten avoids care in (early stages) of disease for fear of privacy breaches or stigma
- 2005 National Consumer Health Privacy Survey
“One out of eight consumers has put their health at risk by engaging in such behaviors as: avoiding their regular doctor, asking their doctor to fudge a diagnosis, paying for a test because they didn’t want to submit a claim, or avoiding a test altogether. Chronically ill, younger, and racial and ethnic minority respondents are more likely than average to practice one or more of these risky behaviors.”
(Note: medical data processing forbidden, unless...)
Medical data processing – possible information flows
Doctor's DOSSIER
Secondary use / research (medicalor policy / statistical)
Other health professionals(Health information Exchange systems)
(treatment team)
Consent oranonymization
Medical confidentiality
Direct communication w/doctors is allowed. For wider access, explicit permission needed
external (hosted) storage
Medical data is considered specially sensitive data under EU data protectionLegislation. Processing forbidden by default – security / protection is critical
The fence
(outsourced) storage of medical data (doctor's dossier)
GP / pharmacy systemsPharmaPartners (ca. 8 miljoen dossiers)
CGM - EuroNED / Microbais
OmniHIS Scipio
Microhis / Tetra / ...
Records centrally stored by IT provider / Encryption of back-end storage system?
Hospital systems: PACS for radiological scans, etc.
Outsourcing / “the cloud”?
Who is responsible for keeping patient's data secure?
Medical data processing – possible information flows
Doctor's DOSSIER
Secondary use / research (medicalor policy / statistical)
Other health professionals(Health information Exchange systems)
(treatment team)
Consent oranonymization
Medical confidentiality
Direct communication w/doctors is allowed. For wider access, explicit permission needed
external (hosted) storage
Medical data is considered specially sensitive data under EU data protectionLegislation. Processing forbidden by default – security / protection is critical
The fence
Medical data processing – possible information flows
Doctor's DOSSIER
Secondary use / research (medicalor policy / statistical)
Other health professionals(Health information Exchange systems)
(treatment team)
Consent oranonymization
Medical confidentiality
Direct communication w/doctors is allowed. For wider access, explicit permission needed
external (hosted) storage
The fence
GP data registrations for research
Anonymous data?
An IT vendor witha sense of responsibility, in this case.
In fact: onlya few records were used here. The goal of that was to check implementationof an automated check on quality of registration of data
But the message is: the data can hardly be called anonymous. We're talking nearly the whole patientrecord – sometimes with, sometimeswithout free text
Proportionaliteit / minimaliteit?
Data collectionfor policy research
LINH dataset
Voor de duidelijkheid zij vermeld dat de gegevensverzameling van het Netwerk geen op individuele personen herleidbare gegevens bevat en zodoende buiten de werkingssfeer van het wetsontwerp persoonsregistratie valt.
Bij de automatisering van de patiëntenbestanden wordt slechts een beperkt aantal gegevens
opgenomen in de gegevensverzameling. Dit zijn - naast een anoniem codenummer - gegevens
over leeftijd, geslacht en verzekeringsvorm.
De categorieën van gegevens die bij het contact tussen patiënt en huisarts/praktijkassistente/
huisarts in opleiding worden geregistreerd zijn:
1. patiëntgegevens (geboortedatum, geslacht, verzekeringsvorm);
2. contactgegevens (avond/weekenddienst, soort contact, initiatief tot contact, aard contact,
etc.);
3. gegevens over klachten en diagnose/werkhypotheses;
4. gegevens over diagnostische verrichtingen (klinische diagnostiek, bloedonderzoek, urine,
reden van diagnostiek, bloedchemie etc.);
5. behandelingsgegevens (soort en aard van de behandeling, vaccinatie);
6. prescriptiegegevens (middel, hoeveelheid, dosis per dag);
7. verwijzingsgegevens (inclusief opname): (medisch specialisme, paramedici, initiatief tot
verwijzing);
8. gegevens over eventueel overleg naar aanleiding van het contact: (met wie en met welk
doel);
LINH dataset + DIS/Vectis, gelinkt met geboortedatum, behandeling, behandeldatum: PC4 + geboortedatum + geslacht: 80,8% uniek identificeerbaarFiguur: M. Koot et al., HotPETs, 2010
Anonymity?
Approximately 99.4% of a sample of the Dutch population is unambiguously identifiable using PC6 postal code, gender and date of birth, and 67.0% by PC4 and date of birth alone.
… and we haven’t even discussed including other identifying data yet..
[ref. Matthijs Koot et al., 2010]
Latanya Sweeney got similar results in the US around 2000 when linking up medical data withmassachusett's voter listZIP/sex/DoB = 'pseudo ID'allowing recombination
Privacy barometer
PhD thesis work by Matthijs Koot on microdataset anonymity and re-identifyability – UvA 2012
Theoretical assessment of the uniqueness of subjects in to-be-combined datasets, based on (known or estimated) distributions (e.g., age, length, ...) within columns
Likely, even with a relatively small number of columns, re-identification probability will be high
We must think about risk mitigation: encrypt columns, distribute keys with strict key management protocols, etc.
You legally really can't (re)combine disparately collected microdata without consent if (re)combination crosses some re-identification treshold..
Trusted Third Parties – consent and anonymization and the Recombination Loophole
Source A
Source B
TTP(recombinewith 1 key/pseudonym)
RecombinedData-set
Risk ass?Terms ofContract?
Recombination loophole
Tracking data and assessing risk / who takes responsibility?
Source A
Source B
TTP(recombinewith 1 key/pseudonym)
RecombinedData-set
(Possible Other data)
audit?
audit?
?
?
Is responsible party in control – or even aware?
Mondriaan – third party recombination loophole
“In Nederland beschikken we over diverse goede bestanden met informatie over de gezondheidszorg. Maar deze bestanden staan los van elkaar. De gegevens zijn verspreid, moeilijk toegankelijk en niet altijd volledig. Daardoor is onderzoek naar het gebruik van geneesmiddelen en de effecten in de dagelijkse praktijk ook kostbaar en tijdrovend.
In het project Mondriaan worden databestanden van zorginstellingen, zorgverzekeraars en huisartsennetwerken aan elkaar gekoppeld via een “Trusted Third Party”. Dit betekent dat de medische en onderzoeksgegevens worden gescheiden van de persoonsgegevens, zoals naam en adres. Deze gegevens worden niet vastgelegd in de door Mondriaan gekoppelde databases en raken dus ook niet bekend bij de onderzoekers. Hierdoor is de privacy van patiënten optimaal beschermd.
At a minimum: tracking and Transparancy Enhancing Tools (TETs)
Source A
Source B
TTP(recombinewith 1 key/pseudonym)
RecombinedData-set
Datasubject
(Possible Other data)
Data-owner centric tools
Medical data processing – possible information flows
Doctor's DOSSIER
Secondary use / research (medicalor policy / statistical)
Other health professionals(Health information Exchange systems)
(treatment team)
Consent oranonymization
Medical confidentiality
Direct communication w/doctors is allowed. For wider access, explicit permission needed
external (hosted) storage
Medical data is considered specially sensitive data under EU data protectionLegislation. Processing forbidden by default – security / protection is critical
The fence
Hospitals and the cloud: combining external storage with data processing?
Dutch Data Hub
Legal loopholes – new EU data protection regulation?
Medical data processing – possible information flows
Doctor's DOSSIER
Secondary use / research (medicalor policy / statistical)
Other health professionals(Health information Exchange systems)
(treatment team)
Consent oranonymization
Medical confidentiality
Direct communication w/doctors is allowed. For wider access, explicit permission needed
external (hosted) storage
Medical data is considered specially sensitive data under EU data protectionLegislation. Processing forbidden by default – security / protection is critical
The fence
Medical researcher
BioMedical research data – MRI scans – DNA data – Grids and clouds?
Can analysis be done externally?
censuur
Controller is medical researcher in hospital
In control, but.. How much control?
Control where data goes
Assess trustworthiness of Grid nodes / Cloud Vms
“declarative” descriptions / host property definitions
Microcontracts
Provenance / auditing
Data processing policies for distributed systems
N.D. Jebessa PhD work @ UvA
Medical data processing – possible information flows
Doctor's DOSSIER
Secondary use / research (medicalor policy / statistical)
Other health professionals(Health information Exchange systems)
(treatment team)
Consent oranonymization
Medical confidentiality
Direct communication w/doctors is allowed. For wider access, explicit permission needed
external (hosted) storage
Medical data is considered specially sensitive data under EU data protectionLegislation. Processing forbidden by default – security / protection is critical
The fence
Clinical health information exchange systems
Dutch electronic patient record. Pull model where doctors can retrieve
information from other doctor's systems
Protection: smartcard(S) and logging Where is control? Access control policies?
Can the data controller still take responsibility over and control data flow – particularly for sensitive medical data? Where can data subject go when things go wrong? How about transparancy?
Big Data: 'anonymized' (or identifyable data with consent) are no longer controlled by data subject / or the party who collected it; we need transparancy and pro-active risk assessements and risk mitigation
Open data- NWO Open Data – NWO manages and 'owns' the data – not the researcher who collected it
Using the Grid and clouds for external data processing- Data owner (controller) is in control – but where does the data go?
Combining external data storage with data processing - How easy may data be extracted in the future? (EU Regulation changes)?
Distributed infrastructures for clinical health exchange- Dutch EPD: who controls policies and access to clinical health data? How much control does a doctor have?
Core question: Can we take responsibility, or do we need to throw data over the fence?
(Is “trust” what we need - because we can't control things?)We need to take responsibility and take control. Do we outsource this, or will we get control back?
Legal obligation to audit, assess, safeguard, and manage transparancy and consent for data flows?
- examples: data lifetime data transfer auditing / lifecycle provenance, active risk assessments before disclosing data, transparancy.- Need to ask consent again when changing goals or when re-identifyability increases- Transprancy enhancing or data subject centric tools place control with the data subject
Ensure designs and architecture allow control by the person(s) who is (are) responsible; medical data: data controller = doctor, medical researcher. (And don't forget the patient - if possible)
How to move forward?
Construct transparancy enhancing tools and consent management tools to ensure tractability of data flows – and to increase control and have the data subject assess and prevent recombination loopholes
Don't throw data over the fence: audit, and control – throughout the data's lifetime
Enable active 'privacy risk assessments' or 'security risk assessments' at the source - before releasing or processing data
Ensure access control can be managed by the person who is responsible (controller, e.g., the doctor) in a fine-grained manner
We need much moare critical thinking in the systems design phase, avoid jumping on the bandwagon with naive or disruptive design or policy decisions that cut responsible parties out of the loop ....