Issues in Designing a Confidentiality Preserving Model Server by Philip M Steel & Arnold Reznek.

Issues in Designing a Confidentiality Preserving Model Server

by Philip M Steel & Arnold Reznek

Talk Outline

BackgroundBasic designDescription of operationConfidentiality outlineConstraints on universe formationOther constraintsSummary

Background

PUBLIC remote access to confidential dataRestriction of queries and responses rather

than the registering and monitoring the userCurrent population survey (CPS), employment

and economic well-being; demographic supplement

Software development by SynecticsHTML, mySQL, php, to develop the query …

SAS as the statistical package run against the data

Risk Model for Microdata

• Intruder has access to record linkage software and identified data sources

• Disclosure occurs if the intruder is successful in linking his identified data with the published microdata

Risk Model for a Model Server

• Intruder has access to record linkage software and identified data sources

• Intruder uses model server to reconstruct microdata for both the variables overlapping his data sources and a sensitive variable

• Disclosure occurs if the intruder is successful in linking his identified data with the reconstructed microdata and has valid estimate of a sensitive characteristic or value

Basic Design Choice

Enable: Choose which functions will operate– Must construct a friendly interface– Limited to the procedures developed– Safe from unknown code

Disable: Choose which functions will not operate– User free to program within disabling constraints– No limit on complexity– Must be monitored (human, program or mix)

Operation

User visits web site, chooses data set, explores data, chooses geography, analysis type

User chooses population, constructs model, selects output

Web site constructs code to send behind firewall

Code checked and run against data at Census

Results checked and returned to user

Structure of Confidentiality Rules

Data preparation

Data exploration

Model universe formation

Model Statement

Model Output

Data exploration rules

Users may request tables for categorical variables and numeric recodes up to e1 dimensions. (start e1=4 including geo)

User may transform numeric recodes using a limited set of functions: log, root, square.

Universe formation: Categorical Variables

Example: Hispanic heads of household with a college degree.

Conditions: X1=H,X2=1,X3=5 (table cell)

Implication: Data preparation must support safe lower dimensional tables

Universe formation rules: Categorical Variables

Limit on the number of categorical variables (u1=3)

Minimum on the size of the universe selected (u2=75)

Universe Formation: Numeric Variables

Example: Families in poverty

Condition: Family income<18,500. Or Family income<18,501?

Implication: Rounding or pre-assigned cutpoints.

Universe formation rules: Numeric variables

Users will select categorical variables first

Numeric variables can be used only at pre-assigned cutpoints.

The number of observations in the whole CPS universe between cutpoints shall be at least u3 for every numeric variable. (start u3=80)

Universe formation rules (cont)

If a cutpoint is used in universe formation then the difference in the size of the model universe obtained by incrementing the cutpoint up or down cannot be less than u4. (start u4=4)

The universe for the model must have at least u2 observations. (start u2=75)

There will be no cutpoints above the 97th percentile of nonzero points or the last half percentile of all points .

Model statements rules

At most m1 variables may be used in the model statement (start m1=20)

Dummy variables must distinguish at least m2 observations (start m2=20)

No interaction term may involve more than 4 variables. (m3=4)

No model involving 3 or more variables can be fully interacted. (m4=3)

Model Output

Residuals will be based on synthetic data

Limit on the number of significant digits?

R2 cannot be 1?

Rules for other diagnostics

Synthetic Residuals

Users may see synthetic bar charts or distributions and synthetic 2-way plots.

Synthetic data must be generated from fixed random number starts and topcoded (and bottom coded where appropriate) at 4 standard deviations from the mean.

Data preparation

The topcode for numeric data needs to be calculated

Cutpoints must be determinedSeparate lists of variables for exploration,

universe formation, dependent and independent variables, model estimation

Standard recodes addedInference from the collection of all 4-way

categorical tables checked

Major Hurdles

Implementing facility for dummy variables

Presentation of geographic options

Implementing synthetic residuals

Architecture for differing variable roles

Future development

Relaxation of top codes

Implementation of model variance estimation (NSO weighting)

Introduction of new dataset

Introduction of new statistical procedures

Facility to add contextual data or merge files

Use of non-sampled data

Overview

• Avoids (as much as possible) tests which accept or reject a users choice.

• Restricts the dimension of the data access.• Has some flexibility in setting system

confidentiality parameters.• Changes the intruder model.• Introduces a modification of k-anonymity.

My thanks to Jerry Reiter, Laura Zayatz and Stephen Wenck

http://204.52.186.190/

Contact: philip.m.steel@census.gov

Issues in Designing a Confidentiality Preserving Model Server by Philip M Steel & Arnold Reznek.

Documents

Transcript of Issues in Designing a Confidentiality Preserving Model Server by Philip M Steel & Arnold Reznek.

Confidentiality Statement

Confidentiality template - Australian Energy Regulator Confidentiality claim... · Confidentiality template Title, page and ... B5 John Kotter, ... Confidentiality confidential the

Challenges to Patient Confidentiality infographicChallenges to Patient Confidentiality infographic Author FPNTC Subject confidentiality Keywords confidentiality, health center, clinic,

Digital Image Confidentiality Depends upon Arnold ... · PDF filedesign and implementation of digital image encryption using RC4 ... All phases are implemented using Matlab. ... are

Meta-models of Confidentiality Dennis Kafura. Meta-Models of Confidentiality Overview Introduction Confidentiality Access control Information flow.

Unit 7. Confidentiality and Data Security · Unit 7: Confidentiality and Data Security Page 7.9 Confidentiality and Security Considerations Why is confidentiality important Confidentiality

[PPT]Security &Confidentiality Guidelines for HIV/AIDS …dhhr.wv.gov/.../Documents/Confidentiality-Training.pptx · Web viewHIV/AIDS Surveillance Security & Confidentiality Training

HIV/AIDS Surveillance Security & Confidentiality Trainingdhhr.wv.gov/oeps/std-hiv-hep/resources/Documents/Confidentiality... · HIV/AIDS Surveillance Security & Confidentiality Training

Protecting Survivor Confidentiality · Protecting Survivor Confidentiality Confidentiality Fundamentals and Challenges for Non-Profit Victim Services Providers July 2013 Presenter:

UNDERSTANDING CONFIDENTIALITY

Confidentiality in College Health - acha.org … · Confidentiality in College Health: Ethical, ... development of its students. ... management. Confidentiality again

Confidentiality Ethics & Integrity Standards of Conduct€¦ · confidentiality statement upon hire and annually thereafter. Confidentiality means that communications with or about

Arnold-Palmer House/Daniel Arnold House

M&A Confidentiality Agreements - straffordpub.commedia.straffordpub.com/products/m-and-a-confidentiality-agreements... · M&A Confidentiality Agreements ... – to have Representatives

Confidentiality of Patient Records for Alcohol and Other ...preventiontrainingservices.com/resources/TAP-13-Confidentiality-of... · [confidentiality] is absolutely essential to the

Protecting the confidentiality of communications in ... · PDF fileProtecting the confidentiality of communications in medical ... relevant documents ... Protecting the confidentiality

CCADV Confidentiality Training - Your Site NAME Goes HEREcoloradoadvocacy.org/.../CCADV-Confidentiality-Training.pdf · CONFIDENTIALITY •Understand advocate confidentiality, including

Confidentiality 0910

AER Confidentiality guideline Confidentiality guideline... · Web viewThe Confidentiality Guideline details the scheme for how we will handle confidentiality claims. There is also

HIPAA & Confidentiality - Children's Minnesota Agreement 2 Competency Training: Confidentiality, HIPAA, Patients Rights and Responsibilities Volunteer Services Confidentiality Agreement