Data Masking using Enterprise Manager

Post on 02-Dec-2014

5.901 views 2 download

Tags:

description

 

Transcript of Data Masking using Enterprise Manager

<Insert Picture Here>

Data Masking using Enterprise Manager – Managing Sensitive Information in

Non-Production EnvironmentsOfir ManorSenior Technology Specialist, Oracle ofir.manor@oracle.com

Agenda

• Introduction

• Data Masking Overview

• Data Masking Examples

• Related EM technology

Agenda

• Introduction

• Data Masking Overview

• Data Masking Examples

Securing Production Environment

• In recent years, increasing attention is given to securing the production environment:• Regulatory Requirements (you know the list…)• Internet access every where (customers, partners)• Increasing threats• Increasing awareness to inside and outside threats

• Oracle Database has a lot of functionality for this. For example:• Authentication – ASO (Advanced Security Options)• Network Traffic Encryption - ASO• Data At Rest Encrypting – ASO’s Transparent Data encryption,

Oracle Secure Backup• Access Control – privileges, roles, VPD, Label Security…• Auditing – regular audit, Fine-Grained Audit, Oracle Audit Vault• Limiting “Super Users” – Oracle Data Vault

What About Other Environments?

• Important systems -> many environments• Pre-prod, test, dev, training

• Usually more than one of each type

• Sensitive information all over the place

• QA / dev can usually do anything in these environments.

• DBAs / sys admins can usually do anything in these environments

• Sometimes partners have full access to these environments (consultants, outsourcing dev / testing / monitoring etc)

• Are these environments audited?• Do you practice careful access control?

What Can Be Done?

There are two options:

1. Heavily investigate in securing all your database environments• Adds IT administrative overhead – auditing,

privilege management etc

• Annoying QA / dev – “Not fun”

• Will be always in lower priority

• Might be neglected, worked around etc over time

2. Make sure no sensitive data arrives to these environments• Mask the data while provisioning these

environments

• Sensitive data can not leak if it’s not there

• An elegant, compliant solution

Agenda

• Introduction

• Data Masking Overview

• Data Masking Examples

What is data masking?

What• The act of anonymizing customer,

financial, or company confidential data to create new, legible data which retains the data's properties, such as its width, type, and format.

Why• To protect confidential data in test

environments when the data is used by developers or offshore vendors

• When customer data is shared with 3rd parties without revealing personally identifiable information

LAST_NAME SSN SALARYAGUILAR 203-33-3234 40,000

BENSON 323-22-2943 60,000

D’SOUZA 989-22-2403 80,000

FIORANO 093-44-3823 45,000

LAST_NAME SSN SALARYANSKEKSL 111—23-1111 40,000

BKJHHEIEDK 111-34-1345 60,000

KDDEHLHESA 111-97-2749 80,000

FPENZXIEK 111-49-3849 45,000

Major features• Data mask format library• Define once; execute multiple times• View sample data before masking• Automatic database referential integrity when masking

primary keys• Implicit – database enforced

• Explicit – application enforced

• Installed as part of Oracle Enterprise Manager (Grid Control) 10g Release 4 (10.2.0.4)

Enterprise ManagerData Masking Pack

Production Staging

MaskTest

TestCloneClone

Format Libraries

• Mask Primitives• Random Number

• Random String

• Random Date within range

• Shuffle

• Sub string of original value

• Table Column

• User Defined Function • National Identifiers

• Social Security Numbers

• Credit Card Numbers

Example – Create a New Format

User-defined mask formatsEmail notification testing

Masking Definitions

• Associates formats with database• Maps formats to table columns

being masked

• Defines dependent columns

• Associated Database target

• Automatically identifies Foreign key relationships

• Can specify undeclared constraints as related columns

• Import-from or export-to XML• “Create like” to apply to similar

databases

Referential Integrity Enforcement

Database-enforced

Application-enforced

Pre-Masking Validation

• Ensure uniqueness can be maintained

• Ensure formats match column data types

• Check Space availability• Warn about Check

Constraints• Check presence of default

Partitions

Masking WorkflowS

ecu

rity

A

dm

inD

BA

Identify Data

Formats

Identify Sensitive

Information

Format

Library

Masking Definition

StagingProd Test

Review Mask Definition

Execute Mask

Clone Prod to Staging

Clone Staging to Test

Performance

• Optimizations• SQL Parallelism for tables > 1 million rows• Statistics collection before & after masking• CTAS statement with NOLOGGING

• Test results• Case 1

• 60GB Database• 100 tables, 215 columns• 20mins

• Case 2• 6 column, 100 million row table• Random Number• 1.3 hours

Data Masking Pack feature details

• Data Masking primitives• Random numbers• Random digits• Random strings• Random date• User defined function (PL/SQL)• Exportable and importable format

definition (XML-based)• Masking algorithms

• Unique value generation• Shuffle• Constant

• Mask definition• Association of masking formats with

application schema• Related application columns without

defined constraints in data dictionary• Exportable and importable XML mask

definitions• “Create Like” to apply mask definition to

other databases

• Validation• Mask validation with data type

• Data overflow validation

• Multiple parent FKs, circular dependency, constraints

• Automatic exclusion of CLOB, BLOB, NCLOB, LONG, LONG RAW, XML column types

• Imported mask definition validated against database schema

• Space availability check

• Efficiency• One bulk operation per table

regardless of number of masked columns

• CTAS to recreate masked table

• Leverage database features, e.g. parallelism, no logging.

Agenda

• Introduction

• Data Masking Overview

• Data Masking Examples

Handling First / Last Name

• Using Shuffle• Useful if first name and last name are in different columns• Preserves real values and real data distribution• Bigger data sets minimize leak risk

• Using Random Strings• Really random• Not real names, different data distribution

• Using a table based lookup• Example – fakenamegenerator.com

Israeli ID Number

• Israeli ID Number uses a check digit• IsraCard, Mastercard etc also uses some kind of check digit

• The check digit protects from:• One digit error• Two adjacent digits replaced

• The algorithm is well documented• Easy to write a function to do it

Israeli ID Number Algorithm

Israeli ID Number Algorithm

Israeli ID Number Algorithm

Israeli ID Number - Format

Agenda

• Introduction

• Data Masking Overview

• Data Masking Examples

Q A&