Isaca new delhi india privacy and big data
-
Upload
ulf-mattsson -
Category
Technology
-
view
304 -
download
1
description
Transcript of Isaca new delhi india privacy and big data
Bridging the Gap Between Privacy and Big Data
Ulf Mattsson , CTO
Protegrity
ulf.mattsson AT protegrity.com
20 years with IBM • Research & Development & Global Services
Inventor • Encryption, Tokenization & Intrusion Prevention
Involvement
Ulf Mattsson, CTO Protegrity
2
• PCI Security Standards Council (PCI SSC)
• American National Standards Institute (ANSI) X9
• Encryption & Tokenization
• International Federation for Information Processing• IFIP WG 11.3 Data and Application Security
• ISACA New York Metro chapter
3
Agenda
1. What is Big Data & Cloud?
2. Risk & Drivers for Data Security
3. The Evolution of Data Security Methods
4. Data De-Identification
5. Off-Shoring & Outsourcing
6. Use Cases & Case Studies
4
Who is Protegrity?
Proven enterprise data protection software leader since the 90’s.
Business driven by compliance
• PCI (Payment Card Industry)
• PII (Personally Identifiable Information)
• PHI (Protected Health Information) – HIPAA
• State and Industry Privacy Laws• State and Industry Privacy Laws
Servicing many Industries
• Retail, Hospitality, Travel and Transportation
• Financial Services, Insurance, Banking
• Healthcare
• Telecommunications, Media and Entertainment
• Manufacturing and Government
Big Data
Hadoop
• Designed to handle the emerging “4 V’s”
• Massively Parallel Processing (MPP)
• Elastic scale
• Usually Read-Only
• Allows for data insights on massive, heterogeneous data sets
What is Big Data?
data sets
• Includes an ecosystem of components:
7
Hive
MapReduce
HDFS
Physical Storage
Pig Other
Application Layers
Storage Layers
Has Your Organization Already Invested in Big Data?
8
Source: Gartner
Cloud
9
Services usually provided by a third party
• Can be virtual, public, private, or hybrid
Increasing adoption – up 12% from 2012*
Often an outsourced solution, sometimes cross-border
Allows for greater accessibility of data and low overhead
Cloud Services
*Source: GigaOM
Cloud Services and Models
Source: NIST, CSA
Drivers for Data Security
12
Data Security
Regulations & Laws
• Payment Card Industry Data Security Standard (PCI DSS)
• National Privacy Laws
• Cross-Border & Outsourcing Privacy Laws
Expanding Threat Landscape
• Hackers & APT
Drivers for Data Security
• Hackers & APT
• Internal Threats & Rogue Privileged Users
• Excessive Privilege or Security Negligence
Sensitive Data Insight & Usability
• Unprotected Sensitive or Restricted Data is Unusable for Marketing, Monetization, Outsourcing, etc.
Vulnerabilities in Emerging Technologies
13
Regulations & LawsLaws
PCI DSS
14
Founded in 2006, comprised of four major credit card brands
Each card brand enforcement program issues fines, fees and schedule deadlines
• Visa's Cardholder Information Security Program (CISP)http://www.visa.com/cisp
PCI Data Security Standards Council
• MasterCard's Site Data Protection (SDP) programhttp://www.mastercard.com/us/sdp/index.html
• Discover's Discover Information Security and Compliance (DISC) programhttp://www.discovernetwork.com/fraudsecurity/disc.html
• American Express Data Security Operating Policy (DSOP)http://www.americanexpress.com/datasecurity
15
PCI DSS Build and maintain a secure network.
1. Install and maintain a firewall configuration to protect data
2. Do not use vendor-supplied defaults for system passwords and other security parameters
Protect cardholder data. 3. Protect stored data4. Encrypt transmission of cardholder data and
sensitive information across public networks
Maintain a vulnerability management program.
5. Use and regularly update anti-virus software6. Develop and maintain secure systems and
applicationsapplications
Implement strong access control measures.
7. Restrict access to data by business need-to-know8. Assign a unique ID to each person with computer
access9. Restrict physical access to cardholder data
Regularly monitor and test networks.
10. Track and monitor all access to network resources and cardholder data
11. Regularly test security systems and processes
Maintain an information security policy.
12. Maintain a policy that addresses information security
16
Protection of cardholder data in memory
Clarification of key management dual control and split knowledge
Recommendations on making PCI DSS business-as-usual and best practices
PCI DSS 3.0
Security policy and operational procedures added
Increased password strength
New requirements for point-of-sale terminal security
More robust requirements for penetration testing
17
Relevant to all sensitive data that is outsourced t o cloud
1. Clients retain responsibility for the data they put in the cloud
2. Public-cloud providers often have multiple data centers, which may often be in multiple countries or regions
3. The client may not know the location of their data, or the data may
PCI DSS Cloud Guidelines
3. The client may not know the location of their data, or the data may exist in one or more of several locations at any particular time
4. A client may have little or no visibility into the controls
5. In a public-cloud environment, one client’s data is typically stored with data belonging to multiple other clients. This makes a public cloud an attractive target for attackers
18
Regulations & LawsLaws
National Privacy Laws
19
National Privacy Laws - USA
1. Names
2. All geographical subdivisions smaller than a State
3. All elements of dates (except year) related to individual
4. Phone numbers
11. Certificate/license numbers
12. Vehicle identifiers and serial numbers
13. Device identifiers and serial numbers
14. Web Universal Resource Locators
Heath Information Portability and Accountability Ac t – HIPAA
4. Phone numbers
5. Fax numbers
6. Electronic mail addresses
7. Social Security numbers
8. Medical record numbers
9. Health plan beneficiary numbers
10. Account numbers
20
14. Web Universal Resource Locators (URLs)
15. Internet Protocol (IP) address numbers
16. Biometric identifiers, including finger prints
17. Full face photographic images
18. Any other unique identifying number
Privacy Laws
54 International Privacy Laws
30 United States Privacy Laws
21
Information Technology Act – 2000 (IT Act)• Requires that the corporate body and Data Processor
implement reasonable security practices and standards
• IS/ISO/IEC 27001 requirements recognized
Information Technology Act – 2008 (Amended IT Act)• Damages for negligence and wrongful gain or loss
• Criminal punishment for disclosing Sensitive Personal
National Privacy Laws - India
• Criminal punishment for disclosing Sensitive Personal Information (SPI)
India Privacy Law – 2011• Expanded definition of SPI to passwords, financial data,
health data, medical treatment records, and more
Right to Privacy Bill – 2013 (Proposed)• Increased jail terms & fines for disclosure of SPI
• Addresses data handled for foreign clients
22
Regulations & Laws
Cross-Border & Outsourcing Laws
23
The laws of the sending country apply to data sent across international borders, including outsourced operations
• i.e. National Privacy Laws
APEC Cross-Border Privacy Laws
• Non-binding privacy enforcement in Asia-Pacific region
Cross-Border & Outsourcing Laws
• Non-binding privacy enforcement in Asia-Pacific region
24
Expanding Threat Landscape
26
Cyber Criminals Cost India USD 4 Billion
27
Source: Symantec 2013
28
29
http://www.ey.com/Publication/vwLUAssets/EY_-_2013_Global_Information_Security_Survey/$FILE/EY-GISS-Under-cyber-attack.pdf
Sensitive Data Insight &
30
Insight & Usability
Vulnerabilities in Emerging
31
in Emerging Technologies
Holes in Big Data…
32
Source: Gartner
Many Ways to Hack Big Data
MapReduce(Job Scheduling/Execution System)
Pig (Data Flow) Hive (SQL) Sqoop
ETL Tools BI Reporting RDBMS
Avr
o (S
eria
lizat
ion)
Zoo
keep
er (
Coo
rdin
atio
n)
Hackers
UnvettedApplications
OrAd Hoc
Processes
Source: http://nosql.mypopescu.com/post/1473423255/apache-hadoop-and-hbase
33
HDFS(Hadoop Distributed File System)
Hbase (Column DB)
Avr
o (S
eria
lizat
ion)
Zoo
keep
er (
Coo
rdin
atio
n)
PrivilegedUsers
The Insider Threat
34
Big Data and Cloud environments are designed for access and deep insight into vast data pools
Data can monetized not only by marketing analytics, but through sale or use by a third party
The more accessible and usable the data is, the
Sensitive Data Insight & Usability
The more accessible and usable the data is, the greater this ROI benefit can be
Security concerns and regulations are often viewed as opponents to data insight
35
Big Data (Hadoop) was designed for data access, not security
Security in a read-only environment introduces new challenges
Massive scalability and performance requirements
Big Data Vulnerabilities and Concerns
Sensitive data regulations create a barrier to usability, as data cannot be stored or transferred in the clear
Transparency and data insight are required for ROI on Big Data
36
Public cloud security is often not visible to the client, but client is still responsible for security
Greater access to shared data sets by more users creates additional points of vulnerability
Data redundancy for high availability, often across multiple data centers, increases vulnerability
Cloud Vulnerabilities and Concerns
multiple data centers, increases vulnerability
Virtualization can create numerous security issues
Transparency and data insight are required for ROI
37
How do you lock this?
Security Improving but We Are Losing Ground
38
Breach Discovery Methods
39
Verizon 2013 Data-breach-investigations-report
The Evolution of Data Security Data Security
Methods
40
Coarse Grained Security
• Access Controls
• Volume Encryption
• File Encryption
Fine Grained Security
Evolution of Data Security Methods
Time
Fine Grained Security
• Access Controls
• Field Encryption (AES & )
• Masking
• Tokenization
• Vaultless Tokenization
41
Use of Enabling Technologies
1%
18%
30%
21%
91%
47%
35%
39%
Access controls
Database activity monitoring
Database encryption
Backup / Archive encryption 21%
28%
7%
22%
39%
28%
29%
23%
Backup / Archive encryption
Data masking
Application-level encryption
Tokenization
Evaluating
42
Old and flawed:
Minimal access
levels so people
can only carry
Access Control
Risk
High –
can only carry
out their jobs
43
AccessPrivilege
LevelI
High
I
Low
Low –
DC6
Slide 43
DC6 I have no idea what this graph is supposed to representDaniel Crum, 11/6/2013
Applying the protection profile to the content of data fields allows
for a wider range of authority for a wider range of authority options
44
Risk
High –
Old:
Minimal access
levels – Least New:
Much greater
How the New Approach is Different
AccessPrivilege
LevelI
High
I
Low
Low –
levels – Least
Privilege to avoid
high risks
Much greater
flexibility and
lower risk in data
accessibility
45
Reduction of Pain with New Protection Techniques
High
Pain& TCO
Strong Encryption Output:AES, 3DES
Format Preserving EncryptionDTP, FPE
Input Value: 3872 3789 1620 3675
!@#$%a^.,mhu7///&*B()_+!@
8278 2789 2990 2789
46
1970 2000 2005 2010
Low
Vault-based Tokenization
Vaultless Tokenization
8278 2789 2990 2789
Format Preserving
Greatly reduced Key Management
No Vault
8278 2789 2990 2789
Fine Grained Security: Encryption of Fields
Production SystemsEncryption of fields• Reversible• Policy Control (authorized / Unauthorized Access)• Lacks Integration Transparency• Complex Key Management• Example: !@#$%a^.,mhu7///&*B()_+!@
47
Non-Production Systems
Fine Grained Security: Masking of Fields
Production Systems
48
Non-Production SystemsMasking of fields• Not reversible• No Policy, Everyone can access the data• Integrates Transparently• No Complex Key Management• Example: 0389 3778 3652 0038
Fine Grained Security: Tokenization of Fields
Production Systems
Tokenization (Pseudonymization)
• No Complex Key Management• Business Intelligence• Example: 0389 3778 3652 0038
49
Non-Production Systems
• Reversible • Policy Control (Authorized / Unauthorized Access)
• Not Reversible• Integrates Transparently
Fine Grained Data Security Methods
Tokenization and Encryption are Different
Used Approach Cipher System Code System
Cryptographic algorithms
Cryptographic keys
TokenizationEncryption
50
Cryptographic keys
Code books
Index tokens
Source: McGraw-HILL ENCYPLOPEDIA OF SCIENCE & TECHNOLOGY
Fine Grained Data Security Methods
Vault-based Tokenization Vaultless Tokenization
Footprint Large, Expanding. Small, Static.
High Availability,
Disaster Recovery
Complex, expensive
replication required.
No replication required.
Vault-based vs. Vaultless Tokenization
51
Distribution Practically impossible to
distribute geographically.
Easy to deploy at different
geographically distributed locations.
Reliability Prone to collisions. No collisions.
Performance,
Latency, and
Scalability
Will adversely impact
performance & scalability.
Little or no latency. Fastest industry
tokenization.
PCI DSS 3.0
• Split knowledge and dual control
PCI SSC Tokenization Task Force
• Tokenization and use of HSM
Card Brands – Visa, MC, AMEX …
The Future of Tokenization
• Tokens with control vectors
ANSI X9
• Tokenization and use of HSM
52
Security of Different Protection Methods
High
Security Level
I
Format
Preserving
Encryption
I
Vaultless
Data
Tokenization
I
AES CBC
Encryption
Standard
I
Basic
Data
Tokenization
53
Low
10 000 000 -
1 000 000 -
100 000 -
10 000 -
Transactions per second*
Speed of Different Protection Methods
10 000 -
1 000 -
100 -I
Format
Preserving
Encryption
I
Vaultless
Data
Tokenization
I
AES CBC
Encryption
Standard
I
Vault-based
Data
Tokenization
*: Speed will depend on the configuration
54
Risk Adjusted Data Protection
Data Security Methods Performance Storage Security Tran sparency
System without data protection
Monitoring + Blocking + Obfuscation
Data Type Preservation Encryption
Strong Encryption
There is always a trade-off between security and usability.
Strong Encryption
Vaultless Tokenization
Hashing
Anonymisation
BestWorst
55
DataDe-Identification
56
De-Identification
The solution to protecting Identifiable data is to properly de-identify it.
Redact the information – remove it.
What is de-identification of identifiable data?
Personally Identifiable Information Health Information / Financial Information
Personally Identifiable Information Health Information / Financial Information�
Redact the information – remove it.
The identifiable portion of the record is de-identified with any number of protection methods such as masking, tokenization, encryption, redacting (removed), etc.
The method used will depend on your use case and the reason that you are de-identifying the data.
57
Identifiable Sensitive InformationField Real Data Tokenized / Pseudonymized
Name Joe Smith csu wusoj
Address 100 Main Street, Pleasantville, CA 476 srta coetse, cysieondusbak, CA
Date of Birth 12/25/1966 01/02/1966
Telephone 760-278-3389 760-389-2289
E-Mail Address [email protected] [email protected]
SSN 076-39-2778 937-28-3390
CC Number 3678 2289 3907 3378 3846 2290 3371 3378
Business URL www.surferdude.com www.sheyinctao.com
Fingerprint Encrypted
Photo Encrypted
X-Ray Encrypted
Healthcare / Financial Services
Dr. visits, prescriptions, hospital stays and discharges, clinical, billing, etc.Financial Services Consumer Products and activities
Protection methods can be equally applied to the actual healthcare data, but not needed with de-identification
58
De-Identified Sensitive Data Field Real Data Tokenized / Pseudonymized
Name Joe Smith csu wusoj
Address 100 Main Street, Pleasantville, CA 476 srta coetse, cysieondusbak, CA
Date of Birth 12/25/1966 01/02/1966
Telephone 760-278-3389 760-389-2289
E-Mail Address [email protected] [email protected]
SSN 076-39-2778 076-28-3390
CC Number 3678 2289 3907 3378 3846 2290 3371 3378
Business URL www.surferdude.com www.sheyinctao.com
Fingerprint Encrypted
Photo Encrypted
X-Ray Encrypted
Healthcare / Financial Services
Dr. visits, prescriptions, hospital stays and discharges, clinical, billing, etc.Financial Services Consumer Products and activities
Protection methods can be equally applied to the actual data, but not needed with de-identification
59
Use
Case
How Should I Secure Different Data?
Simple –PCI
PII
Encryption
of Files
CardHolder Data
Tokenization of Fields
Personally Identifiable Information
Type of
DataI
Structured
I
Un-structured
Complex – PHI
ProtectedHealth
Information
60
Personally Identifiable Information
Research Brief
Tokenization Gets Traction
Aberdeen has seen a steady increase in enterprise use of tokenization for protecting sensitive data over encryption
Nearly half of the respondents (47%) are currently using tokenization for something other than cardholder data
Over the last 12 months, tokenization users had 50% fewer security-related incidents than tokenization non-users
61 Author: Derek Brink, VP and Research Fellow, IT Security and IT GRC
The business intelligence exposed through Vaultless Tokenization can allow many users and processes to perform job functions on protected data
Extreme flexibility in data de-identification can allow responsible data monetization
Vaultless Tokenization & Data Insight
Data remains secure throughout data flows, and can maintain a one-to-one relationship with the original data for analytic processes
62
Use Cases for Coarse & Fine Coarse & Fine
Grained Security
63
Off-shoring & OutsourcingOutsourcing
Business Process Outsourcing (BPO)
• Business Processes
• E.g. Loans, Mortgages, Call Centre, Claims Processing, ERP, etc.
• Application Development
• Need to de-identify Data for Testing and Development
Off-Shoring
Privacy Impacts BPO & Offshore Business Solutions
• Same as Outsourcing, but data is sent for business functions (like call center, etc.) off-shore.
Laws governing your ability to send real data to 3rd parties are already restrictive, and becoming more so
Penalties for infringement are growing more severe
Risk of data breaches and data theft is increased
65
Major Bank in EU wants to centralise EDW operations in a single country and therefore send customer data from country A to country B. Privacy Laws in country A prohibit this.
Private Bank in Europe wants to offshore Finance
Examples
Private Bank in Europe wants to offshore Finance Operations. Privacy Law prohibits transfer of citizen data to India.
Retail Bank in Scandinavia wants to offshore Customer Services. Privacy law prevents transfer of citizen data to the Far East.
66
Case Studies
Protegrity Use Case: UniCredit
CHALLENGES The primary challenge was to protect PII – names and addresses, phone and email, policy and account numbers, birth dates, etc. – to the satisfaction of EU Cross Border Data Security requirements. This included incoming source data from various European banking entities, and existing data within those systems, which would be consolidated at the Italian HQ.
Case Study - Large US Chain Store
Reduced cost
• 50 % shorter PCI audit
Quick deployment
• Minimal application changes
• 98 % application transparent
Top performanceTop performance
• Performance better than encryption
Stronger security
69
Case Study: Large Chain Store
Why? Reduce compliance cost by 50%• 50 million Credit Cards, 700 million daily transactions
• Performance Challenge: 30 days with Basic to 90 minutes with Vaultless Tokenization
• End-to-End Tokens: Started with the D/W and expanding to stores
• Lower maintenance cost – don’t have to apply all 12 requirements
• Better security – able to eliminate several business and daily reports
• Quick deployment
• Minimal application changes
• 98 % application transparent
70
Aadhaar/UIDBig DataBig Data
Use Case
Aadhaar Data Stores
Mongo cluster(all enrolment records/documents
– demographics + photo)
Shard
1
Shard
4
Shard
5
Shard
2
Shard
3Low latency indexed read (Documents per sec),High latency random search (seconds per read)
Low latency indexed read (milli-
Solr cluster(all enrolment records/documents
– selected demographics only)
Low latency indexed read (Documents per sec),Low latency random search (Documents per sec)
Shard
0
Shard
2
Shard
6
Shard
9
Shard
a
Shard
d
Shard
f
MySQL(all UID generated records - demographics only,
track & trace, enrolment status )
Low latency indexed read (milli-seconds per read),High latency random search (seconds per read)
UID master
(sharded)
Enrolment
DB
HDFS(all raw packets)
Data
Node 1Data
Node 10
Data
Node ..
High read throughput (MB per sec),High latency read (seconds per read)
Data
Node 20
HBase(all enrolment
biometric templates)
Region
Ser. 1Region
Ser. 10
Region
Ser. ..
High read throughput (MB per sec),Low-to-Medium latency read (milli-seconds per read)Region
Ser. 20
NFS(all archived raw packets)
Moderate read throughput,High latency read (seconds per read)
LUN 1 LUN 2 LUN 3 LUN 4
Protegrity Summary
Proven enterprise data security software and innovation leader
• Sole focus on the protection of data
• Patented Technology, Continuing to Drive Innovation
Cross-industry applicability• Retail, Hospitality, Travel and
TransportationTransportation
• Financial Services, Insurance, Banking
• Healthcare
• Telecommunications, Media and Entertainment
• Manufacturing and Government
74
Please contact us for more information
www.protegrity.com