McGonagle-Security Implications of Cross-Agency Big Data ... · PDF fileSECURITY IMPLICATIONS...
-
Upload
vuongthien -
Category
Documents
-
view
217 -
download
2
Transcript of McGonagle-Security Implications of Cross-Agency Big Data ... · PDF fileSECURITY IMPLICATIONS...
SECURITY IMPLICATIONS OF CROSS-AGENCY BIG DATA APPROACHES FOR
TAX COMPLIANCE
Les McMonagle (CISSP, CISA, ITIL) Director & Principal Consultant Teradata – InfoSec COE July 2013
2 Confidential – Do Not Distribute Without Permission
Agenda
• Defining The Problem
• Defining The Solution
• Leveraging Information
• Avoid Common Mistakes
• Wrap-Up / Q & A
4 Confidential – Do Not Distribute Without Permission
Increasing data variety and complexity
BIG DATA User Generated
Content Mobile Web
SMS/MMS
Geo-location data External Reference Sources
HD Video
VOIP
Speech to Text
Sensor data
NetFlow / IPFIX Data
Business Data Feeds
User Click Stream
CDR (phone call records)
SIEM Logs
Web logs
DLP Logs
Internet A/B testing
Dynamic Routing
Affiliate Networks
Search marketing
Behavioral Targeting
Firewall Logs
IDS/IPS Logs
Dynamic Routing Tables
ARP Data
DNS Logs
DHCP Logs
Network
Access Logs LogOnOff Logs Static Routing Tables
Host
Big Data: Exponential Growth in Data
5 Confidential – Do Not Distribute Without Permission
Closing The Tax Gap Is Crucial
• The IRS estimates that at the Federal level, the tax gap is 15% to 17%
• Electronic filings introduce new fraud opportunity
• Fraud Is Very Easy & Widespread Today
• Even incarcerated felons are in on it !
• Not just a US Federal or State issue
6 Confidential – Do Not Distribute Without Permission
Fraud Is Very Easy & Widespread Today
• “Electronic filing, which was introduced to speed up delivery of refunds, has made the system more vulnerable to fraud”
• Delays in comparing W2’s to 1040’s
• "We will not be prosecuting our way out of this …
7 Confidential – Do Not Distribute Without Permission
Big Data Analytics involves many data sources
• Data from multiple disparate sources needs to be combined to provide required insight and machine intelligence
• Additional data sources may contain sensitive or restricted data
• Getting approvals for access to data sources from other agencies can be a challenge
• Poor Data Governance programs impede data sharing
• Intelligent Security can ENABLE these Analytics Opportunities
8 Confidential – Do Not Distribute Without Permission 8
Trends impacting Data Privacy
Three trends in Big Data Analytics and Enterprise Data Warehousing today are raising privacy concerns
and increasing business risk
Only one can be controlled and leveraged to reduce risk
1. Proliferation of Personally Identifiable Information (PII)
2. Persistence/Pervasiveness of PII in Gov/Corp data
3. Consolidating data sources into a single, central repository
9 Confidential – Do Not Distribute Without Permission 9
Last year’s Historical data
Active data warehouse
PII – Personally Identifiable Information, PHI – Private Health Information, IP – Intellectual Property
Applying Protection at the data layer become more critical
Privacy not Technology becomes the limiting factor
Aligning Data Governance Strategy with emerging technology trends
10 Confidential – Do Not Distribute Without Permission
Access to Alternative Data Sources Structured and Unstructured Data
Generate Audit and Fraud
Investigation Leads
Dept of Justice Dept of Labor
Dept of Health
Professional Licenses
DMW – Vehicle Registrations/value
Dept of Human Services
Child and Spousal Support Payments
Dept of Revenue
Alignment with W2 Data
11 Confidential – Do Not Distribute Without Permission
Data Security Issues With Cross-Agency Data Sharing
• Defining The Problem
• Defining The Solution
• Leveraging Information
• Avoid Common Mistakes
• Wrap-Up / Q & A
12 Confidential – Do Not Distribute Without Permission
Getting Access — Without Getting Access
Leverage native database Semantic Layer Security Controls to provide only required access to other sensitive data sources
The Security of Inclusion
versus
The Security of Exclusion
Grease the “Data Sharing” Wheels
13 Confidential – Do Not Distribute Without Permission
Leverage Semantic Layer Security Controls
Views Macros
Routine Application
Marketing Application
Disclosure Application
Analytic User/Application
Single-Row Access
Consumer Access Macro
Customer Base Tables
DBA/System Administrator
Data Protection Security Admin Officer
….
Anonymized View
Opt-out/ Anonymized View
Privacy Infrastructure
Databases/Tables Views, Macros User Profiles Logs Audit Reports
Database Infrastructure
Opt-out View
Standard View
14 Confidential – Do Not Distribute Without Permission
Perform Complex Analytics on Multiple Data Sources
Common Precursors to
fraudulent activity
Clickstream led to a
fraudulent filing
Path Analysis
Data Visualization Dashboards
15 Confidential – Do Not Distribute Without Permission
Semantic Layer Security Controls
Different types or combinations of Views can be applied to limit access to only required data
Anonymized View(s)
Fraud Investigation Team View(s)
16 Confidential – Do Not Distribute Without Permission
Use sensitive data source without direct access
INTEGRATED DATA WAREHOUSE
Labor Health Human Services
Revenue Vehicle
Registrations
Macro
Stored Procedure
Standard Reports Output
Suspected Fraud
Yes
No
Drop
Initiate Audit or
Investigation
18 Confidential – Do Not Distribute Without Permission
Improved Understanding from Internal Network Traffic
• Most network conversations (malicious and benign) have their origins in the intent of a human actor
• The Analysts’ job is really to infer the intent of the human actor by looking at the packets they generate
Actor True
Intent
Network Conversations
Sessions
Packets
What we really care about
What we have to work with
Monitoring/Detecting Internal Misuse of Data
19 Confidential – Do Not Distribute Without Permission
Analytics Helps with ALL Compliance Issues
Lack of understanding of
requirements
Intended Fraud (External hackers)
Different paths, but
same revenue impact !
Innocent Mistakes
Employee misuse or abuse of data access
20 Confidential – Do Not Distribute Without Permission
Agenda
• Defining The Problem
• Defining The Opportunity
• Analytics For Compliance
• Analytics For Efficiency
• Wrap-Up / Q & A
• Defining The Problem
• Defining The Solution
• Leverage Information
• Avoid Common Mistakes
• Wrap-Up / Q & A
21 Confidential – Do Not Distribute Without Permission
Advanced Analytic Capabilities
Advanced Analytics (Predictive)
Traditional Analytics (Reactive)
22 Confidential – Do Not Distribute Without Permission
Leverage what private industry is already doing
• Intelligent Credit Card Authorization Checks
• Retailers immediately detect fraudulent product return patterns by comparing and analyzing more data sources prior to providing a refund (has this product, person, card been used recently for a similar refund?)
• Financial institutions for example have highly sophisticated fraud processes built off of a wide range of data and tools
No need for Tax to reinvent the wheel
23 Confidential – Do Not Distribute Without Permission
Leverage Cross-Agency Data Sharing Sources
State tax and revenue agencies today utilize any or all of the following:
• All internal tax systems data
• Federal IRS data
• Department of Labor Unemployment data
• Workforce Commission data
• Department of Motor Vehicle (DMW) Driver’s License, Vehicle Registrations
• Professional Licenses
• Customs data
• Secretary of State
• US-CIS (immigration, work permits and Visas)
• HHSC data, all agency data from DOL not just a subset as done today, etc.
24 Confidential – Do Not Distribute Without Permission
Other potential external/reference data sources
External reference data source include the following:
• Source IP Address – Physical Address or neighborhood matching • Multiple returns from the same source IP Address that is not equal to
the tax payer address or location
• Credit Score Data ?
• Clickstream data from on-line submissions subjected to path analysis to detect consistent fraudulent submission patterns
• Death Notifications • Fish and Game Licenses • FAA • Others ?
Some reference data may be sensitive or regulated
25 Confidential – Do Not Distribute Without Permission
Leverage what private industry is already doing
• Utilize many common data analytic tools and algorithms with minimal adaptation or modification Employees looking at neighbors, family members, VIP’s or other acquaintances’ tax records or data
• Tagging IRS provided data to ensure compliance with IRS-1075 (Data Classification follows the data)
• Monitoring data access for anomalous or inappropriate access patterns or usage
26 Confidential – Do Not Distribute Without Permission
Agenda
• Defining The Problem
• Defining The Opportunity
• Analytics For Compliance
• Analytics For Efficiency
• Wrap-Up / Q & A
• Defining The Problem
• Defining The Solution
• Leverage Information
• Avoid Common Mistakes
• Wrap-Up / Q & A
27 Confidential – Do Not Distribute Without Permission
Avoid Common Mistakes
• Collecting an enormous amount of activity log and other security log data and never use it Data is then reduced to a basic forensic value only without proactive reporting and alerting on anomalous activity
• Mixing together different data sensitivities (Data Classification follows most sensitive data)
• Not leveraging activity log data to monitor data access and detect anomalous or inappropriate access patterns or usage
28 Confidential – Do Not Distribute Without Permission
Leading Misuse of Data or Data Access
• Random curiosity browsing of data Looking at neighbors, family members, VIP’s, other acquaintance data
• Mixing or co-mingling of IRS data with other sources (Data Classification follows the data)
• Poor application of standard information security best practices Such as Least Privilege and Need to Know basis for granting access
Monitor user activity to ensure correct or appropriate use
29 Confidential – Do Not Distribute Without Permission 29
Privacy Principles – One 1/2
• Accountability – requires that the entity define, document, communicate, and assign accountability for its privacy polices and procedures and be accountable for PII under its control.
• Notice – requires that the entity provide notice about its privacy policies and procedures and identify the purpose for which personal information is collected, used, retained, and disclosed.
• Choice and Consent – requires that the entity describe the choices available to the individual and obtain implicit or explicit consent with respect to the collection, use, and disclosure of personal information.
• Collection Limitation – requires that the entity collect personal information only for the purposes identified in the notice.
• Use Limitation – requires that the entity limit the use of personal information to the purpose identified in the notice and for which the individual has provided implicit or explicit consent.
Comparable lists from: International Security, Trust and Privacy Alliance (ISTPA)
Association of Insurance Compliance Professionals (AICP)
30 Confidential – Do Not Distribute Without Permission 30
Privacy Principles – Two 2/2
• Access – requires that the entity provide individuals with access to their personal information for review and update.
• Disclosure – requires that the entity disclose personal information to third parties only for the purposes identified in the notice and only with the implicit or explicit consent of the individual.
• Security – requires that the entity protect personal information against unauthorized access or alteration (both physical and logical).
• Data Quality – requires an entity maintain accurate, complete, and relevant personal information for the purposes identified in the notice.
• Enforcement – requires that the entity monitor compliance with its privacy policies and procedures and have procedures to address privacy-related inquiries and disputes.
These must be captured in business/technical requirements
31 Confidential – Do Not Distribute Without Permission 31
Proven Data Privacy Methodology
• Convergence of existing Data Privacy Principles
• Centralized EDW’s processing/protecting broadly acquired PII
• Experienced data privacy consultants to advise & assist (International experience, ISTPA, CHP, CISA, CISSP certifications)
• Reduce costs by protecting data in a single, secure repository
• Standardize processes to meet common requirements
Solicit help from external Subject Matter Experts (SME) where appropriate
32 Confidential – Do Not Distribute Without Permission
Conclusions
• State Tax authorities behind the curve in efforts to apply big data analytics to the tax fraud/tax gap problem
• Sharing data in a controlled and consistent way while applying consistent, policy and regulation compliant security controls is easier within a single, centralized data repository or EDW
• Reduce data hosting, data sharing, security controls and other operational costs by consolidating data from multiple DataMarts
• Provide only the minimum access to sensitive information assets required to support each specific business process (Least-Privilege, Need-to-Know basis)
• Ensure original data classification follows the data
34 Confidential – Do Not Distribute Without Permission
Les McMonagle Director & Principal Consultant - Information Security COE
• Les McMonagle is an information security consultant leading the Teradata InfoSec COE
• He has over 20 years of experience in the development and implementation of information security architectures
• During his career he has specialized in computer training, E-Commerce applications, IT Operations, information security architecture, processes, audits and Corporate Risk Management
• Les holds CISSP, CISA, ITIL and other relevant industry certifications
• He has participated in the development of the BITS Financial Institution Shared Assessment Program and delivered executive level presentations on Data Privacy and Security
• Les is also playing a lead role in developing Teradata’s Cyber Security solution strategy and how to leverage Teradata’s Unified Data Architecture (UDA) for CyberSecurity solutions
Les McMonagle (CISSP, CISA, ITIL) Mobile: (617) 501-7144 Email:[email protected]
35 Confidential – Do Not Distribute Without Permission
Contact Information
• If you have further questions or comments:
Les McMonagle (CISSP, CISA, ITIL)
Teradata Information Security, Data Privacy and Regulatory Compliance COE [email protected]
(617) 501-7144 Cell
Les Arnold [email protected]
(512) 930-0135 Office