Big Data Analytics to Enhance Security

34
Copyright Stelligence Co.,Ltd. 2016 All rights reserved Big Data Analytics to Enhance Security Anapat Pipatkitibodee Technical Manager [email protected]

Transcript of Big Data Analytics to Enhance Security

Page 1: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Big Data Analytics to

Enhance Security

Anapat Pipatkitibodee

Technical Manager

[email protected]

Page 2: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Agenda

• Big Data Analytics

• Security Trends

• Example Security Attacks

• Integrated Security Analytics with Open Source

• How to Apply ?

Page 3: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Big Data Analytics

Page 4: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Everyone is Claiming Big Data

Page 5: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Traditional vs Big Data

Presenter
Presentation Notes
The term Big Data refers to large-scale information management and analysis technologies that exceed the capability of traditional data processing technologies.1 Big Data is differentiated from traditional technologies in three ways: the amount of data (volume), the rate of data generation and transmission (velocity), and the types of structured and unstructured data (variety)
Page 6: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Drivers of Big Data

• About 80% of the world’s data are semi-structured or unstructured.

Presenter
Presentation Notes
Big Data analytics – the process of analyzing and mining Big Data – can produce operational and business knowledge at an unprecedented scale and specificity. The need to analyze and leverage trend data collected by businesses is one of the main drivers for Big Data analysis tools. The technological advances in storage, processing, and analysis of Big Data include (a) the rapidly decreasing cost of storage and CPU power in recent years; (b) the flexibility and cost-effectiveness of datacenters and cloud computing for elastic computation and storage; and (c) the development of new frameworks such as Hadoop, which allow users to take advantage of these distributed computing systems storing large quantities of data through flexible parallel processing. These advances have created several differences between traditional analytics and Big Data analytics 1. Storage cost has dramatically decreased in the last few years. Therefore, while traditional data warehouse operations retained data for a specific time interval, Big Data applications retain data indefinitely to understand long historical trends. 2. Big Data tools such as the Hadoop ecosystem and NoSQL databases provide the technology to increase the processing speed of complex queries and analytics. 3. Extract, Transform, and Load (ETL) in traditional data warehouses is rigid because users have to define schemas ahead of time. As a result, after a data warehouse has been deployed, incorporating a new schema might be difficult. With Big Data tools, users do not have to use predefined formats. They can load structured and unstructured data in a variety of formats and can choose how best to use the data
Page 7: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Open Source Tools in Big Data

• Hadoop ecosystem

• NoSQL database

Presenter
Presentation Notes
The process of examining large data sets containing a variety of data types i.e., big data.� Big Data analytics enables organizations to analyze a mix of structured, semi-structured, and unstructured data in search of valuable information and insights.� Tools used in Big Data: Hadoop  ecosystem NoSQL database Increase the processing speed of complex queries and analytics.
Page 8: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Apache Hadoop Stack

Reference: Hadoop Essentialsby Swizec Teller

Page 9: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reservedhttps://whatsthebigdata.com/2016/02/08/big-data-landscape-2016/

Page 10: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Big Data Analytics

• The process of examining large data sets containing a variety of data types i.e., big data.

• Big Data analytics enables organizations to analyze a mix of structured, semi-structured, and unstructured data in search of valuable information and insights.

Presenter
Presentation Notes
The process of examining large data sets containing a variety of data types i.e., big data.� Big Data analytics enables organizations to analyze a mix of structured, semi-structured, and unstructured data in search of valuable information and insights.� Tools used in Big Data: Hadoop  ecosystem NoSQL database Increase the processing speed of complex queries and analytics.
Page 11: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Security Trends

Page 12: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Data Analytics for Intrusion Detection

• 1st generation: Intrusion detection systems

• 2nd generation: Security information and event management (SIEM)

Presenter
Presentation Notes
● 1st generation: Intrusion detection systems – Security architects realized the need for layered security (e.g., reactive security and breach response) because a system with 100% protective security is impossible. ● 2nd generation: Security information and event management (SIEM) – Managing alerts from different intrusion detection sensors and rules was a big challenge in enterprise settings. SIEM systems aggregate and filter alarms from many sources and present actionable information to security analysts. ● 3rd generation: Big Data analytics in security (2nd generation SIEM) – Big Data tools have the potential to provide a significant advance in actionable security intelligence by reducing the time for correlating, consolidating, and contextualizing diverse security event information, and also for correlating long-term historical data for forensic purposes.
Page 13: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Limitation of Traditional SIEMs

Storing and retaining a large quantity of data was not economically feasible.

Normalization & datastore schema reduces data

Traditional tools did not leverage Big Data technologies.

Closed platform with limited customization & integration options

Presenter
Presentation Notes
Analyzing logs, network packets, and system events for forensics and intrusion detection has traditionally been a significant problem; however, traditional technologies fail to provide the tools to support long-term, large-scale analytics for several reasons: 1. Storing and retaining a large quantity of data was not economically feasible. As a result, most event logs and other recorded computer activity were deleted after a fixed retention period (e.g., 60 days). 2. Performing analytics and complex queries on large, structured data sets was inefficient because traditional tools did not leverage Big Data technologies. 3. Traditional tools were not designed to analyze and manage unstructured data. As a result, traditional tools had rigid, defined schemas. Big Data tools (e.g., Piglatin scripts and regular expressions) can query data in flexible formats. 4. Big Data systems use cluster computing infrastructures. As a result, the systems are more reliable and available, and provide guarantees that queries on the systems are processed to completion.
Page 14: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Security Trend from Y2015 to Y2016

Fireeye M-Trends Report 2016

Presenter
Presentation Notes
The number of threats is increasing and also becoming more advanced. Today’s advanced threats are stealthy and sophisticated and evade detection from traditional, point security products that look for specific threat signatures. Above are 3 types of advanced threats. They are good at stealing confidential data, whether it be credit cards or IP, and many of their victims unfortunately end up in the headlines. Cyber criminals include the credit card theft at Target and Neiman Marcus. Nation state attacks include Iran and China attacking governments and private sector companies to steal intellectual property and/or national secrets. FYI these advanced threats are also commonly called APTs, or Advanced Persistent Threats. APT are hard to detect because they are not signature-based and hide behind legitimate credentialed activity to evade detection from traditional, point security products. Every year companies like Mandiant produce reports that describe the trends identified based on the breach investigation work that they do as part of their consulting practices. There are a couple metrics that I found interesting reading their recent reports. 100% is often via stealing password hashes or using keyloggers. Often they steal admin-level credentials so they can access many other systems and not be detected. The 40 implies that even if you see malware in one place, you need to look much further as there are likely multiple infected machines and backdoors 243 days shows how they can evade detection for months at a time. They move slow and low and do not set off alarms from point, signature-based security products like anti-malware solutions. 63% of victims were notified by an external entity. Notification usually starts with customer complaints like bank account drained or credit card maxed out. Often FBI informs them.
Page 15: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Security Trend from Y2015 to Y2016

• Threats are hard to investigate

Fireeye M-Trends Report 2016

Presenter
Presentation Notes
The number of threats is increasing and also becoming more advanced. Today’s advanced threats are stealthy and sophisticated and evade detection from traditional, point security products that look for specific threat signatures. Above are 3 types of advanced threats. They are good at stealing confidential data, whether it be credit cards or IP, and many of their victims unfortunately end up in the headlines. Cyber criminals include the credit card theft at Target and Neiman Marcus. Nation state attacks include Iran and China attacking governments and private sector companies to steal intellectual property and/or national secrets. FYI these advanced threats are also commonly called APTs, or Advanced Persistent Threats. APT are hard to detect because they are not signature-based and hide behind legitimate credentialed activity to evade detection from traditional, point security products. Every year companies like Mandiant produce reports that describe the trends identified based on the breach investigation work that they do as part of their consulting practices. There are a couple metrics that I found interesting reading their recent reports. 100% is often via stealing password hashes or using keyloggers. Often they steal admin-level credentials so they can access many other systems and not be detected. The 40 implies that even if you see malware in one place, you need to look much further as there are likely multiple infected machines and backdoors 243 days shows how they can evade detection for months at a time. They move slow and low and do not set off alarms from point, signature-based security products like anti-malware solutions. 63% of victims were notified by an external entity. Notification usually starts with customer complaints like bank account drained or credit card maxed out. Often FBI informs them.
Page 16: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

All Data is Security Relevant = Big Data

Servers

Storage

DesktopsEmail Web

TransactionRecords

NetworkFlows

DHCP/ DNS

HypervisorCustom

Apps

PhysicalAccess

Badges

Threat Intelligence

Mobile

CMDB

Intrusion Detection

Firewall

Data Loss Prevention

Anti-Malware

VulnerabilityScans

Traditional

Authentication

Presenter
Presentation Notes
Key part of IT security is protecting confidential data. Which means detecting advanced threats, like cybercriminals or malicious insiders, before they can steal your data. To detect or investigate them, you need non-security and security data because advanced threats avoid detection from signature-based security products; the fingerprints of an advanced threat often are in the “non-security” data. Most traditional SIEMs just focus on gathering signature-based threats which do *not* have the fingerprints of advanced threats. Also the above scenario is worse if there is no SIEM. Instead point UIs and grep are used and aggregating data is very manual and time consuming.
Page 17: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Data Analytics for Intrusion Detection

• 1st generation: Intrusion detection systems

• 2nd generation: Security information and event management (SIEM)

• 3rd generation: Big Data analytics in security (Next generation SIEM)

Presenter
Presentation Notes
● 1st generation: Intrusion detection systems – Security architects realized the need for layered security (e.g., reactive security and breach response) because a system with 100% protective security is impossible. ● 2nd generation: Security information and event management (SIEM) – Managing alerts from different intrusion detection sensors and rules was a big challenge in enterprise settings. SIEM systems aggregate and filter alarms from many sources and present actionable information to security analysts. ● 3rd generation: Big Data analytics in security (2nd generation SIEM) – Big Data tools have the potential to provide a significant advance in actionable security intelligence by reducing the time for correlating, consolidating, and contextualizing diverse security event information, and also for correlating long-term historical data for forensic purposes.
Page 18: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Example Security Attacks

Page 19: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Advanced Persistent Threats

• Advanced– The attack can cope with traditional security solutions– In many cases is based on Zero-day vulnerabilities

• Persistent– Attack has a specific goal– Remain on the system as long as the attack goal is not met.

• Threat– Collect and steal information-Confidentiality.– Make the victim's system unavailable-Availability.– Modify the victim's system data-Integrity.

Page 20: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Example of Advanced Threat Activities

HTTP (web) session tocommand & controlserver

Remote control,Steal data,Persist in company,Rent as botnet

WEB

ConductBusiness

Create additional environment

Gain Access to systemTransaction

.pdf

.pdf executes & unpacks malwareoverwriting and running “allowed” programs

Svchost.exeCalc.exe

Attacker hacks websiteSteals .pdf files

WebPortal.pdf

Attacker createsmalware, embed in .pdf,

Emails to the target MAIL

Read email, open attachment

Threat intelligence

Auth - User Roles

Host Activity/Security

Network Activity/Security

Presenter
Presentation Notes
Use the animation to talk to the Zeus attack scenario described in the Zeus demo. Reconn – find vulnerability, find method most likely to gain access – locate vulnerable server with .pdf Reconn - Attacker attacks an extranet portal (vulnerable server) and steals a known good document (.pdf) Weaponization - Attacker creates malware and packages up in pdf and names it the same document as that on the portal (look like a good document) Delivery - Attacker spoofs (use technique to send email that looks like it’s coming from an employee of the company) a company employee email and sends to several targets at the company Exploitation – User (all it takes is one) reads email, open the attachment, exploits a vulnerable in a document reader that allows programs to run Installation – program installs several programs that over-write “good” programs on the computer – the calculator program – calc.exe Installation – calc.exe spans svchost.exe, a generic program on windows machines Command and Control – svchost.exe establishes communication to remote command and control server. Point out – this came from a real example. The left shows the different defensive technologies that might have seen something.
Page 21: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Link Events Together

Threat intelligence

Auth - User Roles,Corp Context

Host Activity/Security

Network Activity/Security

WEB

ConductBusiness

Create additional environment

Gain Access to systemTransaction

MAIL

.pdf Svchost.exeCalc.exe

Events that contain link to file

Proxy logC2 communication to blacklist

How was process started?

What created the program/process?

Process making C2 traffic

WebPortal.pdf

Presenter
Presentation Notes
This animation shows how the attack is traced back to the root.
Page 22: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Correlated Security Log

Aug 08 06:09:13 acmesep01.acmetech.com Aug 09 06:17:24 SymantecServer acmesep01: Virus found,Computer name: ACME-002,Source: Real Time Scan,Risk name: Hackertool.rootkit,Occurrences: 1,C:/Documents and Settings/smithe/Local Settings/Temp/evil.tmp,"""",Actual action: Quarantined,Requested action: Cleaned, time: 2009-01-23 03:19:12,Inserted: 2009-01-23 03:20:12,End: 2009-01-23 03:19:12,Domain: Default,Group: My Company\ACME Remote,Server: acmesep01,User: smithe,Source computer: ,Source IP: 10.11.36.20

Aug 08 08:26:54 snort.acmetech.com {TCP} 10.11.36.20:5072 -> 10.11.36.26:443 itsec snort[18774]: [1:100000:3] [Classification: Potential Corporate Privacy Violation] Credit Card Number Detected in Clear Text [Priority: 2]:

20130806041221.000000Caption=ACME-2975EB\Administrator Description=Built-in account for administering the computer/domainDomain=ACME-2975EB InstallDate=NULLLocalAccount = IP: 10.11.36.20 TrueName=Administrator SID =S-1-5-21-1715567821-926492609-725345543 500SIDType=1 Status=Degradedwmi_ type=UserAccounts

Sources

All three occurring within a 24-hour period

Source IP

Data Loss

Default Admin Account

Malware Found

Time Range

Intrusion Detection

Endpoint Security

Windows Authentication

Source IP

Source IP

Presenter
Presentation Notes
The monitoring use case is about taking thousands of security events that are low severity in isolation and connecting the dots in an automated, policy-driven manner to see when a combination of seemingly low severity events, when correlated, is actually a high-severity incident that needs immediate attention. There are hundreds of possible cross-product correlations. One is above and tells the story of a data loss event being detected by signature-based security products For a specific internal IP address running Windows, someone logs into it using a default administrative user name “Administrator” which is not good. All users should have a unique user name (not root or Administrator) so you know exactly who is doing what in the IT environment. The OS logs see this log in. Endpoint-based anti-malware sees known, bad malware running on that machine. Malware means “malicious software” and is a red flag because it may lead to data being stolen by a hacker A data loss prevention tool (in this case the Snort Intrusion Detection Prevention product) sees unencrypted credit card numbers leaving the organization from the above machine. This data loss of credit cards is a major red flag. These 3 events happening on the same machine in a short time period indicates a hacker inappropriately logged into the machine, probably using stolen credentials, then put malware on the machine, perhaps a backdoor to remotely connect back to the machine later, then exfiltrated stolen credit cards from the machine. The credit cards may have then been used for illegal purposes which ultimately may have resulted in the costs of re-issuing credit cards, bad publicity, unhappy customers taking their business elsewhere, customer lawsuits, fines for PCI non-compliance, etc. Splunk can correlate on all these 3 events happening on the same machine and within a short time period. It has connected the dots to find the proverbial needle in the haystack. Splunk can detect and/or alert on these sorts of correlations in real-time or on a scheduled bases.
Page 23: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Incident Analysis & Investigation

Search historically - back in time Watch for new evidence

Relatedevidence

from othersecurity devices

Presenter
Presentation Notes
Incident investigations typical start with an alert by another product or by the user or by law enforcement that “something has happened” that requires deeper understanding. The responder/analyst must decide if the issue is a security risk and determine if an action is required (or mark is as a non-issue). Tasks include what is the root cause, what and why it happened, is this related to any other issue previously seen and is it an attacker or threat.
Page 24: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Integrated Security Analytics with Open Source

Page 25: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

SQRRL Solution

https://sqrrl.com/

Page 26: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Anomaly detection in Visualizing

https://sqrrl.com/

Page 27: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Prelert Behavioral Analytics

for the Elastic Stack

http://info.prelert.com/

Page 28: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Prelert Behavioral Analytics

for the Elastic Stack

http://info.prelert.com/

Page 29: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

How to Apply ?

Page 30: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Determining Data That Can Be Collected

Threat intelligence

Auth - User Roles

Service

Host

Network

Network Security Through Data Analysisby Michael S CollinsPublished by O'Reilly Media, Inc., 2014

• Third-party Threat Intel• Open source blacklist• Internal threat intelligence

• Firewall• IDS / IPS• Web Proxy• Vulnerability scanners

• VPNs• Netflow• TCP Collector

• OS logs• Patching• File Integrity

• Endpoint (AV/IPS/FW)• Malware detection• Logins, Logouts log

• Active Directory• LDAP• AAA, SSO

• Application logs• Audit log• Service / Process

Presenter
Presentation Notes
Threat intelligence Attacker, know relay/C2 sites, infected sites, IOC, attack/campaign intent and attribution Network Activity/Security Where they went to, who talked to whom, attack transmitted, abnormal traffic, malware download Host What process is running (malicious, abnormal, etc.) Process owner, registry mods, attack/malware artifacts, patching level, attack susceptibility Auth - User Roles Access level, privileged users, likelihood of infection, where they might be in kill chain This section explains how Big Data is changing the analytics landscape. In particular, Big Data analytics can be leveraged to improve information security and situational awareness. For example, Big Data analytics can be employed to analyze financial transactions, log files, and network traffic to identify anomalies and suspicious activities, and to correlate multiple sources of information into a coherent view. Data-driven information security dates back to bank fraud detection and anomaly-based intrusion detection systems. Fraud detection is one of the most visible uses for Big Data analytics. Credit card companies have conducted fraud detection for decades. However, the custom-built infrastructure to mine Big Data for fraud detection was not economical to adapt for other fraud detection uses. Off-the-shelf Big Data tools and techniques are now bringing attention to analytics for fraud detection in healthcare, insurance, and other fields.
Page 31: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Option 1 : Replace All Solution

• Data sent to new Big Data Analytic Platform

• Big Data Analytic Platform – Static Visualizations /

Reports– Threat detection, alerts,

workflow, compliance– Incident

investigations/forensics– Non-security use cases

Big Data Analytic Platform

Raw data

AlertsStatic Visualizations

Forensics / Search Interface

Page 32: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Option 2 : Big Data to Traditional SIEM

• Data sent to both system• Big Data Analytic Platform

– Incident investigations/forensics

– Non-security use cases• Traditional SIEM

– Static Visualizations / Reports

– Threat detection, alerts, workflow, compliance

Big Data Analytic Platform

Raw data

Forensics / Search Interface

SIEM

Alerts Static Visualizations

Conn

ecto

rs

Page 33: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved

Factors for evaluating

Big Data Security Analytics Platforms

Factors for Evaluating Open Source

• Scalable data ingestion HDFS

• Unified data management platform Cassandra / Accumulo

• Support for multiple data types Ready to Customized

• Real time Spark / Strom

• Security analytic tools No

• Compliance reporting No

• Easy to deploy and manage Manage many 3rd Party

• Flexible search, report and create new correlation rule

No

Page 34: Big Data Analytics to Enhance Security

Copyright Stelligence Co.,Ltd. 2016 All rights reserved