Download - Cisco OpenSOC

Transcript
Page 1: Cisco OpenSOC

OpenSOCThe Open Security Operations

Centerfor

Analyzing 1.2 Million Network Packets per Second in Real TimeJames Sirota, Big Data ArchitectCisco Security Solutions [email protected]

Sheetal DolasPrincipal [email protected]

June 3, 2014

Page 2: Cisco OpenSOC

2

Problem Statement & Business Case for OpenSOC Solution Architecture and Design Best Practices and Lessons Learned Q & A

Over Next Few Minutes

Page 3: Cisco OpenSOC

3

Business Case

Page 4: Cisco OpenSOC

4

“There's now a growing sense of fatalism:

It's no longer if or when you get hacked,

but the assumption is that you've already been hacked,

with a focus on minimizing the damage.”

Source: Dark Reading / Security’s New Reality: Assume The Worst

Page 5: Cisco OpenSOC

5

Breaches Happen in Hours…But Go Undetected for Months or Even Years

Source: 2013 Data Breach Investigations Report

Seconds

Minutes Hours Days Weeks Month

s Years

Initial Attack to Initial Compromise 10% 75% 12% 2% 0% 1% 1%

Initial Compromise to Data Exfiltration 8% 38% 14% 25% 8% 8% 0%

Initial Compromise to Discovery 0% 0% 2% 13% 29% 54% 2%

Discovery to Containment/ Restoration 0% 1% 9% 32% 38% 17% 4%

Timespan of events by percent of breaches

In 60% of breaches, data

is stolen in hours

54% of breaches are not

discovered for months

Page 6: Cisco OpenSOC

6

Cisco Global Cloud Index

Source: 2014 Cisco Global Cloud Index

Page 7: Cisco OpenSOC

7

Introducing OpenSOCIntersection of Big Data and Security Analytics

Multi Petabyte StorageInteractive Query

Real-Time Search

Scalable Stream Processing

Unstructured Data

Data Access Control

Scalable Compute

OpenSOC

Real-Time Alerts

Anomaly Detection

Data Correlation

Rules and Reports

Predictive Modeling

UI and Applications

Big Data Platform

HadoopStorm

Elastic Search

Kafka

Page 8: Cisco OpenSOC

8

OpenSOC Journey

Sept 2013

First Prototype

Dec 2013Hortonworks

joins the project

March 2014

Platform developmen

t finished

Sept 2014General

Availability

May 2014

CR Work off

April 2014First beta

test at customer

site

Page 9: Cisco OpenSOC

9

Solution Architecture & Design

Page 10: Cisco OpenSOC

10

OpenSOC Conceptual Architecture

Raw Network Stream

Network Metadata Stream

Netflow

Syslog

Raw Application Logs

Other Streaming Telemetry

HiveHBaseRaw

Packet Store

Long-Term Store

Elastic Search

Real-Time Index

Network Packet

Mining and PCAP

Reconstruction

Log Mining and Analytics

Big Data Exploration,Predictive Modeling

Applications + Analyst Tools

Pars

e +

For

mat

Enri

ch

Aler

t

Threat IntelligenceFeeds

Enrichment Data

Page 11: Cisco OpenSOC

11

Raw Network Packet Capture, Store, Traffic Reconstruction

Telemetry Ingest, Enrichment and Real-Time Rules-Based Alerts

Real-Time Telemetry Search and Cross-Telemetry Matching

Automated Reports, Anomaly Detection and Anomaly Alerts

Rich Analytics Apps and Integration with Existing Analytics Tools

Key Functional Capabilities

Page 12: Cisco OpenSOC

12

Fully-Backed by Cisco and Used Internally for Multiple Customers

Free, Open Source and Apache Licensed Built on Highly-Scalable and Proven Platforms

(Hadoop, Kafka, Storm) Extensible and Pluggable Design Flexible Deployment Model (On-Premise or Cloud) Centralize your processes, people and data

The OpenSOC Advantage

Page 13: Cisco OpenSOC

13

OpenSOC Deployment at CiscoHardware footprint (40u)

14 Data Nodes (UCS C240 M3) 3 Cluster Control Nodes (UCS C220 M3)

2 ESX Hypervisor Hosts (UCS C220 M3)

1 PCAP Processor (UCS C220 M3 + Napatech NIC)

2 SourceFire Threat alert processors

1 Anue Network Traffic splitter 1 Router 1 48 Port 10GE Switch

Software StackHDP 2.1Kafka 0.8Elastic Search 1.1

MySQL 5.5

Page 14: Cisco OpenSOC

14

OpenSOC - Stitching Things Together AccessMessaging

SystemData

CollectionSource Systems StorageReal Time Processing

StormKafka

B Topic

N Topic

Elastic Search

Index

Web Services

Search

PCAP Reconstruction

HBase

PCAP Table

Analytic Tools

R / Python

Power Pivot

Tableau

Hive

Raw Data

ORC

Passive Tap

PCAP Topic

DPI Topic

A Topic

Telemetry Sources

Syslog

HTTP

File System

Other

Flume

Agent A

Agent B

Agent N

B Topology

N Topology

A Topology

PCAP

Traffic Replicator

PCAP Topology

DPI Topology

Page 15: Cisco OpenSOC

15

OpenSOC - Stitching Things Together AccessMessaging

SystemData

CollectionSource Systems StorageReal Time Processing

StormKafka

B Topic

N Topic

Elastic Search

Index

Web Services

Search

PCAP Reconstruction

HBase

PCAP Table

Analytic Tools

R / Python

Power Pivot

Tableau

Hive

Raw Data

ORC

Passive Tap

PCAP Topic

DPI Topic

A Topic

Telemetry Sources

Syslog

HTTP

File System

Other

Flume

Agent A

Agent B

Agent N

B Topology

N Topology

A Topology

PCAP

Traffic Replicator

Deeper Look

PCAP Topology

DPI Topology

Page 16: Cisco OpenSOC

16

PCAP TopologyStorageReal Time Processing

Storm

Elastic Search

Index

HBase

PCAP Table

Hive

Raw Data

ORC

Kafka

Spout

Parser

Bolt

HDFSBolt

HBase

Bolt

ESBolt

Page 17: Cisco OpenSOC

17

DPI Topology & Telemetry Enrichment StorageReal Time Processing

Storm

Elastic Search

Index

HBase

PCAP Table

Hive

Raw Data

ORCKafk

a Spou

t

Parser

Bolt

GEO Enric

h

Whois

Enrich

CIF Enric

h

HDFS

Bolt

ESBolt

Page 18: Cisco OpenSOC

18

Enrichments

Parser

Bolt

GEOEnric

hRAW Message

{“msg_key1”: “msg value1”,“src_ip”: “10.20.30.40”,“dest_ip”: “20.30.40.50”,“domain”: “mydomain.com”}

Who Is

Enrich

"geo":[ {"region":"CA","postalCode":"95134","areaCode":"408","metroCode":"807","longitude":-121.946,"latitude":37.425,"locId":4522,"city":"San Jose","country":"US" }]

CIFEnric

h

"whois":[ {"OrgId":"CISCOS","Parent":"NET-144-0-0-0-0","OrgAbuseName":"Cisco Systems Inc","RegDate":"1991-01-171991-01-17","OrgName":"Cisco Systems","Address":"170 West Tasman Drive","NetType":"Direct Assignment"} ],“cif”:”Yes”

EnrichedMessage

Cache

MySQLGeo Lite Data

Cache

HBaseWho Is Data

Cache

HBaseCIF Data

Page 19: Cisco OpenSOC

19

Applications: Telemetry Matching and DPI

Step1: Search

Step2: Match

Step3: Analyze

Step4: Build PCAP

Page 20: Cisco OpenSOC

20

Integration with Analytics Tools

Dashboards Reports

Page 21: Cisco OpenSOC

21

Best Practices and

Lessons Learned

Page 22: Cisco OpenSOC

22

Journey Towards Highly Scalable

Application

Page 23: Cisco OpenSOC

23

Kafka Tuning

Page 24: Cisco OpenSOC

24

This is where we began

Page 25: Cisco OpenSOC

25

Some code optimizations and increased parallelism

Page 26: Cisco OpenSOC

26

Is Disk I/O heavy Kafka 0.8+ supports replication and JBOD

Better performance compared to RAID Parallelism is largely driven by number of disks and partitions

per topic Key configuration parameters:

num.io.threads - Keep it at least equal to number of disks provided to Kafka

num.network.threads - adjust it based on number of concurrent producers, consumers and replication factor

Kafka Tuning

Page 27: Cisco OpenSOC

27

After Kafka Tuning

Page 28: Cisco OpenSOC

28

Bottleneck Isolation, Resource Profiling, Load Balancing

Page 29: Cisco OpenSOC

29

HBase Tuning

Page 30: Cisco OpenSOC

30

This is where we began

Page 31: Cisco OpenSOC

31

Row Key design is critical (gets or scans or both?) Keys with IP Addresses

Standard IP addresses have only two variations of the first character : 1 & 2

Minimum key length will be 7 characters and max 15 with a typical average of 12

Subnet range scans become difficult – range of 90 to 220 excludes 112 IP converted to hex (10.20.30.40 => 0a141e28)

gives 16 variations of first key character consistently 8 character key Easy to search for subnet ranges

Row Key Design

Page 32: Cisco OpenSOC

32

Experiments with Row Key

Page 33: Cisco OpenSOC

33

Know your data Auto split under high workload can result into hotspots and split

storms Understand your data and presplit the regions Identify how many regions a RS can have to perform optimally. Use

the formula below(RS memory)*(total memstore fraction)/((memstore size)*(# column families))

Region Splits

Page 34: Cisco OpenSOC

34

With Region Pre-Splits

Page 35: Cisco OpenSOC

35

Enable Micro Batching (client side buffer) Smart shuffle/grouping in storm Understand your data and situationally exploit various WAL

options Watch for many minor compactions

For heavy ‘write’ workload Increase hbase.hstore.blockingStoreFiles (we used 200)

Know Your Application

Page 36: Cisco OpenSOC

36

And Finally

Page 37: Cisco OpenSOC

37

Kafka Spout

Page 38: Cisco OpenSOC

38

Parallelism is controlled by number of partitions per topic Set Kafka spout parallelism equal to number of

partitions in topic Other key parameters that drive performance

fetchSizeBytes bufferSizeBytes

Kafka Spout

Page 39: Cisco OpenSOC

39

Mysteriously Missing Data

Page 40: Cisco OpenSOC

40

A bug in Kafka spout that used to miss out some partitions and loose data It is now fixed and available from Hortonworks repository (

http://repo.hortonworks.com/content/repositories/releases/org/apache/storm/storm-Kafka )

Mysteriously Missing Data Root Cause

Page 41: Cisco OpenSOC

41

Storm

Page 42: Cisco OpenSOC

42

Every small thing counts at scale Even simple string operations can slowdown throughput

when executed on millions of Tuples

Storm

Page 43: Cisco OpenSOC

43

Error handling is critical Poorly handled errors can lead to topology failure and

eventually loss of data (or data duplication)

Storm

Page 44: Cisco OpenSOC

44

Tune & Scale individual spout and bolts before performance testing/tuning entire topology Write your own simple data generator spouts and no-op

bolts

Making as many things configurable as possible helps a lot

Storm

Page 45: Cisco OpenSOC

45

When it comes to Hadoop…partner up Separate the hype from the opportunity Start small then scale up Design Iteratively It doesn’t work unless you have proven it at

scale Keep an eye on ROI

Lessons Learned

Page 46: Cisco OpenSOC

46

How can you contribute? Technology Partner Program – contribute

developers to join the Cisco and Hortonworks team

Looking for Community PartnersCisco + Hortonworks + Community Support for OpenSOC

Page 47: Cisco OpenSOC

Thank you!We are hiring:

[email protected]@hortonworks.com