Fundamentally Better Cloud Security - Broadcom Inc.
Transcript of Fundamentally Better Cloud Security - Broadcom Inc.
WHITEPAPER
AUTHOR
Deena ThomchickSENIOR DIRECTOR OF CLOUD, BLUE COAT
ContentIQ
Cloud DLP
StreamIQ
ThreatScore
UserBehaviorAnalysis
Data Governance
Contextual Analysis
Fundamentally Better Cloud Security
TAKING A DATA SCIENCE APPROACH
Fundamentally Better Cloud Security
TAKING A DATA SCIENCE APPROACH
IntroductionElastica CloudSOC uses the latest data science techniques, combining machine learning and advanced
math, to provide fundamentally more intelligent and responsive security for the cloud. Our scientists are
continually developing and tuning data science-driven engines and algorithms that take advantage of
expansive processing and storage resources available in the cloud. This highly flexible scientific approach
enables Elastica CloudSOC to keep up with the speed of change while identifying, analyzing, and controlling
more user activity, data, and apps with more accuracy.
Cloud Challenges for Security Cloud adoption challenges IT departments to secure a constantly changing, vast landscape of cloud
territory that they do not control.
• Cloud providers continually update and change their services without warning (as often as every
other week for some services).
• End users regularly adopt new cloud apps without notifying IT.
• Individual end users control what content they choose to upload and share—often without fully
understanding the risks associated with what they do.
• Third parties opportunistically uncover confidential company data
accidentally shared with the public.
• Cyber criminals target cloud accounts as a means to access data, spread
malware, or exfiltrate data.
• Visibility and control challenges associated with traditional security are
duplicated and intensified in the cloud.
• How do you scale data governance and DLP to the vast amount of content being uploaded and
stored in the cloud? How do you avoid duplicating efforts in defining and tuning policies and ensure
minimal false positives?
• How do you detect cloud threat activity when it occurs outside your network infrastructure? How do
you detect advanced threats or malicious user activity when no signatures exist to identify them?
Profoundly improved visibility, security, and DLP in the cloud by leveraging new data science techniques and elastic cloud resources
Fundamentally Better Cloud Security Taking a Data Science ApproachCopyright © 2016 Symantec Corp. All Rights Reserved.
1
Cloud Offers More Resources for SecurityOn a positive note, the cloud opens up new possibilities for solving
classic security problems. Elastica solutions are built upon innovative data
science-driven engines that break through known constraints of traditional
security solutions by leveraging the power of the cloud itself.
More Processing Power
Cloud resources are elastic and the inherent efficiency of cloud provisioning
means security solutions are no longer constrained by the limited processing
power in an on-prem appliance.
More Storage
The cloud offers flexible, expanded storage options unlike on-prem appliances, making it easy to expand
your storage as needed. Too often security appliances are handicapped by storage limitations—resulting in
security based on incomplete intelligence databases and an inability to save enough log files for effective
incident response investigations.
Critical Intelligence Necessary to Protect Your OrganizationA highly effective and accurate cloud security system must
be able to identify key information and to evaluate the
contextual significance of that information in order to turn
it into useful intelligence. Elastica CloudSOC solutions are
based on data science-driven engines to address critical
categories of intelligence essential to effectively protect
your organization.
1. ACTIVITY
a. Knowledge: What is happening between my
users and the cloud? What actions are my users
taking in what cloud services?
b. Security: Is this activity a problem? Could hackers
or malware be getting into my accounts?
2. CONTENT
a. Knowledge: What sensitive data from my organi-
zation is being kept in the cloud?
b. Security: Is any of this sensitive data exposed or
at risk of being exposed?
What IT must do when users adopt cloud apps
Identify and govern confidential data in the cloud to stay compliant and protect intellectual property.
Protect against increasingly sophisticated and damaging cyber crime.
Unsupervised Machine Learning
for when you know you don’t know what you don’t know.
Unsupervised machine learning lets the machines do freeform data discovery. It is a great way to discover source data necessary to guide learning systems to make smart decisions, when you don’t specifically know what that source data should be.
Supervised Machine Learning
for when you know what you don’t know.
Supervised machine learning is a great way to analyze large quantities of source data and sort it into a foundation of knowledge that can be used by systems to make decisions and take actions. It enables a system to use a much larger set of source data, analyze it based on a larger set of characteristics, and process that big data to achieve more effective outcomes.
Fundamentally Better Cloud Security Taking a Data Science ApproachCopyright © 2016 Symantec Corp. All Rights Reserved.
2
Data Science for Better Visibility, Control, and Response to ThreatsYou need deep visibility into real-time traffic, not just what apps users are accessing, but also what exactly are
they doing within that app. Getting to this level of granular and contextual knowledge is difficult. It requires
a system with the ability to read the real meaning in volumes of traffic that uses obscure machine language
identifiers to communicate with disparate systems. Additionally, this system must be adaptive, able to use a
foundation of knowledge based on a continually learning system because these machine language identifiers
can be changed without notice or documentation at any time by 3rd party cloud service development teams.
Elastica data scientists leverage the unique horsepower of cloud computing and machine learning to build
a rich foundation of knowledge. Based on that foundation, they build contextual algorithms that can deliver
a more detailed understanding of user behavior and cloud activity than possible with other traffic analysis
systems. Then they leverage cloud processing power to execute these advanced algorithms.
StreamIQ™ is the advanced extraction technology that enables Elastica to understand transactions per-
formed by users in a cloud app in more granular detail than what is possible in most traffic analysis systems
such as Next Generation Firewalls and Secure Web Gateways. This improved ability to identify the who, what,
where, and when in traffic between your users and cloud accounts is critical to identifying and acting on
potential threats to your organization.
Analyzing Cloud Activity with StreamIQIn traffic analysis you need to track
what activities are being performed
by what users with what cloud apps in
what context. You need details such
as: What actions are being taken? Are
they associated with a specific file with
specific attributes that would make
them important? Are these actions
associated with a cloud app you
consider risky? Is this activity normal
for this user?
New cloud apps are popping up
all the time and existing cloud
apps are continually changing their
programming. Any system would find
it extremely difficult to keep up with
this constantly shifting environment.
StreamIQ Intelligence Fuels Elastica CloudSOC Detect, Protect, and Investigate
Machine learning in StreamIQ drives more accurate and deeper real-time activity tracking for more cloud apps. Elastica solutions use the unique intelligence in StreamIQ to detect more threats, enforce protection with a more granular level of control, and investigate security incidents more effectively.
StreamIQ
• Identifies more details on granular transactions in live traffic
• Analyzes traffic and identifies instructions custom to many apps—sanctioned and unsanctioned
• Automatically updates to accommodate cloud app code changes to stay accurate
• Powers more accurate risk analysis based on better activity intelligence
• Enables more granular policy controls
• Provides more useful data for incident response investigations
Fundamentally Better Cloud Security Taking a Data Science ApproachCopyright © 2016 Symantec Corp. All Rights Reserved.
3
Identifying action indicators, object identifiers, and user information from machine readable text can be
exceedingly difficult to identify. Traditional approaches can’t keep up and as a result can track only a few
gross identifiers and commonly break without warning when cloud services change their algorithms.
Elastica’s StreamIQ technology leverages both unsupervised and supervised machine learning and very
deep content inspection to extract granular cloud activity, which fuels Elastica’s Protect, Detect and
Investigate applications.
Elastica uses both unsupervised and supervised machine learning to create StreamIQ, the intelligence
engine and algorithms that fuel Elastica traffic analysis, CloudSOC ThreatScores for Detect, Protect
rules for visibility and policy control, and the high quality log data in Investigate for incident response.
Our scientists start with a few significant characteristics known to be associated with important traffic
attributes. They use these as starting points for unsupervised machine learning that can identify significant
instructions in machine code that would be very difficult or maybe impossible to find any other way. This
foundational discovery of significant instructions is then fed into supervised machine learning systems that
provide the content and contextual intelligence needed to turn this data into the foundation of knowledge
in StreamIQ. Then powerful StreamIQ algorithms use this knowledge base to read traffic no other system
can interpret. Because machines do this work fast, the Elastica system can keep up with a continually
changing cloud landscape.
Essentially, StreamIQ figures out what the machine code in cloud traffic actually means thanks to this data
science approach so it can deliver a uniquely granular level of traffic intelligence into the Elastica solutions.
Significance
• The domain is cleared of
portions (“12” and “dl”) that
occlude the actual cloud app
(“filesharing.cloudapp”).
• The action (Downloading as
a ZIP) is not explicitly stated
and must be inferred from
multiple portions of the
URL. For this application,
downloading as a ZIP
indicates that there will be
one or more files comprising
the ZIP, and we should search
for each of them.
• The filenames are embedded
in a hierarchy of data formats
and are not near one another,
increasing the difficulty of
extracting them.
POST https://12.dl.filesharing.cloudapp.com/documents/unshared?
session=KSGBYV8TQZX&t=zip&aqs=chrome..69i57.2678j0j1&sourceid=
chrome&ie=UTF-8 HTTP/1.1
Host : 12.dl.filesharing.cloudapp.com
cookie : PREF=ID=08DHMNG54O2X:U=2Q7SPLK15OTW
content-length : 126
user-agent : Mozilla/5.0 (Macintosh; Intel Mac OS X 10 _ 9 _ 2)
content-type : application/x-www-form-urlencoded;charset=UTF-8
accept : */*
token=9YDP70JR5ZCS&payload={[“file”:“passwords.txt”,
“parent”:“credentials”,”confirm”:false,”expires”:60},
“file”:“id _ rsa”,“parent”:“credentials”,”confirm”:false,
”expires”:60]}
Application filesharing.cloudapp
Action Downloading multiple files as a ZIP file
Files password.txt and id_rsa
StreamIQ Traffic Analysis
Fundamentally Better Cloud Security Taking a Data Science ApproachCopyright © 2016 Symantec Corp. All Rights Reserved.
4
Security that Recognizes Risky ActivityOnce you know what is happening in cloud apps, you must be able to identify if that activity poses a risk.
The key to activity-based security analysis lies in the ability to identify when activity represents abnormal
user behavior likely indicating a threat. Cloud activity that follows typical user behavior patterns indicates
everything is probably normal. Malicious activity, whether caused by a malware attack, a hijacked account,
or a malicious insider, usually manifests abnormal activity that can be identified—for example, more
frequent logins or uploads than normal for a particular user can indicate an account takeover. It may sound
simple, but activating it effectively requires extensive foundational knowledge and smart adaptive tracking
systems. Otherwise, you have a system that doesn’t identify abnormalities very well, or requires too much
manual babysitting because it creates a lot of false positives.
User Behavior AnalysisGeneric user behavior based security controls rely on manually
set event thresholds and simple defined actions. This is not
true user behavior analysis because these simplistic controls
are not set based on individual user behavior. These are based
on gross assumptions are relatively easy to set up but not
very accurate, unless used judiciously and balanced by more
nuanced user behavior analysis. An example of a useful generic
behavior threshold control would be a rule to freeze access to
an account if there were three failed user login attempts within
a short period of time.
Another common generic threshold control is to trigger a
response if a user uploads more than a certain number of files
within a particular time period. But how do you decide what
number constitutes larger than normal when some users hardly
ever upload files and others upload lots of files? If this arbitrary
threshold were too high it won’t catch legitimately malicious
activity, and if it were too low it will trigger lots of false positives
creating extra work for IT and frustration for users.
For Example User A may typically batch upload 50
files every Friday to Salesforce, but never uploads files to
Google Drive, except one day when they batch upload
15 files. User B may rarely upload files to Google Drive
except one afternoon when they suddenly upload 10 files. A generic user behavior threshold based on 20 uploads in 10 minutes would falsely flag User A behavior with Salesforce as potentially malicious, but not flag the Google Drive uploads and wouldn’t register User B behavior as abnormal at all.
B
Usual Behavior Anomalous Behaviorrepeated weekly single instance
A BA
> 50 < 3 15 10
User Behavior Analysis Intelligence Feeds Elastica CloudSOC Detect and Protect
Machine learning enables highly granular personal user behavior profiles to more accurately identify risky activity in cloud apps. Elastica solutions use intelligence based on user behavior analysis and ThreatScores to detect threats, auto-matically enforce policies and provide better visibility into risky activity.
Elastica UBA Intelligence
• More aware of abnormal activity due to more granular under-standing of typical user behavior
• Minimizes false positives though individualized and contextual-ized user behavior modeling
• Faster response with automated ThreatScore calculations
Fundamentally Better Cloud Security Taking a Data Science ApproachCopyright © 2016 Symantec Corp. All Rights Reserved.
5
The Elastica system is designed to identify individual user behavioral patterns in context with app, time,
objects, access method, etc. This user-specific, context-based method is much more accurate for
identifying potentially malicious activity. However, a system that can provide a unique baseline for individual
user behavioral patterns requires the ability to classify, analyze, and maintain a large volume of intelligence
data. It requires a system able to adapt to changing patterns over time, and able to interpret the significance
of deviations from normal and translate that deviation as usable, actionable information.
Individualized & Contextualized User Behavior ProfilesElastica uses machine learning with expansive cloud processing and storage resources to power a self-
training User Behavioral Analysis (UBA) engine. The UBA engine uses computational analysis algorithms to
analyze transactional data from StreamIQ. UBA algorithms develop a confidence curve for normal behavior
customized to individual users in context with specific actions, apps and other attributes to create and
maintain collections of highly accurate user behavior profiles.
This foundation of knowledge baseline for normal activity opens up many more opportunities to
accurately identify abnormal and potentially malicious activity without creating a deluge of false positives
at the same time.
Identifying Suspicious Activity with ThreatScoreOnce a system can identify what is normal, it becomes possible to identify what is abnormal and therefore
suspicious. If only it were so simple. How far from normal must behavior drift before it becomes abnormal?
How do you evaluate increasing levels of risk as abnormal activity increases? How can you enable the
solution to automatically respond with appropriate levels of security controls?
Our scientists tackled this problem with another layer of data
science to identify and measure the severity of activity that
deviates from normal. CloudSOC Detect uses computational
analysis of user behavior to identify and score the severity
of incidents representing risk. It then correlates this user
behavior score with threshold-defined triggers and detection
of suspicious sequences of events to calculate a dynamic,
continually updated ThreatScore for each user and action.
CloudSOC Detect displays a dynamic map of user behavior
events with granular event ThreatScores and color coding to
identify levels of risk severity for each user.
Fundamentally Better Cloud Security Taking a Data Science ApproachCopyright © 2016 Symantec Corp. All Rights Reserved.
6
Data Science for More Accurate Data Governance Your organization already has content in the cloud—probably much more than you realize. With the heavy
adoption of Office 365, Google Apps, Box, DropBox, SalesForce, AWS, etc., it is foreseeable that most of
your content will eventually be housed in the cloud. You need to know what files and accounts contain
sensitive, confidential and/or compliance governed content, who has access to that content, which users are
associated with that content, and how at risk it is to exposure. This is important because exposure can result
in material losses for an organization through the loss of intellectual property and/or compliance violations.
The Elastica ContentIQ™ technology uses data science to tackle DLP and data governance employing both
unsupervised and supervised machine learning techniques as well as computational linguistics analysis to
achieve more accurate content identification and classification.
Fundamentally Better Cloud Security Taking a Data Science ApproachCopyright © 2016 Symantec Corp. All Rights Reserved.
7
Industry-leading solutions such as Symantec DLP offer layers of sophisticated technologies to accurately
identify data, but most DLP offerings are not very accurate at identifying sensitive content due to a number
of factors:
ContentIQ Starts by Building a Better Foundation of KnowledgeInterpreting language, design formats and expression structures in a document, email, database field, etc. is
key to effective DLP. To accurately identify and classify content you need to know a lot about characteristics
that indicate specific types of data. Even the most brilliant algorithms are only as good as the underlying
source data.
Today it is possible to identify a far more extensive volume of content indicators by
leveraging newer data science techniques combined with bigger computational
resources. Our data scientists use unsupervised machine learning for indicator
discovery. We access public collections of big data containing different types of
content for data mining. We apply the discovery and clustering capabilities of unsu-
pervised machine learning to these collections to identify many more class-related
terms, expressions, and characteristics than would be possible manually. Using
cloud resources and automated systems, our programs maintain an up-to-date,
robust foundation of ContentIQ indicator knowledge to feed [refresh] our content
identification and classification engine.
Limited Knowledge of Indicators
To identify indicators of sensitive data, some solu-
tions rely solely on limited dictionaries for regular
expression matching. This makes it difficult to
identify data containing industry or topic specific
terminology or terms in various languages.
Limited or No Contextual Analysis
Regular expression matching in many DLP
solutions have zero or minimal ability to effec-
tively evaluate an indicator or multiple indicators
in context. For example, a 16-digit number can
represent any number of things other than a
credit card number so a system that flags every
file containing a 16-digit number will create
many false positives. The same goes for phone
numbers, social security numbers, etc. Without
context, how accurately can a system identify
data with even broader types of indicators such
as source code or legal content?
Lack of Customized Intelligence
Basic DLP solutions do not have the capability to
customize their analysis of files based on typical
form structures used by a particular organization.
Symantec DLP and Elastica CASB are rare in their
ability to offer this capability.
Performance Constraints
Sometimes a DLP solution is limited to small
dictionaries and simple matching or can only
scan a subset of files because it runs on an
appliance with inherent processing, memory,
and/or storage constraints that inhibits the ability
to scan content in real-time.
Fundamentally Better Cloud Security Taking a Data Science ApproachCopyright © 2016 Symantec Corp. All Rights Reserved.
8
ContentIQ Uses Contextual Analysis to Deliver True Positives
Once a system has a rich source of foundational knowledge, the next step is to build models that can
analyze the relationships between multiple indicators in order to evaluate their significance in context.
Sophisticated contextual analysis will greatly increase a system’s ability to identify true positives that could
easily be missed with traditional systems and it will greatly reduce incidence of false positives.
ContentIQ’s contextual analysis enables a system to accurately classify content by first identifying strong
class indicators as well as terms, structures, and characteristics that may only be weakly connected to a
classification category, then analyzing the collection of indicators in relation to each other. Many weak
indicators with strong relationships help identify difficult to target content such as financial, legal, or design
documents. They also help to prevent false positives by narrowing the likely meaning of a term or expres-
sion based on its relationship to additional identifiers that may be present in the same file.
For example, how do you classify a file that contains a 10-digit number?
To most systems that number looks like this ##########. Is that a
phone number? IP address? Swedish or Danish national ID number?
Record number? Account number? Part number? Inventory volume?
Without contextual analysis this data will be flagged and the probability of
a false positive is high, so manual analysis of this file will be necessary.
Take this example a little further to show how vast the non-contextual
identification problem can be. Say a multinational company with customers
located globally wants to make sure they don’t expose any national ID
numbers. The characteristics of these numbers vary by country. U.S. and
U.K. IDs contain 9 digits; in Sweden and Denmark IDs are 10 digits; in
Russia, Turkey, and Norway they contain 11 digits; in Japan and Malaysia IDs
contain 12 digits; in many other countries IDs are 13 digits. So this company
needs a system that identifies ID numbers that could contain from 9 to
13 digits. A 9, 10, 11, or 12-digit item could also be a phone number. An IP
address is also a 10-digit number. And any of these multi-digit numbers
could be part of an address or they could just be some random record
number, part number, inventory volume number, or account number.
This problem is mitigated when expressions are analyzed in context
with a collection of other indicators. For example, if a 10-digit number
is identified AND it is structurally associated with a name AND Swedish
language indicators appear, it is more likely to be a national ID. If a 10-digit
number is identified AND computer or engineering indicators appear, it is
more likely an IP address. If a 10-digit number is identified AND it is asso-
ciated with a name AND a phone related indicator appears AND/OR what
structurally looks like an address appears, then it is likely a phone number.
555 555-1212
[NAME OF PRACTICE] REGISTRATION FORM
(Please Print)
Today’s date: PCP:
PATIENT INFORMATION Patient’s last name: First: Middle: q Mr.
q Mrs. q Miss q Ms.
Marital status (circle one)
Single / Mar / Div / Sep / Wid
Is this your legal name? If not, what is your legal name? (Former name): Birth date: Age: Sex:
q Yes q No / / q M q F
Street address: Social Security no.: Home phone no.:
( )
P.O. box: City: State: ZIP Code:
Occupation: Employer: Employer phone no.:
( )
Chose clinic because/Referred to clinic by (please check one box): q Dr. q Insurance Plan q Hospital
q Family q Friend q Close to home/work q Yellow Pages q Other
Other family members seen here:
INSURANCE INFORMATION (Please give your insurance card to the receptionist.)
Person responsible for bill: Birth date: Address (if different): Home phone no.:
/ / ( )
Is this person a patient here? q Yes q No
Occupation: Employer: Employer address: Employer phone no.:
( )
Is this patient covered by insurance? q Yes q No
Please indicate primary insurance q [Insurance] q [Insurance] q [Insurance] q [Insurance] q [Insurance]
q [Insurance] q [Insurance] q [Insurance] q Welfare (Please provide coupon) q Other
Subscriber’s name: Subscriber’s S.S. no.: Birth date: Group no.: Policy no.: Co-payment:
/ / $
Patient’s relationship to subscriber: q Self q Spouse q Child q Other
Name of secondary insurance (if applicable): Subscriber’s name: Group no.: Policy no.:
Patient’s relationship to subscriber: q Self q Spouse q Child q Other
IN CASE OF EMERGENCY Name of local friend or relative (not living at same address): Relationship to patient: Home phone no.: Work phone no.:
( ) ( )
The above information is true to the best of my knowledge. I authorize my insurance benefits be paid directly to the physician. I understand that I am financially responsible for any balance. I also authorize [Name of Practice] or insurance company to release any information required to process my claims.
Patient/Guardian signature Date
Swedish Language
Indicators
National ID Number
IP Address
Phone Number
Computer Language
Indicators
Name / Phone / Address
Indicators
Washington DC 20001
North Capitol St. NW 222-22-2222
Fundamentally Better Cloud Security Taking a Data Science ApproachCopyright © 2016 Symantec Corp. All Rights Reserved.
9
Simple Relationship Rules Are Not Enough
Some systems perform a very basic contextual analysis
by using simple assumption-based rules such as if a file
contains “SSN” or “Social Security Number” followed by a
9-digit number, identify it as containing PII. This approach
is better but not sufficient. These systems will not correctly
identify a significant amount of content because it relies
on structural relationship definitions that are too simple.
Sophisticated Relationship Models Are Required
The Elastica scientists take a computational approach to
the challenge of analyzing relationships for ContentIQ.
They leverage supervised machine learning to identify,
codify and prioritize relationships between a broad range
of both strong and weak indicators. Then they distill this
foundational relationship knowledge into computational
models to power ContentIQ algorithms to accelerate
identifying and classifying sensitive data and delivering
true positives without triggering false positives.
Remediate and Control with ContentIQOnce you can confidently identify sensitive content that
is either already in the cloud or on its way to the cloud,
you need to perform a risk analysis. How sensitive do you
consider this class of content? Is it content that falls under
compliance regulations? Is it valuable intellectual property?
Is it data so sensitive that it should not be shared even
within your own organization? ContentIQ will help answer
these questions, making it easier to set your guidelines on
these issues. Once you decide what you want to do, you
Identifying Content Unique to an Organization
Some organizations use specific types of data in formats unique to their organi-zation. The ContentIQ machine learning capabilities used to discover content indicators from publicly available data sources can also be used to discover and leverage content indicators unique to an organization. The learning system is smart enough that CloudSOC customers need to feed just a few examples specific to their organization into their Elastica account as training profiles and ContentIQ will learn to look for content in those formats.
ContentIQ Intelligence Enables Accurate Data Governance and DLP with Elastica CloudSOC Detect, Protect, and Investigate
Machine Learning in ContentIQ powers a sophisticated computational linguistics approach to content analysis to more accurately identify and classify sensitive data. Elastica solutions use the unique intelligence in ContentIQ to detect sensitive data, protect against data leakage, and investigate security incidents more effectively.
• Identify compliance related content such as PII, HIPAA, and PCI — with fewer false positives, even when in nontraditional formats
• Identify classes of sensitive content—source code, design documents—with more accuracy
• Track sensitive data stored in sanctioned apps
• Identify sensitive data in traffic to many different unsanctioned and sanctioned apps
• Power more accurate identification of risky activity
• Enable more accurate data governance policies controls
• Provide useful data for incident response investigations
- 6789
ID: 123-45-6789
Jane Doe
ten million
10/10/20109-digit Zip Code
9-digit Date
9-digit $ Amount
9-digit SSN
9-digit Routing#
10,000,000.00
Fundamentally Better Cloud Security Taking a Data Science ApproachCopyright © 2016 Symantec Corp. All Rights Reserved.
10
can define customized ContentIQ profiles and build Elastica CloudSOC policies to automatically remediate
and control where this content can be stored, enforce encryption on this data, set limitations on accessing
or sharing this data, etc.
The Information You Need for Incident ResponseSecurity incidents will occur. That’s the reality of today’s cloud threat landscape and IT departments will at
some point be scrambling to figure out what happened. This type of investigation can be challenging if not
impossible with traditional perimeter security.
Typical Challenges Faced by Incident Response Investigations
Traditional data sources used to investigate security incidents offer up some big challenges, such as:
• Appliances with limited historical data due to storage resource constraints
• Log data that doesn’t include enough granular information to answer important questions
• Vast quantities of redundant or irrelevant logs requiring lots of manual effort to glean useful
information
• Logs full of data designed to be read by machines not humans making it difficult to
interpret the data they contain
• Inability of most on-premises appliances to monitor cloud usage or activities by mobile users
Well Designed Intelligence Engines and Cloud Resources to the Rescue
The limitations of traditional systems for incident response can be solved by leveraging the cloud, applying
data science driven intelligence gathering, designing great algorithms that can interpret the data and a
system that presents that data in an intuitive, easy to interpret format. This is what Elastica Investigate
delivers—the unsung hero of CloudSOC.
Logs, your foundation of knowledge discovery for
incident response, can only include the activity data
that the original security system can read, so the
quality of this data ultimately depends on the intelli-
gence of your firewall, proxy, IPS, CASB or whatever
system. If the underlying intelligence of a system can
only read gross details in its traffic analysis, that’s all
you’ll get from those logs. This is where the power
of StreamIQ and ContentIQ really shine.
StreamIQ picks up detailed activity data that other traffic analysis systems can’t identify. Then it correlates
activity details with multiple related attributes for contextual analysis and translates it from machine code
to human language. This results in logs that are uniquely full of useful information and easy to understand.
Elastica logs automatically consolidate multiple related less important actions under the one action of
Office 365Bob Jones sent an email to Alice Smith with the subject “Billings” using Exchange on April 12, 2016, 11:32 AM
DropboxALERT [email protected] attempted to Share book.xlsx using Linux and Firefox v43 on April 12, 2016 11:34 AM
BoxFile “book.xlsx” has risk of PII and PCI violations from user [email protected]
Google Drive
ALERT Bob Jones shared document “book.xlsx on April 12, 2016 11:45 AM
Office 365Bob Jones user ThreatScore is now 97, changed for “Too many suspicious location changes” on April 12, 2016 11:59 AM
Fundamentally Better Cloud Security Taking a Data Science ApproachCopyright © 2016 Symantec Corp. All Rights Reserved.
11
related contextual significance. For example: StreamIQ data in logs tells you which user was involved instead
of just presenting IP addresses, and it creates a record that this user logged into a particular account instead
of creating multiple records separately tracking each step of the login process. ContentIQ gives you names
and attributes of files that were involved in a transaction instead of presenting unintelligible object identifiers.
In combination, you get logs that make it easy to track who was accessing what file in what app, what the
attributes were of that file, what changes the user made to that file, and what permission settings were
changed related to that file or account.
Pulling it All Together
The best threat intelligence in the world is useless if it can’t find threats or interpret them in a timely manner.
The first thing you’ll notice when you get to the Investigate dashboard is a Query function. This is key,
because wading through lots of irrelevant logs to find the ones you need is a waste of time. Investigate has
a powerful but easy to use query where you can use a wide range of intuitive query terms combined with
keywords to search by app, user, action, file, etc. Or you can skip query and use the rich set of data filtering
options just beside the query feature.
The Investigate interface pivots based on the data returned from your query or filter settings. It automatically
populates data visualizations and presents relevant logs full of drill down details thanks to all that intelligence
work done by StreamIQ and ContentIQ.
Data Science-Based Policy ControlsLayers of data science driven systems from StreamIQ
and ContentIQ, to User Behavior Analysis, to
ThreatScores make it possible for CloudSOC to provide
visibility and control over cloud apps with an accuracy
not possible with previous CASB technologies.
In Elastica CloudSOC, policies can be defined with
a unique level of granularity due to the detailed
intelligence provided by StreamIQ and ContentIQ. In
an optimal world, policy enforcement would be both
automated and nuanced. Elastica Protect enables you
to use the dynamic user ThreatScore rating system
to trigger policy controls in a manner appropriate to
varying levels of risk severity—from monitoring to alerts
to blocking specific traffic to full user quarantines.
Fundamentally Better Cloud Security Taking a Data Science ApproachCopyright © 2016 Symantec Corp. All Rights Reserved.
12
ConclusionElastica Cloud Security is built on a foundation of data science and cloud resources to deliver fundamentally
better cloud security. This approach enables Symantec solutions to move beyond many well known
limitations of traditional security systems. Layers of machine learning, computational analysis, and intelligent
algorithms go into building the highly accurate and adaptive ContentIQ, StreamIQ, and UBA engines at
the core of Elastica CloudSOC. ThreatScores are calculated based on these engines to facilitate practical
everyday security management, big data visualization, and automated controls.
About the AuthorDeena Thomchick is Senior Director of Cloud at Blue Coat. She’s spent more than 25 years in technology
with a particular focus on security. Her background includes work on encryption, advanced threat protec-
tion, network security and endpoint security
About Blue Coat & Elastica Cloud Security
Blue Coat, Inc. is a leading provider of advanced web security solutions for global enterprises and govern-
ments, protecting 15,000 organizations including over 70 percent of the Fortune Global 500. Through the
Blue Coat Security Platform, Blue Coat unites network, security and cloud, protecting enterprises and their
users from cyber threats—whether they are on the network, on the web, in the cloud or mobile. Blue Coat
was acquired by Bain Capital in May 2015. On June 12, 2016, Symantec and Blue Coat, Inc. announced they
have entered into a definitive agreement under which Symantec will acquire Blue Coat for approximately
$4.65 billion in cash. The transaction has been approved by the Boards of Directors of both companies and
is expected to close in the third calendar quarter of 2016.
Elastica, acquired by Blue Coat in November, 2015, is the leader in Data Science Powered™ Cloud Access
Security. Its CloudSOC™ platform empowers companies to confidently leverage cloud applications and
services while staying safe, secure and compliant. A range of Elastica Security Apps deployed on the
extensible CloudSOC™ platform deliver the full life cycle of cloud application security, including auditing
of shadow IT, real-time detection of intrusions and threats, protection against intrusions and compliance
violations, and investigation of historical account activity for post-incident analysis.
For additional information, please visit elastica.net.
Fundamentally Better Cloud Security Taking a Data Science ApproachCopyright © 2016 Symantec Corp. All Rights Reserved.
13
Copyright © 2016 Symantec Corp. All rights reserved. Symantec, the Symantec Logo, the Checkmark Logo, Blue Coat, and the Blue Coat logo are trademarks or
registered trademarks of Symantec Corp. or its a�liates in the U.S. and other countries. Other names may be trademarks of their respective owners. This document
is provided for informational purposes only and is not intended as advertising. All warranties relating to the information in this document, either express or implied,
are disclaimed to the maximum extent allowed by law. The information in this document is subject to change without notice.