Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
Mining for cyber-threat intelligence to improve cyber-security risk mitigation
Panel on Cyber-security Intelligence2019 Community of Users Workshop
Nicholas KolokotronisDepartment of Informatics and TelecommunicationsUniversity of Peloponnese • [email protected]
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
Cyber-threat intelligence▪ From unstructured (textual)
high-volume data to
o Vulnerabilities/exploits
o Links to CVE/other VDB IDs
o Threat actors TTPs
o Specific products/platforms
o Popularity, price, …
o CVSS => measurable
▪ CTI needs to be compliant against legal requirements
2
CT
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
Cyber-defense goals▪ Accurate modelling of the
attack strategies
▪ Determine the attackers’ capabilities
o constraint resources (budget, tools, etc.)
▪ The attackers’ goals vary depending on the target
o access level, degrade QoS, …
▪ Define the defender’s available actions
o possible counter-measures
o highlight parameters
▪ Cyber-defenseneeds tominimizethe attacksurface
3
Dynamic risk analysis
4
Security properties should be measurable
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
Dynamic risk analysis: attack models
5
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
Example: exploitation probability▪ Need to be
measurable
o Estimated from CVSS metrics
o 𝑃 𝑒𝑖 = 2 ×𝐴𝑉 × 𝐴𝐶 × 𝐴𝑢
▪ Likewise for an attack’s attempt probability
6
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
ML – from CTI to structured TTPs▪ Conversion of CTIs to a semi-structured format (JSON, XML)
▪ Filtering specific (TTP, exploits) information, has the benefits:
o More easily processed in a automated way
o Only condensed information will be available
o Reports will be still readable
▪ Known formats for attack patterns is STIX v2.1
▪ The conversion of CTIs into actionable information can be achieved using ML techniques
7
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
Threat actions identification
8
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
CTI generation process
9
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
Classifier needed with a number of features, like:
▪ Word size (CTIs with elaborated TTPs tend to be larger)
▪ Security action word density (security correlated verbs)
▪ Security target word density (security correlated nouns)
Data pre-processing1. Need crawler that gathers all
pages from the web
o CTI vendors (e.g. Symantec)
o Forums, blogs, etc.
2. Sanitize content and keep all textual information as articles
o Remove HTML tags, images, etc.
3. Automated decision on the CTI value of each article
o otherwise it is dropped
10
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
[CT] CTI crawling and classification▪ Crawling components used in Cyber-Trust
11
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
[CT] CTI crawling and classification▪ Clear/Deep/Forum web crawling in Cyber-Trust
o Implement topic-specific crawling on publicly available web sites
▶︎ focus on Deep/Dark web sites that don’t require authentication
o Model Builder is responsible for creating the classification model; needs a set of positive and negative URLs.
o Seed Finder identifies the initial seed of URLs to crawl based on a user-defined query, e.g. on “IoT vulnerabilities”
o The crawled websites go through the Article/Forum Parser, which extracts the useful text part of each one
▶︎ internally forums are structured in a different way compared to websites
12
Dynamic risk analysis (enhanced)
13
Security properties should be measurable
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
Data pre-processing▪ Security correlated verbs/nouns are extracted from CVEs,
CAPEC, CWE repositories using NLP techniques
o Used on each article to find all OVS (Object, Verb, Subject) triplets; these are candidate threat actions
▪ CTI contain strings that an NLP parser may not understand, such as IoCs
o To remedy this,we temporallysubstitute thesewith RegEx, e.g.:
14
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
TTP specific ontology
15
▪ An ontology created by TTPs provided by ATT&CK and CAPEC repositories (MITRE)
Class name Class description Example
Kill chain phase Phase information, e.g. name or order Control or 5
Tactic Description of how to achieve a phase Privilege escalation
Technique Description of how to achieve a tactic DLL injection
Threat action Verb associated with malicious action Overwrite, Terminate
Object The action’s target File, Process
Pre-condition Action prerequisites that have to hold User access
Intent Goal/subgoal of an action Run malicious code
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
Towards threat actions▪ Find similarity of candidate actions with all records in ontology
▪ Information Retrieval (IR) scoring vs. threshold
▪ Vocabulary based on synonyms (e.g. by WordNet) or custom
▪ Best scoring class is assigned to the threat action
16
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
[CT] CTI classification▪ Topic vocabulary in Cyber-
Trust
o XML docs converted into text via XML Data Retriever
o Normalizer drops symbols, converts to lowercase, etc.
o Collected tags are multi-word terms given to Multi-Word Expression Tokenizer
▶︎“exploit kits” => “exploit-kits”
o Word2Vec finds the similarity
17
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
[CT] CTI classification▪ Example top terms in Cyber-Trust collection for tag ddos
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
CTI sharing: using STIX ▪ Structured language for
any CTI
o wide range use cases support
o can focus on relevant aspects
▪ High level of recognition by CSIRTs and LEAs
▪ Combined with TAXII 2.0
o OSS implementations
▪ Supported by MISP
Attack pattern SDO
{
“type” : “attack”,
“id” : “attack-pattern-xyz…”,
“created” : “2017-06-8T08:17:27.000Z”,
“modified” : “2017-06-8T08:17:27.000Z”,
“name” : “Input Capture”,
“description” : “Adversary logs
keystrokes to obtain credentials”,
“kill_chain_phases” : “Maintain”,
“external_references” :
[ {
“source_name” : “ATT&CK”,
“id” : “T1056”
} ]
}
19
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
CTI sources’ quality aspects▪ Existence of conflicting data among sources
▪ Techniques can be used to assess the credibility of source
o Using special-purpose ranking engines (e.g. SimilarWeb)
▶︎ A combination of metrics (page views, unique site users, web traffic, etc.)
▶︎ Include some Dark Web sites
o Number of users (useful for Dark Web sites)
o Number of posts per day
o Number of CVEs per day
▶︎ More than 3/4 of vulnerabilities are publicly reported online ~7d before NVD
▶︎ Mainly concerns Dark Web, paste sites, and cyber-criminal forums
20
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
Use of CTI in Cyber-Trust
21
CTI sharing
dark web
deep web
clear web
Advanced Cyber-Threat Intelligence, Detection and Mitigation Platform for a Trusted Internet of Things
Conclusions - challenges▪ ML can be used for extracting CTIs to structured and
actionable formats
▪ Technical challenges for coping with heterogeneity and volume of cyber-threat data
o Need for (semi-)automated means of processing
o Focused and topic-based crawling can improve performance
o Deep/dark web exploration presents additional challenges
o Big data management and NoSQL stores for efficiency
▪ Legal compliance and privacy-preserving data mining?
22
Top Related