OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA)...

22
Georgios Chatzichristos Operational Security Unit - ENISA 6 11 2018 OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING TOOL USING AI AND NATURAL LANGUAGE PROCESSING

Transcript of OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA)...

Page 1: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

Georgios ChatzichristosOperational Security Unit - ENISA

6 11 2018

OPEN-CSAM

INFORMATION AGGREGATOR AND

REPORTING TOOL USING AI AND

NATURAL LANGUAGE PROCESSING

Page 2: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

THE GOAL

Help Decision Makers

take better decisions !

Page 3: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

3

THE TRIGGER

Open Cyber Security Awareness Machine

Technical

Operational

OperationalTechnical

Page 4: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

4

Overview

Open Cyber Security Awareness Machine

Develop a tool based on latest

technologies that will enhance

situational awareness and help threat

analysts to advise decision makers

Page 5: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

5

The process

Open Cyber Security Awareness Machine

Monitor (machine) Search (analyst)Report

(machine+analyst)

Page 6: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

6

NLP

What is Natural Language Processing?Field of study focused on making sense of language

Using statistics and computers

Basics tasks of NLP:

Topic identification

Text classification

NLP applications include:

Chatbots

Translation, Fake News detection, text summarization

Sentiment analysis -> Social Media, Customer reviews etc.

SPAM

Short name of the powerpoint presentation, maximum length two thirds of the page

Page 7: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

7

Information aggregation

Open Cyber Security Awareness Machine

• News aggregator, monitors 24/7 a set of news sources and tweets

• Uses NLP to isolate trending terms

• Creates clusters of relevant terms using AI

• Searches ENISA’s own publications

• Searches ENISA’s own recommendations

Page 8: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

8

NLP

Open Cyber Security Awareness Machine

Continuous monitoringDaily/Weekly/Monthly/Yearly Stats

Trending terms in Tweets Trending terms in News

ENISA’s termsENISA’s topics

Page 9: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

9

AI

Open Cyber Security Awareness Machine

Continuous monitoringDaily/Weekly/Monthly/Yearly Stats

Page 10: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

10 Open Cyber Security Awareness Machine

Hardcoded

Used to drive AI

Knowledge Graph

Page 11: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

11

Searching

Open Cyber Security Awareness Machine

Page 12: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

12 Open Cyber Security Awareness Machine

Searching

Page 13: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

13

Reporting

Open Cyber Security Awareness Machine

Page 14: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

14 Open Cyber Security Awareness Machine

Latent Dirichlet Allocation (LDA)

Non-negative Matrix Factorization (NMF)

Training Data

/ features

User inputs

Spiders

Scrappers

Elastic Search

Kibana

Jenkins

Knowledge Graph

Sources

Done

Done

Done

Done

Done

Done

Done

Page 15: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

15 Open Cyber Security Awareness Machine

Latent Dirichlet Allocation (LDA)Non-negative Matrix Factorization (NMF)

Training Data

/ features

Users

Spiders

Scrappers

Update of

Knowledge Graph and sources

Elastic Search

Kibana

Jenkins

Knowledge Graph

Sources

Page 16: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

WAY FORWARD

Page 17: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

17 Short name of the powerpoint presentation, maximum length two thirds of the page

Page 18: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

18

The Vision

Open Cyber Security Awareness Machine

Develop a dynamic knowledge graph fed by threat analysts and AI

9,8

8,3

5,6

9,5

9,2

8,1

7,6

8,3

9,1

Hacktivism

3,4

8,1

9,8

9,1

that will keep itself up to date by adding

new terms and delete obsolete ones

Page 19: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

19

The Vision

Open Cyber Security Awareness Machine

Develop a dynamic pool of sources fed by threat analysts and AI

9,88,3

5,6

9,5

9,2

8,1

7,6 8,39,1

3,49,8

7,6

9,1

9,2

8,1

Originality

Authenticity

Popularity

QualityAlso…new types of sources like DarkWeb, Pastebin and sentiment analysis !

Page 20: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

21

The Vision

Open Cyber Security Awareness Machine

Make enisa an open source info hub with good training data for AI available for all

Threat analystsAcademiaEssential Services providers

Researchers

Cyber Security professionals

.

.

.

.

Cyber Security

Professionals

CSIRTs Training data for AI

Use services

Contribute to QoS

Page 21: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

22

Beta testers welcomed. Let us know if you are interested !

[email protected]

EPILOGUE

Open Cyber Security Awareness Machine

https://github.com/enisaeu/OpenCSAM

Page 22: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Latent Dirichlet Allocation (LDA) Non-negative Matrix Factorization (NMF) Training Data / features Users Spiders Scrappers Update

THANK YOU FOR YOURATTENTION

Vasilissis Sofias Str 1, Maroussi 151 24,

Attiki, Greece

+30 28 14 40 9711

[email protected]

www.enisa.europa.eu