Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log:...

75
Spell: Streaming Parsing of System Event Logs Min Du, Feifei Li School of Computing, University of Utah

Transcript of Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log:...

Page 1: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Spell: Streaming Parsing of System Event Logs

Min Du, Feifei Li

School of Computing,

University of Utah

Page 2: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Background

Spell: Streaming Parsing of System Event Logs2

Page 3: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Background

Spell: Streaming Parsing of System Event Logs3

System Event Log

Page 4: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Background

Spell: Streaming Parsing of System Event Logs4

System Event Log

Exists practically on

every computer system!

Page 5: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Background

Spell: Streaming Parsing of System Event Logs5

System Event Log

Exists practically on

every computer system!

Page 6: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Background

Spell: Streaming Parsing of System Event Logs6

System

Event

Log

Strucuted Data Anomaly

DetectionMessage/Event type

Log key

……

printf(“Started service

%s on port %d”, x, y);

Page 7: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Background

Spell: Streaming Parsing of System Event Logs7

System

Event

Log

Strucuted Data Anomaly

DetectionMessage/Event type

Log key

……

printf(“Started service

%s on port %d”, x, y);

L O G A N A L Y S I S

Page 8: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Background

Spell: Streaming Parsing of System Event Logs8

System

Event

Log

Strucuted Data Anomaly

DetectionMessage/Event type

Log key

……

printf(“Started service

%s on port %d”, x, y);

Message count vector:

Xu’SOSP09, Lou’ATC10, Lin’ICSE16, etc.

L O G A N A L Y S I S

Page 9: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Background

Spell: Streaming Parsing of System Event Logs9

System

Event

Log

Strucuted Data Anomaly

DetectionMessage/Event type

Log key

……

printf(“Started service

%s on port %d”, x, y);

Message count vector:

Xu’SOSP09, Lou’ATC10, Lin’ICSE16, etc.

Build workflow model:

Lou’KDD10, Beschastnikh’ICSE14,

Yu’ASPLOS16, etc.

L O G A N A L Y S I S

Page 10: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Background

Spell: Streaming Parsing of System Event Logs10

System

Event

Log

Strucuted Data Anomaly

DetectionMessage/Event type

Log key

……

printf(“Started service

%s on port %d”, x, y);

L O G P A R S I N G

Page 11: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Background

Spell: Streaming Parsing of System Event Logs11

System

Event

Log

Strucuted Data Anomaly

DetectionMessage/Event type

Log key

……

printf(“Started service

%s on port %d”, x, y);

L O G P A R S I N G

Use source code as template to parse logs:

Xu’SOSP09

Page 12: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Background

Spell: Streaming Parsing of System Event Logs12

System

Event

Log

Strucuted Data Anomaly

DetectionMessage/Event type

Log key

……

printf(“Started service

%s on port %d”, x, y);

L O G P A R S I N G

Use source code as template to parse logs:

Xu’SOSP09

Problem: What if we don’t have source code?

Page 13: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Background

Spell: Streaming Parsing of System Event Logs13

System

Event

Log

Strucuted Data Anomaly

DetectionMessage/Event type

Log key

……

printf(“Started service

%s on port %d”, x, y);

L O G P A R S I N G

Use source code as template to parse logs:

Xu’SOSP09

Problem: What if we don’t have source code?

Directly parse from raw system logs:

Makanju’KDD09, Fu’ICDM09, Tang’ICDM10, Tang’CIKM11, etc.

Page 14: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Background

Spell: Streaming Parsing of System Event Logs14

System

Event

Log

Strucuted Data Anomaly

DetectionMessage/Event type

Log key

……

printf(“Started service

%s on port %d”, x, y);

L O G P A R S I N G

Use source code as template to parse logs:

Xu’SOSP09

Problem: What if we don’t have source code?

Directly parse from raw system logs:

Makanju’KDD09, Fu’ICDM09, Tang’ICDM10, Tang’CIKM11, etc.

Problem: Offline batched processing, some very slow.

Page 15: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Our approach

Spell: Streaming Parsing of System Event Logs15

Spell, a structured Streaming Parser for Event Logs using an

LCS (longest common subsequence) based approach.

Page 16: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Our approach

Spell: Streaming Parsing of System Event Logs16

Spell, a structured Streaming Parser for Event Logs using an

LCS (longest common subsequence) based approach.

Two log entries:

Temperature (41C) exceeds warning threshold

Temperature (42C, 43C) exceeds warning threshold

Example:

Page 17: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Our approach

Spell: Streaming Parsing of System Event Logs17

Spell, a structured Streaming Parser for Event Logs using an

LCS (longest common subsequence) based approach.

Two log entries:

Temperature (41C) exceeds warning threshold

Temperature (42C, 43C) exceeds warning threshold

LCS:

Temperature * exceeds warning threshold

Example:

Page 18: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Our approach

Spell: Streaming Parsing of System Event Logs18

Spell, a structured Streaming Parser for Event Logs using an

LCS (longest common subsequence) based approach.

Two log entries:

Temperature (41C) exceeds warning threshold

Temperature (42C, 43C) exceeds warning threshold

LCS:

Temperature * exceeds warning threshold

Naturally a message type!

printf(“Temperature %s exceeds warning threshold”)

Example:

Page 19: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Basic workflow

Spell: Streaming Parsing of System Event Logs19

Add new log entry into LCSMap in a streaming fashion, update existing message type if

length(LCS) > 0.5 * length(new log entry)

LCSMap

Page 20: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Basic workflow

Spell: Streaming Parsing of System Event Logs20

LCSMap

new log entry: Temperature (41C) exceeds warning threshold

Page 21: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Basic workflow

Spell: Streaming Parsing of System Event Logs21

LCSMap

LC

SO

bje

ct

LCSseq: Temperature (41C) exceeds warning threshold

lineIds: {0}

paramPos: {empty}

new log entry:

Page 22: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Basic workflow

Spell: Streaming Parsing of System Event Logs22

LCSMap

LC

SO

bje

ct

LCSseq: Temperature (41C) exceeds warning threshold

lineIds: {0}

paramPos: {empty}

new log entry: Temperature (43C) exceeds warning threshold

Page 23: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Basic workflow

Spell: Streaming Parsing of System Event Logs23

LCSMap

LC

SO

bje

ct

LCSseq: Temperature * exceeds warning threshold

lineIds: {0, 1}

paramPos: {1}

new log entry:

Page 24: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Basic workflow

Spell: Streaming Parsing of System Event Logs24

LCSMap

LC

SO

bje

ct

LCSseq: Temperature * exceeds warning threshold

lineIds: {0, 1}

paramPos: {1}

new log entry: Command has completed successfully

Page 25: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Basic workflow

Spell: Streaming Parsing of System Event Logs25

LCSMap

new log entry:

LC

SO

bje

ct

LCSseq: Temperature * exceeds warning threshold

lineIds: {0, 1}

paramPos: {1}

LC

SO

bje

ct

LCSseq: Command has completed successfully

lineIds: {2}

paramPos: {empty}

Page 26: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Basic workflow

Spell: Streaming Parsing of System Event Logs26

LCSMap

new log entry: ……

……

LC

SO

bje

ct

LCSseq: Temperature * exceeds warning threshold

lineIds: {0, 1}

paramPos: {1}

LC

SO

bje

ct

LCSseq: Command has completed successfully

lineIds: {2}

paramPos: {empty}

Page 27: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs27

To compute LCS of two log entries, each one has 𝑶(𝒏) length:

Page 28: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs28

To compute LCS of two log entries, each one has 𝑶(𝒏) length:

Naïve way: Dynamic Programing

Page 29: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs29

To compute LCS of two log entries, each one has 𝑶(𝒏) length:

Naïve way: Dynamic Programing

Time complexity:

To compare a log entry with an existing message type: 𝑂(𝑛2)To compare a new log entry with 𝑂(𝑚) existing message types: 𝑂(𝑚𝑛2)

Page 30: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs30

To compute LCS of two log entries, each one has 𝑶(𝒏) length:

Naïve way: Dynamic Programing

Time complexity:

To compare a log entry with an existing message type: 𝑂(𝑛2)To compare a new log entry with 𝑂(𝑚) existing message types: 𝑂(𝑚𝑛2)

Can we do better?

Page 31: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs31

Observation. For a complex system,

number of log entries: millions

number of message types: hundreds

Page 32: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs32

Observation. For a complex system,

number of log entries: millions

number of message types: hundreds

For example:Blue Gene/L log:

4,457,719 log entries, 394 message types

Hadoop log used in Xu’SOSP09:

11,197,705 log entries, only 29 message types

Page 33: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs33

Observation. For a complex system,

number of log entries: millions

number of message types: hundreds

For example:Blue Gene/L log:

4,457,719 log entries, 394 message types

Hadoop log used in Xu’SOSP09:

11,197,705 log entries, only 29 message types

For a majority of new log entries, their message types already exist in LCSMap!

Page 34: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs34

Improvement 1: Prefix Tree

Existing message types:

A B C

A C D

A D

E F

Page 35: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs35

Improvement 1: Prefix Tree

Existing message types:

A B C

A C D

A D

E F

ROOT

A E

FB C D

C D

Page 36: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs36

Improvement 1: Prefix TreeROOT

A E

FB C D

C D

New log entry: A B P C

Page 37: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs37

Improvement 1: Prefix TreeROOT

A E

FB C D

C D

New log entry: A B P C

Page 38: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs38

Improvement 1: Prefix TreeROOT

A E

FB C D

C D

New log entry: A B P C

Page 39: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs39

Improvement 1: Prefix TreeROOT

A E

FB C D

C D

New log entry: A B P C

Parameter:

Page 40: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs40

Improvement 1: Prefix TreeROOT

A E

FB C D

C D

New log entry: A B P C

Parameter:

Page 41: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs41

Improvement 1: Prefix TreeROOT

A E

FB C D

C D

Time Complexity:

𝑶(𝒏) for each log entry

Page 42: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs42

Improvement 1: Prefix TreeROOT

A D

AB

C

Problem:

New log entry: D A P B C

E

F

Page 43: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs43

Improvement 1: Prefix TreeROOT

D

A

Problem:

New log entry: D A P B C

Matches D A

A

B

C

E

F

Page 44: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs44

Improvement 1: Prefix Tree

Problem:

New log entry: D A P B C

Matches D A

Should be: A B C

ROOT

D

A

A

B

C

E

F

Page 45: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs45

Improvement 2: Inverted Index

A (1, 1) (2, 1) (3, 2)

B (1, 2)

C (1, 3)

D (3, 1)

E (2, 2)

F (2, 3)

Existing message types:

A B C

A E F

D A

Page 46: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs46

Improvement 2: Inverted Index

New log entry:D

A

B

P

C

A (1, 1) (2, 1) (3, 2)

B (1, 2)

C (1, 3)

D (3, 1)

E (2, 2)

F (2, 3)

Page 47: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs47

Improvement 2: Inverted Index

New log entry:D

A

B

P

C

A (1, 1) (2, 1) (3, 2)

B (1, 2)

C (1, 3)

D (3, 1)

E (2, 2)

F (2, 3)

1

Page 48: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs48

Improvement 2: Inverted Index

New log entry:D

A

B

P

C

A (1, 1) (2, 1) (3, 2)

B (1, 2)

C (1, 3)

D (3, 1)

E (2, 2)

F (2, 3)

1

2

Page 49: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs49

Improvement 2: Inverted Index

New log entry:D

A

B

P

C

A (1, 1) (2, 1) (3, 2)

B (1, 2)

C (1, 3)

D (3, 1)

E (2, 2)

F (2, 3)

1

2

3

Page 50: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs50

Improvement 2: Inverted Index

New log entry:D

A

B

P

C

A (1, 1) (2, 1) (3, 2)

B (1, 2)

C (1, 3)

D (3, 1)

E (2, 2)

F (2, 3)

1

2

3

No match,

parameter position

Page 51: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs51

Improvement 2: Inverted Index

New log entry:D

A

B

P

C

A (1, 1) (2, 1) (3, 2)

B (1, 2)

C (1, 3)

D (3, 1)

E (2, 2)

F (2, 3)

1

2

3

No match,

parameter position

4

Page 52: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs52

Improvement 2: Inverted Index

New log entry:D

A

B

P

C

A (1, 1) (2, 1) (3, 2)

B (1, 2)

C (1, 3)

D (3, 1)

E (2, 2)

F (2, 3)

1

2

3

No match,

parameter position

4

Page 53: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs53

Improvement 2: Inverted Index

A (1, 1) (2, 1) (3, 2)

B (1, 2)

C (1, 3)

D (3, 1)

E (2, 2)

F (2, 3)

Time complexity: 𝑂(𝑐𝑛)for each log entry, 𝑐 < 𝑚

Page 54: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs54

Improvement 2: Inverted Index

A (1, 1) (2, 1) (3, 2)

B (1, 2)

C (1, 3)

D (3, 1)

E (2, 2)

F (2, 3)

Time complexity: 𝑂(𝑐𝑛)for each log entry, 𝑐 < 𝑚

For remaining log entries, compare it with each message type

using simple DP.

Page 55: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on effectiveness

Spell: Streaming Parsing of System Event Logs55

We could be wrong.

More heuristics to adjust the result.

Example:boot (command 2359) Error: Console-Busy Port already in use

wait (command 3964) Error: Console-Busy Port already in use

Initial message type:* (command *) Error: Console-Busy Port already in use

Page 56: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on effectiveness

Spell: Streaming Parsing of System Event Logs56

We could be wrong.

More heuristics to adjust the result.

Example:boot (command 2359) Error: Console-Busy Port already in use

wait (command 3964) Error: Console-Busy Port already in use

Initial message type:* (command *) Error: Console-Busy Port already in use

Solution: Split heuristic

If a parameter position has very few unique tokens:boot (command *) Error: Console-Busy Port already in use

wait (command *) Error: Console-Busy Port already in use

Page 57: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on effectiveness

Spell: Streaming Parsing of System Event Logs57

We could be wrong.

More heuristics to adjust the result.

Example:Fan speeds ( 3552 3534 3375 4354 3515 3479 )

Fan speeds ( 3552 3534 3375 4299 3515 3479 )

Fan speeds ( 3552 3552 3391 4245 3515 3497 )

Fan speeds ( 3534 3534 3375 4245 3497 3479 )

Fan speeds ( 3534 3534 3375 4066 3497 3479 )

Initial message type:Fan speeds ( * 3552 * 3515 * )

Fan speeds ( 3534 3534 3375 * 3497 3479 )

Page 58: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on effectiveness

Spell: Streaming Parsing of System Event Logs58

We could be wrong.

More heuristics to adjust the result.

Example:Fan speeds ( 3552 3534 3375 4354 3515 3479 )

Fan speeds ( 3552 3534 3375 4299 3515 3479 )

Fan speeds ( 3552 3552 3391 4245 3515 3497 )

Fan speeds ( 3534 3534 3375 4245 3497 3479 )

Fan speeds ( 3534 3534 3375 4066 3497 3479 )

Initial message type:Fan speeds ( * 3552 * 3515 * )

Fan speeds ( 3534 3534 3375 * 3497 3479 )

Solution: Merge heuristic

Merge similar message types together:Fan speeds: ( * )

Page 59: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Evaluation

Spell: Streaming Parsing of System Event Logs59

IPLoM (Makanju’KDD09):

Partition log file using 3-step heuristics (log entry length, etc.)

CLP (Fu’ICDM09)

Cluster similar logs together based on weighted edit distance

Log dataset:

Log type Count Message type ground truth

Los Alamos HPC log 433,490 Available online

BlueGene/L log 4,747,963 Available online

Openstack Cloud log 87,519 Manually parsed from source

code

Methods to compare:

Page 60: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Evaluation - Efficiency

Spell: Streaming Parsing of System Event Logs60

Page 61: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Evaluation - Efficiency

Spell: Streaming Parsing of System Event Logs61

Openstack:

Page 62: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Evaluation - Effectiveness

Spell: Streaming Parsing of System Event Logs62

Page 63: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Evaluation - Effectiveness

Spell: Streaming Parsing of System Event Logs63

Page 64: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Thank you!

Page 65: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Evaluation - Effectiveness

Spell: Streaming Parsing of System Event Logs65

Page 66: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Evaluation

Spell: Streaming Parsing of System Event Logs66

IPLoM (Makanju’KDD09):

Partition log file using 3-step heuristics (log entry length, etc.)

CLP (Fu’ICDM09)

Cluster similar logs together based on weighted edit distance

Log dataset:

Log type Source Count Message type ground

truth

Los Alamos HPC log Available online 433,490 Available online

BlueGene/L log Available online 4,747,963 Available online

Openstack Cloud log Generated using

CloudLab

87,519 Manually parsed from

source code

Methods to compare:

Page 67: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Our approach

Spell: Streaming Parsing of System Event Logs67

Spell, a structured Streaming Parser for Event Logs using an

LCS (longest common subsequence) based approach.

The longest subsequence common to both sequences.

LCS of two sequences:

Page 68: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Our approach

Spell: Streaming Parsing of System Event Logs68

Spell, a structured Streaming Parser for Event Logs using an

LCS (longest common subsequence) based approach.

The longest subsequence common to both sequences.

E.g. LCS of:

1, 3, 5, 7, 9

1, 5, 7, 10

equals:

1, 5, 7

LCS of two sequences:

Page 69: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

Evaluation - Effectiveness

Spell: Streaming Parsing of System Event Logs69

Page 70: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs70

Improvement 2: Inverted Index

New log entry: A (1, 1) (2, 1) (3, 1)

B (1, 2)

C (1, 3) (2, 2)

D (2, 3) (3, 2)

E (4, 1)

F (4, 2)

A

B

P

C

1

Page 71: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs71

Improvement 2: Inverted Index

New log entry: A (1, 1) (2, 1) (3, 1)

B (1, 2)

C (1, 3) (2, 2)

D (2, 3) (3, 2)

E (4, 1)

F (4, 2)

A

B

P

C

1

2

Page 72: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs72

Improvement 2: Inverted Index

New log entry: A (1, 1) (2, 1) (3, 1)

B (1, 2)

C (1, 3) (2, 2)

D (2, 3) (3, 2)

E (4, 1)

F (4, 2)

A

B

P

C

1

2

No match,

parameter position

Page 73: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs73

Improvement 2: Inverted Index

New log entry: A (1, 1) (2, 1) (3, 1)

B (1, 2)

C (1, 3) (2, 2)

D (2, 3) (3, 2)

E (4, 1)

F (4, 2)

A

B

P

C

1

2

3No match,

parameter position

Page 74: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs74

Improvement 2: Inverted Index

New log entry: A (1, 1) (2, 1) (3, 1)

B (1, 2)

C (1, 3) (2, 2)

D (2, 3) (3, 2)

E (4, 1)

F (4, 2)

A

B

P

C

1

2

3No match,

parameter position

Page 75: Spell: Streaming Parsing of System Event Logslifeifei/papers/spell_icdm16_full.pdfBlue Gene/L log: 4,457,719 log entries, 394 message types Hadoop log used in Xu’SOSP09: 11,197,705

SPELL – Improvement on efficiency

Spell: Streaming Parsing of System Event Logs75

Improvement 2: Inverted Index

A (1, 1) (2, 1) (3, 1)

B (1, 2)

C (1, 3) (2, 2)

D (2, 3) (3, 2)

E (4, 1)

F (4, 2)

Time complexity: 𝑶(𝒄𝒏)for each log entry, 𝒄 < 𝒎

For remaining log entries, compare it with each message type

using simple DP.