Testing Heuristic Detections

Testing Heuristics

Andrew Lee CISSP

Chief Research Officer

ESET LLC

[email protected]

What do you need?

The appropriateness of the methodology (or it’s correct application) Repeatability Independently verifiable Validated sample sets

Adherence to safe and ethical practices in handling and testing samples

Understanding of what heuristic detection is (and what it’s not)

A quick word on FP testing

• No ‘tricks’!– Appropriate “ItW” false positive set– Evaluation of FP’s– ‘Grey’/unusual or very strange unlikely files will

tend to penalize heuristic based products

• Defaults• Best settings

Junk / Corrupt files

• Poor sample sets simply reinforce the cycle - the more junk added, the more detected

• Using AV products to determine maliciousness is silly, it simply reinforces the cycle (Kaminski - Eicar 2006?)

“Time to Update”

Product Actual Time to Update / % missed (20 Samples)

Average TtU

X1 1 hour at 100% (20 upd) 1 hour

X2 4 hours at 5% (1 upd) 4 hours

X3 8 hours at %50 (10 upd) 4 hours

X4 30 hours at %20 (5 upd) 6 hours

Actual TtU

Product Actual Time to Update / % missed

Average TtU

(zero removed)

X1 1 hour at 100% 1 hour

X2 4 hours at 5% 4 hours

X3 8 hours at 50% 8 hours

X4 30 hours at %20 30 hours

Mean time

Statistical problems in mean comparison

0

500

1000

1500

2000

2500

3000

0% 10% 20% 30% 40% 50% 60% 70%

No of proactive detections in wk

No of updates in week

Each Dot represents a different product

Lies, Damned Lies and Statistics

• Statistical intgrity is biased, means of more succesful product are calculated over less samples (necessarily). This is not good for comparisons.

• Concentrating on speed of update is surely sending the wrong message to the consumers, giving them the false impression that buying a product that releases a lot of updates very quickly is going to protect them better.

Retrospective (Frozen Update)

• Selection of time period– 6 months?– 3 months?– 1 day?– 1 hour?

• Verification (is it possible to do real time?)

Frozen Update Pt II

• What samples are important?

• Is this a recursive process?– Single snapshot is not necessarily the most

useful information– Performance over time– Sound statistical model

To quote Dr Alan Solomon.

• 1. If something is superb at detecting viruses, it's no use if it gives a lot of false alarms.

• 2. Anything that relies on the user to make a correct decision, on matters that he is not likely to be able to decide about, is useless.

• 3. You can receive something that is *exactly* what the salesman promised to deliver, and it's nevertheless useless.

Shameless plug

AVIEN Guide to Managing Malware in the Enterprise

http://www.smallblue-greenworld.co.uk/pages/avienguide.html

Testing Heuristic Detections

Economy & Finance

Transcript of Testing Heuristic Detections