Testing Heuristic Detections
-
Upload
frisksoftware -
Category
Economy & Finance
-
view
715 -
download
1
description
Transcript of Testing Heuristic Detections
What do you need?
The appropriateness of the methodology (or it’s correct application) Repeatability Independently verifiable Validated sample sets
Adherence to safe and ethical practices in handling and testing samples
Understanding of what heuristic detection is (and what it’s not)
A quick word on FP testing
• No ‘tricks’!– Appropriate “ItW” false positive set– Evaluation of FP’s– ‘Grey’/unusual or very strange unlikely files will
tend to penalize heuristic based products
• Defaults• Best settings
Junk / Corrupt files
• Poor sample sets simply reinforce the cycle - the more junk added, the more detected
• Using AV products to determine maliciousness is silly, it simply reinforces the cycle (Kaminski - Eicar 2006?)
“Time to Update”
Product Actual Time to Update / % missed (20 Samples)
Average TtU
X1 1 hour at 100% (20 upd) 1 hour
X2 4 hours at 5% (1 upd) 4 hours
X3 8 hours at %50 (10 upd) 4 hours
X4 30 hours at %20 (5 upd) 6 hours
Actual TtU
Product Actual Time to Update / % missed
Average TtU
(zero removed)
X1 1 hour at 100% 1 hour
X2 4 hours at 5% 4 hours
X3 8 hours at 50% 8 hours
X4 30 hours at %20 30 hours
Mean time
Statistical problems in mean comparison
0
500
1000
1500
2000
2500
3000
0% 10% 20% 30% 40% 50% 60% 70%
No of proactive detections in wk
No of updates in week
Each Dot represents a different product
Lies, Damned Lies and Statistics
• Statistical intgrity is biased, means of more succesful product are calculated over less samples (necessarily). This is not good for comparisons.
• Concentrating on speed of update is surely sending the wrong message to the consumers, giving them the false impression that buying a product that releases a lot of updates very quickly is going to protect them better.
Retrospective (Frozen Update)
• Selection of time period– 6 months?– 3 months?– 1 day?– 1 hour?
• Verification (is it possible to do real time?)
Frozen Update Pt II
• What samples are important?
• Is this a recursive process?– Single snapshot is not necessarily the most
useful information– Performance over time– Sound statistical model
To quote Dr Alan Solomon.
• 1. If something is superb at detecting viruses, it's no use if it gives a lot of false alarms.
• 2. Anything that relies on the user to make a correct decision, on matters that he is not likely to be able to decide about, is useless.
• 3. You can receive something that is *exactly* what the salesman promised to deliver, and it's nevertheless useless.
Shameless plug
AVIEN Guide to Managing Malware in the Enterprise
http://www.smallblue-greenworld.co.uk/pages/avienguide.html