Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY...

24
Malware Characterization using Windows API Call Sequences Sanchit Gupta, Sarvjeet Kaur and Harshit Sharma SPACE - 2016 Scientific Analysis Group DRDO Metcalfe House, Delhi 110054

Transcript of Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY...

Page 1: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

Malware Characterization using

Windows API Call Sequences

Sanchit Gupta, Sarvjeet Kaur and Harshit Sharma

SPACE - 2016

Scientific Analysis Group

DRDO

Metcalfe House, Delhi – 110054

Page 2: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

AIM

• Extraction of run-time behaviour of Malware by monitoring

system calls

• Categorize unknown Malware

SCOPE

Operating System : Windows OS

Category of Malware : Five Classes

SPACE-2016 : Malware Characterization using Windows API Call Sequences

1. Worm

2. Trojan-Downloader

3. Trojan-Spy

4. Trojan-Dropper

5. Backdoor

2 of 20

Page 3: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

Windows-API

(Kernel32.dll, Advapi32.dll & other Windows DLL)

Kernel API

Native API (Ntdll.dll)

HARDWARE

Operating

System

USER APPLICATION

User Level

Kernel Level

SPACE-2016 : Malware Characterization using Windows API Call Sequences

Application Execution in Windows OS

3 of 20

Page 4: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

Windows-API

(Kernel32.dll, Advapi32.dll & other Windows DLL)

Kernel API

Native API (Ntdll.dll)

HARDWARE

Operating

System

User Level

Kernel Level

SPACE-2016 : Malware Characterization using Windows API Call Sequences

Application Execution in Windows OS

MALWARE

4 of 20

Page 5: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

CreateFile(..) (Kernel32.dll)

NtCreateFile (SSDT)

NtCreateFile() (ntdll.dll)

DISK Write

Operating

System

User Level

Kernel Level

SPACE-2016 : Malware Characterization using Windows API Call Sequences

Application Execution in Windows OS

Create a File

IRP_MJ_Write Driver

5 of 20

Page 6: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

CreateFile(..) (Kernel32.dll)

NtCreateFile (SSDT)

NtCreateFile() (ntdll.dll)

HARD DISK

Operating

System

User Level

Kernel Level

SPACE-2016 : Malware Characterization using Windows API Call Sequences

Application Execution in Windows OS

Create a File

IRP_MJ_Write Driver

Well Documented

Version Independent Many Hooking Libraries

6 of 20

Page 7: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

Some Malicious Win-API Patterns

Malicious Activity API Pattern

Key Logger (FindWindowA, ShowWindow, GetAsyncKeyState) (SetWindowsHookEx,

RegisterHotKey, GetMessage,UnhookWindowsHookEx)

Screen Capture (GetDC, GetWindowDC), CreateCompatibleDC, CreateCompatibleBitmap,

SelectObject, BitBlt, WriteFile

Antidebugging (IsDebuggerPresent, CheckRemoteDebuggerPresent, OutputDebugStringA,

OutputDebugStringW)

Downloader URLDownloadToFile, (WinExec,ShellExecute)

DLL Injection OpenProcess, VirtualAllocEx, WriteProcessMemory, CreateRemoteThread

Dropper FindResource, LoadResource, SizeOfResource

For each Malicious Activity:

1. Identification of such Win-API Call sequences

2. Extraction of same

7 of 20

Page 8: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

Identification & Categorization of Win-APIs

SPACE-2016 : Malware Characterization using Windows API Call Sequences

1 2 3 4 ... 533 534

OpenFile WriteFile Send GetHostbyName … Connect GetSystemTime

534 Win –API Calls

A B C D E ... Y Z

I/O

Create

I/O

Open

I/O

Write

I/O

Find

I/O

Read

… Win-

Service

System

Info

26 Category

Win-APIs in ‘I/O create’ category is used to create

I/O objects like file, folder, Stdin & Stdout.

8 of 20

Page 9: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

CATEGORY SOME EXAMPLES Code No. of API

1 I/O Create CreatefileA, CreatePipe A 14

2 I/O Open OpenFile ,OpenFileMappingA B 10

3 I/O Write WriteFile, WriteConsoleW, WriteFileEx C 25

4 I/O Find FindFirstFileA, FindNextFileW D 13

5 I/O Read ReadFile, ReadFileEx, ReadConsoleA E 18

6 I/O Acces SetFileAttributesW, SetConsoleMode, F 19

7 Loading Library LoadLibraryExW ,FreeLibrary G 7

8 Registry Read RegOpenKeyExW, RegQueryValueA H 15

9 Registry Write RegSetValueA, RegSetValueW, I 13

….. ……….

22 Internet Open/ Read InternetOpenUrlA, InternetReadFile V 13

23 Internet Write InternetWriteFile, TransactNamedPipe W 2

24 Win-Service Create CreateServiceW, CreateServiceA X 2

25 Win-Service Other StartServiceW, ChangeServiceConfigA Y 11

26 System Information GetSystemDirectoryW, GetSystemTime Z 35

TOTAL APIs 26 534

Page 10: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

Win-API Call Extraction of Malware

Host Machine: UBUNTU 14.01

APP-MON

Execute Malware

Win-API Call Sequence

Higher Level Category Sequence

SPACE-2016 : Malware Characterization using Windows API Call Sequences

520 for each Malware Class

Total Sample: 2,600

Time Taken: 40 days

Guest Machine : Win-XP SP2

10 of 20

Page 11: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

Tool Used: AntConc

N-gram Analysis of final Call sequence

SPACE-2016 : Malware Characterization using Windows API Call Sequences

LIMITATION: Finds exactly same consecutive patterns

11 of 20

Page 12: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

ssdeep (Fuzzy Hash) Matches inputs that have homologies.

Properties:

• Non-Propagation

• Alignment Robustness and

• Signature Matching Criteria

Analysis of sequence: ssdeep Analysis

SPACE-2016 : Malware Characterization using Windows API Call Sequences

12 of 20

Page 13: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

SPACE-2016 : Malware Characterization using Windows API Call Sequences SPACE-2016 : Malware Characterization using Windows API Call Sequences

FILE LENGTH : 26 KB

ssdeep Hash: 768:9tshU99FMiEHvIbDtNKm2tWHl5DXhAfQPLJzOmu:9UY+iXnnKqhXEQPl3u

Analysis of sequence: ssdeep Analysis (2)

13 of 20

Page 14: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

CHANGE SSDEEP HASH MATCH

Removed ‘Annual’ from first

paragraph

768:8tshU99FMiEHvIbDtNKm2tWHl5DXhAfQPL

JzOmu:8UY+iXnnKqhXEQPl3u 99

Replace 2016 with 2017 (15

replacements)

768:XtshU9EFMiEHv1bDtNKm2tlfl5DXhAfQPL

JzObv:XUh+iinnK3hXEQPlCv 83

Replace cryptography with

cryptology (25 replacements)

768:TtshG99FMiEHvcbDtNUq2twHl5DXhA9QVL

JzOIu:TUu+ijnnUshXGQVlVu 79

Removes first 500 bytes 768:TtshU99FMiEHvIbDtNKm2tWHl5DXhAfQPL

JzOmu:TUY+iXnnKqhXEQPl3u 99

Removes first 1500 bytes and place

them in end

768:1tshU99FMiEHvIbDtNKm2tWHl5DXhAfQPL

JzOm3:1UY+iXnnKqhXEQPl33 96

Remove first 1500 bytes and place

them in middle

768:1tshU99FMiEHvUbDtNKm2tWHl5DXhAfQPL

JzOmu:1UY+i5nnKqhXEQPl3u 96

Remove 2000 bytes from middle 768:9tshU99FMA&&vIbDtNKm2tWHl5DXhAfQPL

JzOmu:9UY+/&nnKqhXEQPl3u 96

Replace 800 bytes from middle with

random string

768:9tshU99FM91HvIbDtNKm2tWHl5DXhAfQPL

JzOmu:9UY+6n&nKqhXEQPl3u 97

SPACE-2016 : Malware Characterization using Windows API Call Sequences SPACE-2016 : Malware Characterization using Windows API Call Sequences

Analysis of sequence: ssdeep Analysis (3)

ssdeep Hash: 768:9tshU99FMiEHvIbDtNKm2tWHl5DXhAfQPLJzOmu:9UY+iXnnKqhXEQPl3u

Page 15: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

Analysis of sequence : ssdeep Analysis (2)

SPACE-2016 : Malware Characterization using Windows API Call Sequences

matching score as malware classification criteria.

15 of 20

Page 16: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

Classification Results: Best Matching Class

SPACE-2016 : Malware Characterization using Windows API Call Sequences

MALWARE

CLASS Worm Backdoor Trojan -

Dropper

Trojan -

Downloader

Trojan -

Spy

Worm 109 1 6 3 1

Backdoor 1 98 6 5 10

Trojan -

Dropper 6 6 101 4 3

Trojan -

Downloader 3 5 4 108 0

Trojan -

Spy 1 10 3 0 106

Training Data : 2000 Samples

Testing Data : 120 samples per Malware Class

16 of 20

Page 17: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

Classification Results Comparison

Existing (Malware vs Benign)

Our Model (One Malware Class vs All)

Page 18: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

Proposed Malware Classification Framework

18 of 20 SPACE-2016 : Malware Characterization using Windows API Call Sequences

Page 19: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

FUTURE DIRECTION

• More number of samples

• More Malware Categories like rootkit, botnet etc.

• Other Operating systems like Linux, Android etc.

SPACE-2016 : Malware Characterization using Windows API Call Sequences

19 of 20

Page 20: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

References

SPACE-2016 : Malware Characterization using Windows API Call Sequences

1. Shafiq, M.Z., Tabish, S.M., ,Mirza, F., Farroq, M.: Pe-Miner: Mining structural information

to detect malicious executable in real time. In: 12th international symposium on recent

advances in intrusion detection (2009)

2. Moskovitch, R. et al: Unknown Malcode Detection Using OPCODE Representation.

Intelligence and Security Informatics, Volume LNCS 5376, 204-215 (2008)

3. Moskovitch, R. et al: Unknown Malcode Detection via text categorization and the imbalance

problem.In:IEEE International Conference on Intelligence and Security Informatics, 156-

161 (2008)

4. Santos, I. et al.: Opcode sequences as representation of executables for data-mining based

unknown malware detection. Information Science. Vol 231. 64-82 (2013)

5. Egele, M., Scholte, T., Kirda, E., Kruegel, C.: A Survey on Automated Dynamic Malware

Analysis Techniques and Tools. ACM Computing Surveys, V. 44 n. 2,1-42 (2012)

6. Santos, I. et al.: OPEM: A Static-Dynamic Approach for Machine-learning-based Malware

Detection. In: International Conference CISIS12-ICEUTE12, 189, 271-280 (2013)

7. Ye,Y. et al.: SBMDS: an interpretable string based malware detection system using SVM

ensemble with bagging. Journal in Computer Virology. 5(4).283-293 (2009)

8. Zolkipli, M.F., Jantan, A.: Approach for Malware Behavior Identification and Classification.

In: 3rd International Conference on Computer Research and Development, Shanghai, 191-

194 (2011)

9. Islam, M.R., Tian, R., Batten, L., Versteeg, S.: Classification of malware based on integrated

static and dynamic features. Journal of Network and Computer Applications 36, 646–656

(2013)

10. Gandotra, E. , Bansal, D., Sofat, S.: Malware Analysis and Classification: A Survey. Journal

of Information Security, 5, 56-64 (2014)

11. Ranveer, S., Hiray, S.: Comparative Analysis of Feature Extraction Methods of Malware

Detection. International Journal of Computer Applications 120 (5),1-7 (2015)

12. Youngjoon, K., Eunjin, K., HuyKang, K.: A Novel approach to detect Malware based on

API call sequence analysis. International Journal of Distributed Sensor Networks, Article

No. 4, (2015)

13. Park, Y., Reeves, D., Mulukutla, V., Sundaravel,B.: Fast malware classification by

automated behavioural graph matching, In: Sixth Annual Workshop on Cyber Security and

Information Intelligence Research (2010)

14. Nari, S., Ghorbani, A.A.: Automated Malware Classification based on Network Behavior,

In: International Conference on Computing, Networking and Communications (ICNC)

(2013).

15. VxVault,http://www.vxvault.com

16. Vxheaven, http://www.vxheaven.org

17. VirusSign, http://www.virussign.com

18. VirusTotal, http://www.virustotal.com

19. Kornblum, J.: Identifying almost identical files using context triggered piecewise hashing.

Digital Investigation Journal, 91-97 (2006)

20. Hunt, G., Brubacher, D.: Detours: Binary Interception of Win32 Functions. 3rd Conference

on USENIX Windows NT Symposium, 135-143 (1999)

21. Firdausi, I. et al. : Analysis of Machine Learning Techniques used in Behavior-Based

Malware Detection. In: Second International Conference on Advances in Computing,

Control and Telecommunication Technologies (ACT). IEEE. 201-203. (2010)

Page 21: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

Thank You

Page 22: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

CTPH: Workflow

Page 23: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

CTPH: Workflow

• Output of FNV hash -> 6 bits (LSB)

• BASE64 encoding -> [A..Z][a..z][0..9][+,/]

• Rather then generating single hash of file, hashes of different blocks of variable lengths are recorded.

Page 24: Malware Characterization using Windows API Call Sequencesmath-sa-sara0050/space16/... · CATEGORY SOME EXAMPLES Code No. of API 1 I/O Create CreatefileA, CreatePipe A 14 2 I/O Open

Worm A worm is a type of malicious program that spreads copies of itself to other devices

over a network.

Trojan-Downloader This type of trojan secretly downloads malicious files from a remote server, then

installs and executes the files.

Trojan-Spy This type of trojan secretly installs spy programs and/or keylogger programs.

Trojan-Dropper This type of trojan contains one or more malicious programs, which it will secretly

install and execute.

Backdoor A remote administration utility that bypasses normal security mechanisms to secretly

control a program, computer or network.a