Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.
-
Upload
kelsie-garton -
Category
Documents
-
view
214 -
download
1
Transcript of Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.
![Page 1: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/1.jpg)
Reversing Data Formats:What Data Can Reveal
by Anton Dorfman ZeroNights 0x03, Moscow, 2013
![Page 2: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/2.jpg)
About me
• Fan of & Fun with Assembly language• Researcher• Scientist• Teach Reverse Engineering since 2001• Candidate of technical science
![Page 3: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/3.jpg)
Observed topics
• Samples• Basics• Thoughts• Fields• Related works• Applications
![Page 4: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/4.jpg)
Samples
![Page 5: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/5.jpg)
What is it?
![Page 6: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/6.jpg)
What is it? Hint 1
![Page 7: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/7.jpg)
What is it? Hint 2
![Page 8: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/8.jpg)
What is it?
![Page 9: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/9.jpg)
What is it?
![Page 10: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/10.jpg)
What is it? Hint 1
![Page 11: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/11.jpg)
What is it? Hint 2
![Page 12: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/12.jpg)
Basics
![Page 13: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/13.jpg)
Standard way
• Hex editor• Researcher• Brain (equally important as the Hex editor)• Basic knowledge how data can be organized
(in brain)• Analysis of the executable file that
manipulates with data format
![Page 14: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/14.jpg)
3 ways of researching
• Dynamic analysis – use reverse engineering of application that manipulate with data format
• Static analysis – try to extract header, structures, fields and try to find relationship between data
• Statistic analysis – when have some amount of samples and use statistics of changes and ranges of values
![Page 15: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/15.jpg)
Breaking the rules
• There is the rule RTFM (Read The F**king Manual)
• Nobody likes it• I’m not exception
• First of all I start my research, and second – try to find related works and generalize ideas from them
![Page 16: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/16.jpg)
Definitions
• Field – some value used to describe Data Format
• Structure – way for organizing various fields• Data – information for representing which
Data Format is developed• Header – common structure before data, may
contain substructures
![Page 17: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/17.jpg)
Thoughts
![Page 18: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/18.jpg)
Main Idea
• Data format is developed by human (not pets or aliens)
• Sometimes looks like that no human works on it…• Format developers use common data organization
concepts and similar thoughts when creating new data formats
• If we find regularities in data format organization rules we can automate searching of them
![Page 19: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/19.jpg)
Three types of data format
• File formats – multimedia formats, database formats, internal formats for exchanging between program components and etc.
• Protocols – network protocols, hardware device interaction protocols, protocols of interaction between driver and user space application and etc.
• Structures in memory – OS structures, application structures and etc.
![Page 20: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/20.jpg)
Levels of abstraction
• Bit Order• Byte Order• Fields Size• Field Basic Type• Field Type• Structure• Field Semantics
![Page 21: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/21.jpg)
Tasks to solve
• Extract header, separate it from data• Find field boundaries• Find structures and substructures• Find types of fields• Detect bit and byte ordering• Determine semantics of fields• Interpret goal of field – it’s semantics
![Page 22: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/22.jpg)
Bit order
• There are most significant bit – MSB and least significant bit - LSB
• MSB is a left-most bit – commonly used – value is 10010101b – 95h - 149
• LSB is a left-most bit - value is 10101001b – A9h - 169
![Page 23: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/23.jpg)
Byte order
• Little-endian – Intel byte order• Big-endian – network byte order
• Middle-endian or mixed-endian – mixed byte order – when length of value more then default processor machine word
![Page 24: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/24.jpg)
Fields
![Page 25: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/25.jpg)
Types of fields
• Service fields – for describing Data Format (size of structure and etc.)
• Common fields – “fields from life” (time, date and etc.)
• Specific fields – we can find range for that type (bit flags and etc.)
• Application specific – can be interpret only by application, we should use dynamic analysis
![Page 26: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/26.jpg)
Field Size and Levels of field interpretation
• Commonly field is a byte sequence• Sometimes field is a bit sequence
• Fixed size in bytes (1, 2, 3, 4, …)• Fixed size in bits (1, 2, 3, 4, …)• Variable size in bytes• Variable size in bits
![Page 27: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/27.jpg)
Basic field types
• Hex value – by default• Decimal value• Character value (up to 4 symbols)• String (ASCII or Unicode)• Float value• Bits value
![Page 28: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/28.jpg)
Types of field
• Text• Offset• Size• Quantity• ID• Flag• Counter• DateTime• Protect
![Page 29: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/29.jpg)
Text
• Usually fixed size, remaining space after text filled with 0
• Variable sized text ended with 0, or there are size of this text field
• Usually text string contain their meanings
![Page 30: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/30.jpg)
Offset
• Offset to another field• Offset to data• Pointer to another structure (may be child or
parent)• Pointer to another instance of such structure
(next or previous in linked list)
• Can be relative and absolute
![Page 31: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/31.jpg)
Export in PE-EXE
IMAGE_DOS_HEADER
+03Che_lfanew DWORD
...
...
Module ImageBase
+078h
IMAGE_NT_HEADERS
OptionalHeaderIMAGE_OPTIONAL_HEADER
Signature DWORD
IMAGE_DIRECTORY_ENTRY_EXPORT
FileHeader IMAGE_FILE_HEADER
...
+024h
IMAGE_EXPORT_DIRECTORY
AddressOfNames
...
AddressOfNameOrdinals
AddressOfFunctions
...
+020h
+01Ch
Names (Offsets)
Offset FuncName X+1
Offset FuncName X
Offset FuncName X+2
Name Ordinals
Index FuncAddr X+1
Index FuncAddr X
...
Functions Addresses
FuncAddr X+1
...
...
Function Names
...
‘LoadLibraryA’,0
...
...
Index FuncAddr X+2
...
‘LoadLibraryExA’,0
FuncAddr X
FuncAddr X+2
+ImageBase +ImageBase
...
+ImageBase
+ImageBase
+ImageBase
‘LoadLibraryExW’,0LoadLibraryExA
+ImageBase
LoadLibraryExA EP:
Index Index=(Elem size - dword)
(Elem size - word)
(Elem size - dword)
mov edi,edipush ebp...
+ImageBase
Find
![Page 32: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/32.jpg)
Size
• Size of data• Size of structure• Size of substructure• Size of field
• Can be not only in bytes, but also in bits, words (2 byte), double words, paragraphs (16 byte) and etc.
![Page 33: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/33.jpg)
Quantity
• Quantity of substructures• Quantity of elements
• IMAGE_NT_HEADERS.IMAGE_FILE_HEADER. NumberOfSections
• IMAGE_EXPORT_DIRECTORY. NumberOfFunctions
![Page 34: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/34.jpg)
ID
• Identifier of data format, often called magic number
• ID of structure• ID of substructure
• Often ID is value consist of char symbols• Often ID looks like “magic” - BE BA FE CA
![Page 35: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/35.jpg)
Flag
• Bit Flags• Enum values as a flags• Special case – Bool value
![Page 36: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/36.jpg)
Counter
• Increment (possibly decrement)• Starts with 0 / begins with another value• Changes by 1 or another value
![Page 37: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/37.jpg)
DateTime
• Storing format• Resolution• Moment of beginning• Range
![Page 38: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/38.jpg)
Protect
• CRC value of data• CRC value of substructure• Various Hash functions and etc.
![Page 39: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/39.jpg)
Related works
![Page 40: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/40.jpg)
Projects, articles and authors• Protocol Informatics Project – based on “Network Protocol Analysis
using Bioinformatics Algorithms” by Marshall A. Beddoe• Discoverer - “Discoverer: Automatic Protocol Reverse Engineering
from Network Traces” by Weidong Cui, Jayanthkumar Kannan, Helen J. Wang
• Laika – “Digging For Data Structures” by Anthony Cozzie, Frank Stratton, Hui Xue, and Samuel T. King
• RolePlayer – “Protocol-Independent Adaptive Replay of Application Dialog” by Weidong Cuiy, Vern Paxsonz, Nicholas C. Weaverz, Randy H. Katzy
• Tupni – “Tupni: Automatic Reverse Engineering of Input Formats” by Weidong Cui, Marcus Peinado, Karl Chen, Helen J. Wang, Luiz Irun-Briz
![Page 41: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/41.jpg)
Protocol Informatics Project• Global and local sequence alignment -
Needleman Wunsch and Smith Waterman algorithms
![Page 42: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/42.jpg)
Discoverer
![Page 43: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/43.jpg)
Laika
• Using Bayesian unsupervised learning• For fixed size structures only
![Page 44: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/44.jpg)
Tupni
• Based on taint tracking engine
![Page 45: Reversing Data Formats: What Data Can Reveal by Anton Dorfman ZeroNights 0x03, Moscow, 2013.](https://reader038.fdocuments.us/reader038/viewer/2022110116/5516e8ad550346f5558b4831/html5/thumbnails/45.jpg)
Applications
• RE any program• RE undocumented/proprietary file formats• RE undocumented/proprietary network protocols• RE undocumented structures in memory• Fuzzing • Examination of protocol implementation• Replay network interaction• Zero-day vulnerability signature generation