Hatkit Project - Datafiddler

18
Developed by Martin Holst Swende 2010-2011 Twitter: @mhswende [email protected]

description

Presentation of the Hatkit Datafiddler, which is part of the Hatkit Project (Http Analysis Toolkit)

Transcript of Hatkit Project - Datafiddler

Page 1: Hatkit Project - Datafiddler

Developed by Martin Holst Swende 2010-2011

Twitter: @mhswende

[email protected]

Page 2: Hatkit Project - Datafiddler

This presentation is just a quick and steep dive into the Datafiddler. It does not cover much, but hopefully gives a bit of understanding about what the Datafiddler is capable of.

The Datafiddler operates on data stored by the Hatkit Proxy in a MongoDB database. The proxy is not covered in this presentation.

Two primary views exists; the tableview and the aggregrator.

A third view, 3rd party plugins, is planned but not implemented in the UI.

Page 3: Hatkit Project - Datafiddler

Dynamic display of data in a table-based layout

(1:1 mapping)

Page 4: Hatkit Project - Datafiddler

This is what data is fetched

from each document ('row') in the database.

This is what data is fetched

from each document ('row') in the database.

The variable 'v1' will

contain request.time

The variable 'v1' will

contain request.time

These are the column definitions. This is python code which is evaluated. They

have access to the variables, and a library of 'transformations'

These are the column definitions. This is python code which is evaluated. They

have access to the variables, and a library of 'transformations'

date(millis) takes an UTC timestamp and converts it to a nice

human readable format. The second column will be titled Date and contain the result of date(v1)

date(millis) takes an UTC timestamp and converts it to a nice

human readable format. The second column will be titled Date and contain the result of date(v1)

Page 5: Hatkit Project - Datafiddler

The v0 parameter is the object id. This column uses'Coloring', which means that the value is not displayed, instead a color is calculated from the hash of the value.

This is particularly useful e.g when values are long but not interesting. Cookie values take a lot of screen real estate, but often it is only interesting to see when they are changed – which is shown by the color.

The v0 parameter is the object id. This column uses'Coloring', which means that the value is not displayed, instead a color is calculated from the hash of the value.

This is particularly useful e.g when values are long but not interesting. Cookie values take a lot of screen real estate, but often it is only interesting to see when they are changed – which is shown by the color.

Page 6: Hatkit Project - Datafiddler

There are a lot of prefedined 'transformers' which can be used when defining the columns

There are a lot of prefedined 'transformers' which can be used when defining the columns

For example, the function below makes it possible to display both URL-parmeters and POST-parameters in the same column.

showparams(url,form)

Sorts parameters by keys. You can send in two dicts, and get the combined result. This makes it easierto show both form-data and url-data in the same column. Example

variable v2: request.urlvariable v3: request.datacolumn: sortparams(v2, v3)

//Another versionvariable v1: requestcolumn: sortparams(form=v1.data,url=v1.url)

Page 7: Hatkit Project - Datafiddler

It is simple to write the kind of view you need for the particular purpose at hand.It is simple to write the kind of view you need for the particular purpose at hand.

Some example scenarios:- Analysing user interaction using several accounts with different browsers:

* Color cookies* Color user-agent* Parameters* Response content type (?)

- Analysing server infrastructure* Color server headers* Server header value for X-powered-by, Server etc. * File extension* Cookie names

- Searching for reflected content (e.g. for XSS)* Parameter values* True/False if parameter value is found in response body (simple python hack)

- Analyzing brute-force attempt* Request parameter username* Request parameter password* Response delay* Response body size* Response code* Response body hash

After you write some good column definitions for a particular purpose, save it for next timeAfter you write some good column definitions for a particular purpose, save it for next time

Page 8: Hatkit Project - Datafiddler

This is an example of how an object (request-response) is stored in the database. Each individual field can be used in database queries, more advanced functionality can be achieved using javascript which is executed inside the database.

This is an example of how an object (request-response) is stored in the database. Each individual field can be used in database queries, more advanced functionality can be achieved using javascript which is executed inside the database.

Since MongoDB does not impose a schema, these structures were dynamically generated by the writer (Hatkit proxy) on the fly.

Dynamic properties such as headers and parameters can be used for selection just as any ’static’ property, such as response.rtt which always will be there.

This enables semantics like ”Select request.url.parameters.z from x where request.url.parameters.z exists”.

…(but just to be clear: all keys/values are dynamic)

Since MongoDB does not impose a schema, these structures were dynamically generated by the writer (Hatkit proxy) on the fly.

Dynamic properties such as headers and parameters can be used for selection just as any ’static’ property, such as response.rtt which always will be there.

This enables semantics like ”Select request.url.parameters.z from x where request.url.parameters.z exists”.

…(but just to be clear: all keys/values are dynamic)

Page 9: Hatkit Project - Datafiddler

Displays aggregated data in a tree structure(1:N mapping)

Page 10: Hatkit Project - Datafiddler

Aggregation (grouping) is a feature of MongoDB. It is like a specialized Map/Reduce which can only be performed on <10 000 documents.

You provide the framework with a couple of directives, and the database will return the results, which are different kinds of sums. This enables pretty nice kind of queries which can be displayed in a tree-form.

Example: sitemap can be easily generatedExample: Show all http response codes, sorted by host/pathExample: Show all unique http header keys, sorted by extensionExample: Show all request parameter names, grouped by hostExample: Show all unique request parameter values, in grouped by host

Page 11: Hatkit Project - Datafiddler
Page 12: Hatkit Project - Datafiddler
Page 13: Hatkit Project - Datafiddler
Page 14: Hatkit Project - Datafiddler

Provides capabilities to use existing frameworks, libraries and applicationsfor

analysing captured data

Page 15: Hatkit Project - Datafiddler

3rd party analysis – The idea is to use plugins that use the stored traffic and ’replays’ it through other frameworks. Status: API defined, no UI exists. Runnable through console.

3rd party analysis – The idea is to use plugins that use the stored traffic and ’replays’ it through other frameworks. Status: API defined, no UI exists. Runnable through console.

W3af pluginPlugin which uses the ’greppers’ in w3af to analyse each request/response pair. Requires w3af to be installed, calls relevant parts of the w3af code directly.

Status: Code works, but not feature complete.

W3af pluginPlugin which uses the ’greppers’ in w3af to analyse each request/response pair. Requires w3af to be installed, calls relevant parts of the w3af code directly.

Status: Code works, but not feature complete.

Ratproxy pluginPlugin which starts ratproxy (by lcamtuf) and opens a port (X) for listening. It sets ratproxy to use port X as forward proxy, then replays all traffic through ratproxy, while capturing the output from the process.

Status:PoC performed, but not nearly finished

Ratproxy pluginPlugin which starts ratproxy (by lcamtuf) and opens a port (X) for listening. It sets ratproxy to use port X as forward proxy, then replays all traffic through ratproxy, while capturing the output from the process.

Status:PoC performed, but not nearly finished

Httprint pluginPlugin which uses httprint to fingerprint remote servers.

Status: Idea-stage, unsure if httprint is still alive

Httprint pluginPlugin which uses httprint to fingerprint remote servers.

Status: Idea-stage, unsure if httprint is still alive

Page 16: Hatkit Project - Datafiddler
Page 17: Hatkit Project - Datafiddler

For ’breakers’ : Datafiddler is very useful for analyzing remote servers and applications, from a low-level infrastructure point-of-view to high-level application flow.

For ’defenders’ : Hatkit proxy can be set as a reverse proxy, logging all incoming traffic. Datafiddler can be used as a tool to analyze user interaction, e.g. to detect malicious activity and perform post mortem analysis. The proxy is very lightweight on resources (using Rogan Dawes’ Owasp Proxy), and the backend (MongoDB) has great potential to scale and can handle massive amounts of data.

Page 18: Hatkit Project - Datafiddler

Hatkit proxy requirements:•Java•(optional** : MongoDB)•(mongodb java drivers included in binary release)

** Can be used in interception-only mode, where data is not stored.

Datafiddler Requirements (only tested on Linux / Ubuntu):• Python• Qt4• PyQt4 bindings• Python mongodb driver• MongoDB•(optional: w3af)•(optional: ratproxy)

To get up and running, grab Hatkit proxy :Src: http://martin.swende.se/hgwebdir.cgi/hatkit_proxy/Bin: http://martin.swende.se/hgwebdir.cgi/hatkit_proxy/raw-file/tip/hatkit.zip

And Datafiddler:Src: http://martin.swende.se/hgwebdir.cgi/hatkit_fiddler/