Information Assurance in a Distributed Forensic Cluster

51
© University of South Wales Information Assurance in a Distributed Forensic Cluster Nick Pringle a *, Mikhaila Burgess a a University of South Wales (formerly University of Glamorgan), Treforest, CF37 1DL, UK

description

Information Assurance in a Distributed Forensic Cluster. Nick Pringle a *, Mikhaila Burgess a a University of South Wales (formerly University of Glamorgan), Treforest, CF37 1DL, UK. Introduction. - PowerPoint PPT Presentation

Transcript of Information Assurance in a Distributed Forensic Cluster

PowerPoint Presentation

Information Assurancein aDistributed Forensic Cluster

Nick Pringlea*, Mikhaila Burgessa

a University of South Wales (formerly University of Glamorgan), Treforest, CF37 1DL, UK

University of South WalesGood afternoon.

My name is Nick Pringle from the University of South Wales in the UK.

This presentation is about Information Assurance in a Distributed Forensic Cluster.

1IntroductionAs data quantities increase we will need to adopt alternative models in our forensic processing environments.

We believe that Distributed Processing will play a key part in this.

We believe existing practice breaks down in a distributed system.

Were going to show our design for a framework that provides data assurance in a distributed storage environment.2

University of South WalesWe all know | that because of the relentless advance in technology,the nature of our work is changing | and that we need new approaches and tools for us continue to investigate digital evidence effectively.

We believe that, | at least for large cases, |distributed processing will play a very important part in our future.

In the next 20 minutes Ill explain the problem, | revisit one of the foundations of our subject | and then see that it breaks down when we move a distributed environment.

Ill show you Fcluster and Fclusterfs which we propose as a solution.

And then therell be 10 minutes for any questions.2Forensic SoundnessIts a key part of our disciplineIts quite hard to defineExisting standards and frameworks are a little vagueIts all down to accepted Best PracticeIts achieved by implementing controls

3

University of South WalesOne of the most important concepts in forensics is that of Forensic Soundness

It would be nice if there was an explicit list of things to do to achieve this but we dont have really one.

There are documents like the ACPO guidelines, | standards like ISO 27037 | and various books have been written over the years | but these often use rather vague language like adequate or appropriate.

What we can say, |is that it comes down to is what is accepted as Best Practice3Internal Controls on the Forensic ProcessBy Property, eg. cryptographic hashes, sizes, name!By Location, eg. on specific media, network storageBy Authority, eg. order and response formBy Access Control, eg. write blocker, passwordBy Separation of Process, eg. crime scene and lab workBy Checklist, eg. have all the tasks been completed?By Audit, but this is after the process4

University of South WalesIf I tried to define exactly what IS needed, | I would certainly include these.

In the business world these are called Internal Controls

These are a selection of some of them. Im not going to go through them one by one | as Im sure youll recognise them.

Im sure you could suggest more. There should be training, | testing | and calibration of equipment

We could probably devote entire conference to the subject.4At the bedrock of ForensicsThe Forensic ImageIts a snapshot at a point in timeIt is complete, including Boot Sectors, Unallocated space, HPA, HPC areasRather like the pieces of a Jigsaw, the parts form a whole. We can measure it with SHA-1 etc

5

University of South WalesWe can further clarify Soundness when we look at taking a forensic image of storage media.Forensic Soundness is at the very heart of The Forensic Image.

Other forensic disciplines must be envious that we can, |in a single action sometimes, | capture the entire crime scene.

The Image is complete in very specific ways. It is created with special software that has been tested and approved |and this can be differentiated from ordinary products like partition magic.

We can measure the image. Cryptographically hash it and test it for change.

The image has become the bedrock of our process.5Traditional ArchitecturesBased around a Forensic Image6

Forensic Image Stores

University of South WalesOf course we then need to analyse the Image

These are the architectures that weve used for 20 years.

The basic architecture is a single PC | where we store the image on a local drive | and analyse its contents on the local processor.

The advanced version is a network, |with a central file store, | connected to workstations via Ethernet. We store the image on the file server | and analyse it on the workstations.

Its all fairly straightforward.6A time of Great ChangeIn the Golden Age life was so simple (Simson Garfinkel, 2010)3V Volume, Variety and Velocity (Gartner, 2007)We now have Desktops, notebooks, netbooks, Virtualisation, Cloud storage, Cloud Processing, Smart Phones, Tablets, SatNav, USB Sticks, Memory cards, Terabyte drives, games machines, Cameras, etc.We find it difficult to cope with the sheer volume of dataWe have a backlog

7

University of South WalesWell, it was straightforward until recently.I dont need to tell anyone here | we are undergoing huge changes in technology at the moment.

Simsom Garfinkel referred to the period from 1999-2007 as the Golden Age.Remember when a computer crime was one PC with one hard disk running Windows 98 or XP? (Digital Forensic Research: The Next 10 years 2010)

In 2007 the Gartner organisation used the phrase 3VVolume ( its bigger )Variety ( it comes in many more forms)and Velocity ( this new stuff just keeps on coming ).

We need to analyse even bigger media | and more of it. Cross drive forensics and multi agency access is becoming more important.

7Traditional ArchitecturesBased around a Forensic Image8

This is limited to SATA3 and 32 cores?Forensic Image StoresSATA3 = 600MB/sSSD Read = 500MB/sIf 1 Core can process 4 MB/sThis could occupy 150 coresProcessor constrainedGigabit Network = 100MB/sThis can supply only 50 coresNetwork constrained

University of South WalesWhen we look at how traditional architectures cope with this increase in workload, we uncover a couple of problems.

If we use FTK as a standard, we find that no matter which processor we consider, | i3, i5, i7 or Xeon, a single core can process ON AVERAGE about 4 MB/s. This is based on todays expectation of processing.

On the left, we have the single PC which has a SSD. | The SSD can deliver data fast enough to feed about 150 cores. But the more cores we add to a single PC, costs rises exponentially. 64 core processors are disproportionately more expensive than eight, 8 core processors.

When we try to solve this by moving to a Cluster of Workstations, the connection between the file storage and the network switch becomes the bottleneck.Surprisingly, this makes it worse! 10 Gb networking does exist, | but again its disproportionately expensive, and will only allow us to occupy, perhaps fifteen, 8 core processing hosts.

More automation would mean more processing which increases the number of cores required to keep the same processing time.8Anticipated DevelopmentsMulti tera-byte crime scenesMulti-Agency AccessMulti Device AnalysisComplex processing, image and object recognitionSemantic meaning of textusage profilingGoogle had the same type of Problem9

University of South WalesExisting architectures struggle with large scale investigations

This research is not aimed at the analysis of a single mobile phone or a single memory stick.

It looks forward to a time when our software is more ambitiousand subsequently more demanding on processing, | where it uses artificial intelligence, or advanced statistics, to assist us with automatic processing This would extend processing times even further with existing architectures.

Google had the same type of problem 15 years ago.9Google/Apache Hadoop10

A processing Model - Map/ReduceA File System - HDfsSplit the data as whole files (SIPs/DEBs) across the cluster

and

Dont move the dataRun the program where the data is stored

University of South WalesGoogle wanted to process vast quantities of data coming in from the log files which were generated when their search engine was used.

They solved their problem with a processing model called Map/Reduce and a custom file system. The complete system is called Hadoop.

For a number of reasons that Ill not explain in detail now, Hadoop is not suitable for our purposes. Lets just say that it does not really support random access and as HDFS splits large files into chunks over separate hosts, it is not an efficient solution for our problem with our large image files that require random access.

Hadoop was designed to solve some very specific problems when analysing long streams of data, but we can learn 2 valuable lessons | that we can implement in a system specifically designed for digital forensic processing.

1 We should split the data as WHOLE FILES over multiple machines.2 We shouldnt move the data. We should run the program where the data is stored.

10Solutions and OpportunitiesDistributed processing is one that interests me

11

University of South WalesThe key to Googles successful implementation of their design is that its a Middleware

I need to make it clear that its middleware what Im working on.

Its not the operating systemAnd its not an application program11Distributed Architecture12Encrypted and SecureVPN

University of South WalesThis raises the possibility of large clusters that might share processing.

There are many issues to be addressed but I believe that the opportunities are worth the effort in solving them.

12We lose The ImageDistributed storage of acquired information packages is in direct conflict with the imageThe images integrity comes, primarily, from its wholesomeness

We lose the integrity we have enjoyed for 20 years

We need to re-establish Assurance13

University of South WalesThere is a major problem with storing Images in a distributed storage system.

Its the size of the image file thats the problemto fully exploit distributed processing we need to store data across the clusterbut if we do break them up into small fragments, we lose the integrity we hold so dear.

We need to address this as a priority issue.

This seems to lead us to adopt Digital Evidence Bags or Submission Information Packages.

13Distributed Data needs to appear as a single file system14

University of South WalesIf we adopt the same approach as Google we can make a distributed file system appear to be uniformfrom the point of view of the user..

Google have the resources to allow them to build a completely new directory service.

I dont have the resources to do the same but I can use something which has been suggested before in forensics.

A FUSE file system.

14FUSE File-Systems15

Application Program

University of South WalesUnder normal circumstances, | a program accesses the file system through a virtual file system interface.

There are just over 40 functions that form the interface between an application program and a file system.

A call to the Virtual file system is chained on to the code to perform a function for a specific file system format.

1516FUSE File-Systems

Application Program

University of South WalesA File System in User Space allows an ordinary user program to be inserted in the chain

We can open out the chain and slide in a new set of functions to modify the behaviour of the file system.

In this illustration you can see that Ive amended any function which would modify the file system with No Operation code and in doing so | I have created a Read Only File system.

1617FUSE File-Systems

Application Program

University of South WalesHere you can see the FUSE file system in place and we can substitute the real file system from,| in this case,| EXT3 to NTFS.

And ALL OF THIS this was done without changing the application program.17FUSE File System in ForensicsForensic discovery auditing of digital evidence containers, Richard, Roussev & Marziale (2007)Selective and intelligent imaging using digital evidence bags. In: Proceedings of the sixth annual digital forensics research workshop (DFRWS), Lafayette, IN; Aug 2006. Turner P. Affuse (Simson Garfinkel)MountEWF Xmount for VirtualBox or VMWare format disk images.18

University of South WalesAs I said previously, Im not the first person to propose using FUSE file systems in digital forensic.

18FClusterfs A wish listThe ability to store extended directory/file meta dataWe want unaltered legacy software to run. New software requires no new skillset. Sculptor, bulk_extractor etc will still workGives access to files on remote servers where theyre stored as whole filesThe ability to handle multi storage volumes from different mediaHas end to end encryption built-inTracks movements and processing: Logging.Is Read Only to the userHighly tailorable access control at volume, directory and file levels19

University of South WalesSo if were going to design a file system for our forensic work, what do we want it do?

These are just some of the things Id like to see in a forensic system file system.

Id certainly want it to store the meta-data | from the original file system.

The ability to run legacy software like sculptor, bulk_extractor or AFF would be a bonus.

It would be an advantage to be able to handle multiple file system volumes within one real file system this would enable us to conduct cross device forensics.

Encryption, logging and of course being read only, would be important too.

Im sure there are others.19Existing FUSE File-Systems20MySQLfs - Substitutes an SQL database for the file-systemCurlFTPfs Mounts an ftp/ssh/sftp/http/https serverLoggedfs Records all file access activityeCryptfs Encrypts and decrypts data per file on the flyROfs a read only file system

University of South WalesThere are about 50 FUSE File systems at the moment, of which 5 are of particular interest. The most interesting is MySQLfs, which replaces the entire file system normally found on the storage media with tables | within an SQL database. The directory is stored in an SQL table. There is a tree table | that stores the directory hierarchy | and a data_blocks table that actually holds the data.

curlFTPfs enables an FTP or SSH server to be mounted as if they were local file systems with full | read/write functionality.

Loggedfs creates an audit trail when a file is accessed.

eCryptfs introduces code that enables on-the-fly encryption and decryption during file read/write actions using strong algorithms | like AES-256.

ROfs creates a read-only file system that doesnt generate error messages.

20Distributed Data appearing as a single file system21

MySQLfsDirectoryFileAFileBFileCFileDFileEFileFFileAFileBFileFFileCFileDFileEeCryptfsCurlFTPfsROfsLoggedfs

University of South WalesThis is how the five FUSE file systems we saw on the previous screen would fit into a new | Forensic File system.

An SQL database is used to store directory information for many file-systems.The actual data is stored on many storage servers running FTP or SSH.All access to data is ReadOnly.All access to data is loggedAll data is encrypted in storage and in transit21FClusterfs22

Host Bftp ServerFile2File3FClusterfs MountFile1File2File3Host AFClusterfs MountFile1File2File3ftp ServerFile1Remote Connection Slower - EthernetLocal Connection Faster SATARAM Connection even faster BUS speed!fclusterfs --mysql_user=me--mysql_password=mypassword --mysql_host=25.63.133.244 --mysql_database=fclusterfs --volume=74a8f0f627cc0dc6 --audituser='Investigator Name' /home/user/Desktop/fsmount

University of South WalesAnother important feature of FClusterfs should be that, | like Hadoop, processing should normally take place on local data, which is fastest to access.

Local SSD with SATA3 is faster than EthernetRAM is faster than SATA3

Fclusterfs should be peer to peerand we should be able to have multiple SQL databases.22Password varchar(45)FClusterfs MySQL Tables23VolumeListinginodesserveraccessinfometadatatree

University of South WalesWe start with a couple of tables intended to accommodate the basic data held in a file system.Click We extend this to include extra field just for our needs, | the SHA-1 cryptographic hash, the route the data took to come into the cluster, its replicated storage locations We also have a table to store the hierarchical tree structure.ClickWe need a table to keep track of the multiple original source file-systems. ClickWe store the access information to the data storage servers within the database. We dont want the user to have this information. We want access control limited to the FUSE file system itself.ClickThe Metadata for each file is held in a separate table because its very variable in length.

23Our Submission Information Package (SIP/DEB)24Header SectionNick PringleA Villainous Crime12/May/2013 14:25:23This is a small 1GB memory stick taken from the desk of the suspectFriday, November 29 2013. 13:42:52 GMTFriday, November 29 2013. 13:42:52 GMT74a8f0f627cc0dc6My Label/mhash/lib/keygen_s2k.c

Dumping attribute $STANDARD_INFORMATION (0x10) from mft record 150 (0x96)Resident: YesAttribute flags: 0x0000 0x00 ARCHIVE (0x00000020)Dumping attribute $FILE_NAME (0x30) from mft record 150 (0x96)Resident: YesResident flags: 0x01Parent directory: 136 (0x88)File Creation Time: Sat Jul 20 18:25:53 2013 UTCFile Altered Time: Sat Jul 20 18:25:53 2013 UTCMFT Changed Time: Sat Jul 20 18:25:53 2013 UTCLast Accessed Time: Sat Jul 20 18:25:53 2013 UTCDumping attribute $DATA (0x80) from mft record 150 (0x96)Resident: NoAttribute flags: 0x0000Attribute instance: 2 (0x2)Compression unit: 0 (0x0)Actual Data size6066 (0x17b2)Allocated size: 8192 (0x2000): 6066 (0x17b2)111242416A8724ACDB2135FE66EB7BE554CCF16091FBC26641242417D7A6B1A3F17E33A1F15BF8B815EC4B13410EFED3FC0198EF2F7782EF9EA8568853E6E3A48B86256D

Data Section

begin-base64 777 FC0198EF2F7782EF9EA8568853E6E3A48B86256D.cptDlevh4eFxd761tZ1zaPShNPDvGkB1FZn8UJiMY3zLCOAWKyj5CiPQSQOEGdUKzhQCN3oG0Xh27lSyydHHwA7cCSeRS012Sv74NF16GixZ4f8qx7fMwtV73LdW9K53EwHUGnbHUw6WEOm0wh9ch8QvJcPcPvW3oIdQAA0HEBaB45I3XOaAr95Yq37pBkMblDlC+/fu5ueFt6voIcPM9tD53GrO0G0T/6wAaPAqNEDWcCZTztjbRH+FELEM9rxZidX8/glPd/UBXbgZ/IjSIsknIsZG+KMZhJg1AWxmniKj633A0qeD/Fny9gj1i7f2RhCWrd78y2fXKt4YA/nM4osibDh1o9QsiGTjtrkdFM4fy4rHA6w98UdIwvROiH+roMKx0twdiDgy+zlvgvSohF9PKMn5Nq7Y4KLw19kp53JixBHilkoKefebVTybKNxNMh6c4QiNZucKQqRQWvVlYMgwqVbzqWiJQPM5Mzhks7gDqZCx5s5QIl99w9fczGwurXn9yMjnNzGurFG32fo8ve/hoEAgsO6slJ3/suViTtD+L97BrPqrsnkSv/gOr3aldEfstRgiA0A/v7ApAP6zDOe0TXDHHZ3OkRfopu4HAy+k234k6HQRkvveoS2T53Jz6HrCSplAh2xqpMjRiTl5PF+EpiHiyy3w8zX5oAgNMdkm/Nwv+CwESi8JnAbaCkcOEbiusNfjtxsF/SnaDPqCzX2ezaKu9ElvLcqYDJA2ycQFw4MXy3Vr4gXNdg456Ael7nJbtfARZFrchg8/bhN5itxLOda8/BjMlsA9zE9cXAPUM3W5bANniu75AXkbrI6yQDpsO5Kdf0Y

University of South WalesAlthough we see no reason why we couldnt use existing formats like AFFfor simplicity in our prototype we designed our own simple Submission Information Package or Digital Evidence Bag structureconsisting of a header and data sections.

Its the header that goes into the metadata table in SQL.

24FClusterfs MySQL Tables25VolumeListinginodesauditserveraccessinfoWorkflowTasksmetadatatreenodestate

Password varchar(45)

University of South WalesWe have a table to store Audit information.

ClickWe have a table that stores the automatic workflow actions| that will be initiated when a Digital Evidence Bag arrives | containing a specific type file.It could | for example | routinely extract EXIF information when a JPG arrives.

ClickFinally | we have a table | periodically updated with information about the cluster status used for load balancing.

25

FCluster ArchitectureRoles and ZonesAcquisition AuthorityImagerFClusterfs metadata storageMetadata Import/Load BalancerReplicatorData Storage ServerProcessing ServerResults Meta Data table

26

University of South WalesFClusterfs file system is the core of the Fcluster system.

FCluster is divided into 4 zones.

Rather like a security perimeter,the progress of data through the zones is dependant on the successful attainment of the previous zone.

If a Package is simply copied onto a storage node it will be ignored as it will not have a verifiable chain of evidence through the FClusterfs database.

26Assurance Zones Acquisition - OverviewThe cluster issues an authority to image. This includes a one time use key to be used to encrypt the evidence.The imaging device creates the image, SIP/DEB of the file directory and SIP/DEBs of the file data which are encrypted using the one time use key.SIP/DEBs are pushed/pulled to the cluster27

University of South WalesThe first zone is Acquisition

In FCluster, Imaging is considered to be part of the system, not an extension.

Assurance is established here by an authority | action | completion loop.

Digital evidence bags are always encrypted with a key that must have been created and issued by the cluster.

2728Assurance Zones Acquisition Detail 1 of 6

University of South WalesImaging is controlled by records stored in the VolumeID table

Here, at the start of imaging, we can see 11 existing records2829

Assurance Zones Acquisition Detail 2 of 6

University of South WalesAcquisition authority is granted by creating keys using a simple dialog box.

We connect to the Mysql Database

Select the device to authorise

Enter the number of keys to generate

Set the expiry date

And press Generate

2930

Assurance Zones Acquisition Detail 3 of 6

University of South WalesWe can see that 5 extra records have been added to the database

The device name, creation date, expiry date and the key for the cryptography can be seen

A corresponding file containing the keys has also been created and would be sent to the imaging device3031

Assurance Zones Acquisition Detail 4 of 6

University of South WalesThis is the acquisition program dialog screen on the imaging device.

We can set various parameters as appropriate but, importantly, need to select the key number to use for encryption.

3132Assurance Zones Acquisition Detail 5 of 6

University of South WalesThe Acquisition program creates selective Submission Information Bags

In this case only 9 out of 124 files are chosen to be stored as SIPs.

3233Assurance Zones Acquisition Detail 6 of 6

University of South WalesThe resulting SIPs are stored in a hierarchical structure because of the potential number.

Each one is uniquely named using the file system number and the SHA-1 of the file.33Assurance ZonesMetadata Import/Load Balancing - OverviewThe File directory SIP is imported by decrypting with the one-time key generated for the Acquisition.The VolumeID table is updated

Only if the directory SIP import was successfulStorage locations are chosen and the inodes table is updated accordinglyEach Data SIP File is opened and the Loadbalancer reads the header and chooses primary storage based on the ability of the node to deal with this SIP type.

34

University of South WalesThe second Zone is Metadata Import Load Balancing

The file-system directory meta data is imported from the file that was created during imaging.

The corresponding record in the VolumeID table is updated and records are added to the inodes table. One for each individual file on the original media.

Storage locations are chosen based upon the current cluster status information stored in the node-status table.

The inodes table is updated with this information.34Assurance Zones Metadata Import Detail 1 of 6 35

University of South WalesThis is the Metadata Import screen

On the right | is a dynamic list of the nodes in the cluster | showing which is active and an indication of their current loading.

On the left we have the outline of the host file system | where we have placed the evidence bags staged | ready to be imported.35Assurance Zones Metadata Import Detail 2 of 636

University of South WalesFirst we must import the directory structure SIP.

This is used to create a skeleton directory structure, ready, waiting for the actual data to arrive.3637

Assurance Zones Metadata Import Detail 3 of 6 University of South WalesHere, we can see that 9 SIPs have been identified in the staging folder | and have already been allocated to the storage/processing nodes.3738

Assurance Zones Metadata Import Detail 4 of 6 University of South WalesThe dynamic status shows that the movedata program, running in the background, has yet to actually move the data | to its primary destination.3839

Assurance Zones Metadata Import Detail 5 of 6 University of South WalesAnd finally the movefile program initiates and succeeds in the transfer.

3940

Assurance Zones Metadata Import Detail 6 of 6 University of South WalesThis is the inodes table after metadata import and load balancing.

You can see the first fields, | up to the file size, | contain data youd expect in a typical file system.Beyond that are extra fields we identified as being of interest in forensics.

They include the SHA-1 hash, the original source location and the locations of the storage servers.

You should also see that, whereas there is a record for every file in the original file system, only those records where data has been imported | have the additional information.40Assurance Zones Distribution - Overview41

Each SIP/DEB is read and only if it is expected, ie found in the inodes table, it is copied to the location as recorded in the inode table The SIP is unpacked, decrypted and header data added to the meta-data tableThe inodes table is updated with the storage data statusIn due course, the SIP/DEB will be replicated to 2 other locations and the inodes table updated accordingly.

University of South WalesIn the 3rd Zone The Distribution Zone, the movefile and replicate programs, constantly run in the background, replicating data to its first, second and third storage locations.

The inodes table is updated after every replication.

In the first storage server for a file,the unpack program reads data from the inodes table and tries to find the corresponding SIP.

It extracts the metadata and inserts it into the meta_data table. It then extracts the actual data, decrypts it and saves it in local storage.

Finally, it updates the inodes table with the latest status change.

41Assurance Zones Distribution Detail 1 of 242

University of South WalesHere we can see the complete record for one SIP.4243Assurance Zones Distribution Detail 2 of 2

University of South WalesFor each file imported into the cluster,the forensic metadata, extracted from the SIP, is added to the meta_data table43Assurance Zones Processing - Overview44

Using the processing table, a standard set of tasks is run on the data stored locally on the hostResults are usually recorded as XML formatted data in the results table within the same database referenced by inode number.

University of South WalesThe final Zone is Processing.

The workflow program is constantly scanning for new data arriving on a storage node.

As soon as it arrives,it triggers a standard set of programs to run and the results are inserted into the meta_data table.44Assurance Zones Processing Detail 1 of 145

University of South WalesHere we see that after the arrival of a JPG file an EXIF extraction program was run and the results stored in the corresponding record in the meta_data table.45Audit46

University of South WalesThroughout all of this, every time a file is opened a record is made in the audit table.

So each move, unpack and read is recorded.46Mounting the file system47

2 Select a file system from multiple available3 Choose a mount point1 Connect to a database4 Mount

The mounted FClusterfs file system is indistinguishable from a normal file system

University of South WalesEntire file systems, and multiple file systems, can be mounted using the mount command.

This simple dialog box allows a user to connect, select and mount a file-system.

Mounted file systems are indistinguishable from a normal file system. 47Latency and Multi-threading and Parallel Processing48Linear ProcessingMulti-Threading/Parallel/Distributed Processing

Task setupProcessingTask closure

University of South WalesIt may seem that all this involves a great deal of additional processing but this illustration, hopefully, explains why this is acceptable.

Overall, despite having much longer times to setup and close the tasks, the concurrency of the processing section yields an overall advantage.48Why is this the right approach?This could be achieved within an application program but each package would to implement it and gain approval.Working at file system level the efficacy is globalInteraction with FClusterfs is unavoidableFclusterfs controls data access and maintains Assurance49

University of South WalesWe think this is a good approach because it acts as near to the operating system level as is possible without us writing our own specialist operating system.

It is not possible for an application program to avoid the Fclusterfs file system in gaining access to the data.

We have control.

An weve done it without changing the application programs.49In SummaryDistributed processing is a prime candidate to reduce the backlog but there are problemsWe lose the image; one of the foundations that has evolved in digital forensics over the last 20 yearsWe can replace it by learning from, not adopting, Hadoop50

University of South WalesSo whats the Take Home Message?

We are sure that Distributed Processing can help reduce the processing backlog

and that a major problem,in that we lose The Image, and the integrity we need,can be solved.

50Funded by

51

University of South WalesThis research was funded by the European Social Fund

and Administered by the Knowledge Economy Skills Scholarships51Information Assurancein aDistributed Forensic ClusterQuestions?52 University of South Wales52