Saturday, May 10, 2014

The Master File Table - Part 2


In the $MFT - Part 1 post we covered the most basic information about what the $MFT is and how we can view it in hex.  In this post I am going to discuss the basic architecture of the $MFT and the individual entries that it contains.  The first thing we need to know is that each entry in the $MFT is 1024 bytes stored in hex.  If you are unfamiliar with HEX I have written a blog post about it here.  We are going to be using the ghex program that we installed in Part 1 and the mft.raw file that was also created in Part 1.

$ ghex mft.raw

Yeah, that's the easy part.  We are going to use the first 1024 byte entry to go over the basics and then we are going move to a different entry to cover some of the more complicated aspects of parsing the $MFT.  First lets cover some of the information about ghex and some of the terminology we'll need to continue.


I have placed a red box and number next to some important components of this application.  Number 1 shows the offset or location (in bytes) that is currently selected.  It also shows how many bytes and which bytes we have selected.

As you can see I have selected the entire first entry of the $MFT.  The entry is 1024 bytes and starts at offset 0 (just about everything digital starts at 0, not 1, so get used to it).  Number 2 is important because it shows the interpretation of the hex that we have selected (which is little indian).  Big Indian was used by the old G5 Apple processors so we don't see much of that any more but be aware that it is possible that certain architectures may need to be interpreted as Big Indian.  I don't know who chose the names or why but when interpretting multiple bytes stored in Little Indian with ghex you will select the right most byte first then select the ones to the left to get the correct number displayed in the "Signed" bit boxes in the bottom left hand corner of the software.

One thing you have probably already noticed about the previous screenshot is that the second $MFT entry starts with the same FILE that we are see at the beginning of the entry we are working with.  All $MFT entries start with the FILE identifier.  This makes it easy to identify the beginning of an entry.  We call this the "FILE" identifier and is it always the first four bytes (offset 0-3) of the entry and is the start of this $MFT entries header.



The following two bytes (offset 4 and 5) are the offset to the Fix-Up Code.  I am not going to spend any time on this at the moment.  Other than in this case the 2 byte fix-up code is located starting at offset 48.



Offset 6 and 7 are the Size of Fix-Up Code in bytes.  In this case the size is 3 bytes.

Offset 8-15 are the $Logfile Sequence Number.  This is directly related to the $Logfile on NTFS volumes.  I am not going to discuss this further in this post but you can expect a future post about this.

Offset 16 and 17 are the Sequence Number.  Each time an entry uses this entry location in the $MFT the Sequence Number will increase by one.  Entries 1 - 16 (these files are all created when partitioning an NTFS volume) of the $MFT will always be the entry location (1-16).  The $MFT entry (0) will always display 1.

Offset 18 and 19 are the Hard Link Count.  Hard links are specific to NTFS volumes and are different then the .lnk file we often see.  There may be a future post discussing this as well.

Offset 20 and 21 are the Offset to the Start of Attributes.  We will be looking at attributes in the next post on the $MFT.  This will always be h/30 /00 or h/38 /00.  When interpreted as Little Indian these numbers will be either 48 or 56, consecutively.

Offset 22 and 23 are the Allocation Flag.  These bytes tell us if the file or directory is still allocated or if it has been deleted.


Offset 24-27 is the $MFT Actual Record Size.  This shows how many bytes of the 1024 bytes is actually being used by the entry.  In this case it is 416 bytes.  Remember to select the hex from right to left because this data is stored in Little Indian.


Offset 28-31 is the $MFT Physical Record Size.  This will always be 1024 bytes because that is always the size allocated for each entry. (NOTE: This 1024 byte number is consistent for NT 4 and above.  It is not likely that you will see anything older than NT 4 unless you go looking for it).

Offset 32-39 is the $MFT Base Reference.  I'm not going to discuss this today.

Offset 40 and 41 is the Next Attribute Number.  This indicates what number is attribute will be added if any new attributes are added.  It gives us the total number of attributes for this entry, plus 1.  The next attribute added will be number 5 meaning that attributes 1, 2, 3, and 4 are already being used.

This covers the basics of the $MFT Record Header.  In the next post I will cover some of the $MFT attributes.

Thursday, May 8, 2014

Creating a Timeline with The Sleuth Kit


The Sleuth Kit is a very powerful set of tools.  When performing a forensic exam dates and times are often the only way to prove who was behind the keyboard or to isolate the events leading up to a breach and attack.  One tool that the Sleuth Kit provides us is the ability to create a timeline that we can review with as a spreadsheet.  We are going to be using the same E01 (nps-2008-jean) file that we have been using in the other posts.  There are a few steps to making command lines with the Sleuth Kit.  The first is the ils command.

$ ils -em -o63 nps-2008-jean.E01 | tee nps-2008-jean.bodyfile

The ils command lists inode information.  With the -em options we are asking the machine to display all inodes in the partition and to display it in ASCII (mactime).  Finally, we used the linux | to push the output to another command, in this case we have used the tee command.  This command allows us to create a new file (which we have called nps-2008-jean.bodyfile).  The output of this new file looks something like what you see below, but much, much larger:

md5|file|st_ino|st_ls|st_uid|st_gid|st_size|st_atime|st_mtime|st_ctime|st_crtime
0|<nps-2008-jean.E01-$MFT-alive-0>|0|-/rr-xr-xr-x|0|0|33636352|1210717123|1210717123|1210717123|1210717123
0|<nps-2008-jean.E01-$MFTMirr-alive-1>|1|-/rr-xr-xr-x|0|0|4096|1210717123|1210717123|1210717123|1210717123
0|<nps-2008-jean.E01-$LogFile-alive-2>|2|-/rr-xr-xr-x|0|0|55738368|1210717123|1210717123|1210717123|1210717123
0|<nps-2008-jean.E01-$Volume-alive-3>|3|-/rr-xr-xr-x|48|0|0|1210717123|1210717123|1210717123|1210717123

We aren't going to be working directly with this file so lets continue with the process.  Next we will be using the fls command.  This command lists file information and just like the last command we will be outputting the data from this command to the same bodyfile.

$ fls -o63 -r -m "/" nps-2008-jean.E01 >> nps-2008-jean.bodyfile

We can see a very similar command here.  We still are still point the command to the correct partition and we are still using the -m option.  This time we have also added the -r option to look recursively at the file starting at "/" or the root directory.  We have used the >> in place of the | to append the output of this command to the same bodyfile that we created with the last command.

If you were to review the file now you would see the new data is very similar to the data generated with the ils command.  During the final step of creating the timeline we will be using the mactime command to interpret the bodyfile and generate a file that can be interpreted with Excel or a similar spreadsheet software.

$ mactime -b nps-2008-jean.bodyfile -d > nps-2008-jean.timeline.csv

In this command we are using the -b option allowing us to point the command at a file (like the bodyfile we created).  The -d option outputs the results of the mactime command to a tab delimited csv file.  We used the > to specify the name and location of the csv.  Why do we need this?  Lets take a look.  I am sing OpenOffice to view the file.


We can see that the mactime command has organized the dates and times into a chronological order making it very simple for us to see the relationship of events based on the location in the timeline.  If we look under the "Meta" column we can see the inode number making it easier for us to identify the files we are interested in or even display the contents of the file with the icat command easily.

$ icat -o63 nps-2008-jean.E01 17015
[General]
Display Name=Outlook Forms Redirector
Description=Redirects Exchange/Outlook forms to whichever of the two is running.
Path="frmrdrct.dll"
Entry Point=1
Client Version=4.0
Misc Flags=Disabled;NoUserEdit

[Exchange Client Compatibility]
Exchange Registry=1
Exchange Extension Key="Outlook/Exchange Forms Coexistence"

So it's pretty obvious that if dates and times are a key part of your investigation it may be very helpful to create a timeline.

HEX - The Basics


This blog post is just an explanation of Hexadecimal.

Hexadecimal or hex is a common way to view digital data and is directly related to the 1's and 0's that you always hear about when people discuss computers data.  Hex characters include:

0 1 2 3 4 5 6 7 8 9 A B C D E or F

Each Hex character represents four bits.  A bit is the smallest form of digital data and is represented by either a 0 or 1.  Think of it like a switch.  It is either off (0) or on (1).  You can think of a hard drive as an incredible number of microscopic switches.  So when you are viewing hex each character represents four bits.  Hex is any one of the 16 possible combinations that you can make from the variables of the four bits.  A group of four bits is called a nibble.  A group of two nibbles is called a byte.  Below you can see a diagram of how the 0 and 1 combinations can make one of the 16 possible hex characters.  Take some time to look this diagrams over to truly understand what is happening.  It's not complicated but if you aren't mathematically oriented (which I'm not) it takes a minute to ingest this.


So you can see that (if reading from left to right) the first bit is equivalent to an eight.  The second is worth equivalent to four, the third is equivalent to two, and the fourth is equivalent to 1.  The rest is basic arithmetic.  Please take a quick not that his is for Intel based processors and that it is possible for this to be reversed based on the architecture of the device but the majority of what we see is interpreted in this manner.

When viewing hex you will usually see the hex displayed in bytes with two nibbles next to one another and a space between each byte (pair of nibbles), like we see in this ghex screenshot:


As the user we usually encounter data with an extra layer of conversion (unless we are viewing the data with a hex editor).  Our interaction with a text file will help us understand this.  Go ahead and create a text file on your desktop (with notepad or gedit or whatever text editor you prefer.  Call it HEXTest.txt and type "Hello World!" in the file, then save it.  Now open that file with your hex editor.


We cans see our data and to the right we can see our plain text data.  So the byte 48 that is the first byte of our text file is interpreted by the text editor as the capital letter H.  The extra layer of interpretation that we are seeing is called ASCII or the American Standard Code for Information Interchange.  Essentially ASCII displays specific hex bytes into English characters and number.  Other software may interpret the hex in a different manner.  Below is a chart showing how hex characters are interpreted with ASCII.

.

So what happens when you add data to your text file at the hex level using your hex editor?  Lets find out. 


As you can see I've added a few zeros to the hex and also the word FAB.  Now lets open the again with the text error.


The hex editor initially stated that the file had "unsupported" characters and then displayed the hex plainly.  As you can see understanding what you are seeing in hex can greatly improve you understanding of the file.

More later.