In the $MFT - Part 1 post we covered the most basic information about what the $MFT is and how we can view it in hex. In this post I am going to discuss the basic architecture of the $MFT and the individual entries that it contains. The first thing we need to know is that each entry in the $MFT is 1024 bytes stored in hex. If you are unfamiliar with HEX I have written a blog post about it here. We are going to be using the ghex program that we installed in Part 1 and the mft.raw file that was also created in Part 1.
$ ghex mft.raw
Yeah, that's the easy part. We are going to use the first 1024 byte entry to go over the basics and then we are going move to a different entry to cover some of the more complicated aspects of parsing the $MFT. First lets cover some of the information about ghex and some of the terminology we'll need to continue.
I have placed a red box and number next to some important components of this application. Number 1 shows the offset or location (in bytes) that is currently selected. It also shows how many bytes and which bytes we have selected.
As you can see I have selected the entire first entry of the $MFT. The entry is 1024 bytes and starts at offset 0 (just about everything digital starts at 0, not 1, so get used to it). Number 2 is important because it shows the interpretation of the hex that we have selected (which is little indian). Big Indian was used by the old G5 Apple processors so we don't see much of that any more but be aware that it is possible that certain architectures may need to be interpreted as Big Indian. I don't know who chose the names or why but when interpretting multiple bytes stored in Little Indian with ghex you will select the right most byte first then select the ones to the left to get the correct number displayed in the "Signed" bit boxes in the bottom left hand corner of the software.
One thing you have probably already noticed about the previous screenshot is that the second $MFT entry starts with the same FILE that we are see at the beginning of the entry we are working with. All $MFT entries start with the FILE identifier. This makes it easy to identify the beginning of an entry. We call this the "FILE" identifier and is it always the first four bytes (offset 0-3) of the entry and is the start of this $MFT entries header.
The following two bytes (offset 4 and 5) are the offset to the Fix-Up Code. I am not going to spend any time on this at the moment. Other than in this case the 2 byte fix-up code is located starting at offset 48.
Offset 6 and 7 are the Size of Fix-Up Code in bytes. In this case the size is 3 bytes.
Offset 8-15 are the $Logfile Sequence Number. This is directly related to the $Logfile on NTFS volumes. I am not going to discuss this further in this post but you can expect a future post about this.
Offset 16 and 17 are the Sequence Number. Each time an entry uses this entry location in the $MFT the Sequence Number will increase by one. Entries 1 - 16 (these files are all created when partitioning an NTFS volume) of the $MFT will always be the entry location (1-16). The $MFT entry (0) will always display 1.
Offset 18 and 19 are the Hard Link Count. Hard links are specific to NTFS volumes and are different then the .lnk file we often see. There may be a future post discussing this as well.
Offset 20 and 21 are the Offset to the Start of Attributes. We will be looking at attributes in the next post on the $MFT. This will always be h/30 /00 or h/38 /00. When interpreted as Little Indian these numbers will be either 48 or 56, consecutively.
Offset 22 and 23 are the Allocation Flag. These bytes tell us if the file or directory is still allocated or if it has been deleted.
Offset 24-27 is the $MFT Actual Record Size. This shows how many bytes of the 1024 bytes is actually being used by the entry. In this case it is 416 bytes. Remember to select the hex from right to left because this data is stored in Little Indian.
Offset 28-31 is the $MFT Physical Record Size. This will always be 1024 bytes because that is always the size allocated for each entry. (NOTE: This 1024 byte number is consistent for NT 4 and above. It is not likely that you will see anything older than NT 4 unless you go looking for it).
Offset 32-39 is the $MFT Base Reference. I'm not going to discuss this today.
Offset 40 and 41 is the Next Attribute Number. This indicates what number is attribute will be added if any new attributes are added. It gives us the total number of attributes for this entry, plus 1. The next attribute added will be number 5 meaning that attributes 1, 2, 3, and 4 are already being used.
This covers the basics of the $MFT Record Header. In the next post I will cover some of the $MFT attributes.