Sunday, July 27, 2014

Physical Disks and Logical Volumes

At the beginning of last week I took a leap into the world of Open Source forensics at a new level.  My goal is to complete a full case using only open source tools.  As a result, more posts.

So the purpose of this post is a basic overview how physical disks relate to logical volumes and file systems specifically the New Technology File System (NTFS).  Lets take a look at physical disks.  Physical disks is referring to hard drives.


Hard drives historically have been magnetic disks or platters that spin.  More recently we have seen flash memory/solid state drives that stores the data on microchips.  Both of these drives, typically, store data in 512 byte sectors.  These sectors are the smallest physical storage unit on the drive.  Sectors are tracked with factory set tracking controlled by the hard drives circuit board.

Ideally, all of a files data would be stored contiguously or in a linear fashion, however, this is impractical because files are continuously being moved around, added or deleted.  Another consideration is that files are NOT perfectly sized to fit into these disk sectors.  Also, we haven't addressed logical volumes yet.

Logical volumes are data sets on the disk and contain a file system (like NTFS, FATxx, EXTx, ZFS, JFS, UFS, XFS, HFS+, and many, many others).  These file systems refer to and track data in clusters.  Clusters may contain 1 or more sectors and have a minimum size of 512 bytes but are commonly larger on bigger data sets such as multiple terabyte volumes.  These logical volumes are contained in partitions.  Partitions can be seen with the Sleuth Kit's mmls command.  You can often see the file system related to that partition as well as seen below:

$ mmls -B nps-2008-jean.E01
DOS Partition Table
Offset Sector: 0
Units are in 512-byte sectors

     Slot    Start        End          Length       Size    Description
00:  Meta    0000000000   0000000000   0000000001   0512B   Primary Table (#0)
01:  -----   0000000000   0000000062   0000000063   0031K   Unallocated
02:  00:00   0000000063   0020948759   0020948697   0009G   NTFS (0x07)
03:  -----   0020948760   0020971519   0000022760   0011M   Unallocated

For more information on the mmls command and the Master Boot Record refer to my previous post here.

So in short and to summarize, data is stored physically in 512 byte sectors.  Those sectors are contained in one and only one logical cluster.  That cluster may or may not contain other sectors.  That cluster is tracked by the file system.

Now to talk about files and, loosely, how files are stored on a disk.  Files are a single entity and are very important to forensic exams.  Files can be smaller than a sector or larger than a cluster.  So how do file systems store these odd numbers?  It's really quite simple.

Because file systems track information at the logical cluster level all files are contained in at least one cluster and only one file can be stored in each cluster.  For example, if we have a file system with a logical cluster size of 1024 bytes (or two sectors) and a file that is 1411 bytes the file system will be forced to allocate two full clusters to the file.  The first file will contain the first 1024 bytes of the file.  The remaining 387 bytes will go into the second cluster.  This leaves 637 bytes in that cluster that isn't being used.

The second cluster in this example still contains two physical sectors.  Of the two physical sectors only the first is being used.  In the first sector only 387 bytes is being used.  This leaves 125 bytes in that first physical sector that isn't being used.  This is generally how files are stored.  There are a few exceptions to this rule including NTFS which may store files under a certain size inside of the Master File Table.

Files will always start at the beginning of a cluster and in turn will also always start at the beginning of a sector.  This doesn't mean that the beginning of each cluster or sector always contains the beginning of a file.  Hopefully, this example made that clear.  In the next post I am going to discuss data carving and things like ram slack and file slack.