Saturday, December 19, 2009

Recovering Deleted JPEGs from a FAT File System - Part 4

Part 4 in a series of posts on recovering deleted JPEG files from a FAT file system.

Now that we've created a test data set and covered the high-level structure of a FAT file system, it's time to get our hands dirty with some code. Enough talk, more action!!!

Follow the "read more" link for a detailed description of how to read and interpret a FAT file system boot sector.

The following C structure defines the first 36 bytes of the FAT boot sector which is common across all three variants - FAT12, FAT16, and FAT32.

// COMMON BOOT SECTOR
#define FAT_COMBS_OEMSZ (8)

typedef struct {
  uint8_t  jmpInstruction[3];            // 02-00
  uint8_t  oemName[FAT_COMBS_OEMSZ];     // 10-03
  uint16_t bytesPerSector;               // 12-11
  uint8_t  sectorsPerCluster;            // 13-13
  uint16_t reservedSectors;              // 15-14
  uint8_t  numberOfFATs;                 // 16-16
  uint16_t numberOfRootDirEntries;       // 18-17
  uint16_t numberOfSectorsShort;         // 20-19
  uint8_t  mediaType;                    // 21-21
  uint16_t sectorsPerFAT;                // 23-22
  uint16_t sectorsPerTrack;              // 25-24
  uint16_t numberOfHeads;                // 27-26
  uint32_t numberHiddenSectors;          // 31-28
  uint32_t numberOfSectorsLong;          // 35-32
} __attribute__ ((packed))  fatBootSectorCommon_t;

The packed attribute instructs the compiler to place the structure members adjacent to each other in memory. Without this, the compiler might insert padding between structure members to optimize their alignment. In this case such padding would cause a mismatch between the in-memory and on-disk structure layouts making reading the boot sector more cumbersome.

As might be expected, the common part of the boot sector provides the information needed to understand the file system's on-disk structure. Particular fields of interest are:

  • oemName ASCII description of the OS used to format the file system. OS dependent, no null termination.
  • bytesPerSector: sector size in bytes - see part 3 for a brief explanation.
  • sectorsPerCluster: data area cluster size in sectors - see part 3 for a brief explanation.
  • reservedSectors: number of sectors consumed by the boot sector and any additional reserved space.
  • numberOfFATs: multiple copies of the file allocation table may be kept to guard against disk errors, this field indicates the number of copies in the FAT area.
  • numberOfRootDirEntries: size of the root directory area in terms of the maximum number of directory entries. Recall that in both FAT12 and FAT16 space is reserved after the FAT area to store the root directory's contents. The size of a directory entry, 32 bytes, is needed to compute the size of the root directory area from this parameter.
  • numberOfSectorsShort: total number of sectors in the volume, only valid if the count is less than 64K. Otherwise the value of this field will be 0 and numberOfSectorsLong will contain the correct value.
  • sectorsPerFAT: size in sectors of a single copy of the file allocation table in the FAT area.
  • numberOfSectorsLong: total number of sectors in the volume, only valid if the count is greater than or equal to 64K. Otherwise the value of this field will be 0 and numberOfSectorsShort will contain the correct value.

The remainder of a FAT12 or FAT16 boot sector has the following format:

// FAT12 & FAT16 EXTENDED BOOT SECTOR
#define FAT_EXTBS_START (36)
#define FAT_EXTBS_VLSZ  (11)
#define FAT_EXTBS_FSSZ  (8)

typedef struct {
  uint8_t  driveNumber;                    // 36-36
  uint8_t  reserved;                       // 37-37
  uint8_t  bootSignature;                  // 38-38
  uint32_t volumeID;                       // 42-39
  uint8_t  volumeLabel[FAT_EXTBS_VLSZ];    // 53-43
  uint8_t  filesystemType[FAT_EXTBS_FSSZ]; // 61-54
  uint8_t  reserved2[448];                 // 509-62
  uint16_t signature;                      // 511-510
} __attribute__ ((packed))  fatBootSectorExt1216_t;

Much of the extended boot sector is reserved for the possible inclusion of boot code. For this project, only the following extended boot sector fields are of marginal interest:

  • volumeID: numerical ID - value dependent on the OS/tool used to create the file system.
  • volumeLabel: ASCII name for the file system, used by some OSs for the mount-point name. No null termination.
  • fileSystemType: ASCII description of file system type. No standard values, examples include "FAT", "FAT12", and "FAT16". No null termination.

The following C code fragment demonstrates the use of these structures to read the common and extended boot sectors from an image file opened with the descriptor fdImage.

fatBootSectorCommon_t  bootSectorCommon;
fatBootSectorExt1216_t bootSectorExt1216;

fseek(fdImage, 0, SEEK_SET);
fread(&(bootSectorCommon), 
      sizeof(fatBootSectorCommon_t), 1, fdImage);

fseek(fdImage, FAT_EXTBS_START, SEEK_SET);
fread(&(bootSectorExt1216), 
      sizeof(fatBootSectorExt1216_t), 1, fdImage); 

Thanks to the use of the packed attribute, the on-disk structures can be read directly into their in-memory counterparts and accessed using the standard C structure idioms.

The following code fragment shows how to use the boot sector information to calculate the location and size of the major file system areas.

#define ALIGN(size, alignment) \
 (((size) + (alignment-1)) / (alignment))

size_t volSizeSectors;
size_t fatFirstSector;
size_t fatSizeSectors;
size_t rootdirFirstSector;
size_t rootdirSizeSectors;
size_t clustersFirstSector;
size_t clustersSizeClusters;

if (bootSectorCommon.numberOfSectorsShort > 0) {
  volSizeSectors = 
    bootSectorCommon.numberOfSectorsShort;
  
} else {
  volSizeSectors = 
    bootSectorCommon.numberOfSectorsLong;
}


fatFirstSector = 
  bootSectorCommon.reservedSectors;

fatSizeSectors = 
  bootSectorCommon.sectorsPerFAT * 
  bootSectorCommon.numberOfFATs;

  
rootdirFirstSector = 
  fatFirstSector + 
  fatSizeSectors;

rootdirSizeSectors = 
  ALIGN((bootSectorCommon.numberOfRootDirEntries * 32,
         bootSectorCommon.bytesPerSector);


clustersFirstSector = 
  rootdirFirstSector + 
  rootdirSizeSectors;

clustersSizeClusters = 
  (volSizeSectors - clustersFirstSector) /
  bootSectorCommon.sectorsPerCluster;

As mentioned above, calculating the size of the root directory area requires knowing the size of a directory entry. We'll discuss the root directory in a future post - for now it is sufficient to know that each entry is 32 bytes.

Printing the boot sector's contents to a console is a fairly simply matter with the only complication being that the ASCII information should first be copied to null-terminated strings.

char oemName[FAT_COMBS_OEMSZ+1];
char volumeLabel[FAT_EXTBS_VLSZ+1];
char filesystemType[FAT_EXTBS_FSSZ+1];

strncpy(oemName,        
        (char *)pBootSectorCommon->oemName, 
        FAT_COMBS_OEMSZ);
strncpy(volumeLabel,    
        (char *)pBootSectorExt1216->volumeLabel, 
        FAT_EXTBS_VLSZ);
strncpy(filesystemType, 
        (char *)pBootSectorExt1216->filesystemType, 
        FAT_EXTBS_FSSZ);

oemName[FAT_COMBS_OEMSZ]       = 0;
volumeLabel[FAT_EXTBS_VLSZ]    = 0;
filesystemType[FAT_EXTBS_FSSZ] = 0;

Voila! This should be enough information to write a simple program capable of reading a FAT file system boot sector, locating the major areas, and printing the information to a console.

As an example, below is the output of the program I am (re)developing alongside these posts from which the code fragments above were taken. The output was generated by running the program against the test disk image created in part 2 of this series. Recall that my development platform is an Apple MacBookPro running OS X Snow Leopard (10.6.2).

$ sw_vers
ProductName:    Mac OS X
ProductVersion: 10.6.2
BuildVersion:   10C540

$ wc -l fatrecover.c 
     306 fatrecover.c

$ make all
gcc -g -Wall fatrecover.c -o fatrecover

$ ./fatrecover frtest1.dmg 

=== FAT RECOVER v0.0 ===

Opening file frtest1.dmg................OK
Reading boot sector.....................OK
Processing boot sector..................OK

BOOT SECTOR:
          OEMName: (BSD  4.4)
         volumeID: 63260cf5
     Volume Label: FRTEST1    
           FSType: FAT16   
     Bytes/Sector: 512
  Sectors/Cluster: 4
 Reserved Sectors: 1
   Hidden Sectors: 0
 Sectors/Vol (2B): 32768
 Sectors/Vol (4B): 0
      Sectors/FAT: 32
        # of FATs: 2
  RootDir Entries: 512

MAJOR STRUCTURES:
      AREA    OFFSET      SIZE
----------  --------  --------
 BOOT+RESV  00000000  0x000001 (sectors)
       FAT  0x000001  0x000040 (sectors)
   ROOTDIR  0x000041  0x000020 (sectors)
      DATA  0x000061  0x001fe7 (clusters)

Note that the structure offsets are in units of sectors.

From the output we can see that the:

  • volume name is "FRTEST1", matches the name specified when the disk image was created.
  • sector and cluster sizes are 512 and 2048 bytes respectively.
  • FAT area starts at sector 1 and is 64 sectors long (2 * 32).
  • root directory can contain up to 512 entries.
  • data area contains 0x1fe7 or 8167 clusters.
  • volume contains a total of 32768 sectors - 16MB as expected.

To sanity check the output, let's validate the size of the data area.

Total Sectors327680x8000
Less Boot Sector327670x7FFF
Less FATs327030x7FBF
Less RootDir326710x7F9F
Divided by 481670x1FE7
(remaining sectors)3

The check looks good. The remaining three sectors further validate the calculations. Only the boot sector consumed an odd number of sectors - one - which implies that there should be three or less unused sectors in the data area (since the cluster size is four sectors).

As a second sanity check, let's make sure the file allocation table's are an appropriate size. Since FAT entries are 16 bits (2 bytes), we know that each FAT contains (32*512)/2 = 8192 entries which is enough to represent the 8167 clusters in the data area.

The sanity checks appear to pass - it looks like the above code works. In the next post we'll skip ahead to processing the root directory before tacking the FAT.

Addendum

One important detail omitted from the main-line discussion above is that all FAT file system data structures are stored on-disk in little endian order. As I am using a little endian machine (Intel Core 2 Duo) the example code above uses the value of all multi-byte fields directly. A big endian machine, however, would have to convert all of the multi-byte field values from little to big endian before using them.

When I originally did this project in 2004, I used ah PowerPC Apple PowerBook which had a big endian architecture. As a result, my original program had to do the endian conversion. Below are the macros I wrote to do the necessary byte swapping.

//  Macros to byte swap to change endianess
//
#if BYTE_ORDER == BIG_ENDIAN


#define LEToHost16(_x)                          \
do {                                            \
    unsigned char *buff = (unsigned char *) _x; \
    unsigned char tmp;                          \
    tmp     = buff[1];                          \
    buff[1] = buff[0];                          \
    buff[0] = tmp;                              \
   } while(0)


#define LEToHost24(_x)                          \
do {                                            \
    unsigned char *buff = (unsigned char *) _x; \
    unsigned char tmp;                          \
    tmp     = buff[2];                          \
    buff[2] = buff[0];                          \
    buff[0] = tmp;                              \
   } while(0)


#define LEToHost32(_x)                          \
do {                                            \
    unsigned char *buff = (unsigned char *) _x; \
    unsigned char tmp;                          \
    tmp     = buff[0];                          \
    buff[0] = buff[3];                          \
    buff[3] = tmp;                              \
    tmp     = buff[1];                          \
    buff[1] = buff[2];                          \
    buff[2] = tmp;
   } while(0)

#else

#define LEToHost16(_x) 
#define LEToHost24(_x) 
#define LEToHost32(_x) 

#endif