Part 6 in a series of posts on recovering deleted JPEG files from a FAT file system.
In this post we'll tackle the last major FAT file system data structure - the file allocation table. Compared to the previous post on the root directory, this post is much less complex. Follow the "read more" link for an overview of the file allocation table and the code required to process it.
As mentioned in previous posts, a FAT16 file allocation table consists of one 16 bit entry for each cluster in the data area. The value stored in each entry indicates the associated cluster's allocation state and, if needed, the next cluster containing data for the same file system object.
Although table entries exist for clusters 0
and 1
, they are
reserved for the system's use. The first byte of entry 0
is used to
store a copy of the mediaType
field from the common boot sector -
the entry's remaining bits are set to 1
. Entry 1
is sometimes used
to record if the volume was unmounted cleanly. Because of these
reservations, the data area cluster information beings with entry 2
.
Initially, table entries 2
and higher are initialized to a value of
0x0000
indicating that the associated clusters are available for
use. Bad clusters are signified by the value 0xFFF7
while reserved
clusters have a value between 0xFFF0
and 0xFFF6
(inclusive).
Table entries for allocated clusters contain either:
- The number of the next cluster containing additional data for the associated file system object (file or directory).
-
A value between
0xFFF8
and0xFFFF
(inclusive) indicating that this is the last cluster containing data for the associated file system object. Collectively, these values are known as end-of-chain (EOC
) values.
Essentially, allocated table entries form a singly linked-list specifying the location and order of the clusters containing a file system object's data. To illustrate this, consider the following diagram:
ROOT FAT DIR TABLE +--------+------+ +-------+-------+ | FNMAME | 1ST C| | ENTRY | VALUE | +========+======+ +=======+=======+ | FILEA | 0x02 +-------+ | 0X00 | RESVD | | | | | | | | +--------+------+ | +-------+-------+ | FILEB | 0x03 +-----+ | | 0X01 | RESVD | | | | | | | | | +--------+------+ | | +-------+-------+ | | | | +--->| 0X02 | EOC | | | | | | | | | +-------+-------+ +----->| 0X03 | 0X04 | | | +----+ +-------+-------+ | | 0X04 | 0X06 |<---+ | | +----+ +-------+-------+ | | 0X05 | 0X00 | | | | | | +-------+-------+ | | 0X06 | EOC |<---+ | | | +-------+-------+ | | | | | |
The diagram depicts two examples, FILEA
and FILEB
, and their
associated entries in the root directory and file allocation table.
FILEA
's root directory entry states that cluster 0x02
contains the
first portion of its data. To reflect this allocation, file allocation
table entry 0x02
has a non-zero value. Furthermore, entry 0x02
contains an EOC
value indicating that no other clusters contain data
for FILEA
- all of FILEA
's data fits within the single cluster.
FILEB
is a longer file requiring three clusters to store its
data. Its root directory entry identifies cluster 0x03
as the first
and, appropriately, file allocation table entry 0x03
contains a
non-zero value. In this case, entry 0x03
contains the value 0x04
indicating that FILEB
's second cluster of data resides in cluster
0x04
. File allocation table entry 0x04
in turn contains the value
of 0x06
identifying cluster 0x06
as the third holding FILEB
's
data. Finally, file allocation table 0x06
contains an EOC
value
indicating that it is the last cluster containing FILEB
's data. Note
that cluster 0x05
, although between clusters holding FILEB
's data,
remains unallocated and is available for use.
The example illustrates how the file allocation table is used to both track cluster allocations and construct linked-lists describing the location of file system object data. In FAT parlance, these linked lists are called cluster chains.
The file allocation table forms the heart of the file system - a fact reflected by the file system's name, FAT. Before discussing the code required to process the on-disk table, it's worth noting a couple of things:
- The smallest allocation unit is a cluster - 2KB in our test image's case. File system objects smaller than a cluster consume the whole thing. As a result, a large number of small files will result in severe internal fragmentation.
- The file allocation table only contains forward pointers. There are are no back pointers to prior clusters or the directory entry for the file system object the chain belongs to.
- The file allocation table alone describes the location of file system object data - erasing the table essentially erases the file system although the data remains unchanged.
Recall from the first post that the third point was the basis for the original project in 2004. File systems that erase files by only clearing their directory and file allocation table entries leave the data in place thus making it theoretically possible to recover them.
Compared to the root directory, processing the file allocation table is trivial. The following code fragment provides all of the required constants, macros, and type definitions:
#define FAT_FAT16ENT_FREESECTOR (0x0000U) #define FAT_FAT16ENT_RESERVED_START (0xfff0U) #define FAT_FAT16ENT_RESERVED_END (0xfff6U) #define FAT_FAT16ENT_BADSECTOR (0xfff7U) #define FAT_FAT16ENT_LASTSECTOR_START (0xfff8U) #define FAT_FAT16ENT_LASTSECTOR_END (0xffffU) #define FAT_FAT16ENT_ISFREESECTOR(x) \ ((x) == FAT_FAT16ENT_FREESECTOR) #define FAT_FAT16ENT_ISBADSECTOR(x) \ ((x) == FAT_FAT16ENT_BADSECTOR) #define FAT_FAT16ENT_ISLASTSECTOR(x) \ ((x) >= FAT_FAT16ENT_LASTSECTOR_START) #define FAT_FAT16ENT_ISRESERVEDSECTOR(x)\ (((x) >= FAT_FAT16ENT_RESERVED_START) && \ ((x) <= FAT_FAT16ENT_RESERVED_END)) typedef uint16_t fatFAT16Entry_t;
Reading a cluster entry requires calculating the associated sector number and offset, loading the sector into memory, and extracting the correct two bytes.
fatTableOffset = clusterNum * sizeof(fatFAT16Entry_t); sectorNum = (fatTableOffset / bytesPerSector) + pimageInfo->fatFirstSector; sectorOffset = fatTableOffset % bytesPerSector; frSectorRead(pimageInfo, sectorNum, 1, pSectorBuff); memcpy(pfatEntry, (pSectorBuff + sectorOffset), sizeof(fatFAT16Entry_t));
Alternatively, a portion of the table loaded into a memory buffer can be iterated over as an array.
fatFAT16Entry_t *fatTable; fatFAT16Entry_t entry; fatTable = (fatFAT16Entry_t *) pfatTableBuffer; for(entryIndex=0; entryIndex < numEntries; entryIndex++ ) { entry = fatTable[entryIndex]; .... }
Using the code fragments above, I enhanced the program I am
(re)developing alongside these posts to process and printout the file
allocation table. Running it against the test disk image,
frtest1.dmg
, created in the second post and used to produce the
examples in the prior two posts yielded:
$ sw_vers ProductName: Mac OS X ProductVersion: 10.6.2 BuildVersion: 10C540 $ wc -l fatrecover.c 829 fatrecover.c $ make clean all rm -r fatrecover fatrecover.dSYM gcc -g -Wall fatrecover.c -o fatrecover $ ./fatrecover frtest1.dmg === FAT RECOVER v0.0 === Opening file frtest1.dmg................OK Reading boot sector.....................OK Processing boot sector..................OK Reading root dir........................OK BOOT SECTOR: OEMName: (BSD 4.4) volumeID: 63260cf5 Volume Label: FRTEST1 FSType: FAT16 Bytes/Sector: 512 Sectors/Cluster: 4 Reserved Sectors: 1 Hidden Sectors: 0 Sectors/Vol (2B): 32768 Sectors/Vol (4B): 0 Sectors/FAT: 32 # of FATs: 2 # RootDir Entries: 512 MAJOR STRUCTURES: AREA OFFSET SIZE ---------- -------- -------- BOOT+RESV 00000000 0x000001 (sectors) FAT 0x000001 0x000040 (sectors) ROOTDIR 0x000041 0x000020 (sectors) DATA 0x000061 0x001fe7 (clusters) LISTING DIR: ROOT ATTR SIZE NAME 1ST CLUSTER ------ -------- -------------------- ----------- .....A 92889 4.2.04.jpg (0x00000a) .H..D. 0 .Trashes (0x000002) .H...A 4096 ._.Trashes (0x000003) ...V.A 0 FRTEST1. (00000000) FAT TABLE ( KEY: .=free B=bad -=used X=last R=reserved) 00000000-0000001F: RXX-X.....---------------------- 00000020-0000003F: -----------------------X........ 00000040-0000005F: ................................ 00000060-0000007F: ................................ <OUTPUT FOR EMPTY ENTRIES OMITTED> CLUSTER CHAIN FOR ENTRY 0xa: 0x000a->0x000b->0x000c->0x000d->0x000e->0x000f->0x0010-> 0x0011->0x0012->0x0013->0x0014->0x0015->0x0016->0x0017-> 0x0018->0x0019->0x001a->0x001b->0x001c->0x001d->0x001e-> 0x001f->0x0020->0x0021->0x0022->0x0023->0x0024->0x0025-> 0x0026->0x0027->0x0028->0x0029->0x002a->0x002b->0x002c-> 0x002d->0x002e->0x002f->0x0030->0x0031->0x0032->0x0033-> 0x0034->0x0035->0x0036->0x0037->EOC (46 clusters)
As discussed previously, file allocation table entries 0
and 1
are
reserved by the system. The program output indeed shows that entry 0
contains a RESERVED
value but entry 1
instead contains an EOC
value. A little research1 uncovered that EOC
values are used to both
prevent entry 1
from being used and to encode the volume's last
unmount state - the output thus makes sense.
Table entry 2
is also allocated and the root directory listing tells
us that it belongs to the .Trashes
subdirectory. Furthermore, the
EOC
value indicates that the directory's contents fit within the
single sector.
Entry 4
is allocated to the ._.Trashes
file which requires two
clusters to store its 4096 bytes of data. Presumably, entry 5
also belongs to this file and is the last in its cluster chain.
Finally, entries 0x0a
to 0x37
are all allocated to the test JPG
file 4.2.04.jpg
. This is suggested by the root directory listing and
allocation table summary and confirmed by the additional output
showing the cluster chain starting at entry 0x0a
. The cluster
chain's length, 46, provides a sanity check as it is the smallest
number of clusters required to store the file's 92889
bytes of data
(46 * 2KB -> 94208B
).
With that we now have all the code required to examine a FAT file system disk image. With a little work, the code fragments provided in this and prior posts can be used to create a program capable of inspecting file system images of varying size and complexity.
In the next post, we'll regroup by using the functionality developed thus far to inspect various file system images to determine the best approach for recovering deleted JPG files.