Part 9 in a series of posts on recovering deleted JPEG files from a FAT file system.
A month ago (!), in part 8, we looked at the JPEG file format specification to determine if there was sufficient determinism in the on-disk layout to allow the recovery of deleted files through analyzing the residual data in the file system. The answer was mixed:
- GOOD: Uniquely valued markers, discoverable through data inspection, identify the beginning and type of the segments that constitute a JPEG file.
- GOOD: the metadata segments have a pre-defined size
-
BAD: the length of the entropy encoded image data is, to the best
of my knowledge, unspecified in the
START-OF-SCAN
segment header. Instead, anEND-OF-IMAGE
marker is used to identify the end of the entropy encoded data. The theory is that this is done to allow JPEG files to be written as the image is processed.
Essentially, this means that there is no way to determine through data
inspection the length or location of the clusters containing the encoded
image data. The only clue available is the END-OF-IMAGE
marker at
the end of the entropy encoded data.
One option is to discover and analyze latent directory entries in the data area - doing so could provide valuable clues to the start and length of erased JPEG files. The downsides to this approach are added complexity (recovering deleted directory entries) and incompleteness (directory entries for deleted JPEG files may not exist due to reuse).
A simpler approach is to inspect each cluster in the data area to see
if it begins with a START-OF-IMAGE
marker or contains an
END-OF-IMAGE
marker. Any extent of clusters bounded by
START-OF-IMAGE
and END-OF-IMAGE
markers stands a good chance of
being the data for a contiguous JPEG file - the very kind of file
we've been trying to recover in this series. In this post, I'll
implement this simple method and test the results. Follow the "Read
more" think for the rest of the post.
I added the following code to the fatrecover
program being developed
alongside these posts to perform the marker scan (excerpted from a
larger procedure).
#define JPEG_MARKER_COMMON (0xFF) #define JPEG_MARKER_SOI (0xD8) #define JPEG_MARKER_EOI (0xD9) int firstCluster; int lastCluster; int clusterIndex; int byteOffset; unsigned char lastMarkerType; int lastMarkerCluster; printf("Scanning clusters:\n"); lastMarkerType = 0; lastMarkerCluster = 0; for(clusterIndex = firstCluster; clusterIndex < lastCluster; clusterIndex++) { frClusterRead(pimageInfo, clusterIndex, 1, pimageInfo->tmpBuff); // N.B. - the following code does not cover the // case of markers spanning cluster boundaries. // Not sure if this is a legal condition for // JPEG files. for (byteOffset = 0; byteOffset < (pimageInfo->clusterSizeBytes-1); byteOffset++) { // N.B. - SOI markers should be in the first two bytes of the // cluster. This constraint can be relaxed to defeat // attempts to hide images by adding a preamble. if ((byteOffset == 0) && (JPEG_MARKER_COMMON == pimageInfo->tmpBuff[byteOffset]) && (JPEG_MARKER_SOI == pimageInfo->tmpBuff[byteOffset+1])) { printf("SOI cluster: %#08x\n",clusterIndex); lastMarkerType = JPEG_MARKER_SOI; lastMarkerCluster = clusterIndex; } if ((JPEG_MARKER_COMMON == pimageInfo->tmpBuff[byteOffset]) && (JPEG_MARKER_EOI == pimageInfo->tmpBuff[byteOffset+1])) { printf("EOI cluster: %#08x offset: %#x\n", clusterIndex, byteOffset); if (JPEG_MARKER_SOI == lastMarkerType) { printf("** CONTIG JPEG? start: %#08X length: %d\n", lastMarkerCluster, (clusterIndex-lastMarkerCluster+1)); } lastMarkerType = JPEG_MARKER_EOI; lastMarkerCluster = clusterIndex; } } }
Processing the post-deletion test disk image from post 7 with this new
code via the shell command jpgscan
results in:
$ sw_vers ProductName: Mac OS X ProductVersion: 10.6.2 BuildVersion: 10C540 $ wc -l fatrecover.c 1431 fatrecover.c $ make all gcc -g -Wall fatrecover.c -o fatrecover $ ./fatrecover frtest2.dmg === FAT RECOVER v0.0 === Opening file frtest2.dmg................OK Reading boot sector.....................OK Processing boot sector..................OK Reading root dir........................OK AT YOUR COMMAND: >> ls LISTING DIR: ROOT ATTR SIZE NAME 1ST CLUSTER ------ -------- -------------------- ----------- .H..D. 0 .fseventsd (0x000006) .H..D. 0 .Trashes (0x000002) .H...A 4096 ._.Trashes (0x000003) ...V.A 0 FRTEST2. (00000000) >> fat FAT TABLE ( KEY: .=free B=bad -=used X=last R=reserved) 00000000-0000001F: RXX-XXX......................... 00000020-0000003F: ................................ 00000040-0000005F: ................................ 00000060-0000007F: ................................ 00000080-0000009F: ................................ 000000A0-000000BF: ................................ [output omitted for brevity] >> jpgscan Scanning clusters: SOI cluster: 0x000007 EOI cluster: 0x00000c offset: 0x7f4 ** CONTIG JPEG? start: 0X000007 length: 6 SOI cluster: 0x00000d EOI cluster: 0x000013 offset: 0x564 ** CONTIG JPEG? start: 0X00000D length: 7 SOI cluster: 0x000014 EOI cluster: 0x00001c offset: 0x6f4 ** CONTIG JPEG? start: 0X000014 length: 9 SOI cluster: 0x00001d EOI cluster: 0x000027 offset: 0x47c ** CONTIG JPEG? start: 0X00001D length: 11 SOI cluster: 0x000028 EOI cluster: 0x000032 offset: 0x7e6 ** CONTIG JPEG? start: 0X000028 length: 11 SOI cluster: 0x000033 EOI cluster: 0x00003e offset: 0x195 ** CONTIG JPEG? start: 0X000033 length: 12 SOI cluster: 0x00003f EOI cluster: 0x00004a offset: 0x37d ** CONTIG JPEG? start: 0X00003F length: 12 SOI cluster: 0x00004b EOI cluster: 0x00005c offset: 0x27e ** CONTIG JPEG? start: 0X00004B length: 18 SOI cluster: 0x00005d EOI cluster: 0x000082 offset: 0x6ac ** CONTIG JPEG? start: 0X00005D length: 38 SOI cluster: 0x000083 EOI cluster: 0x0000ac offset: 0x552 ** CONTIG JPEG? start: 0X000083 length: 42
In part 7, we used the contig
command to list the location and size
of the contiguous JPEG files before they were deleted. For convenience,
I copied the information from post 7 below:
>> contig LAYOUT STATUS OF FILES IN DIR: ROOT NAME 1ST LENGTH CONTIGUOUS? --------------- -------- -------- ------------- 4.2.05.jpg 0x000083 42 CONTIGUOUS 4.2.01.jpg 0x00005d 38 CONTIGUOUS 4.1.06.jpg 0x00004b 18 CONTIGUOUS 4.1.01.jpg 0x00003f 12 CONTIGUOUS 4.1.05.jpg 0x000033 12 CONTIGUOUS 4.1.04.jpg 0x000028 11 CONTIGUOUS 4.1.02.jpg 0x00001d 11 CONTIGUOUS 4.1.08.jpg 0x000014 9 CONTIGUOUS 4.1.07.jpg 0x00000d 7 CONTIGUOUS 4.1.03.jpg 0x000007 6 CONTIGUOUS ._.Trashes 0x000003 2 CONTIGUOUS
Comparing the contig
and jpgscan
outputs confirms that the CONTIG JPEG?
lines accurately reflect the location and size of the
contiguous JPEG files before they were deleted - success, the simple
marker scan method worked!
The contiguous files discovered by the scan can be recovered manually
using the fatrecover
utility's extract
command that was
implemented in part 7. With a little more effort, we can automate the
recovery of such extents into files for further analysis. This is
fairly straight forward and shouldn't require explanation.
In the next post, we'll take a look at what happens when there are non-contiguous files in the file system. After that, I'll try to fix the long file name code from post 5 and wrap up this part of series.