Part 7 in a series of posts on recovering deleted JPEG files from a FAT file system.
After a hiatus due to the holidays, personal matters, and work-stuff I'm ready to continue the FAT Recover project. In this post I'll
- finally demonstrate the two principal assumptions that this project is based on.
- actually recover deleted files using a manual approach.
Follow the "read more" link for the detailed discussion.
But first a mea culpa - I've discovered that the code for long file name support in post 5 is utterly broken. The code works for files that only use a single long file name entry but does not correctly process file names spanning multiple entries. For now, I'll side-step the issue and post a fix at a later time. I'll also update post 5 to warn future readers. Apologies for the error - that's the danger of hacking in the wee hours and minimal unit testing.
By this point, it's likely apparent that this project is based on two assumptions:
- Deleting a file from a FAT file system often does not erase the associated data.
- In simple cases - like a digital camera saving pictures on a fresh memory card - files are stored contiguously in FAT file systems.
To test if these assumptions hold, let's create and analyze a new disk image with multiple JPEG files.
Expanding on the method from post 2, the following shell script creates a new disk image and copies to it ten images from the test corpus.
#!/bin/bash IMAGENAME=frtest2 VOLNAME=FRTEST2 VOLSIZE='-megabytes 16' MOUNTPATH=/Volumes/${VOLNAME} TESTPATH=/Users/jcardent/projects/fatrecover2/test/images TESTFILES=`ls -1Sr ${TESTPATH}/4*.jpg | head -10` hdiutil create \ -fs "MS-DOS FAT16" \ ${VOLSIZE} \ -layout NONE \ -volname ${VOLNAME} ${IMAGENAME} hdiutil attach ${IMAGENAME}.dmg for file in $TESTFILES do echo "Copying file " ${file} cp ${file} ${MOUNTPATH} done hdiutil detach /Volumes/${VOLNAME}
Mounting the image and listing the root directory confirms that ten image files were copied to the encapsulated FAT file system.
$ hdiutil attach frtest2.dmg /dev/disk1 /Volumes/FRTEST2 $ls -ao /Volumes/FRTEST2/ total 712 drwxrwxrwx 1 jcardent 16384 Jan 26 07:18 . drwxrwxrwt@ 4 root 136 Jan 26 07:18 .. drwxrwxrwx@ 1 jcardent 2048 Jan 26 07:18 .Trashes -rwxrwxrwx 1 jcardent 4096 Jan 26 06:30 ._.Trashes drwxrwxrwx 1 jcardent 2048 Jan 26 07:18 .fseventsd -rwxrwxrwx 1 jcardent 23423 Jan 26 06:30 4.1.01.jpg -rwxrwxrwx 1 jcardent 21630 Jan 26 06:30 4.1.02.jpg -rwxrwxrwx 1 jcardent 12278 Jan 26 06:30 4.1.03.jpg -rwxrwxrwx 1 jcardent 22504 Jan 26 06:30 4.1.04.jpg -rwxrwxrwx 1 jcardent 22935 Jan 26 06:30 4.1.05.jpg -rwxrwxrwx 1 jcardent 35456 Jan 26 06:30 4.1.06.jpg -rwxrwxrwx 1 jcardent 13670 Jan 26 06:30 4.1.07.jpg -rwxrwxrwx 1 jcardent 18166 Jan 26 06:30 4.1.08.jpg -rwxrwxrwx 1 jcardent 77486 Jan 26 06:30 4.2.01.jpg -rwxrwxrwx 1 jcardent 85332 Jan 26 06:30 4.2.05.jpg
Since the last post, I've made a number of enhancements to the
fatrecover
program that I am developing for this series. One new
feature is a simple shell that allows interactively analyzing a disk
image. Building and running the program results in:
$ sw_vers ProductName: Mac OS X ProductVersion: 10.6.2 BuildVersion: 10C540 $ wc -l fatrecover.c 1348 fatrecover.c $ make all gcc -g -Wall fatrecover.c -o fatrecover $./fatrecover frtest2.dmg === FAT RECOVER v0.0 === Opening file frtest2.dmg................OK Reading boot sector.....................OK Processing boot sector..................OK Reading root dir........................OK AT YOUR COMMAND: >> help quit exits program help displays commands bsector prints boot sector fsareas prints size and location of fs areas fat prints file allocation table ls prints current directory chain prints cluster chain contig prints file continuity status checksum prints cluster checksums extract extracts clusters to file >>
Listing the root directory within fatrecover
also shows the ten
copied files (and the volume name entry which is usually hidden).
>> ls LISTING DIR: ROOT ATTR SIZE NAME 1ST CLUSTER ------ -------- -------------------- ----------- .....A 85332 4.2.05.jpg (0x000083) .....A 77486 4.2.01.jpg (0x00005d) .....A 35456 4.1.06.jpg (0x00004b) .....A 23423 4.1.01.jpg (0x00003f) .....A 22935 4.1.05.jpg (0x000033) .....A 22504 4.1.04.jpg (0x000028) .....A 21630 4.1.02.jpg (0x00001d) .....A 18166 4.1.08.jpg (0x000014) .....A 13670 4.1.07.jpg (0x00000d) .....A 12278 4.1.03.jpg (0x000007) .H..D. 0 .fseventsd (0x000006) .H..D. 0 .Trashes (0x000002) .H...A 4096 ._.Trashes (0x000003) ...V.A 0 FRTEST2. (00000000)
Printing the FAT table suggests a well-ordered layout of the JPEG files in the data area.
>> fat FAT TABLE ( KEY: .=free B=bad -=used X=last R=reserved) 00000000-0000001F: RXX-X..-----X------X--------X--- 00000020-0000003F: -------X----------X-----------X- 00000040-0000005F: ----------X-----------------X--- 00000060-0000007F: -------------------------------- 00000080-0000009F: --X----------------------------- 000000A0-000000BF: ------------X................... 000000C0-000000DF: ................................
Inspecting the cluster chains of the first three files confirms their contiguous layout.
>> chain 0x7 0x0007 -> 0x0008 -> 0x0009 -> 0x000a -> 0x000b -> 0x000c -> EOC (6 clusters, check 6, Contiguous) >> chain 0xd 0x000d -> 0x000e -> 0x000f -> 0x0010 -> 0x0011 -> 0x0012 -> 0x0013 -> EOC (7 clusters, check 7, Contiguous) >> chain 0x14 0x0014 -> 0x0015 -> 0x0016 -> 0x0017 -> 0x0018 -> 0x0019 -> 0x001a -> 0x001b -> 0x001c -> EOC (9 clusters, check 9, Contiguous)
For convenience, I implemented the contig
shell command which lists
each file and reports if it is stored contiguously in the file system.
>> contig LAYOUT STATUS OF FILES IN DIR: ROOT NAME 1ST LENGTH CONTIGUOUS? --------------- -------- -------- ------------- 4.2.05.jpg 0x000083 42 CONTIGUOUS 4.2.01.jpg 0x00005d 38 CONTIGUOUS 4.1.06.jpg 0x00004b 18 CONTIGUOUS 4.1.01.jpg 0x00003f 12 CONTIGUOUS 4.1.05.jpg 0x000033 12 CONTIGUOUS 4.1.04.jpg 0x000028 11 CONTIGUOUS 4.1.02.jpg 0x00001d 11 CONTIGUOUS 4.1.08.jpg 0x000014 9 CONTIGUOUS 4.1.07.jpg 0x00000d 7 CONTIGUOUS 4.1.03.jpg 0x000007 6 CONTIGUOUS ._.Trashes 0x000003 2 CONTIGUOUS
The contig
command confirms that all of the copied files are indeed
stored contiguously. This evidence supports the second assumption -
contiguous storage - but what about the first assumption that file
data doesn't get erased after deletion?
I've also enhanced fatrecover
with the ability to checksum a series
of clusters using the FNV hash algorithm. Hashing the clusters for the
first three files yields:
>> checksum 0x7 6 0007-000A: 621102b1 5f618b21 d2855912 f5e1af11 000B-000C: 5587dd40 8988110b >> checksum 0xd 7 000D-0010: e736c48d 2554ee35 e613f172 fe82cc2f 0011-0013: de7bf92b 515eea02 acefd6c3 >> checksum 0x14 9 0014-0017: 73e7cfea cdc2d024 5625892b 560cd9b1 0018-001B: e5257e13 49fddb98 44610606 6011d7c7 001C-001C: eadda707
To validate these checksums, I also implemented a standalone FNV hash utility to checksum the original image files in 2KB chunks (last chunk padded with 0s).
$ ./fnvtest 4.1.03.jpg 0000-0003: 621102b1 5f618b21 d2855912 f5e1af11 0004-0007: 5587dd40 8988110b $ ./fnvtest 4.1.07.jpg 0000-0003: e736c48d 2554ee35 e613f172 fe82cc2f 0004-0007: de7bf92b 515eea02 acefd6c3 $ ./fnvtest 4.1.08.jpg 0000-0003: 73e7cfea cdc2d024 5625892b 560cd9b1 0004-0007: e5257e13 49fddb98 44610606 6011d7c7 0008-000B: eadda707
They match! This indicates that we have indeed located the data for these files in the disk image.
Now let's mount the disk image and delete the JPEG files.
$ hdiutil attach frtest2.dmg /dev/disk1 /Volumes/FRTEST2 $ rm /Volumes/FRTEST2/4*.jpg $ ls -ao /Volumes/FRTEST2/ total 48 drwxrwxrwx 1 jcardent 16384 Jan 28 06:59 . drwxrwxrwt@ 4 root 136 Jan 28 06:59 .. drwxrwxrwx@ 1 jcardent 2048 Jan 28 06:59 .Trashes -rwxrwxrwx 1 jcardent 4096 Jan 26 06:30 ._.Trashes drwxrwxrwx 1 jcardent 2048 Jan 28 06:59 .fseventsd $ hdiutil detach /Volumes/FRTEST2/ "disk1" unmounted. "disk1" ejected.
Re-examining the image in fatrecover
confirms that the files are no
longer listed in the root directory and that their associated clusters
have been marked as unused in the FAT table.
>> ls LISTING DIR: ROOT ATTR SIZE NAME 1ST CLUSTER ------ -------- -------------------- ----------- .H..D. 0 .fseventsd (0x000006) .H..D. 0 .Trashes (0x000002) .H...A 4096 ._.Trashes (0x000003) ...V.A 0 FRTEST2. (00000000) >> fat FAT TABLE ( KEY: .=free B=bad -=used X=last R=reserved) 00000000-0000001F: RXX-XXX......................... 00000020-0000003F: ................................ 00000040-0000005F: ................................ 00000060-0000007F: ................................ 00000080-0000009F: ................................ 000000A0-000000BF: ................................ 000000C0-000000DF: ................................
Checksumming the same clusters as before results in:
>> checksum 0x7 6 0007-000A: 621102b1 5f618b21 d2855912 f5e1af11 000B-000C: 5587dd40 8988110b >> checksum 0xd 7 000D-0010: e736c48d 2554ee35 e613f172 fe82cc2f 0011-0013: de7bf92b 515eea02 acefd6c3 >> checksum 0x14 9 0014-0017: 73e7cfea cdc2d024 5625892b 560cd9b1 0018-001B: e5257e13 49fddb98 44610606 6011d7c7 001C-001C: eadda707
The same checksums! Although the files were deleted the data remains unchanged in the file system! This supports the first assumption that deleting files does not erase their associated data.
To recover the deleted files, all that is needed is to extract the
data from the disk image and write it to a new file. I added the
extract
command to do just that:
>> extract 0x7 6 recover1.jpg Extracting clusters: 0x07 0x08 0x09 0x0a 0x0b 0x0c >> extract 0xd 7 recover2.jpg Extracting clusters: 0x0d 0x0e 0x0f 0x10 0x11 0x12 0x13 >> extract 0x14 9 recover3.jpg Extracting clusters: 0x14 0x15 0x16 0x17 0x18 0x19 0x1a 0x1b 0x1c
The recovered files are slightly larger than the originals but this is due to extracting the entire last cluster even though it was only partially used.
$ ls -ao images/4.1.0[378].jpg -rw-r--r-- 1 jcardent 12278 Dec 4 21:16 images/4.1.03.jpg -rw-r--r-- 1 jcardent 13670 Dec 4 21:16 images/4.1.07.jpg -rw-r--r-- 1 jcardent 18166 Dec 4 21:16 images/4.1.08.jpg $ ls -ao recover*.jpg -rw-r--r-- 1 jcardent 12288 Jan 28 10:50 recover1.jpg -rw-r--r-- 1 jcardent 14336 Jan 28 10:50 recover2.jpg -rw-r--r-- 1 jcardent 18432 Jan 28 10:50 recover3.jpg
Comparing the original and recovered files with ImageMagik's
identify
, and compare
utilities results in:
$ identify images/4.1.0[378].jpg images/4.1.03.jpg JPEG 256x256 256x256+0+0 8-bit DirectClass 12kb images/4.1.07.jpg[1] JPEG 256x256 256x256+0+0 8-bit DirectClass 13.3kb images/4.1.08.jpg[2] JPEG 256x256 256x256+0+0 8-bit DirectClass 17.7kb $ identify recover[123].jpg recover1.jpg JPEG 256x256 256x256+0+0 8-bit DirectClass 12kb recover2.jpg[1] JPEG 256x256 256x256+0+0 8-bit DirectClass 14kb recover3.jpg[2] JPEG 256x256 256x256+0+0 8-bit DirectClass 18kb $ compare -metric RMSE images/4.1.03.jpg recover1.jpg null: 0 (0) $ compare -metric RMSE images/4.1.07.jpg recover2.jpg null: 0 (0) $ compare -metric RMSE images/4.1.08.jpg recover3.jpg null: 0 (0)
Success, the recovered files are identical to the originals! For further confirmation, I opened these files with Apple's Preview application and visually compared the recovered images to the originals - they were indeed identical.
It's important to note that this manual recovery process worked because we knew where the files originally were in the file system. In the real use-case, the location of the original files is unknown so a method is needed to discover the beginning and ending of deleted files. In the next post, I'll begin developing such a method.