![]() |
|
|||
Hooking IO Calls for Multi-Format Image Support
This article has been published in the Sleuthkit Informer #19:: http://www.sleuthkit.org/informer/index.php Revised in April 2005 to use the new command line switches for PyFlag 0.76. OverviewOften when analysing hard disk images, the image may be provided in a slightly different format to the expected partition dd image. This may happen because the image was split into multiple files, or it might be that the image was acquired using Encase (TM) which uses its own proprietary image file format. Many forensic tools require the image to be in a specific format. For example the Sleuthkit requires the image to be an uncompressed partition images, for example that obtained using the dd command line: dd if=/dev/hda1 of=image.dd If the raw disk was used, i.e. /dev/hda, the investigator is forced to use dd to "slice" the original image into partitions depending on the partition table (Note that the 63 sector skip is normally found from the partition table, using sfdisk, mmls or a similar tool): dd if=disk_image.dd of=partition_image.dd bs=512 skip=63 If the original disk was very large to start with, this is a time consuming operation. It would be nice to have an abstraction layer which converts between the different formats of images (a partition image vs. a disk image) on the fly without requiring to copy the image again. This functionality becomes even more desirable when considering the analysis of images which have been stored using compression. For example, the popular forensic package Encase(tm) stores images in a proprietary format called The Expert Witness Compression Format. This format provides compression as well as splitting large images into manageable parts. By providing a transparent abstraction layer it is possible to enable any tool to automatically support the image format. Hooking IO for fun and profitThe PyFlag forensic package used to have an IO Subsystem patch for the Sleuthkit which enabled it to operate on a number of different file formats. Although the Sleuthkit is an excellent tool, it soon became obvious that the same functionality was also required of other tools, like strings, sfdisk etc. Modifying the source code of an application resulted in an increased amount of code maintenance required to retrofit the IO subsystem patch as each version of the Sleuthkit was released. The developers of PyFlag had to find a better way. Ideally the tool would have to involve no source code modification, and allow arbitrary programs to handle the supported file formats transparently. The obvious solution to this problem was an abstraction layer based on library hooking techniques. When a program wishes to perform an IO operation on a file (for example open, read or write the file), it is very rare that the program issue the kernel system call directly. In fact, most programs will call the C library's open(), read() and write() calls as required. Since most programs are dynamically linked rather than statically compiled, the linking of the C library code is done during run time, by the dynamic linker. Most dynamic linker implementations (and in particular the GNU libc dynamic loader) allow a library to be loaded first, before loading other system libraries. Also, if a library provides a required symbol, the linker will stop searching for that symbol in other libraries. This property allows a library to "hook" a library function by simply masking the library function with a locally defined function. An example serves to illustrate the technique. Assume we have the following program, written in pseudo C code:
main() {
fd=open("somefile",O_RDONLY);
read(fd,buffer,SIZE);
close(fd);
}
When this program is executed, it calls the C library's open function (which actually does the system call). The program then reads some data from the filehandle, by calling the C library's read function, and finally calls the library's close function to close the filehandle. In the glibc implementation of the dynamic loader (The one used in most Linux systems), the environment variable LD_PRELOAD specifies to the linker that the named library should be loaded before any other libraries. If the desired symbol is present within the named library it will mask other functions with the same name present in other libraries. In our case, we wish to hook the open(), read() and close() functions, hence we need to create a shared object (a library - we shall call it the hooker object) with these functions defined. After setting LD_PRELOAD to the location of the hooker object we have created, our library will trap all calls to the specified function: External program ---> Hooker object ---> real libc functions The result of this is that as far as the external program is concerned, it is operating on a simple partition image as would have been obtained using dd. In practice however, the hooker object is able to read more complex images, emulating a simple partition image to the external program. ImplementationThe PyFlag iohooker tool implements this technique. Not only does it hook open, read, write etc, but also hooks the stream functions fopen, fread, fwrite etc. It currently supports many different external programs, such as dd, disktype, all Sleuthkit executables, strings and many more. IOHooker is distributed in two components. The main component is a shared object called libio_hooker.so. In order to control this object, environment variables are set by a wrapper program: iowrapper. For the purposes of demonstration we download the binary version of PyFlag. We untar the distribution in our home directory, and change directory into it. The first step, prior to being able to use the iowrapper is to set the LD_LIBRARY_PATH environment variable. This is required to allow the dynamic linker to find libio_hooker.so. If we fail to set this properly, the linker can not run the iowrapper: ~/pyflag$ ./bin/iowrapper -h ./bin/iowrapper: error while loading shared libraries: libio_hooker.so: cannot open shared object file: No such file or directory After setting the LD_LIBRARY_PATH environment variable, we are able to run the iowrapper normally:
~/pyflag$ export LD_LIBRARY_PATH=`pwd`/libs/
~/pyflag$ ./bin/iowrapper
This program wraps library calls to enable binaries to operate
on images with various formats. NOTE: Ensure that libio_hooker.so
is in your LD_LIBRARY_PATH before running this wrapper.
Usage: ./bin/iowrapper -i subsys -o option prog arg1 arg2 arg3...
-i subsys: The name of a subsystem to use (help for a list)
-o optionstr: The option string for the subsystem (help for an example)
-f wrapped filename: All wrapped filenames will start
with this string. This is useful for programs that need to
open other files as well as the target file (for example
/usr/bin/file needs to open magic files as well).
Loading library now for hooking
The final message "Loading library now for hooking" confirms that the hooker object is properly initialised and ready. Let us first check to see what IO Subsystems are supported by the iowrapper:
~/pyflag$ ./bin/iowrapper -i help
Loading library now for hooking
Available Subsystems:
standard - Standard Sleuthkit IO Subsystem
advanced - Advanced Sleuthkit IO Subsystem
sgzip - Seekable Gzip format
ewf - Expert Witness Compression format
raid - Raid 5 implementation
Unhandled Exception(IO Error): No such IO subsystem: help
Each subsystem requires specific options that make sense for it. The Advanced IO subsystem, allows users to specify arbitrary offsets, as well as multiple split image sets. We can get a more detailed explanation of these options:
~/pyflag$ ./bin/iowrapper -i advanced -o help
Loading library now for hooking
Advanced io subsystem options
offset=bytes Number of bytes to seek to in
the image file. Useful if there is some extra data at the start
of the dd image (e.g. partition table/other partitions)
file=filename Filename to use for split files.
If your dd image is split across many files, specify this parameter
in the order required as many times as needed for seamless
integration
A single word without an = sign represents a filename
to use
For our first example, we use the Sleuthkit's fls tool to list the files present in partition 6 of a hard disk image. The fls tool does not provide the option of selecting an offset into the image for the start of the filesystem, hence we need to wrap it. First we calculate the offset where the partition starts: /pyflag# sfdisk -uS -l /tmp/test.dd Disk /tmp/test.dd: cannot get geometry Disk /tmp/test.dd: 0 cylinders, 0 heads, 0 sectors/track read: Inappropriate ioctl for device Warning: The partition table looks like it was made for C/H/S=*/255/63 (instead of 0/0/0). For this listing I'll assume that geometry. Units = sectors of 512 bytes, counting from 0 Device Boot Start End #sectors Id System /tmp/test.dd1 63 96389 96327 de Dell Utility /tmp/test.dd2 * 96390 19647494 19551105 7 HPFS/NTFS /tmp/test.dd3 19647495 58733639 39086145 c W95 FAT32 (LBA) /tmp/test.dd4 58733640 117210239 58476600 5 Extended /tmp/test.dd5 58733703 59328044 594342 82 Linux swap /tmp/test.dd6 59328108 117210239 57882132 83 Linux The start of partition 6 is at 59328108 sectors of 512 bytes. We can therefore use the wrapper to force fls to read the file system located at that offset (note the offset is specified in sectors):
~/pyflag$ ./bin/iowrapper -i advanced -offset 59328108s \
-filename /tmp/test.dd -- fls foo
Set file to read from as /tmp/test.dd
d/d 11: lost+found
d/d 32769: etc
l/l 12: cdrom
d/d 131073: var
...
d/d 3211272: opt
d/d 3555336: initrd
l/l 16: vmlinuz
Note that as far as fls is concerned it is opening and reading the file foobar. It does not realise that foobar does not exist, since the wrapper provides it with valid data. For the next example, we used Encase(tm) to create an evidence file of a floppy disk. The file command is unable to determine what is stored inside the image, due to it being encoded in the proprietary EWF format: ~/pyflag$ file test.e01 test.e01: data ~/pyflag$ hexdump -C test.e01 | head 00000000 45 56 46 09 0d 0a ff 00 01 01 00 00 00 68 65 61 |EVF...ÿ......hea| 00000010 64 65 72 00 00 00 00 00 00 00 00 00 00 b2 00 00 |der..........²..| 00000020 00 00 00 00 00 a5 00 00 00 00 00 00 00 80 00 10 |.....¥..........| Lets wrap the hexdump program to show the contents of the raw image: ~/pyflag$ ./bin/iowrapper -i ewf -filename test.e01 -- hexdump -C test.e01 | head 00000000 eb 3c 90 4d 53 44 4f 53 35 2e 30 00 02 01 01 00 |ë<.MSDOS5.0.....| 00000010 02 e0 00 40 0b f0 09 00 12 00 02 00 00 00 00 00 |.à.@.ð..........| 00000020 00 00 00 00 00 00 29 fc 02 29 08 4e 4f 20 4e 41 |......)ü.).NO NA| 00000030 4d 45 20 20 20 20 46 41 54 31 32 20 20 20 33 c9 |ME FAT12 3É| From this hexdump it looks like the image is that of a FAT 12 floppy disk. To confirm we can run the file command over the image. Since file opens other files other than the image (it needs to open the magic file), we need to prevent the hooker from hooking those other files (otherwise when the file program tries to open its magic file, it will be getting the image instead). To this end we can use the -f flag to restrict hooking only to files of a given name: ~/pyflag$ ./bin/iowrapper -i ewf -f test.e01 -filename test.e01 -- file test.e01 test.e01: x86 boot sector, code offset 0x3c, OEM-ID "MSDOS5.0", root entries 224, sectors 2880 (volumes <=32 MB) , sectors/FAT 9, serial number 0x82902fc, unlabeled, FAT (12 bit) Sleuthkit's fls can be used on this Encase image:
~/pyflag$ ./bin/iowrapper -i ewf -f foo -filename test.e01 -- \
./bin/fls foo
r/r 9: gunzip.exe
r/r 11: Hiew.exe
r/r 12: tar.exe
r/r 22: cygwin1.dll
..
Finally we wish to extract the Encase image into a standard dd image. We wrap dd and redirect the output to a file:
~/pyflag$ ./bin/iowrapper -i ewf -f test.e01 -filename test.e01 -- \
dd if=test.e01 of=/tmp/test.dd
Note Encase images often span many individual segments, each in their own file. To enable users to specify all segments at the same time, shell globbing may be used. In the following example, the -f specifies that files called foo should be hooked (i.e. when fls is attempting to open foo, it will get the encase image): ~/pyflag$ ./bin/iowrapper -i ewf -f foo -filename test.e* -- ./bin/fls foo Remote Access to live systemsSometimes we wish to analyse a live unix system remotely. This may be so we can quickly see if the system is compromised, without having to acquire the entire image first. We can use our forensic tools to examine the remote raw device by using the remote IO subsystem. Note This type of analysis is quite fragile because the system is still live, and using its file system. The forensic tools are accessing the raw device while it is being modified which makes it susceptible to race conditions. For example, if a file is removed just as the forensic utility is accessing its directory inode inconsistant data may be obtained. The ramifications of this is that forensic tools may crash, or provide inconsistant results. It is impossible, however, for the IO subsystem to alter the live system in any way (since the raw device is opened as read only). One of the common problems with accessing a remote system is authentication and encryption. Access to the raw device over the network could easily lead to a root compromise by disclosing sensitive system information (e.g. the shadow file). The problem of authentication and encryption is best left to dedicated programs, such as Secure Shell (ssh). This is the approach taken by the remote access IO subsystem. The only requirements on the live system are an ssh server, and the remote_server program (which may be compiled staticly). These are the steps required to access remote raw devices over the network:
The following is an example of a session which might be run on a remote target machine:
~/pyflag$ ./bin/iowrapper -i remote -host target \
-server_path /path/to/remote_server -device /dev/hda -- \
mmls -t dos foo
DOS Partition Table
Units are in 512-byte sectors
Slot Start End Length Description
00: ----- 0000000000 0000000000 0000000001 Primary Table (#0)
01: ----- 0000000001 0000000062 0000000062 Unallocated
02: 00:00 0000000063 0000096389 0000096327 Dell Utilities FAT (0xde)
03: 00:01 0000096390 0019647494 0019551105 NTFS (0x07)
04: 00:02 0019647495 0058733639 0039086145 Win95 FAT32 (0x0C)
05: 00:03 0058733640 0117210239 0058476600 DOS Extended (0x05)
06: ----- 0058733640 0058733640 0000000001 Extended Table (#1)
07: ----- 0058733641 0058733702 0000000062 Unallocated
08: 01:00 0058733703 0059328044 0000594342 Linux Swap / Solaris x86 (0x82)
09: 01:01 0059328045 0117210239 0057882195 DOS Extended (0x05)
10: ----- 0059328045 0059328045 0000000001 Extended Table (#2)
11: ----- 0059328046 0059328107 0000000062 Unallocated
12: 02:00 0059328108 0117210239 0057882132 Linux (0x83)
We can now list the contents of the windows partition: ~/pyflag$ ./bin/iowrapper -i remote -host target \ -server_path /path/to/remote_server -device/dev/hda \ -offset 0000096390s -- fls foo d/d 12763-144-4: Documents and Settings d/d 6672-144-3: DRIVERS d/d 6941-144-6: I386 r/r 6915-128-3: IO.SYS d/d 62628-144-5: LDIR r/r 6916-128-3: MSDOS.SYS d/d 16844-144-1: My Music r/r 6671-128-3: NTDETECT.COM r/r 6670-128-3: NTLDR d/d 13231-144-4: Program Files ... In the above analysis we use the following parameters:
Note This analysis would easily reveal to us if there are hidden files or directories, even in cases where kernel level rootkits are installed. This is because most kernel level rootkits trap system calls accessing files on the filesystem, but do not filter access to raw devices. Since fls is reading the filesystem structures on the raw device, it is independant of the kernel's filesystem driver or filesystem related system calls. Although it is conceivable that rootkits can filter the raw device to hide files, this will dramatically increase the complexity of the rootkit. ConclusionsLibrary hooking is a powerful technique which enables a wrapper to be inserted between an arbitrary executable, and the image. PyFlag has developed an image abstraction layer which allows arbitrary programs to automatically support a variety of forensic image formats transparently. The remote IO subsystem allows for the remote access and analysis of raw devices by forensic tools, making it possible to detect some kernel level rootkits remotely. |