A fairly modern definition: A filesystem is a collection of files that we can treat as a unit.
An older definition: a contiguous area of disk space that we can use to store files, usually in a hierarchical fashion.
The primary place is still in a contiguous area of disk space on a single disk drive.
Other places include memory ("RAM" disks), RAID devices, and "virtual" filesystems such as created by ideas like sshfs using FUSE. Filesystems on devices such as flash memory devices are also becoming prevalent.
Even tape devices can be used to store and read filesystems, though generally not to write.
The steady march of technology has been pronounced in the area of spinning hard disk technology.
However, spinning drives are found as one of the three most common types of disk drives: SCSI, ATA, and S-ATA.
SCSI drives are often found on high performance systems, and were designed to be placed in long daisy chains. Unfortunately, a confusing welter of standards have been created in the SCSI world, and you have to be very careful about trying to plug random drives into random setup. The original type was "single-ended" and in a largish 50 pin connector (the infamous "Centronics" connector.) Then came the 68 pin and 80 pin connectors, and the idea of "differential" voltage checks rather than
While ATA drives are now in strong decline, with sales falling every year, they still make up a large percentage of the installed base of drives. They inherited an old ribbon cable format, and actually use parallel wires to deliver data.
SATA drives are now the strongest sellers, and are increasing rapidly as a percentage of the installed base of drives. Generally, cables from a new SATA generation will work with drives (not a feature commonly found with SCSI drives.)
While most of the Linux/Unix world uses direct access to hard drives, the Microsoft world has generally relied on BIOS access using INT 0x13. The practical side of this is that the BIOS methods are often not reliable, particularly when trying to acquire disk sizes. It's safer to use tools that actually do direct access to hard disk drives. Modern drives all use LBA, and CHS has been a figment of the imagination for many, many years.
One thing to be aware of when trying to acquire a disk image is that more modern drives can have an HPA. This is a section of the disk that has been reserved for other purposes, and standard tools often don't see it. The most common reason for having HPA is for a computer maker to have somewhere to store recovery programs and data. The HPA is located at the logical end of the drive (the highest LBA numbers), unless there is a DCO.
The way to detect HPA is to use two ATA commands, READ_NATIVE_MAX_ADDRESS and IDENTIFY_DEVICE. Both of these return maximum sectors, but the former always returns the device limit, whereas the latter will return only the number of sectors that are available.
Another thing to be aware is that newer drives can also have a DCO. This was designed to make physically different drives actually appear to be the same size by the expedient of wasting space. These can be detected by READ_NATIVE_MAX_ADDRESS, IDENTIFY_DEVICE, and DEVICE_CONFIGURATION_IDENTIFY.
A very important subject in the "dead" analysis world is that of write blocking. Generally, you want to do hardware level write blocking. Fortunately, in the last few years, USB write blockers have been coming into the market which simplify the process of write blocking. Relatively inexpensive products such as WiebeTech's blockers and Digital Intelligence's are now available.
Acquisition of an image can be as simple as using the "dd" command on a drive. First, you would want to check for an HPA or DCO; then, using a write-blocker, copy the image to somewhere.
"Somewhere?" That's a good question. The old standby of writing to optical media, which is inherently less modifiable than most other technologies, has unfortunately been up-ended in many cases by the very rapid growth in the size of other media. You can walk into Costco and buy a 2 terabyte drive for $149.00, but the largest generally available optical technology is still stuck in the 25 gigabyte range.
Wherever you end up placing your image, you will want to make multiple "integrity" hashes of it. Those hashes need to be recorded somewhere, and some people, such as Brian Carrier, recommend jotting them down in your notebook. Program such as md5sum and sha1sum are still adequate for simple error checking, but MD5 in particular is not trustworthy for anything other than that: MD5 vulnerable to collision attacks.
For historical reasons, we have used the idea of a "partition", a logically contiguous (and maybe even physically so) area of drive space. While abstraction schemes such as IBM's LVM and even modern filesystems such as ZFS have been moving away from this idea, the old disk partition is still the most common means of storing filesystems.
The most common partitioning scheme is that of the MBR (or DOS) scheme. The MBR is physically in the first LBA sector (well, the first 512 bytes is more accurate now that 4k sector drives are now coming into the market...)
$ od -x -Ad /tmp/firstblock 0000000 48eb d090 00bc fb7c 0750 1f50 befc 7c1b 0000016 1bbf 5006 b957 01e5 a4f3 bdcb 07be 04b1 0000032 6e38 7c00 7509 8313 10c5 f4e2 18cd f58b 0000048 c683 4910 1974 2c38 f674 b5a0 b407 0203 0000064 0080 8000 0841 0006 0800 90fa f690 80c2 0000080 0275 80b2 59ea 007c 3100 8ec0 8ed8 bcd0 0000096 2000 a0fb 7c40 ff3c 0274 c288 f652 80c2 0000112 5474 41b4 aabb cd55 5a13 7252 8149 55fb 0000128 75aa a043 7c41 c084 0575 e183 7401 6637 0000144 4c8b be10 7c05 44c6 01ff 8b66 441e c77c 0000160 1004 c700 0244 0001 8966 085c 44c7 0006 0000176 6670 c031 4489 6604 4489 b40c cd42 7213 0000192 bb05 7000 7deb 08b4 13cd 0a73 c2f6 0f80 0000208 f084 e900 008d 05be c67c ff44 6600 c031 0000224 f088 6640 4489 3104 88d2 c1ca 02e2 e888 0000240 f488 8940 0844 c031 d088 e8c0 6602 0489 0000256 a166 7c44 3166 66d2 34f7 5488 660a d231 0000272 f766 0474 5488 890b 0c44 443b 7d08 8a3c 0000288 0d54 e2c0 8a06 0a4c c1fe d108 6c8a 5a0c 0000304 748a bb0b 7000 c38e db31 01b8 cd02 7213 0000320 8c2a 8ec3 4806 607c b91e 0100 db8e f631 0000336 ff31 f3fc 1fa5 ff61 4226 be7c 7d7f 40e8 0000352 eb00 be0e 7d84 38e8 eb00 be06 7d8e 30e8 0000368 be00 7d93 2ae8 eb00 47fe 5552 2042 4700 0000384 6f65 006d 6148 6472 4420 7369 006b 6552 0000400 6461 2000 7245 6f72 0072 01bb b400 cd0e 0000416 ac10 003c f475 00c3 0000 0000 0000 0000 0000432 0000 0000 0000 0000 738c d0f4 0000 0180 0000448 0001 fe83 1e3f 003f 0000 9920 0007 0000 0000464 1f01 fe05 ffff 995f 0007 f4db 1d12 0000 0000480 0000 0000 0000 0000 0000 0000 0000 0000 0000496 0000 0000 0000 0000 0000 0000 0000 aa55 0000512
Important note: The final two bytes are the "magic numbers" identifying this as an MBR.
$ udcli /tmp/firstblock 0000000000000000 eb48 jmp 0x4a 0000000000000002 90 nop 0000000000000003 d0bc007cfb5007 sar byte [eax+eax+0x750fb7c], 1 000000000000000a 50 push eax 000000000000000b 1f pop ds 000000000000000c fc cld 000000000000000d be1b7cbf1b mov esi, 0x1bbf7c1b 0000000000000012 06 push es 0000000000000013 50 push eax 0000000000000014 57 push edi 0000000000000015 b9e501f3a4 mov ecx, 0xa4f301e5 000000000000001a cb retf 000000000000001b bdbe07b104 mov ebp, 0x4b107be 0000000000000020 386e00 cmp [esi+0x0], ch 0000000000000023 7c09 jl 0x2e 0000000000000025 7513 jnz 0x3a 0000000000000027 83c510 add ebp, 0x10 000000000000002a e2f4 loop 0x20 000000000000002c cd18 int 0x18 000000000000002e 8bf5 mov esi, ebp 0000000000000030 83c610 add esi, 0x10 0000000000000033 49 dec ecx 0000000000000034 7419 jz 0x4f 0000000000000036 382c74 cmp [esp+esi*2], ch 0000000000000039 f6a0b507b403 mul byte [eax+0x3b407b5] 000000000000003f 028000008041 add al, [eax+0x41800000] 0000000000000045 0806 or [esi], al 0000000000000047 0000 add [eax], al 0000000000000049 08fa or dl, bh 000000000000004b 90 nop 000000000000004c 90 nop 000000000000004d f6c280 test dl, 0x80 0000000000000050 7502 jnz 0x54 0000000000000052 b280 mov dl, 0x80 0000000000000054 ea597c000031c0 jmp dword 0xc031:0x7c59 000000000000005b 8ed8 mov ds, eax 000000000000005d 8ed0 mov ss, eax 000000000000005f bc0020fba0 mov esp, 0xa0fb2000 0000000000000064 40 inc eax 0000000000000065 7c3c jl 0xa3 0000000000000067 ff740288 push dword [edx+eax-0x78] 000000000000006b c252f6 ret 0xf652 000000000000006e c28074 ret 0x7480 0000000000000071 54 push esp 0000000000000072 b441 mov ah, 0x41 0000000000000074 bbaa55cd13 mov ebx, 0x13cd55aa 0000000000000079 5a pop edx 000000000000007a 52 push edx 000000000000007b 7249 jb 0xc6 000000000000007d 81fb55aa7543 cmp ebx, 0x4375aa55 0000000000000083 a0417c84c0 mov al, [0xc0847c41] 0000000000000088 7505 jnz 0x8f 000000000000008a 83e101 and ecx, 0x1 000000000000008d 7437 jz 0xc6 000000000000008f 668b4c10be mov cx, [eax+edx-0x42] 0000000000000094 057cc644ff add eax, 0xff44c67c 0000000000000099 01668b add [esi-0x75], esp 000000000000009c 1e push ds 000000000000009d 44 inc esp 000000000000009e 7cc7 jl 0x67 00000000000000a0 0410 add al, 0x10 00000000000000a2 00c7 add bh, al 00000000000000a4 44 inc esp 00000000000000a5 0201 add al, [ecx] 00000000000000a7 006689 add [esi-0x77], ah 00000000000000aa 5c pop esp 00000000000000ab 08c7 or bh, al 00000000000000ad 44 inc esp 00000000000000ae 06 push es 00000000000000af 007066 add [eax+0x66], dh 00000000000000b2 31c0 xor eax, eax 00000000000000b4 89440466 mov [esp+eax+0x66], eax 00000000000000b8 89440cb4 mov [esp+ecx-0x4c], eax 00000000000000bc 42 inc edx 00000000000000bd cd13 int 0x13 00000000000000bf 7205 jb 0xc6 00000000000000c1 bb0070eb7d mov ebx, 0x7deb7000 00000000000000c6 b408 mov ah, 0x8 00000000000000c8 cd13 int 0x13 00000000000000ca 730a jae 0xd6 00000000000000cc f6c280 test dl, 0x80 00000000000000cf 0f84f000e98d jz dword 0xffffffff8de901c5 00000000000000d5 00be057cc644 add [esi+0x44c67c05], bh 00000000000000db ff00 inc dword [eax] 00000000000000dd 6631c0 xor ax, ax 00000000000000e0 88f0 mov al, dh 00000000000000e2 40 inc eax 00000000000000e3 6689440431 mov [esp+eax+0x31], ax 00000000000000e8 d288cac1e202 ror byte [eax+0x2e2c1ca], cl 00000000000000ee 88e8 mov al, ch 00000000000000f0 88f4 mov ah, dh 00000000000000f2 40 inc eax 00000000000000f3 89440831 mov [eax+ecx+0x31], eax 00000000000000f7 c088d0c0e80266 ror byte [eax+0x2e8c0d0], 0x66 00000000000000fe 890466 mov [esi], eax 0000000000000101 a1447c6631 mov eax, [0x31667c44] 0000000000000106 d266f7 shl [esi-0x9], cl 0000000000000109 3488 xor al, 0x88 000000000000010b 54 push esp 000000000000010c 0a6631 or ah, [esi+0x31] 000000000000010f d266f7 shl [esi-0x9], cl 0000000000000112 7404 jz 0x118 0000000000000114 88540b89 mov [ebx+ecx-0x77], dl 0000000000000118 44 inc esp 0000000000000119 0c3b or al, 0x3b 000000000000011b 44 inc esp 000000000000011c 087d3c or [ebp+0x3c], bh 000000000000011f 8a540dc0 mov dl, [ebp+ecx-0x40] 0000000000000123 e206 loop 0x12b 0000000000000125 8a4c0afe mov cl, [edx+ecx-0x2] 0000000000000129 c108d1 ror dword [eax], 0xd1 000000000000012c 8a6c0c5a mov ch, [esp+ecx+0x5a] 0000000000000130 8a740bbb mov dh, [ebx+ecx-0x45] 0000000000000134 00708e add [eax-0x72], dh 0000000000000137 c3 ret 0000000000000138 31db xor ebx, ebx 000000000000013a b80102cd13 mov eax, 0x13cd0201 000000000000013f 722a jb 0x16b 0000000000000141 8cc3 mov ebx, es 0000000000000143 8e06 mov es, [esi] 0000000000000145 48 dec eax 0000000000000146 7c60 jl 0x1a8 0000000000000148 1e push ds 0000000000000149 b900018edb mov ecx, 0xdb8e0100 000000000000014e 31f6 xor esi, esi 0000000000000150 31ff xor edi, edi 0000000000000152 fc cld 0000000000000153 f3a5 rep movsd 0000000000000155 1f pop ds 0000000000000156 61 popad 0000000000000157 ff26 jmp dword near [esi] 0000000000000159 42 inc edx 000000000000015a 7cbe jl 0x11a 000000000000015c 7f7d jg 0x1db 000000000000015e e84000eb0e call 0xeeb01a3 0000000000000163 be847de838 mov esi, 0x38e87d84 0000000000000168 00eb add bl, ch 000000000000016a 06 push es 000000000000016b be8e7de830 mov esi, 0x30e87d8e 0000000000000170 00be937de82a add [esi+0x2ae87d93], bh 0000000000000176 00eb add bl, ch 0000000000000178 fe4752 inc byte [edi+0x52] 000000000000017b 55 push ebp 000000000000017c 42 inc edx 000000000000017d 2000 and [eax], al 000000000000017f 47 inc edi 0000000000000180 656f outsd 0000000000000182 6d insd 0000000000000183 004861 add [eax+0x61], cl 0000000000000186 7264 jb 0x1ec 0000000000000188 20446973 and [ecx+ebp*2+0x73], al 000000000000018c 6b0052 imul eax, [eax], 0x52 000000000000018f 6561 popad 0000000000000191 640020 add [fs:eax], ah 0000000000000194 45 inc ebp 0000000000000195 7272 jb 0x209 0000000000000197 6f outsd 0000000000000198 7200 jb 0x19a 000000000000019a bb0100b40e mov ebx, 0xeb40001 000000000000019f cd10 int 0x10 00000000000001a1 ac lodsb 00000000000001a2 3c00 cmp al, 0x0 00000000000001a4 75f4 jnz 0x19a 00000000000001a6 c3 ret 00000000000001a7 0000 add [eax], al 00000000000001a9 0000 add [eax], al 00000000000001ab 0000 add [eax], al 00000000000001ad 0000 add [eax], al 00000000000001af 0000 add [eax], al 00000000000001b1 0000 add [eax], al 00000000000001b3 0000 add [eax], al 00000000000001b5 0000 add [eax], al 00000000000001b7 008c73f4d00000 add [ebx+esi*2+0xd0f4], cl 00000000000001be 800101 add byte [ecx], 0x1 00000000000001c1 0083fe3f1e3f add [ebx+0x3f1e3ffe], al 00000000000001c7 0000 add [eax], al 00000000000001c9 0020 add [eax], ah 00000000000001cb 99 cdq 00000000000001cc 07 pop es 00000000000001cd 0000 add [eax], al 00000000000001cf 0001 add [ecx], al 00000000000001d1 1f pop ds 00000000000001d2 05feffff5f add eax, 0x5ffffffe 00000000000001d7 99 cdq 00000000000001d8 07 pop es 00000000000001d9 00db add bl, bl 00000000000001db f4 hlt 00000000000001dc 121d00000000 adc bl, [0x0] 00000000000001e2 0000 add [eax], al 00000000000001e4 0000 add [eax], al 00000000000001e6 0000 add [eax], al 00000000000001e8 0000 add [eax], al 00000000000001ea 0000 add [eax], al 00000000000001ec 0000 add [eax], al 00000000000001ee 0000 add [eax], al 00000000000001f0 0000 add [eax], al 00000000000001f2 0000 add [eax], al 00000000000001f4 0000 add [eax], al 00000000000001f6 0000 add [eax], al 00000000000001f8 0000 add [eax], al 00000000000001fa 0000 add [eax], al 00000000000001fc 0000 add [eax], al 00000000000001fe 55 push ebp 00000000000001ff aa stosb
0000446 80 byte 0 is the boot flag 0000447 01 bytes 1-3 are the beginning CHS 0000448 01 0000449 00 0000450 83 byte 4 is partition type (Linux, in this case) 0000451 fe bytes 5-7 are the ending CHS 0000452 3f 0000453 1e 0000454 3f bytes 8-11 are the starting LBA address 0000455 00 0000456 00 Note that the first partition "always" starts 0000457 00 at 0x3f = sector 63... 0000460 20 bytes 12-15 are the size in sectors 0000461 99 0000462 07 0x00079920 = 497,952 512k sectors = 248,976 1k blocks 0000463 00
[root@sophie ~]# fdisk -l Disk /dev/sda: 250.0 GB, 250000000000 bytes 255 heads, 63 sectors/track, 30394 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0xd0f4738c Device Boot Start End Blocks Id System /dev/sda1 * 1 31 248976 83 Linux /dev/sda2 32 30394 243890797+ 5 Extended /dev/sda5 32 30394 243890766 8e Linux LVM
Unfortunately, extended partitions are far more complicated.
Let's look at the second partition:0000462 00 Bootable flag 0000463 00 Starting CHS 0000464 01 0000465 1f 0000466 05 Partition type (05 means "extended") 0000467 fe Ending CHS 0000468 ff 0000469 ff 0000470 5f Starting LBA 0000471 99 0000472 07 0000473 00 0000474 db Size in sectors 0000475 f4 0000476 12 0x1d12f4db = 487,781,595 512k sectors 0000477 1d = 2*243,890,797 1/2 1k blocks
The most important thing to note is that often sectors 1-62 are not used by the operating system → 62 512k byte sectors are 31k, which is plenty of room for bad stuff!
Other things to note are that malware can play a *lot* of MBR tricks, if it's clever enough. Hiding sectors by making partition table entries unparseable by ordinary software is an uncommon but possible one.
Probably the most common abstraction is that of RAID (Redundant Array of Inexpensive Disks.) Here we take a number of disk devices, and aggregate them into a common RAID pool, and then partition that aggregation.
How can we do this? There are two main ways: hardware RAID ith some sort of RAID controller doing the work, and presently logical drives to the system, and software RAID, where the logical drives are maintained by the system.
RAID, and particularly hardware-based RAID-5 and its variants, make forensics on just drives very difficult. If you have a hardware RAID situation, the best recourse BY FAR is to keep those drives connected to that RAID controller!