Please read pp. 379-488 of MF.
Working in the Linux environment is a lot easier. Running most programs, such as md5sum and sha1sum, does not generally require installing any new software. Depending on what distribution, less common programs such as ssdeep may well be in the distribution's repository.
The old standby for dumping a binary was "od" (octal dump); we now have other programs such "xxd", "ghex", and "hexdump" that might also be in a given distribution.
Of course, one of the handiest is the old "file" program, which does a very credible job of identifying quite a few programs. Also of some use can be "readelf -a" and of course "objdump -a". You can look for namelists with "nm"...
Generally, the easiest thing to do is to install packages like "clamav" and "clamav-update". You can then use "clamscan" to scan individual files, and "freshclam" to get the latest signatures.
Please ignore the "Japanese" translation on page 413 of MF, it's not really correct. The definitions for "Kaiten" and "Goraku" are close enough, but "wa" in this case is a particle (not a noun, as the MF book seems to think), and it marks the subject of the sentence ("Kaiten"). Most likely, it's referring to the malware named "kaiten"; see McAfee and PacketStorm's version of kaiten.c, for instance.
(If not, then it most likely refers to a sushi place ("kaiten sushi" refers to a sushi restaurant that puts the sushi trays on a circular course, such as a conveyor belt or maybe a circular water course); even less likely, but still conceivably, it might also be a reference to a WWII Japanese torpedo system called the "kaiten".)
The Google and Altvista translators are okay for most languages; for Japanese, though, the best tend to be in Japanese.
Running "strings" over a suspect binary is almost always worth doing. While it's possible that a good programmer or packing/encryption has removed or obfuscated all strings, it's also also possible that it hasn't.
Using "ldd" can show you where things are actually at. You can use "file" (if you like) to learn more about a shared library.
Using "nm" can really show you where things are actually at. You can figure out exactly where in memory a variable can be located. If it shows a lot of information, you can use "gdb" (GNU debugger) to get a real feel for what the program is doing.
You can do an "strace -f PROGRAMNAME" in a sandbox (not a machine that you care about!) to see what's going on; this will show you all of the system calls made by this program and its children. If you find a lot of tasks created (not likely), you can use something like "strace -ff -o FILE PROGRAMNAME" to have all of strace output written to separate files for each task.
We can install "upx" easily in the Unix/Linux world. This is a simple packer/compressor that can actually save space.
Using upx:
[langley@host Slidy]$ cp /usr/bin/emacs . [langley@host Slidy]$ pwd /media/disk/CIS-4930r/2010-01/Slidy [langley@host Slidy]$ file emacs emacs: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped [langley@host Slidy]$ upx emacs Ultimate Packer for eXecutables Copyright (C) 1996 - 2008 UPX 3.03 Markus Oberhumer, Laszlo Molnar & John Reiser Apr 27th 2008 File size Ratio Format Name -------------------- ------ ----------- ----------- 11102144 -> 2780472 25.04% linux/ElfAMD emacs Packed 1 file. [langley@host Slidy]$ file emacs emacs: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, stripped
Like the PE format in Windows, the Linux ELF format (also used by other folks), is related to COFF. You can look at /usr/include/elf.h for the exact layout, and use "readelf" and "objdump" to look at ELF files.
There are a number of options that are useful with "readelf":
With "objdump", try "-p" and "-a".
One possibility for trying out a suspect binary is to use a emulation environment to see what, if anything, you can tell about the program's execution. In the Linux world, there is "wine"; try "strace -f wine BINARY" and see what happens.