Compilation and linking both (generally) introduce recognizable structure, such as symbol tables, linkage maps, and debugging information. These can help you distinguish what compilers and linkers might have been involved in the creation of an executable.
Of course, malware authors often remove identifying information (though not in this case apparently! Also, the main page for analyzing this trojan is worth reading: http://www.skullsecurity.org/blog/?p=627.)
As we have mentioned before, file "types" as indicated by extensions such .exe or .jpg are completely unreliable for identifying files.
In the Unix/Linux world, we have long had file (and now we also have the very useful objdump for looking at executable characteristics), but there's no completely equivalent program from Microsoft for the Windows world. Outside of Microsoft's port of Unix file, MF lists GT2, which appears to offer quite a bit of functionality.
Because of the vast proliferation of malware, many businesses and other types of organizations have sprung up to help fight the problem.
These can be good sources of signatures to try to identify malware; the databases that, for instance, ClamAV (aka known as ClamWin), provides can be used to identify problematic software.
I don't personally recommend using any online engines. You have no idea what's in the suspected malware, and you have no idea who you are giving the information to. If it turns out that keylogging information, for instance, has been buffered in the file you submit, you can be exposing sensitive information to unknown parties.
Of course, one thing to look for (as always) are strings. What might be in the strings that you find?
From MF, pp. 314-316:
MF points out on page 316 that strings can also be intentionally misleading.
While some malware is written in straight assembly language that potentially has very little linkage (and none if the writer can figure out how to have it executed directly rather than having the operating system load the executable into a process), other malware is written in higher-level languages that almost always require some linkage (there aren't any malware writers using Forth or Factor, I guess!)
For instance, take a look at a variation on the classic web page A Whirlwind Tutorial on Creating Really Teensy ELF Executables in Linux
In the Windows world, DUMPBIN can be of a lot of help for a standard binary produced from Microsoft's programming tools. It can list sections and what is linked in. The Linux program ldd has been ported to the Windows world, and it is helpful about showing what is dynamically linked (but note that your own environment's PATH information can also influence dynamic linking characteristics!) It can also list symbolic information, much like nm can in the Unix/Linux world.
Metadata is not limited to Windows executables (although those binaries certainly can contain a surprising amount of metadata); also note with Windows Office files that they can embed a lot of metadata — although it is often very "stale" since many people simply re-use Office files in order to use embedded formatting. Other formats, though, such as EXIF information in JPEG and TIFF files, can also contain useful metadata. Adobe's formats also may metadata that can be useful.
What to look for:
There is even (at least occasionally) an Obfuscated C Code Contest, so obfuscation is not solely the province of hackers. Other folks, such as Zend, the PHP folks, have provided mechanisms such as bytecoding for the explicit purpose of obfuscation.
But the malware malefactors have taken such obfuscation to entirely new levels, creating "digital armor", using packing, encryption, and "binders".
The idea of packing is actually an old one. Back in the day, we had limited resources, be those memory, disk, or bit transportation (often via the U.S. mail and sneakernet.) Packing anything and everything conserved these resources. Even these days we use programs like zip, gzip, and other compressors to reduce the size of files. We even can build compression into a binary with upx.
But malefactors use packing for an entirely different purpose. They want to obscure their malware (and perhaps minimize its footprint), and packing is an efficient method to do so. They also use "embedded" packers, which are far less common in legitimate applications today. The decompression routine typically appear at the end of the file; the routine decompresses the executable and then executes it.
Encryption accomplishes roughly the same goal for malware as packing: it obscures the nature of the program.
Unlike packers, which remove redundancy and thus are detectably different from binaries (which typically have plenty of redundancy), encryptors don't (generally) produce low redundancy — if anything, they can reduce redundancy.
The concept here is very reminiscent of the old days of computing, when we used "overlays" to separate and recombine bits of binaries in order to conserve memory space.
COFF, which has been used in several operating systems, was superseded by PE (portable executable) format in the Windows world.
This format has the typical ideas that one expects: sectioning of binaries into areas such as .text, .data, and .bss sections; unfortunately, its quest for backware compatibility, it also kept things such as the "MS-DOS stub" area, which can be used by attackers.
CFF Explorer is used extensively in the MF book for extracting data from Window's executables.