Digg this story   Add to del.icio.us  
Rooting Out Corrupted Code
Jon Lasser, 2002-12-11

Is there a backdoor on your system? A flawed but timely project from the Shmoo Group could help network administrators spot altered programs.

Sometimes it's easy to tell when you're dealing with an imposter. That Mona Lisa at your neighbor's yard sale is unlikely to be the real thing. When you see Elvis at the mall, you can be pretty sure that he's a fake, too.

Even on a computer it can be obvious. when you run strings against your ls binary and among all of the other data it returns gcc -shared -o /tmp/own.so /tmp/own.c;rm -f /tmp/own.c, you can be pretty sure that's not the real ls command. A fellow in my local Linux Users Group reported this recently, and he didn't need to be told that the system had been rooted.

Sometimes, however, it's more difficult to tell if there's a problem. The most common way to verify the integrity of a binary on a Unix system is by comparing a checksum of the actual file to the checksum of a known-good copy of that file. Tripwire and AIDE are popular system validation tools built on this premise.

The checksum concept has its roots in the bad old days of 300 baud modems without built-in error correction, the xmodem protocol guarded against downloads corrupted by line noise. On the sender's side, xmodem would break a file up into 128-byte chunks and run a check against each block that would produce a single number. After downloading a block of data, the receiver's system would run the same check against it. If both sides got the same answer, the receiver would ask for the next block of data; if the answers differed, then the receiver asked the sender to retransmit the corrupted data.

When Ward Christensen wrote xmodem in 1977, an eight bit checksum seemed reasonable: there was only a 1 in 256 chance that a corrupted packet would be mistaken for the real thing. As file sizes edged upwards and the number of packets transmitted grew, it became clear that 8 bits were not enough to protect against even accidental changes. Later modem protocols used 16 or even 32 bits, lowering the chances of accidental corruption exponentially.

Delivering the Goods
Modern cryptographically-strong checksums work only a little differently: the whole file is reduced to a single 128-bit checksum or "hash." But the MD5 and SHA-1 algorithms used for file integrity checksums are designed to be resistant not only to accidental modification but to intentional trickery: it is nearly impossible to produce two files with the same MD5 checksum, and it is mathematically infeasible to produce two meaningful files with the same checksum.

Many vendors provide a method to validate checksums of installed files against the "correct" numbers: Red Hat's RPM package manager provides a verify option that can inform the user if the MD5 checksum of a file doesn't match what Red Hat expects. This is a wonderful tool if you are concerned with accidental corruption or an inexperienced attacker, but an experienced attacker can easily install an "updated" package that contains mangled checksums or could directly modify the MD5 checksums in the RPM database.

Recently, the Shmoo Group put together Known Goods, an Internet-accessible list of file hashes for various operating systems. MD5 and SHA-1 checksums are available for each file on 26 different Linux, FreeBSD, Mac OS X, and Solaris versions. If you're not sure that your system integrity is intact, you can just go and validate your file checksums against those at Known Goods.

That's the idea, anyway. In practice, first you have to boot from trusted media. Otherwise, if the system has been compromised with a kernel module rootkit any trojaned binaries will be hidden from casual inspection, and if your md5 checksum program has itself been compromised, it will return checksums of the attacker's choosing.

Unfortunately, the system provided by Known Goods is of very limited utility. Checksums are available for only one revision of a file per operating system release: when Red Hat releases an update for OpenSSH, there's no way to tell whether Known Goods' checksum is for the original or the patched version. Without this information, it is impossible to tell whether the checksum doesn't match because you've been hacked or merely because you're up-to-date on your patches.

More seriously, how do you know that you can trust this data? I've met several folks from Shmoo, and they seem to me to be great people. But do you know that you can trust them? If I say that you can trust them, can you trust me?

A Strong Start
In reality, I'm confident that Shmoo isn't trying to put one over on me. And of course I'm entirely trustworthy. But the Known Goods page isn't SSL-protected, and the data isn't cryptographically signed, so I can't be sure that it hasn't been modified either in transit or by someone who compromised their server.

The system could be dramatically improved with smart use of digital signatures.

There is a substantial difference between a cryptographically strong hash and a document digitally signed with public-key cryptography such as GNU Privacy Guard. When an attacker puts a trojaned package on a Web server, he or she can generate a valid checksum for that package too. It'll be different than the old one, but it will be valid.

In order to sign a file, however, you also need access to the signer's private key and his or her passphrase. Once signed, anyone with access to the signer's public key can verify that the packages were created or approved by the person who signed the files. Not only does Red Hat provide digital signatures within packages they distribute, but they also sign messages listing MD5 checksums of updates and ISO images that they make available.

Similarly, Tripwire (but not AIDE) can sign its checksum database so that you can have a high level of confidence in its data.

MD5 checksums are better than nothing: lazy and incompetent attackers frequently fail to replace them when installing Trojans; Indeed, many of the recent trojaned software packages were caught when the checksums failed to verify. But the next generation of attackers will automate the checksum modification process, and we should be ready. That means using public-key encryption to digitally sign our checksum information, or the files themselves.

Known Goods is a strong start, it but needs to support multiple file versions and some sort of digital signatures before we can rely on it. It would be even more useful if their database included popular source packages distributed over the Internet: attackers would have to hack not only the software distribution site but the checksum distribution site to fool people into using their code.

People rely on Internet distribution of software. It's time that we learned to tell the real thing from the dangerous imitation.

SecurityFocus columnist Jon Lasser is the author of Think Unix (2000, Que), an introduction to Linux and Unix for power users. Jon has been involved with Linux and Unix since 1993 and is project coordinator for Bastille Linux, a security hardening package for various Linux distributions. He is a computer security consultant in Baltimore, MD.
    Digg this story   Add to del.icio.us  
Comments Mode:
AIDE 2002-12-17


Privacy Statement
Copyright 2010, SecurityFocus