,
Federico Biancuzzi interviews OpenSSH developer Damien Miller to discuss features included in the upcoming version 4.3, public key crypto protocols details, timing based attacks and anti-worm measures.
Could you introduce yourself?
Damien Miller: I am one of the developers of OpenSSH and OpenBSD. I have been working on OpenSSH since starting the project to port it to other platforms (initially Linux) back in 1999, but found myself working more and more on the native OpenBSD version of OpenSSH and on the OpenBSD operating system itself as time went on. I also maintain a couple of other free software projects, most notably a collection of NetFlow tools (pfflowd, flowd and softflowd).
The upcoming OpenSSH version 4.3 will add support for tunneling. What type of uses is this feature suited for?
Damien Miller: Reyk and Markus' new tunneling support allows you to make a real VPN using OpenSSH without the need for any additional software. This goes well beyond the TCP port forwarding that we have supported for years - each end of a ssh connection that uses the new tunnel support gets a tun(4) interface which can pass packets between them. This is similar to the type of VPN supported by OpenVPN or other SSL-VPN systems, only it runs over SSH. It is therefore really easy to set up and automatically inherit the ability to use all of the authentication schemes supported by SSH (password, public key, Kerberos, etc.)
The tunnel interfaces that form the endpoints of the tunnel can be configured as either a layer-3 or a layer-2 link. In layer-3 mode you can configure the tun(4) interfaces with IP or IPv6 addresses and route packets over them like any other interface - you could even run a dynamic routing protocol like OSPF over them if you were so inclined. In layer-2 mode, you can make them part of a bridge(4) group to bridge raw ethernet frames between the two ends.
A practical use of this might be securely linking back to your home network while connected to an untrusted wireless net, being able to send and receive ICMP pings and to use UDP based services like DNS.
Like any VPN system that uses a reliable transport like TCP, an OpenSSH's tunnel can alter packet delivery dynamics (e.g. a dropped transport packet will stall all tunnelled traffic), so it probably isn't so good for things like VOIP over a lossy network (use IPsec for that), but it is still very useful for most other things.
Some companies have included crypto features in their hardware, for example Intel included a PRNG in some chipsets, and VIA bundled a full hardware set of crypto functions in its recent CPUs. How and when can OpenSSH take advantage of specific types of hardware like these?
Damien Miller: OpenSSH depends on OpenSSL for cryptographic services and therefore depends on OpenSSL to take advantage of hardware facilities. On OpenBSD at least, this support is seamless - OpenSSL has hooks to directly use Via Padlock instructions (which are amazingly fast) or go via the crypto(4) device to use co-processors like hifn(4) or ubsec(4).
On other operating systems, OpenSSL needs some application support to tell it to load "engine" modules to provide access to hardware services. Darren Tucker has posted patches to portable OpenSSH to get it to do this, but we haven't received any test reports back yet.
Why did you increase the default size of new RSA/DSA keys generated by ssh-keygen from 1024 to 2048 bits?
Damien Miller: Firstly, increasing the default size of DSA keys was a mistake (my mistake, corrected in the next release) because unmodified DSA is limited by a 160-bit subgroup and SHA-1 hash, obviating the most of the benefit of using a larger overall key length, and because we don't accept modified DSA variants with this restriction removed. There are some new DSA standards on they way that use larger subgroups and longer hashes, which we could use once they are standardized and included in OpenSSL.
We increased the default RSA keysize because of recommendations by the NESSIE project and others to use RSA keys of at least 1536 bits in length. Because host and user keys generated now will likely be in use for several years we picked a longer and more conservative key length. Also, 2048 is a nice round (binary) number.
Do you plan to add any other algorithm to generate/exchange keys? For example, why didn't you include an implementation of ECC, used by the NSA?
Damien Miller: ECC (Elliptic Curve Cryptography) has some speed and key size advantages, but there are two impediments to use using it. First, no ECC key exchange method has been specified for the SSH protocol. This isn't too much of a problem as the protocol has a great extension mechanism that allows us to define new methods without breaking other implementations or having to go begging to the IANA for a number reservation.
The second reason is more of a killer: many ECC methods are patented. The NSA made the press recently for licensing these patents, something that we have neither the means nor the desire to do. There are ECC methods that are not patented, but the whole area is a minefield that we don't really want to navigate. Also, some of the ECC methods that are patented are the optimizations that give ECC its performance advantage.
On modern machines, the key exchange isn't much of a delay anyway and can often be avoided by using the connection multiplexing support that has been in openssh-3.9 (reusing the one SSH connection for multiple commands, file transfer or login sessions).
The recent version 4.2 "added support for the improved arcfour cipher modes from draft-harris-ssh-arcfour-fixes-02. The improves the cipher's resistance to a number of attacks by discarding early keystream output". Could you tell us something more?
Damien Miller: Remember that RC4 is a stream cipher; generating a stream of random-looking bytes (based on the key you feed it) that you XOR with the data that you want to encrypt. Fluhrer, Mantin and Shamir found that this early keystream can be correlated with the original key. This unfortunate property may be used to construct an attack that recovers the original key. In its strongest form, this attack is devastating (it is the basis of the 802.11 WEP crack for example) - fortunately the use of RC4 in the SSH protocol has been better engineered, but it still needed to be fixed.
An easy and computationally cheap way to avoid this attack is to simply discard this early keystream. These new cipher modes discard the first 1.5KB of keystream. This doesn't slow down the cipher at all, and so these modes are recommended for people who want to use a faster, but weaker cipher than AES. I.e. use these in favour of the original 'arcfour' cipher. A future release of OpenSSH will probably remove the old method from the default list of accepted ciphers.
Looking at the stats on your website, it seems that more than 70% of the servers print the banner SSH-1.5 or SSH-1.9. This means that they accept SSHv1. I think this is a bad thing. Why don't you remove the support for version 1 of the protocol from OpenSSH?
Damien Miller: We definitely need support for protocol 1 in the client, as lots of devices still only support that version. It is useful to have it in the server, at least so we can test the client. I would like to turn protocol 1 off by default in the server some time in the next few years, but we don't have any desire to remove it altogether.
Another statistic suggests that more than 80% of the SSH servers on the Internet run OpenSSH. I'm wondering if you have ever verified which version they are running, and what is the average behaviour of an OpenSSH administrator. Does people update the server as soon as a new release is available?
Damien Miller: Funny you mention this, we just completed another version survey with the assistance of Mark Uemura from OpenBSD Support Japan. The results of this should be going up on OpenSSH.com soon.
I don't have detailed OpenSSH version histories for usage surveys before last year's. Certainly the use of paleolithic versions (such as 2.x) is very infrequent, but beyond this it is difficult to tell how quickly users update - many vendors will keep relatively ancient versions (such as 3.1p1) on life-support with spot security fixes. This will avoid known security problems, but it doesn't give their users the benefit of any of the proactive work that we do, nor any of the new features.
It is worth noting that OpenBSD, which has a very conservative policy on its stable trees, typically updates supported OpenBSD releases to the latest OpenSSH version when it is released.
Talking about Microsoft Windows, we often hear the theory that the most used software is targeted by exploit writers because of its market share. However looking back at OpenSSH history I don't see a lot of big exploits, so I'm wondering if this happened because you focus on code quality and security, or maybe because OpenSSH is open source and for some reasons there is less honor in announcing a public exploit for it?
Damien Miller: I don't think being open source is any discouragement to exploit writers, quite the opposite - it is easier to find bugs and exploit them when you have the code. OpenSSH is a very attractive target too - it is ubiquitous and exploit-writers take some pride (or at least schadenfreude) in finding OpenBSD holes.
We are painfully cognisant of impact that any security vulnerability would have across our users, so we take a lot of care to avoid them. Beyond the obvious practicalities of not making mistakes, we have procedural safeguards (every commit must be cross-checked and OKed by at least one other developer) and we use technical safeguards such as privilege separation.
Being very popular means also being a good platform for a worm. Did you adopt any specific measures to fight automated attacks?
Damien Miller: Privilege separation alone probably makes a worm targeting a bug in sshd impractical. An attacker would need to break into the unprivileged sshd process that deals with network communications and, because this just gives them access to an unprivileged and chrooted account, then exploit a second vulnerability to either break the privileged monitor sshd or escalate privilege via a kernel bug. This would add a fair amount of complexity, fragility and size to a worm - it would probably need to implement a fair chunk of the SSH protocol just to propagate.
We also implemented self re-execution at the c2k4 Hackathon. This changes sshd so that instead of forking to accept a new connection, it executes a separate sshd process to handle it. This ensures that any run-time randomizations are reapplied to each new connection, including ProPolice/SSP stack canary values, shared library randomizations, malloc randomizations, stack gap randomizations, etc.
Without re-exec, all sshd child processes would share the same randomizations. This would allow an attacker to exhaustively search for the right offsets and values for their exploit by making many connections (millions probably) to the server. With re-exec, each time they connect the values will all be different so there is no guarantee that they will ever stumble upon the right combination.
Another security improvement, just introduced in openssh-4.2 was the "zlib@openssh.com" compression method. This was an idea that Markus Friedl had after the last zlib vulnerability was published.
The SSH protocol has supported zlib compression for a long time, but the standard "zlib" protocol method requires this to be started early in the protocol: after key exchange, but (critically) before user authentication successfully completed. This exposes the compression code to unauthenticated users.
Our solution is to define a new compression method that still performs zlib compression, but delays its start until after user authentication has finished, so only authenticated users get to see it. This is another significant reduction in attack surface with effectively zero performance impact. This also makes the writing of a worm that targets the zlib code in OpenSSH impossible.
What about the hashing of host names and addresses added to known_hosts files?
Damien Miller: Well, that is a defence against worms using SSH as a vector rather than exploiting bugs in OpenSSH itself. This came out of a study from Schechter, Jung, Stockwell, and McLain early this year that found that the hostnames in known_hosts files could be used to build a topological worm, in other words one that can find a fairly optimal path from an infected system to another vulnerable system. In this case, the worm would just use normal trust relationships (Kerberos/GSSAPI trusts, public keys or harvested passwords) to spread from system to system - no bugs in SSH implementations were required.
Hashing the hostnames in known_hosts removes a good source of target information for such potential worms at the cost of some convenience - it is difficult to manually edit the known_hosts file when the hostnames are gibberish. To reduce this inconvenience, we added some extra commands to ssh-keygen so it can lookup, remove or rehash known_hosts. The HashKnownHosts is still off by default, but we might consider turning it on by default if we hear enough success stories.
Note that there are plenty more sources of information to a worm that are outside of our control - shell histories, netstat output, etc. Once again, good fundamental security practices such as not sharing accounts and limiting the range of account trust (especially transitive trust) are still required.
Did you develop any measure to fight timing based attacks?
Damien Miller: There are two classes of timing attacks, one of which matters and the other is not so important.
The not so important timing attacks allow active detection of which usernames are valid by differing timings in authentication failure, e.g. a valid username might take a little while to return (as the authentication method does the work of verifying their supplied credentials) where an invalid username might return quickly (as the authentication method returns early because it knows the username is invalid and destined to fail). We implement defences against these attacks by sending a fake username and credentials to the authentication backends. This hasn't been 100% effective when we delegate authentication to external libraries (e.g. PAM in portable OpenSSH) as they can do their own checks which return early anyway. I don't think these "attacks" matter that much because all they do is reveal the existence of something that isn't much of a secret anyway.
The other class of timing attacks are more scary - these are attacks that allow a passive observer to recover information relating to authentication secrets such as passwords. Attacks of this type have been found by Solar Designer and independently by Song, Wagner and Tian.
A simple attack of this type is watching the early parts of the protocol for a packet which contains a response to a password request. With some knowledge of the protocol these are fairly easy to spot or guess and once an attacker has obtained one, they can directly recover the length of the password. To prevent this, OpenSSH pads passwords up to a minimum of 64 characters. After 64 characters, brute forcing attacks are infeasible anyway, unless you have picked an utterly stupid password like "a" x 64.
A stronger attack involves watching the protocol, *after* the user has authenticated and established a session for occasions where they type a password in, such as running "su" or ssh'ing to another host. Without countermeasures, these can be clearly distinguished in the protocol as a one-way stream of short packets (keystrokes) without replies because the server will disable TTY echo. This will give a passive observer information about password length and inter-keystroke timing. To defeat this attack, OpenSSH sends back fake replies when TTY echo is turned off.
There have been some attacks based on timing. For example you could spot a valid username, and then start a password guessing attack. Then a lot of system administrators started to get their logs full of failed login. This is still an annoying problem and it's solved in very different ways; some people use the firewall to limit the number of active tcp connections to the box, others filter by source IP, and then there are all those tricks you can do on the system itself. Is there a plan to stop these annoying bots from being so efficient? Why don't you delay the login prompt by an increasing time like a standard console does?
Damien Miller: I wouldn't say that these attacks are too efficient - OpenSSH only allows a couple of guesses before the session is hung up and this is only after a fairly computationally expensive key exchange, so there is something of a natural rate limit already. As such, I don't believe that the protocol supports effective brute-force attacks against remotely sane passwords.
So what we see are worms that go for low hanging fruit - stupid password combinations like passwords that are the same as the username and are simple dictionary words - in other words, the stuff that people have been warned repeatedly about for over 20 years. These worms are a only a nuisance for any site that implements even basic password complexity requirements, or implements retrospective password auditing (e.g. using John the Ripper).
As far as defences, I'm more interested in adding better authentication controls (such as, untrusted hosts need to authenticate with a key before a password) than failure delays. Failure delays don't help much anyway - a smart attacker will just run multiple attacks in parallel to defeat them. There might be a case for forcing some sort of client-puzzle to be solved between failed password authentication attempts, but that is more complexity.