Friday, April 25, 2008

Automatic patch-based exploit generation

This paper promises "automatic patch-based exploit generation". The paper is a bit overstated, this isn't possible. By "exploit" the paper does not mean "working exploit". That's an important difference. Generating fully functional exploits by reverse engineering a patch takes a lot of steps, this paper automates only one of them, and only in certain cases.

Hackers have been reverse-engineering patches for a while now. In the old days, hackers and vendors would fully disclose all the information about a vulnerability. Those who discovered the bug would often publish a "proof of concept" exploit. This made a lot of people angry because they would get exploited before being able to patch their systems. Therefore, the industry has changed to be largely anti-disclosure, and few details of new vulnerabilities are now reported.

In response, the industry figured out how to get that information anyway. By comparing the patch against the previous version of the software, people can figure out what changed. Software makers are conservative and change the minimum necessary to fix the bug, so comparing the differences is pretty easy.

A good case study of this was the bug that led to Blaster. The discoverers ("Last Stage of Delirium") recognized that this was a major new class of bugs ("remote root in the default install of the Windows desktop") and refused to disclose any details about it. This was a major break from the industry norms at the time.

This frustrated me. I developed signatures for the ISS Proventia intrusion-prevention system. Without full disclosure or a proof-of-concept exploit, I could not create signatures to protect against it. This frustrated me.

So we reverse-engineered the details anyway. We used a tool called "bindiff" to find out what changed, and then created our own proof-of-concept exploit from it, then signatures to protect against it. (For intrusion-detection buffs: these were "vulnerability" signatures, not "exploit" signatures).

Nowadays, this is rather common. Many companies in our industry reverse-engineer patches on a regular basis. Black-hat groups in China and Eastern Europe also do this. For a major new vulnerability, you can expect that chances are pretty good that somebody, somewhere, has a working exploit within the first 24 hours of a patch being released. Indeed, Errata Security does this a couple times a month.

The steps are pretty straightforward.

First, you find the differences. Unfortunately, while the logic of the software is mostly unchanged, the way software build tools lay out the software changes a lot. You need tools like "bindiff" that can find the "logical" differences rather than the "exact" differences, and you often need some experience to quickly sort the irrelevant changes from the relevant ones.

Second, you need to find out how to reach the changed code. This is what the proposed automated-tool does. Given a function, it tests all possible inputs in order to find out which inputs will cause the NEW code to execute, and consequently the OLD code to fail. Normal brute-force wouldn't work - the paper above figured out clever ways to narrow the combinations that need to be tested.

Thirdly, you need to reach that function. In my experience, this is the hardest part of exploit development. How do you create a network packet or a file formatted in just the precise way to reach that function.

Fourthly, you need to figure out how to get execution from the bug. For example, if NX bit is stopping you, then you need return-to-libc shellcode to get around this. There are a lot of exploit frameworks that make this easy, but "easy" shellcode only works for about half the time, the other half its really, really tough, and can often take several days to get it right.

Fifthly, you have to try it out on a bunch of different systems. Your exploit needs to be different for different versions of the target. You typically create versions for specific targets you are interested in, and leave the remaining targets unexploited, otherwise you'd spend forever for each minor difference in potential targets.

The second step, the step this paper automates, takes the least amount of time. Indeed, I never would have identified it as a separate step (it's really part of the next step) because it's usually very easy. Sure, there are probably some difficult cases where this step might take more than a couple minutes, but those cases are fairly rare. It's certainly an interesting development, but in the real world, it wouldn't significantly reduce the amount of time it takes to make fully functional exploits from reverse-engineered patches. However, this time is already worrisome short, which means while you shouldn't be more scared from this paper, you should already be very scared.

The above article quotes "Microsoft has not taken adequate steps to make such attempts more difficult". I'm not sure I'd phrase it that way. Microsoft is under no obligation to do anything about this problem. While it's true that they "don't do anything", but "don't do enough" is a different claim.

The reason is that fixing this is harder than people think. Users already complain that it takes to long for Microsoft to patch flaws, and that the patches are often buggy. Obfuscating changes would make this worse. Moreover, it would be an arms-race: there are many simple things Microsoft could do in order to make reversing harder, but it would be easy to write tools to counteract them. Still, I would hope that Microsoft would be researching this area heavily -- while it's a bad idea for most bugs, it's something to consider for certain critical bugs.

No comments: