Friday, August 09, 2013

When did we start trusting bad code?

As we all know: programmers can't be trusted to do cryptography right. They'll make rooky mistakes like screwing up the random number generator.

But the converse is also true: cryptographers can't be trusted to write code.

We see that with Silent Circle. The company was founded by two of the most trustworthy cryptographers, Jon Callas and Phil Zimmerman. In terms of crypto, these guys rock. But in terms of code, they are untrustworthy. Silent Circle's coding practices are 15 years out of date. In other words, they are using the coding practices from 1990s era Microsoft.

Take their recent bug, which was of the form:
memcpy(dst, src, ntohs(*(short*)pkt));
This is the classic buffer overflow you read about in the paper "Smashing the Stack for Fun and Profit". It reads a length from the input packet and trusts it, blindly copying that number of bytes from the source to the destination buffer. When a hacker puts a large number in the packet, larger than the size of the destination buffer, an overflow occurs, and the hacker can hack Silent Circle. So why aren't we, the cybersec community, mocking Silent Circle for this "rooky" mistake? There is no better known vulnerability in C/C++ code than this, so why don't we point that out?

This isn't even the real problem, though. Everybody has bugs. In hindsight, all bugs look simple. No matter how simple or obvious, we shouldn't criticize someone for a single bug. What's important is how such bugs are handled. Does a company have a "secure development" process? Does a company have a "vulnerability response" process? Those are the important questions. (Update: I've had vulns as simple as this -- the point I'm trying to make is how vulns are handled).

And the answer for Silent Circle is that it doesn't really have such processes. It's had vulnerabilities, but lists no "advisories" on its website. They described the fix as "bug fixes" and not as "vulnerability fixes" (as shown in the picture above). Nowhere do they discuss their "secure development" strategies, and their behavior suggests that they don't have any.

Jon Callas even says he is "hoping that next time [there is a vuln] will be long enough from now that we don’t actually acquire skill at handling such problems". Uh, no. In the modern age, trustworthy companies practice vulnerability handling -- regardless if they have ever had one. You should run from any company that describes their vulnerability strategy as "hope".

As it stands today, Silent Circle is untrustworthy. While you can trust their cryptography, you can't trust that they won't hide vulnerabilities. While their crypto is robust, their coverup of their vulnerability has been like snake oil companies.

Here's some things Silent Circle can do in order to win back our trust:
  1. Create an advisory page on their website that lists all their past vulnerabilities.
  2. Offer bounties for vulnerabilities.
  3. Label updates that fix vulnerabilities as "fix vulnerabilities" rather than "bug fixes".
  4. Document their vulnerability handling processes.
  5. Document their "secure development" processes.
In other words, to earn our trust, they need to behave like Microsoft of 2013, not Microsoft of 1998.

By the way, this blog post comes out of a discussion I had while drinking with Marsh Ray. He was defending Silent Circle, and I was yelling at him "since when did this behavior become acceptable?". Since when did "no advisories" become acceptable? Since when did "no SDL" become acceptable? Since when did covering up vulnerabilities, calling them "bug fixes", become acceptable?

Disclosure: I'm biased toward Silent Circle. When congress comes for our encryption keys, and they will, it'll be Silent Circle's founders defending us. This makes me all the more frustrated with their untrustworthy code.

Here is their actual bug fix from last month:
void ZRtp::storeMsgTemp(ZrtpPacketBase* pkt) {
-  int32_t length = pkt->getLength() * ZRTP_WORD_SIZE;
+  uint32_t length = pkt->getLength() * ZRTP_WORD_SIZE;
+  length = (length > sizeof(tempMsgBuffer)) ? sizeof(tempMsgBuffer) : length;
    memset(tempMsgBuffer, 0, sizeof(tempMsgBuffer));
    memcpy(tempMsgBuffer, (uint8_t*)pkt->getHeaderBase(), length);

where "getLength()" is defined as:

    uint16_t getLength()           { return zrtpNtohs(zrtpHeader->length); };

    uint16_t zrtpNtohs (uint16_t net)
        return ntohs(net);


Anonymous said...

When did we start trusting bad code? -- When have we had any other choice? How much good code is there to trust?

Robert Graham said...

In answer to the above question: the issue isn't so much the code itself, but the vendor's responsible handling of vulnerabilities in the code.

Marc Durdin said...

Of course, for this to work, we have to stop rating companies on how secure their code is by counting the number of security advisories they issue. Otherwise, there is a real incentive to paper over security bug fixes.

Jerry Leichter said...

It's worth pointing out that things are necessarily as simple as they seem. I'm not going to defend code I haven't seen, but the fragment shown - whether the actual original code at the end, or the simplified example - might be fine. You really have to know the context.

The number of bytes being copied isn't "anything an attacker wants to insert". It's a unsigned short - at most 65535 bytes. Given current machines, that's a fairly small amount of memory. The surrounding code could be organized to guarantee that there's always 65K of "safe" space. For example, I've written fast copying code where one uses malloc() to allocate a large (say, a couple of meg) buffer and then sub-allocates sequential segments until there's no room left. Allocate "a couple of meg usable plus 65K ignored" and code like this is safe and maybe even reasonable.

Consider the repaired code. Is it definitely safe? Hardly: It's simply assuming that the passed pointer points to a memory region with some pre-defined constant number of bytes that can be safely written to. Is that assumption checked? How? Why is it inherently a better assumption than that there's always 65K of space that can be safely written to? Conversely, why is it safe to assume that the input length is correct - even if short enough to fit in some fixed output buffer? There have been (multiple) bugs in which an input length was fiddled leading to sensitive data located just after the intended input made it into an output buffer.

One way or the other, you're making assumptions about the inputs to a function that manipulates data - i.e., any non-trivial function. Are those assumptions at least written down so that a careful programmer can try to write calls consistent with them - and a reviewer can check if (a) the code works correctly if the assumptions are true; (b) calling code ensures the assumptions are valid? What assumptions can be enforced with certainty? (We know with certainty that the length is no more than 65K!)

Ideally, assumptions should be checked in small, centralized pieces of code. C++ has all sorts of issues when it comes to secure programming, but if you (say) defined a vector type with a length, you can make sure that the amount of memory allocated to the vector really is consistent with that length at that all accesses are properly bounds-checked. (Assuming that the code doesn't play nasty casting games - checkable - and that the compiler generates correct code, a much safer assumption than many others!)

-- Jerry

Anonymous said...

Suggest you go read up on what "Smashing the Stack" means.

the memcpy thing is a buffer overflow, yes. But it's not smashing the stack.

Carlos said...


Speaking of reading, maybe, just maybe, before someone tells someone else to go read something, they should read it themselves.

Yes, the memcpy code can (assuming, as other have said that the surrounding code doesn't prevent it in some way) be used to cause a buffer overrun. A buffer overrun that can be exploited do smash the stack.

That's actually the point if the "... for fun & profit" article...