Friday, August 09, 2013

The Rob Test: 12 Steps to Safer Code

Joel Spolsky has a famous list of "12 Steps to Better Code". I thought I'd create a similar list for safer, more secure code that's resilient against hackers.

The Rob Test
1. Do you use source control, bug tracking, and planning (i.e. GitHub basics)?
2. Do you have automated (one step, daily) builds?
3. Do you have automated regression/unit testing? Can you fix/release in 24 hours?
4. Do you reward testers for breaking things? (like fuzz testing)
5. Do your coders know basic vulns? (buffer-overflows, OWASP Top 10) Do you train them? Do you test new hires?
6. Do you know your attack surface? threat model?
7. Do you sniff the wire to see what's going on? (including sslstrip)
8. Do you have detailed security specifications as part of requirements/design?
9. Do you ban unsafe practices? (strcpy, SQL pasting, clear-text)
10. Do you perform regular static/dynamic analysis on code?
11. Do you have, and practice, an incident response plan? (secure@, bounties, advisories, notification)
12. Are your processes lightweight and used, or heavyweight and ignored?

1. Do you have source control and bug tracking?

The first question is whether you have "standard" development practices as described by Joel, such as source control and bug tracking. I use GitHub as an example because it has all the basics, without too much unnecessary fluff. If you don't do these basics, then you can't hope to have anything close to secure code.

Sadly, big corporate processes often don't come close to GitHub. They end up being so cumbersome and difficult to use that internal developers avoid them.

2. Do you have automated builds?

Your build process needs to be as simple as the standard "configure; make" combo. If you can't do this, you have serious development issues that'll prevent much of the rest of secure development. In particular, it'll hinder your ability to ship fixes in 24 hours.

3. Do you have automated regression/unit testing? Can you fix/release in 24 hours?

When a vulnerability is discovered in Chrome, Google can fix the bug and ship an update within hours. That's an update for Windows, Mac OS X, Linux, and Android -- and multiple versions of each, such as 64-bit/32-bit.

This should be considered the norm.

Firstly, in the long run, it saves enormous amount of time even without considering security. It drastically shortens the feedback loop for developers, who instantly find bugs even before they check in their code. They quickly learn to write better code and some bugs simply no longer happen.

Secondly, it's an essential part to security response. Once a vulnerability has been announced, your tradeoff is between whether hackers will exploit the bug against customer systems, or whether your fix introduces a new bug that breaks other customers.

The new norm needs to be more than "configure; make; make install", but:
$ configure
$ make
$ make install
$ make regress

4. Do you reward testers for breaking things?

Your quality assurance efforts are broken. The fault lies with the name: testers are trying to prove the system works, not trying to prove it doesn't.

For example, if your requirement spec says "allows usernames up to 100 characters", your testing procedures need to include "enter username of 10 million characters". Sadly, most "quality assurance" processes test up to the 100 character limit, but nobody enters 101 characters to see what will happen.

This should include the classic "fuzz testing", where your testers/developers write tools with the express intent of trying to break the product by sending random junk at it.

5. Do your coders know the basics, like buffer overflows and SQL injection?

SQL injection is by far the most dangerous bug on the Internet, the one hackers use most often to break into websites. It's also the easiest to fix. Unfortunately, most web developers either don't know it exists, don't understand it, or don't believe it's a real threat. Hence, every time you hire a new developer or new consultant, they will add SQL injection flaws to your website. You need a process to stop this from happening, such as testing new hires, or training old developers.

It's not just SQL injection, it's the entire OWASP Top 10. Everyone one of your website developers must understand the OWASP vulnerabilities or they will be doomed to repeat them.

The same is true of other languages. When developing code in C/C++/ObjectiveC, your developers need to know why buffer overflows are a big deal.

6. Do you know your attack surface and threat model?

I hate these terms because they sound complicated. It comes down to this: your code has two parts, the part that deals with the external world that's untrusted/hackable, and the code that deals with only internal matters. You need to understand the difference.

That's the problem with developers, they don't understand the difference between internal/external. For example, they think if they put a hidden field in a web form that hackers can't change it, because it's "internal" to the app. Nope, it's external, and everything external can be manipulated by hackers.

Unfortunately, the languages/libraries that programmers use try very hard to hide this distinction (internal vs. external). Therefore, programmers have to unlearn a lot.

7. Do you sniff yourself? ...including with SSLstrip?

Everything communicates with the external world across a network. As part of your development processes, you need to eavesdrop on that network traffic and see what it's really doing. When you do this, you'll find obvious problems, such as passwords going across in the clear that any hacker can see. Sniffing your traffic is one of the first things a hacker will do -- you need to do it too.

This seems like a narrow/small requirement, but in my experience it's huge. What really happens on the network is a black-box to most developers, yet the first thing hackers look at when hacking a product. Developers need more visibility into this.

8. Is security part of your requirements and design?

Avoid heavyweight process here. Too much security in the requirement/design phase kills projects. But you need some.

The best thing to add to your requirements is by taking vulnerabilities found in your product and putting them in the requirements for future projects. If you got burned because some coder added a "backdoor password" to one of your products, add "no backdoor passwords" to all future requirements.

With that said, some of the items listed above need to be part of the requirements spec. Instead of simply saying "usernames up to 100 characters in length", you also need to add things like "and safely rejects anything longer". You need to put enough language in the requirement spec to support this sort of development/testing downstream to make sure that there is budget/time allocated for it.

9. Do you forbid unsafe practices?

If you are writing C code, it shouldn't have "strcpy()" anywhere. It's easy to create header files to do this. It's easy to automate a regression test to prevent this function from ever being used.

If you are writing web code, it's easy to use parameterized queries instead of pasting strings together. Static analyzers easily find this problem.

Your apps need to use SSL to communicate at a minimum. Encryption may be useful in other areas, too.

These are just examples to start with. Get this issues fixed in your development process, then over time start adding other unsafe practices.

10. Do you perform regular static analysis on code?

Static analyzers find the most obvious bugs, like strcpy() or SQL string pasting. They have a lot of false-positives, so in the long run, you'll be changing your code simply to avoid static analyzer false positives. This actually isn't a bad thing -- even when things are "false" positives, they are often good technique to avoid. Giving coders fewer options, and making things more obviously correct, pays off in the long run.

I wouldn't run static analyzers on every build, but I would run them on a "regular" basis.

11. Do you have a response plan?

Bugs happen, no matter how smart you are. You need to have a response plan for when that happens.

The problem with your organization is that your response plan is likely the "Seven Stages of Denial". Your organization will work hard to deny that something is a major vulnerability. When a hacker calls tech support, your response will be "but you aren't a customer, therefore we don't care". No, the hacker isn't your customer -- he's just breaking into all your customers. When you pass the vulnerability up the chain, your project managers and developers will push back, trying to explain why it isn't a bug, or why it doesn't need to be fixed immediately, and why the person who reported it is evil and has a secret agenda.

Another problem you'll have is surprise. Every time an incident happens it'll be unlike any incident that happened before. There's some process you can put in place to guide people, but the best thing to do is get somebody high up in the company who can understand the problem, then push aside the various barriers in the company in order to get it fixed in a timely manner. For example, you need a VP of engineering who can tell everyone to stay late, or a VP of marketing with enough to clout to ship the fix anyway even though it hasn't been fully tested.

Lastly, your problem is that you'll mislead customers. Whether it's on your website, or your marketing/sales underlying dealing one-on-one with customers, there is going to be a lot of lying going on trying to downplay the severity of the event. This is especially true because your competitors are going to keep bringing up the issue. These lies will not serve you well. Instead, they will destroy your credibility with the market/customers. Your message should be "problems happen to everyone, the measure of a company is how well they respond, and we are responding responsibly", not "there is no severe problem".

You should an "advisory" page that airs your dirty laundry, listing all the bad problems you've had in the past. It doesn't need to be obvious, you can bury it like your do your privacy-policy. What this does is tell customers that you are honest and trustworthy, willing to tell the truth no matter how much it hurts. When your competitors use this against you, ask where their advisory page. Every reputable company has one -- if your competitors don't, they aren't reputable. Ideally, when you competitors prove that you have problems, they'll be sending the potential customer to your website.

If your marketing brochures advertise any sort of security feature, such as encryption or anti-hacker defense, then you really should have a "bounty" program that pays people who can break it. Make it a small amount, like $1000, because no matter how much confidence you have in these features, hackers will break them. Moreover, hackers will break your product in ways that you think is "outside the rules" -- pay up anyway, fix the problem, and proudly display the effectiveness of your bug bounty program on your advisories page.

12. Are your processes lightweight and used, or heavyweight and ignored?

This is the biggest issue of all. Do you have the previous 11 processes in place in your company, but they aren't being followed because they require too much training or too much effort? Then dismantle them! Lightweight process that people actually follow is better than heavyweight process that is ignored.

I see a lot of "top down" process that fails in companies. The problem is that it attracts process wonks who want to create big solutions to the problems, but who never get their hands dirty with the details. I've seen complicated "safe coding" practices that never solve specific issues like "strcpy()". That's why I mention "strcpy()" specifically in this document: start with the details you know and build process from the bottom up, distrust nonsense process coming from the top down.





No comments: