Wednesday, April 21, 2021

Ethics: University of Minnesota's hostile patches

The University of Minnesota (UMN) got into trouble this week for doing a study where they have submitted deliberately vulnerable patches into open-source projects, in order to test whether hostile actors can do this to hack things. After a UMN researcher submitted a crappy patch to the Linux Kernel, kernel maintainers decided to rip out all recent UMN patches.

Both things can be true:

  • Their study was an important contribution to the field of cybersecurity.
  • Their study was unethical.
It's like Nazi medical research on victims in concentration camps, or U.S. military research on unwitting soldiers. The research can simultaneously be wildly unethical but at the same time produce useful knowledge.

I'd agree that their paper is useful. I would not be able to immediately recognize their patches as adding a vulnerability -- and I'm an expert at such things.

In addition, the sorts of bugs it exploits shows a way forward in the evolution of programming languages. It's not clear that a "safe" language like Rust would be the answer. Linux kernel programming requires tracking resources in ways that Rust would consider inherently "unsafe". Instead, the C language needs to evolve with better safety features and better static analysis. Specifically, we need to be able to annotate the parameters and return statements from functions. For example, if a pointer can't be NULL, then it needs to be documented as a non-nullable pointer. (Imagine if pointers could be signed and unsigned, meaning, can sometimes be NULL or never be NULL).

So I'm glad this paper exists. As a researcher, I'll likely cite it in the future. As a programmer, I'll be more vigilant in the future. In my own open-source projects, I should probably review some previous pull requests that I've accepted, since many of them have been the same crappy quality of simply adding a (probably) unnecessary NULL-pointer check.

The next question is whether this is ethical. Well, the paper claims to have sign-off from their university's IRB -- their Institutional Review Board that reviews the ethics of experiments. Universities created IRBs to deal with the fact that many medical experiments were done on either unwilling or unwitting subjects, such as the Tuskegee Syphilis Study. All medical research must have IRB sign-off these days.

However, I think IRB sign-off for computer security research is stupid. Things like masscanning of the entire Internet are undecidable with traditional ethics. I regularly scan every device on the IPv4 Internet, including your own home router. If you paid attention to the packets your firewall drops, some of them would be from me. Some consider this a gross violation of basic ethics and get very upset that I'm scanning their computer. Others consider this to be the expected consequence of the end-to-end nature of the public Internet, that there's an inherent social contract that you must be prepared to receive any packet from anywhere. Kerckhoff's Principle from the 1800s suggests that core ethic of cybersecurity is exposure to such things rather than trying to cover them up.

The point isn't to argue whether masscanning is ethical. The point is to argue that it's undecided, and that your IRB isn't going to be able to answer the question better than anybody else.

But here's the thing about masscanning: I'm honest and transparent about it. My very first scan of the entire Internet came with a tweet "BTW, this is me scanning the entire Internet".

A lot of ethical questions in other fields comes down to honesty. If you have to lie about it or cover it up, then there's a good chance it's unethical.

For example, the west suffers a lot of cyberattacks from Russia and China. Therefore, as a lone wolf actor capable of hacking them back, is it ethical to do so? The easy answer is that when discovered, would you say "yes, I did that, and I'm proud of it", or would you lie about it? I admit this is a difficult question, because it's posed in terms of whether you'd want to evade the disapproval from other people, when the reality is that you might not want to get novichoked by Putin.

The above research is based on a lie. Lying has consequences.

The natural consequence here is that now that UMN did that study, none of the patches they submit can be trusted. It's not just this one submitted patch. The kernel maintainers are taking scorched earth response, reverting all recent patches from the university and banning future patches from them. It may be a little hysterical, but at the same time, this is a new situation that no existing policy covers.

I partly disagree with the kernel maintainer's conclusion that the patches "obviously were _NOT_ created by a static analysis tool". This is exactly the sort of noise static analyzers have produced in the past. I reviewed the source file for how a static analyzer might come to this conclusion, and found it's exactly the sort of thing it might produce.

But at the same time, it's obviously noise and bad output. If the researcher were developing a static analyzer tool, they should understand that this is crap noise and bad output from the static analyzer. They should not be submitting low-quality patches like this one. The main concern that researchers need to focus on for static analysis isn't increasing detection of vulns, but decreasing noise.

In other words, the debate here is whether the researcher is incompetent or dishonest. Given that UMN has practiced dishonesty in the past, it's legitimate to believe they are doing so again. Indeed, "static analysis" research might also include research in automated ways to find subversive bugs. One might create a static analyzer to search code for ways to insert a NULL pointer check to add a vuln.

Now incompetence is actually a fine thing. That's the point of research, is to learn things. Starting fresh without all the preconceptions of old work is also useful. That researcher has problems today, but a year or two from now they'll be an ultra-competent expert in their field. That's how one achieves competence -- making mistakes, lots of them.

But either way, the Linux kernel maintainer response of "we are not part of your research project" is a valid. These patches are crap, regardless of which research project they are pursuing (static analyzer or malicious patch submissions).


Conclusion

I think the UMN research into bad-faith patches is useful to the community. I reject the idea that their IRB, which is focused on biomedical ethics rather than cybersecurity ethics, would be useful here. Indeed, it's done the reverse: IRB approval has tainted the entire university with the problem rather than limiting the fallout to just the researchers that could've been disavowed.

The natural consequence of being dishonest is that people can't trust you. In cybersecurity, trust is hard to win and easy to lose -- and UMN lost it. The researchers should have understand that "dishonesty" was going to be a problem.

I'm not sure there is a way to ethically be dishonest, so I'm not sure how such useful research can be done without the researchers or sponsors being tainted by it. I just know that "dishonesty" is an easily recognizable issue in cybersecurity that needs to be avoided. If anybody knows how to be ethically dishonest, I'd like to hear it.

Update: This person proposes a way this research could be conducted to ethically be dishonest:

1 comment:

Chuck Pergiel said...

How much open source software is secure? Has anyone checked?