Friday, October 21, 2016

Some notes on today's DNS DDoS

Some notes on today's DNS outages due to DDoS.

We lack details. As a techy, I want to know the composition of the traffic. Is it blindly overflowing incoming links with junk traffic? Or is it cleverly sending valid DNS requests, overloading the ability of servers to respond, and overflowing outgoing link (as responses are five times or more as big as requests). Such techy details and more make a big difference. Was Dyn the only target? Why were non-Dyn customers effected?

Nothing to do with the IANA handover. So this post blames Obama for handing control of DNS to the Russians, or some such. It's silly, and not a shred of truth to it. For the record, I'm (or was) a Republican and opposed handing over the IANA. But the handover was a symbolic transition of a minor clerical function to a body that isn't anything like the U.N. The handover has nothing to do with either Obama or today's DDoS. There's no reason to blame this on Obama, other than the general reason that he's to blame for everything bad that happened in the last 8 years.

It's not a practice attack. A Bruce Schneier post created the idea of hacking doing "practice" DDoS. That's not how things work. Using a botnot for DDoS always degrades it, as owners of machines find the infections and remove them. The people getting the most practice are the defenders, who learn more from the incident than the attackers do.

It's not practice for Nov. 8. I tweeted a possible connection to the election because I thought it'd be self-evidently a troll, but a lot of good, intelligent, well-meaning people took it seriously. A functioning Internet is not involved in counting the votes anywhere, so it's hard to see how any Internet attack can "rig" the election. DDoSing news sources like CNN might be fun -- a blackout of news might make some people go crazy and riot in the streets. Imagine if Twitter went down while people were voting. With this said, we may see DDoS anyway -- lots of kids control large botnets, so it may happen on election day because they can, not because it changes anything.

Dyn stupidly uses BIND. According to "version.bind" queries, Dyn (the big DNS provider that is a major target) uses BIND. This is the most popular DNS server software, but it's wrong. It 10x to 100x slower than alternatives, meaning that they need 100x more server hardware in order to deal with DDoS attacks. BIND is also 10x more complex -- it strives to be the reference implementation that contains all DNS features, rather than a simple bit of software that just handles this one case. BIND should never be used for Internet-facing DNS, packages like KnotDNS and NSD should be used instead.

Fixing IoT. The persistent rumor is that an IoT botnet is being used. So everything is calling for regulations to secure IoT devices. This is extraordinarily bad. First of all, most of the devices are made in China and shipped to countries not in the United States, so there's little effect our regulations can have. Except they would essentially kill the Kickstarter community coming up with innovative IoT devices. Only very large corporations can afford the regulatory burden involved. Moreover, it's unclear what "security" means. There no real bug/vulnerability being exploited here other than default passwords -- something even the US government has at times refused to recognize as a security "vulnerability".

Fixing IoT #2. People have come up with many ways default passwords might be solved, such as having a sticker on the device with a randomly generated password. Getting the firmware to match a printed sticker during manufacturing is a hard, costly problem. I mean, they do it all the time for other reasons, but it starts to become a burden for cheaper device. But in any event, the correct solution is connecting via Bluetooth. That seems to be the most popular solution these days from Wimo to Echo. Most of the popular WiFi chips come with Bluetooth, so it's really no burden for make devices this way.

It's not IoT. The Mirai botnet primarily infected DVRs connected to security cameras. In other words, it didn't infect baby monitors or other IoT devices insider your home, which are protected by your home firewall anyway. Instead, Mirai infected things that were outside in the world that needed their own IP address.

DNS failures cause email failures. When DNS goes down, legitimate email gets reclassified as spam, and dropped by spam filters

It's all about that TTL. You don't contact a company's DNS server directly. Instead, you contact your ISPs "cache". How long something stays in that cache is determined by what's known as the TTL or "time to live". Long TTLs mean that if a company wants to move servers around, they'll have to wait until for until caches have finally aged out old data. Short TTLs mean changes propagate quickly. Any company that had 24 hours as their TTL was mostly unaffected by the attack. Twitter has a TTL of 205 seconds, meaning it only takes 4 minutes of DDoS against the DNS server to take Twitter offline. One strategy, which apparently Cisco OpenDNS uses, is to retain old records in its cache if it can't find new ones, regardless of the TTL. Using their servers, instead of your ISPs, can fix DNS DDoS for you:

Why not use anycast?

The attack took down only east-coast operations, attacking only part of Dyn's infrastructure located there. Other DNS providers, such as Google's famed 8.8.8.8 resolver, do not have a single location. They instead us anycasting, routing packets to one of many local servers, in many locations, rather than a single server in one location. In other words, if you are in Australia and use Google's 8.8.8.8 resolver, you'll be sending requests to a server located in Australia, and not in Google's headquarters.

The problem with anycasting is it technically only works for UDP. That's because each packet finds its own way through the Internet. Two packets sent back-to-back to 8.8.8.8 may, in fact, hit different servers. This makes it impossible to establish a TCP connection, which requires all packets be sent to the same server. Indeed, when I test it here at home, I get back different responses to the same DNS query done back-to-back to 8.8.8.8, hinting that my request is being handled by different servers.

Historically, DNS has used only UDP, so that hasn't been a problem. It still isn't a problem for "root servers", which server only simple responses. However, it's becoming a problem for normal DNS servers, which give complex answers that can require multiple packets to hold a response. This is true for DNSSEC and things like DKIM (email authentication). That TCP might sometimes fail therefore means things like email authentication sometimes fail. That it will probably work 99 times out of 100 means that 1% of the time it fails -- which is unacceptable.

There are ways around this. An anycast system could handle UDP directly and pass all TCP to a centralized server somewhere, for example. This allows UDP at max efficiency while still correctly with the uncommon TCP. The point is, though, that for Dyn to make anycast work requires careful thinking and engineering. It's not a simple answer.

6 comments:

  1. "The market can't fix this because neither the buyer nor the seller cares." - Bruce Schneier

    ReplyDelete
  2. Thanks god dailystreamz.com is still working ! I can watch online movies all this weekend :D


    ReplyDelete
  3. Do we have any info on the DVRs that were affected, then? There must be quite a large number of them to be used for such an attack, yes?

    ReplyDelete
  4. A slight quibble - TCP anycast is a reasonable proposition (refer to https://www.nanog.org/meetings/nanog37/presentations/matt.levine.pdf).

    It's unlikely that two packets from the same flow will hit different servers.

    ReplyDelete
  5. Rob, good point about BIND. I'm not sure about your claim that it's not a practice attack. You say:

    "A Bruce Schneier post created the idea of hacking doing "practice" DDoS. That's not how things work. Using a botnot for DDoS always degrades it, as owners of machines find the infections and remove them. The people getting the most practice are the defenders, who learn more from the incident than the attackers do."

    1. "That's not how things work." -- You're referring to other people's behavior, such as an APT based in another country, who will have motivations, considerations, and plans that we cannot fully know. Things can work any kind of way.

    2. "Using a botnet degrades it" -- It doesn't need to last forever, and it's assumed that it won't. And there are probably some things you can never be sure of without real-world testing. Also see Point 1 about different people and organizations having goals and motivations we're not privy to.

    3. "The people getting the most practice are the defenders, who learn more from the incident than the attackers do." -- This claim doesn't logically intersect with the question of whether this was practice run. There's no reason to assume that the attackers would not do it if they thought that the defenders would get "more practice" than they would. It might not matter who learns more, and see Point 1.

    ReplyDelete

Note: Only a member of this blog may post a comment.