Thursday, November 03, 2016

In which I have to debunk a second time

So Slate is doubling-down on their discredited story of a secret Trump server. Tip for journalists: if you are going to argue against an expert debunking your story, try to contact that expert first, so they don't have to do what I'm going to do here, showing obvious flaws. Also, pay attention to the data.


The experts didn't find anything

The story claims:
"I spoke with many DNS experts. They found the evidence strongly suggestive of a relationship between the Trump Organization and the bank".
No, he didn't. He gave experts limited information and asked them whether it's consistent with a conspiracy theory. He didn't ask if it was "suggestive" of the conspiracy theory, or that this was the best theory that fit the data.

This is why "experts" quoted in the press need to go through "media training", to avoid getting your reputation harmed by bad journalists who try their best to put words in your mouth. You'll be trained to recognize bad journalists like this, and how not to get sucked into their fabrications.


Jean Camp isn't an expert

On the other hand, Jean Camp isn't an expert. I've never heard of her before. She gets details wrong. Take for example in this blogpost where she discusses lookups for the domain mail.trump-email.com.moscow.alfaintra.net. She says:
This query is unusual in that is merges two hostnames into one. It makes the most sense as a human error in inserting a new hostname in some dialog window, but neglected to hit the backspace to delete the old hostname.
Uh, no. It's normal DNS behavior with non-FQDNs. If the lookup for a name fails, computers will try again, pasting the local domain on the end. In other words, when Twitter's DNS was taken offline by the DDoS attack a couple weeks ago, those monitoring DNS saw a zillion lookups for names like "www.twitter.com.example.com".

I've reproduced this on my desktop by configuring the suffix moscow.alfaintra.net.



I then pinged "mail1.trump-email.com" and captured the packets. As you can see, after the initial lookups fail, Windows tried appending the suffix.



I don't know what Jean Camp is an expert of, but this is sorta a basic DNS concept. It's surprising she'd get it wrong. Of course, she may be an expert in DNS who simply had a brain fart (this happens to all of us), but looking across her posts and tweets, she doesn't seem to be somebody who has a lot of experience with DNS. Sorry for impugning her credibility, but that's the way the story is written. It demands that we trust the quoted "experts". 

Call up your own IT department at Slate. Ask your IT nerds if this is how DNS operates. Note: I'm saying your average, unremarkable IT nerds can debunk an "expert" you quote in your story.

Understanding "spam" and "blacklists"

The new article has a paragraph noting that the IP address doesn't appear on spam blocklists:
Was the server sending spam—unsolicited mail—as opposed to legitimate commercial marketing? There are databases that assiduously and comprehensively catalog spam. I entered the internet protocal address for mail1.trump-email.com to check if it ever showed up in Spamhaus and DNSBL.info. There were no traces of the IP address ever delivering spam.
This is a profound misunderstanding of how these things work.

Colloquially, we call those sending mass marketing emails, like Cendyn, "spammers". But those running blocklists have a narrower definition. If  emails contain an option to "opt-out" of future emails, then it's technically not "spam".

Cendyn is constantly getting added to blocklists when people complain. They spend considerable effort contacting the many organizations maintaining blocklists, proving they do "opt-outs", and getting "white-listed" instead of "black-listed". Indeed, the entire spam-blacklisting industry is a bit of scam -- getting white-listed often involves a bit of cash.

Those maintaining blacklists only go back a few months. The article is in error saying there's no record ever of Cendyn sending spam. Instead, if an address comes up clean, it means there's no record for the past few months. And, if Cendyn is in the white-lists, there would be no record of "spam" at all, anyway.

As somebody who frequently scans the entire Internet, I'm constantly getting on/off blacklists. It's a real pain. At the moment, my scanner address "209.126.230.71" doesn't appear to be on any blacklists. Next time a scan kicks off, it'll probably get added -- but only by a few, because most have white-listed it.


There is no IP address limitation

The story repeats the theory, which I already debunked, that the server has a weird configuration that limits who can talk to it:
The scientists theorized that the Trump and Alfa Bank servers had a secretive relationship after testing the behavior of mail1.trump-email.com using sites like Pingability. When they attempted to ping the site, they received the message “521 lvpmta14.lstrk.net does not accept mail from you.”
No, that's how Listrake (who is the one who actually controls the server) configures all their marketing servers. Anybody can confirm this themselves by ping all the servers in this range:


In case you don't want to do scans yourself, you can look up on Shodan and see that there's at least 4000 servers around the Internet who give the same error message.


Again, go back to Chris Davis in your original story ask him about this. He'll confirm that there's nothing nefarious or weird going on here, that it's just how Listrak has decided to configure all it's spam-sending engines.

Either this conspiracy goes much deeper, with hundreds of servers involved, or this is a meaningless datapoint.


Where did the DNS logs come from?

Tea Leaves and Jean Camp are showing logs of private communications. Where did these logs come from? This information isn't public. It means somebody has done something like hack into Alfa Bank. Or it means researchers who monitor DNS (for maintaing DNS, and for doing malware research) have broken their NDAs and possibly the law.

The data is incomplete and inconsistent. Those who work for other companies, like Dyn, claim it doesn't match their own data. We have good reason to doubt these logs. There's a good chance that the source doesn't have as comprehensive a view as "Tea Leaves" claim. There's also a good chance the data has been manipulated.

Specifically, I have as source who claims records for trump-email.com were changed in June, meaning either my source or Tea Leaves is lying.

Until we know more about the source of the data, it's impossible to believe the conclusions that only Alfa Bank was doing DNS lookups.

By the way, if you are a company like Alfa Bank, and you don't want the "research" community from seeing leaked intranet DNS requests, then you should probably reconfigure your DNS resolvers. You'll want to look into RFC7816 "query minimization", supported by the Unbound and Knot resolvers.


Do the graphs show interesting things?

The original "Tea Leaves" researchers are clearly acting in bad faith. They are trying to twist the data to match their conclusions. For example, in the original article, they claim that peaks in the DNS activity match campaign events. But looking at the graph, it's clear these are unrelated. It display the common cognitive bias of seeing patterns that aren't there.

Likewise, they claim that the timing throughout the day matches what you'd expect from humans interacting back and forth between Moscow and New York. No. This is what the activity looks like, graphing the number of queries by hour:

As you can see, there's no pattern. When workers go home at 5pm in New York City, it's midnight in Moscow. If humans were involved, you'd expect an eight hour lull during that time. Likewise, when workers arrive at 9am in New York City, you expect a spike in traffic for about an hour until workers in Moscow go home. You see none of that here. What you instead see is a random distribution throughout the day -- the sort of distribution you'd expect if this were DNS lookups from incoming spam.

The point is that we know the original "Tea Leaves" researchers aren't trustworthy, that they've convinced themselves of things that just aren't there.


Does Trump control the server in question?

OMG, this post asks the question, after I've debunked the original story, and still gotten the answer wrong.

The answer is that Listrak controls the server. Not even Cendyn controls it, really, they just contract services from Listrak. In other words, not only does Trump not control it, the next level company (Cendyn) also doesn't control it.


Does Trump control the domain in question?

OMG, this new story continues to make the claim the Trump Organization controls the domain trump-email.com, despite my debunking that Cendyn controls the domain.

Look at the WHOIS info yourself. All the contact info goes to Cendyn. It fits the pattern Cendyn chooses for their campaigns.
  • trump-email.com
  • mjh-email.com
  • denihan-email.com
  • hyatt-email.com

Cendyn even spells "Trump Orgainzation" wrong.


There's a difference between a "server" and a "name"

The article continues to make trivial technical errors, like confusing what a server is with what a domain name is. For example:
One of the intriguing facts in my original piece was that the Trump server was shut down on Sept. 23, two days after the New York Times made inquiries to Alfa Bank
The server has never been shutdown. Instead, the name "mail1.trump-email.com" was removed from Cendyn's DNS servers.

It's impossible to debunk everything in these stories because they garble the technical details so much that it's impossible to know what the heck they are claiming.


Why did Cendyn change things after Alfa Bank was notified?

It's a curious coincidence that Cendyn changed their DNS records a couple days after the NYTimes contacted Alfa Bank.

But "coincidence" is all it is. I have years of experience with investigating data breaches. I know that such coincidences abound. There's always weird coincidence that you are certain are meaningful, but which by the end of the investigation just aren't.

The biggest source of coincidences is that IT is always changing things and always messing things up. It's the nature of IT. Thus, you'll always see a change in IT that matches some other event. Those looking for conspiracies ignore the changes that don't match, and focus on the one that does, so it looms suspiciously.

As I've mentioned before, I have source that says Cendyn changed things around in June. This makes me believe that "Tea Leaves" is editing changes to highlight the one in September.

In any event, many people have noticed that the registrar email "Emily McMullin" has the same last name as Evan McMullin running against Trump in Utah. This supports my point: when you do hacking investigations, you find irrelevant connections all over the freakin' place.


"Experts stand by their analysis"

This new article states:
I’ve checked back with eight of the nine computer scientists and engineers I consulted for my original story, and they all stood by their fundamental analysis
Well, of course, they don't want to look like idiots. But notice the subtle rephrasing of the question: the experts stand by their analysis. It doesn't mean the same thing as standing behind the reporters analysis. The experts made narrow judgements, which even I stand behind as mostly correct, given the data they were given at the time. None of them were asked whether the entire conspiracy theory holds up.

What you should ask is people like Chris Davis or Paul Vixie whether they stand behind my analysis in the past two posts. Or really, ask any expert. I've documented things in sufficient clarity. For example, go back to Chris Davis and ask him again about the "limited IP address" theory, and whether it holds up against my scan of that data center above.


Conclusion

Other major news outlets all passed on the story, because even non experts know it's flawed. The data means nothing. The Slate journalist nonetheless went forward with the story, tricking experts, and finding some non-experts.

But as I've shown, given a complete technical analysis, the story falls apart. Most of what's strange is perfectly normal. The data itself (the DNS logs) are untrustworthy. It builds upon unknown things (like how the mail server rejects IP address) as "unknowable" things that confirm the conspiracy, when they are in fact simply things unknown at the current time, which can become knowable with a little research.

What I show in my first post, and this post, is more data. This data shows context. This data explains the unknowns that Slate present. Moreover, you don't have to trust me -- anybody can replicate my work and see for themselves.









7 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Great work. The fallacious behavior to which you are referring is called confirmation bias.

    ReplyDelete
  4. And illusory correlation.

    ReplyDelete
  5. FWIW, Camp has been a security professor at IU-Bloomington for over a decade now. Her work is more focused on the HCI aspects of security, though, so I'm not surprised you hadn't heard of her.

    ReplyDelete
  6. My searching of Paul Vixie's Farsight Passive DNS db suggests that "changed in June" is flat out wrong.

    Prior to Sep 23rd this year:

    - The NS for the trump-email.com domain was unchanged since mid 2010 (and possibly earlier, this is probably the limit of the pDNS data)
    - The SOA for the domain was unchanged since December 2014
    - The MX (incoming.cdcservices.com) was unchanegd since December 2011
    - The TXT ("Internet Solution from Cendyn.com.", "v=spf1 ip4:198.91.42.0/23 ip4:64.135.26.0/24 ip4:64.95.241.0/24 ip4:206.191.130.0/24 ip4:63.251.151.0/24 ip4:69.25.15.0/24 mx ~all" ) was unchanegd since November 2014
    - The CNAMES www.trump-email.com, mail.trump-email.com., _client._smtp.trump-email.com. and links.trump-email.com. were all unchnaged since at least 2012 (some date back to 2010)
    - The A record mail1.trump-email.com. A 66.216.133.29
    goes back to Fri Jul 2 19:20:22 2010

    One thing that is missing is that there is no record of anyone actually querying for the A record for trump-email.com and I can't (from the farsight data) see where it comes from

    The only oddity is that three machine generated CNAMEd subdomains show up briefly on Sep 23. All redirected to trump-email.com

    dw6w3yzfw6.trump-email.com.
    s4ddlkd49j.trump-email.com.
    t59hykhmfc.trump-email.com.

    There is no record of these or any other strange subdomains in the Farsigght pDNS. I suspect that either this was a test by some researcher to confirm that *.trump-email.com redirected to trump-email.com or someone was in the process of setting up a different email tracking system when it was decided to drop the domain entirely

    ReplyDelete
  7. Good work. I despise it when bits and pieces of science (each with very limited scope) get strung together and thus imply these individual pieces work together and are some sort of proof of some idea--put forth by a layperson!

    Some of the so-called experts also declare that they have "never seen this before" when in some industries, it is very common, if fact, in some industries it is the norm, not the exception. So, always keep in mind, some scientist's "experience" is not science--it is opinion based on the limitations of the scope of that scientists experience.

    More directly, email servers used to distribute marketing emails, don't accept inbound email and bounce everything that comes their way. So, if you declare that it's just unheard of (by "experts") to configure an email server that way, your experts have limited scope of experience AND limited commonsense because everybody, everyday, gets emails from marketers using servers that don't accept and may bounce any/all inbound emails. (e.g. no-reply@SomeMarketersDomain.com). Sheesh!

    Finally, in analyzing the data (assuming it is legit) we can postulate not a single explanation, but a whole array of possibilities that might produce these data. One that comes to mind for me is, wouldn't it be ingenious to use such an email account to communicate to a list of a single "subscriber"? You wouldn't have to be particularly tech savy (if at all) to operate it. You'd fly under the radar of the NSA et al, who probably ignore these IPs while culling some of the traffic it analyzes. So, if I were going to postulate nefarious Trump dealings with Russia via this data regarding this email server, that would be the direction I would turn and the theory I would follow. Makes much better sense, right?

    But, proof of nothing.

    ReplyDelete

Note: Only a member of this blog may post a comment.