Thursday, January 26, 2017

Is 'aqenbpuu' a bad password?

Press secretary Sean Spicer has twice tweeted a random string, leading people to suspect he's accidentally tweeted his Twitter password. One of these was 'aqenbpuu', which some have described as a "shitty password". Is is actually bad?

No. It's adequate. Not the best, perhaps, but not "shitty".


It depends upon your threat model. The common threats are password reuse and phishing, where the strength doesn't matter. When the strength does matter is when Twitter gets hacked and the password hashes stolen.

Twitter uses the bcrypt password hashing technique, which is designed to be slow. A typical desktop with a GPU can only crack bcrypt passwords at a rate of around 321 hashes-per-second. Doing the math (26 to the power of 8, divided by 321, divided by one day) it will take 20 years for this desktop to crack the password.

That's not a good password. A botnet with thousands of desktops, or a somebody willing to invest thousands of dollars on a supercomputer or cluster like Amazon's, can crack that password in a few days.

But, it's not a bad password, either. A hack of a Twitter account like this would be a minor event. It's not worth somebody spending that much resources hacking. Security is a tradeoff -- you protect a ton of gold with Ft. Knox like protections, but you wouldn't invest the same amount protecting a ton of wood. The same is true with passwords -- as long as you don't reuse your passwords, or fall victim to phishing, eight lower case characters is adequate.

This is especially true if using two-factor authentication, in which case, such a password is more than adequate.

I point this out because the Trump administration is bad, and Sean Spicer is a liar. Our criticism needs to be limited to things we can support, such as the DC metro ridership numbers (which Spicer has still not corrected). Every time we weakly criticize the administration on things we cannot support, like "shitty passwords", we lessen our credibility. We look more like people who will hate the administration no matter what they do, rather than people who are standing up for principles like "honesty".



The numbers above aren't approximations. I actually generated a bcrypt hash and attempted to crack it in order to benchmark how long this would take. I'll describe the process here.

First of all, I installed the "PHP command-line". While older versions of PHP used MD5 for hashing, the newer versions use Bcrypt.

# apt-get install php5-cli

I then created a PHP program that will hash the password:


I actually use it three ways. The first way is to hash a small password "ax", one short enough that the password cracker will actually succeed in hashing. The second is to hash the password with PHP defaults, which is what I assume Twitter is using. The third is to increase the difficulty level, in case Twitter has increased the default difficulty level at all in order to protect weak passwords.

I then ran the PHP script, producing these hashes:
$ php spicer.php
$2y$10$1BfTonhKWDN23cGWKpX3YuBSj5Us3eeLzeUsfylemU0PK4JFr4moa
$2y$10$DKe2N/shCIU.jSfSr5iWi.jH0AxduXdlMjWDRvNxIgDU3Cyjizgzu
$2y$15$HKcSh42d8amewd2QLvRWn.wttbz/B8Sm656Xh6ZWQ4a0u6LZaxyH6

I ran the first one through the password cracking known as "Hashcat". Within seconds, I get the correct password "ax". Thus, I know Hashcat is correctly cracking the password. I actually had a little doubt, because the documentation doesn't make it clear that the Bcrypt algorithm the program supports is the same as the one produced by PHP5.

I run the second one, and get the following speed statistics:


As you can seem, I'm brute-forcing an eight character password that's all lower case (-a 3 ?l?l?l?l?l?l?l?l). Checking the speed as it runs, it seems pretty consistently slightly above 300 hashes/second. It's not perfect -- it keeps warning that it's slow. This is because the GPU acceleration works best if trying many password hashes at a time.

I tried the same sort of setup using John-the-Ripper in incremental mode. Whereas Hashcat uses the GPU, John uses the CPU. I have a 6-core Broadwell CPU, so I ran John-the-Ripper with 12 threads.


Curiously, it's slightly faster, at 347 hashes-per-second on the CPU rather than 321 on the GPU.

Using the stronger work factor (the third hash I produced above), I get about 10 hashes/second on John, and 10 on Hashcat as well. It takes over a second to even generate the hash, meaning it's probably too aggressive for a web server like Twitter to have to do that much work every time somebody logs in, so I suspect they aren't that aggressive.

Would a massive IoT botnet help? Well, I tried out John on the Raspbery PI 3. With the same settings cracking the password (at default difficulty), I got the following:


In other words, the RPi is 35 times slower than my desktop computer at this task.

The RPi v3 has four cores and about twice the clock speed of IoT devices. So the typical IoT device would be 250 times slower than a desktop computer. This gives a good approximation of the difference in power.


So there's this comment:
Yes, you can. So I did, described below.

Okay, so first you need to use the "Node Package Manager" to install bcrypt. The name isn't "bcrypt", which refers to a module that I can't get installed on any platform. Instead, you want "bcrypt-nodejs".

# npm install bcrypt-nodejs
bcrypt-nodejs@0.0.3 node_modules\bcrypt-nodejs

On all my platforms (Windows, Ubuntu, Raspbian) this is already installed. So now you just create a script spicer.js:

var bcrypt = require("bcrypt-nodejs");
console.log(bcrypt.hashSync("aqenbpuu");

This produces the following hash, which has the same work factor as the one generated by the PHP script above:

$2a$10$Ulbm//hEqMoco8FLRI6k8uVIrGqipeNbyU53xF2aYx3LuQ.xUEmz6

Hashcat and John then are the same speed cracking this one as the other one. The first characters $2a$ define the hash type (bcrypt). Apparently, there's a slightly difference between that and $2y$, but that doesn't change the analysis.



The first comment below questions the speed I get, because running Hashcat in benchmark mode, it gets much higher numbers for bcrypt.

This is actually normal, due to different iteration counts.

A bcrypt hash includes an iteration count (or more precisely, the logarithm of an iteration count). It repeats the hash that number of times. That's what the $10$ means in the hash:

$2y$10$.........

The Hashcat benchmark uses the number 5 (actually, 2^5, or 32 times) as it's count. But the default iteration count produced by PHP and NodeJS is 10 (2^10, or 1024 times). Thus, you'd expect these hashes to run at a speed 32 times slower.

And indeed, that's what I get. Running the benchmark on my machine, I get the following output:

Hashtype: bcrypt, Blowfish(OpenBSD)
Speed.Dev.#1.....:    10052 H/s (82.28ms)

Doing the math, dividing 1052 hashes/sec by 321 hashes/sec, I get 31.3. This is close enough to 32, the expected answer, giving the variabilities of thermal throttling, background activity on the machine, and so on.

Googling, this appears to be a common complaint. It's be nice if it said something like 'bcrypt[speed=5]'  or something to make this clear.




Spicer tweeted another password, "n9y25ah7". Instead of all lower-case, this is lower-case plus digits, or 36 to the power of 8 combinations, so it's about 13 times harder (36/26)^8, which is roughly in the same range.


BTW, there are two notes I want to make.

The first is that a good practical exercise first tries to falsify the theory. In this case, I deliberately tested whether Hashcat and John were actually cracking the right password. They were. They both successfully cracked the two character password "ax". I also tested GPU vs. CPU. Everyone knows that GPUs are faster for password cracking, but also everyone knows that Bcrypt is designed to be hard to run on GPUs.

The second note is that everything here is already discussed in my study guide on command-lines [*]. I mention that you can run PHP on the command-line, and that you can use Hashcat and John to crack passwords. My guide isn't complete enough to be an explanation for everything, but it's a good discussion of where you need to end up.

The third note is that I'm not a master of any of these tools. I know enough about these tools to Google the answers, not to pull them off the top of my head. Mastery is impossible, don't even try it. For example, bcrypt is one of many hashing algorithms, and has all by itself a lot of complexity, such as the difference between $2a$ and $2y$, or the the logarithmic iteration count. I ignored the issue of salting bcrypt altogether. So what I'm saying is that the level of proficiency you want is to be able to google the steps in solving a problem like this, not actually knowing all this off the top of your head.


5 comments:

ekasperc said...

Are you sure your hashcat is using your gpu ?
An RX480 will hash around 8kh/s (https://hashcat.net/forum/thread-5557-post-30508.html#pid30508)

Robert Graham said...

Yes. Hashcat uses a log-iteration count of 5 (2^5 iterations), whereas the passwords in question use a log-iteration count of 10 (2^10 iterations). As you can see, the difference in speed is indeed roughly that difference.

I just ran the benchmark to be sure, and got:

Hashtype: bcrypt, Blowfish(OpenBSD)
Speed.Dev.#1.....: 10052 H/s (82.28ms)

Doing the division (10052 divided by 321) I get 31.3, which is close to the expected difference of 32 times (dividing 2^10 by 2^5).

_NKT_ said...

A fascinating little blog post. Thanks for the analysis.

Of course, it doesn't rule out re-use or sites that don't use bcrypt (there's plenty using MD5 still, because "legacy") but for looking at Twitter alone, this all adds up.

He didn't pick that password though, bet you there was a grownup to do it.

Gary Howard said...

good tech. Thanks.

Gary W. Deason said...

Good tips to protect your personal information. This is very important in our time. I advise you to take this https://www.refog.com/ into account it is perfectly helps to monitor activities on your PC, especially if your PC is not always with you.