Every once in a while, I hear someone making comments about the strength of things like long passwords.
For example, if you have a 255 character password that just uses the 26 roman upper and lower case letters, plus the numeric digits. That means that your password has 62^255 possible values, if you can try a million million passwords per second, the time required would exceed the heat death of the universe.
Wow, that's cool - it means that you can never break my password if I use a long enough password.
The odds are very good that something in the system's going to take your password and apply a one-way hash to that password - after all, it wouldn't do to keep that password lying around in clear text where an attacker could see it. But the instant you take a hash of a secret, the strength of the secret degrades to the strength of the hash.
It's another example of the pigeonhole principle in practice - if you put N+M items into N slots, you're going to have some slots with more than one entry. The pigeonhole principle applies in this case as well.
In other words, if the password database that holds your password uses a hash algorithm like SHA-1, your 62^255 possible character password just got reduced in strength to a 256^20 possible value hash. That means that any analysis that you've done on your password doesn't matter, because all an attacker needs to do is to find a different password that hashes to the same value as your password and they've broken your password. Since your password strength exceeds the strength of the hash code, you know that there MUST be a collision with a weaker password.
The bottom line is that when you're calculating the strength of a password, it's important that you understand what your password looks like to an attacker. If your password is saved as an SHA-1 or MD5 hash, that's the true maximum strength of your password.
To be fair, 256^20 is something like 1.4E48, so even if you could still try a million million passwords per second, you're still looking at something like a million million years to brute force that database, but 256^20 is still far less than 62^255.
 explains a reason why longer or more complicated passwords are more secure though. Rather than spending a million million years trying to brute force a password you go the other way. You spend a little bit of time and lots of memory to generate a database of encrypted words, combinations of words, and words combined with symbols and/or numbers (crossed w/ a salt).
Then instead of trying to brute force one password you might try and search all available (higher privileged) hashed passwords and find one that matches. While there's some chance that a more secure password would collide with one of these passwords that chance is lower than a short or simple password colliding with one of these passwords (given that ALL short or simple passwords will likely collide). Therefore you've likely increased your security by having your password hiding in the shadows where the hackers aren't looking.
What might be interesting (and probably done somewhere) would be taking password verification up a level: Instead of just disallowing simple passwords also check a database of hashed passwords to see if there's any hits. If so also reject that password even if the password would otherwise meet the security requirements. Of course I guess that only ups the antity: Now the hackers may know where NOT to look. Sigh...
My work has a password-protected web site (maintained by an outsourced company) for some function. The site has some requirements on the password - at least 8 chars, one numeric, one non-alphanumric, etc. But I've found out that they only keep (or hash?) the first 8 characters! So you can type a new password of "abcdefgh@$FGWB$%Y^@", and then login with just "abcdefgh".
I should've expected that though - took them a while to move their site to HTTPS, so the passwords are not sent as clear-text cookies!
Not directly relevant to the original post, but I guess "truncating" is also a possible answer to "what's done with it".
What about storing 2 hash values for password? E.g. SHA1 and MD5 both. I think there shouldn't be many strings that end up with similar SHA1 value AND similar MD5 value to my password.
Is this why it is recommended to disable generation of Lanman hashes - the strength of Lanman hashes is much lesser* than that offerend by NTLM2, meaning a hacker can easily generate a hash that matches the Lanman hash?
* The uppercasing and truncating to 14 chars reduces the possible combinations.
Yes. But it's not just 14 characters -- LM hashes are split in two parts of 7, reducing the number of combinations even further, to the point where brute forcing the hash is simple on even modest hardware. LM hashes also don't use salts, which means that, if you obtain the password database, you can crack all passwords simultaneously. Wikipedia has a good overview at http://en.wikipedia.org/wiki/LM_hash
In short, nobody should use LM hashes anymore, unless they just care about backwards compatibility more than security (which is more people than you'd expect).
There MUST be a collision, but
1) In practice, people have passwords no longer than 20 characters,
2) So, there are good chances that the collision passwords will be longer than 100 characters.
So, theoretically, you must be right, but that doesn't change practice anyhow.
You could do that, but you would still have the same problem. Now, instead of there being N_sha1 possible passwords, there are now N_sha1*N_md5 possible passwords (2^160 * 2^128 == 2^288) which is still much smaller than the pre-hashed password space originally proposed of 62^255 ~ 2^1518.32.
Also, you don't want to be using md5 for anything.
@Przemyslaw Wlodarczak: storing two different hashes of the same password doubles the surfaces area of an attack. An attacker only needs to crack one of the hashes in order to get the correct password.
@Larry Osterman: One little thing about this article: not only would the attacker need to find a collision, but that attacker would need to find a collision that falls withing the proper character set (in most cases standard ASCII). Therefore, the numbers aren't quite as clear as you list. Due to the nature of the hash function, it is likely that the first 256^20 password combinations do not fill the set of 256^20 possible hashes, and it is possible (though unlikely in the case of a 255 character password) that the first collision with your has happens after your password has already been discovered.
"Since your password strength exceeds the strength of the hash code, you know that there MUST be a collision with a weaker password."
Actually there's only a small chance of that. There's a high chance that there must be collisions with gazillions of equally strong passwords, but a collision with a teeny tiny 30-character password will still be rare.
Now, rarity still means it'll happen by Tuesday (even though we don't know where), so Foo had a good suggestion. If the hash value matches a known broken hash value then warn the user that they'd better change their password again immediately.
Regarding LM hashing, it's not enough to just not use them. Each machine's administrator has to actively set Windows to stop storing them. Just like Windows XP SP2 turned on the firewall by default, let's hope SP3 will turn off storage of LM hashes by default.
Norman, you're right - that's essentially Motoma's point (and a similar point was made offline).
On the other hand, given the difference in keyspace sizes (2^20 vs 62^255) the chances of there being a collision with your password are rather high.
Vista turned off LM Hashes by default, I don't know if SP3 does.
Btw, the easiest way I know of disabling LM hashes is to use a passphrase - since most passphrases are longer than 14 characters long, they disable the LM hash.
Aaron: Maybe I'm just showing my ignorance here, but what's wrong with MD5?
Nothing wrong with md5 except that it was not designed as an encryption hash algorithm replacement but rather as a replacement for crc32.
@JM: Thanks for the reference.
@Larry: since most passphrases are longer than 14 characters long, they disable the LM hash
Hmm, JM's link indicates that the password is truncated to 14 characters, so wouldn't that be a case of what Jonathan described?
arun.philip: If you try to set a password longer than 14 characters on XP, Windows pops up some text indicating that you won't be able to access older resources.
That message is an indication that the LM hash is being disabled.
But I still don't get it... what would the trouble be in digesting passwords into a hash that far exceed the strength of even the "strongest" passwords? Surely, in this era of terabyte HD's and quad core CPU's, even the allocation of a whole kilobyte to store a user's password is not going to be a problem, I would expect?