Fabulous Adventures In Coding
Eric Lippert is a principal developer on the C# compiler team. Learn more about Eric.
Consider the following scheme:
I have some client software which I sell. When the client software starts up for the first time, it obtains a “token” from the user. The token string can be anything; the user can choose their name, their cat’s name, their password, the contents of some disk file, whatever. How the client software obtains the token from the user is not particularly relevant; perhaps they type it in, perhaps it is set in a registry key, but whatever it is, it is the user’s choice what information they use to identify themselves.
Anyway, the first time the client software starts up it obtains the token, and then phones home over the internet to my server. My server does some unspecified expensive and painful process by which I satisfy myself that the user is authorized to use my client software. (Assume for the sake of argument that such a process exists.)
I encrypt the token message with my RSA private key and send it back to the client. The client writes the encrypted message out to disk.
I do not want to go through the expensive and painful process of contacting the server again every subsequent time the software starts up. Instead, the client software decrypts the stored encrypted token using my RSA public key, and then compares it byte-by-byte to the user’s token. (Again, perhaps they type it in, or perhaps it is in a file or registry key, or whatever.) If the encrypted token matches the decrypted token then I know that I’ve authorized this user at least once already, so I don’t need to do it again.
If the token cannot be authenticated then I go into some error path and deny access to the software.
Does any of this seem like a good idea to you? Because none of this seems like a good idea to me.
The fundamental problem here is that I am attempting to build my own security system instead of purchasing a solution off the shelf that was written by security experts. There are plenty of third-party software licensing solutions available; just buy one.
But it gets worse. Consider the attacks that are possible against this system.
First off, clearly there is nothing stopping any hostile client from simply decompiling my client software, removing the security checks, recompiling, and partying on. The premise of the scenario is that the client is hostile; doing the security check on the client and trusting the result seems ill-advised.
Second, suppose the attacker simply obtains a valid encrypted token once, and then puts the pair up on the internet. Any hostile user can now use the unmodified software without contacting the server. If I notice, I suppose I can check my records to see who put their token up on the internet and sue that person. Or can I? As we’ll see later, a clever attacker can get around this.
Third, suppose there is a hostile eavesdropper Eve, listening in on the connection between a benign client (Alice) and the server (Bob). Alice’s token goes out, the encrypted token comes back, and the eavesdropper has now captured enough information to use the software.
One could mitigate this attack by encrypting the token with the public key first. Then only the server can determine the contents of the token, right? As we’ll see, that’s maybe not true either…
Fourth, there’s no mechanism proposed whereby encrypted tokens expire or can otherwise be treated as “known to be compromised” in some way. There’s no way to identify hostile clients and revoke their ability to run the software.
All of these are pretty fundamental problems with this licensing scheme. Those aren’t the specific problem I want to get at today though. There’s also an extremely dangerous mistake in the design of the crypto mechanism.
The expression “use the right tool for the job” never applies more than when considering crypto solutions to real-world problems. Cryptosystems are designed to solve very specific problems very well, but as soon as you deviate even slightly from the by-design scenario of a cryptosystem, you are off the trail and skiing downhill through the trees; you hit a tree, that’s your fault – you shouldn’t have gone off the trail.
The RSA cryptosystem is extremely good at solving two problems:
1) Alice (again, the benign client) wishes to send a message of her choice to Bob (the server), encrypted with Bob’s public key. The message can be decrypted by Bob, but not by Eve, the hostile eavesdropper.
2) Bob wishes to publish a message of his choice that could only have come from Bob. The message is encrypted with Bob’s private key and can be decrypted by anyone with his public key. The fact that the message can be decrypted by Bob’s public key is evidence that the message came from Bob.
(And obviously these scenarios can be combined; Alice’s message to Bob could be encrypted with her private key, so that Bob is the only one who can read it, and Alice is the only one who could have sent it. Provided, of course, that Alice and Bob have found some secure mechanism by which to reliably exchange public keys. We assume that such a mechanism exists and has been used. This might not be a valid assumption in the real world!)
Does the scheme outlined above actually fall into either of these scenarios? No, absolutely not. It might appear that we are in scenario two, but we’re not. Scenario two is that Bob is encrypting a message of Bob’s choice with Bob’s private key. But in the scheme outlined here, the server is encrypting a message of the presumed-hostile client’s choice! The RSA cryptosystem was not designed to be secure against attacks wherein you allow an attacker to encrypt an arbitrary message with your private key. Such attacks are called “chosen ciphertext” attacks because the attacker chooses a text which is then “decrypted” -- that is, encrypted with the key that Bob would normally use to decrypt a message from Alice.
And in fact, RSA is not secure against such attacks. If you allow an arbitrary attacker to send you an arbitrary message, which you then encrypt with your private key, then the attacker can carefully craft messages which allow her to do all kinds of crazy things:
* Most obviously, an attacker can simply make a token which is a document that says something that Bob would never willingly say. The existence of such a service means that no one can use the fact that a message is decryptable with Bob's public key to conclude that this message is vouched for by Bob. But it is far worse than that.
* Suppose Eve has captured a token message from Alice to Bob, encrypted with Bob's public key. Since encrypting a message with the private key decrypts messages encrypted with the public key, Eve simply sends that as her token to Bob. Bob decrypts Alice's token for Eve. Now Eve knows Alice's token. Of course, Bob can detect that Eve has submitted a token which looks suspiciously like it decrypts to Alice's token, so this might not be the smartest thing for Eve to do. But...
* An attacker can contrive a novel message which, when encrypted with the private key, results in enough information to decrypt a previously-captured message that was encrypted with the public key. That is: Alice encrypts her token with Bob's public key and sends it to the server (Bob), who decrypts it with the private key, encrypts the result with the private key, and sends it back. Eve intercepts both messages. Eve then crafts a very special message based on a transformation of the captured content and asks Bob to encrypt it with Bob’s private key. The contents of the result can be used by Eve to determine what Alice’s token is. Bob has no easy way of knowing that Eve’s special message is in fact an attempt to decrypt Alice’s token.
* A hostile authenticated client can choose a token she would like to be encrypted with Bob’s private key, but she does not want Bob to actually see the token. It is possible to contrive a random-seeming token such that you can figure out what a chosen token’s encryption with Bob's private key would be. The hostile client can use this attack in order to generate a token/encrypted token pair that can be distributed to other attackers, but which does not point back to the authenticated client.
* And so on. We haven’t even gotten into issues like correct padding of the message to prevent replay attacks. There are potentially many more issues here.
I don’t know much about cryptography; the most important thing that I do know on the subject is that I don’t know nearly enough about cryptography to safely design or implement a crypto-based security system.
So, what have we learned?
0) If you possibly can, simply don’t go there. Encryption is extremely difficult to get right and is frequently the wrong solution in the first place. Use other techniques to solve your security problems.
1) If the problem is an untrustworthy client then don’t build a security solution which requires trusting the client.
2) If you can use off-the-shelf parts then do so.
3) If you cannot use off-the-shelf-parts and do have to use a cryptosystem then don’t use a cryptosystem that you don’t fully understand.
4) If you have to use a cryptosystem that you don’t fully understand, then at least don’t use it to solve problems it was not designed to solve.
5) If you have to use a cryptosystem to ski through the trees, then at least don’t allow the presumably hostile client to choose the message which is encrypted. Choose the token yourself. If the token must include information from the client then sanitize it in some way; require it to be only straight ASCII text, insert random whitespace, and so on.
6) If you have to allow the client to choose the token, then don’t encrypt the token itself. Sign a cryptographically-secure hash of the token. Its much harder for the attacker to choose a token which produces a desired hash.
7) Don’t use the same key pair for encrypting outgoing messages as you do for protecting incoming messages. Get a key pair for each logically different operation you're going to perform.
8) Encrypt the communications both ways.
9) Consider having a revocation mechanism, so that once you know that Eve is attacking you, you can at least revoke her license. (Or you can revoke a known-to-be compromised license, and so on.)
And with those cheerful thoughts, I'm off on my annual flight south to Canada to spend a whirlwind Christmas vacation with family and friends back in Waterloo. I hope you all have a festive holiday season!
The problem with revocation is how to check for a revoked cert. Do you contact the revocation server every time you start? That makes startup slow -- or impossible if you're offline. Do you cache revocation lists? If so, the user can delete the cache, not to mention which it can grow unbounded.
Indeed. A fundamental problem with this scheme that I did not call out in the article is that the scheme is intended to protect the assets of the benign software provider from misuse by hostile users. Security features such as cert revocation lists are designed to protect the assets of benign users from attacks by hostile software providers (that is, virus writers, and so on.) So again, we have an example of a situation where a security system designed for one thing causes difficulties when used to solve a completely different problem. -- Eric
Not just individuals, but entire industries have fallen into these traps.
Happy holidays to you too and all the best wishes.
Software protection an Licensing schemes are hard to do right. They are so hard that, it's impossible to have a 100% proof licensing scheme. People have tried with hardware keys, with kernel level protection hooks and various different things. But all of those can be broken, worked around or decompiled and then patched.
In my opinion the hardest to crack licensing method is one that combines many protection methods preferably with random checks scattered in the code at random events and intervals and even the checks are different, like various check sums or partial key checks. Also another thing that strengthens the licensing mechanism is to bind the license to the hardware IDs like CPU Id or Bios serial number. If a executable signing system that protects against tampering is used then the licensing system gets harder to break.
You can have a pretty strong licensing system if more than one layer of protection is used, and people will soon come to the conclusion that it's not worth the effort to try to work around the licensing mechanism, simply because it's too hard to do so.
Thanks alot for this post since i subscribed to you rss feed this is absolute my favorite blog entry!!! more security stuff, encryption and so on!!!
Ps it seems im bond my code was 007!!!
Happy Holidays to everyone!
Also, for lesson learned number 6: "If you have to allow the client to choose the token, then don’t encrypt the token itself. Sign a cryptographically-secure hash of the token. Its much harder for the attacker to choose a token which produces a desired hash."
There are dictionaries out there where if you know the hash value, you can find a value which produces that hash. So unless you also protect the hash value and use one that is suitably collision resistant, you may leave yourself open.
I think one of the most important things about security is that it be placed in layers. That way, the failure of any single layer doesn't mean the failure of the system. It may be weakened/broken in one area, but hopefully reinforcements in another can shore it up so that nothing is compromised.
The second most important thing about security is to know that there is no such thing as a completely secure system. All security systems are somehow fallible, the question is at what cost. The purpose of safe guards isn't so much to make it impossible to get the protected item, it is to raise the cost of doing to the point that it exceeds the value of the protected item. The more "impossible" things a malicious entity has to do to get the item, the higher the cost. Of course, if you mistake a trivial operation for an "impossible" one, such as seems to be the case in this scenario, you could have a very serious problem.
Personally, I'm of the opinion that since you're going to have to trust your users to some extent anyway, you may as well TRUST them. That's not to say you don't have ANY licensing mechanism, but all that engineering effort you've put into a "secure" authentication system could've been much better spent on other USEFUL (to the end-user) features.
Having said that, the ultimate security mechanism is, of course, to put a significant amount of logic for your application directly on the server. That way, you CAN'T use the software without an authenticated connection to the server. Just look at World of Warcraft, for example - the client is free to download, and they make all of their money selling access to their servers. And they make A LOT of money!
Dean's says much of what I always say when this topic comes up.
This quote from a different comment sums up what seems to be a general philosophy among those copy-protecting their software:
"You can have a pretty strong licensing system if more than one layer of protection is used, and people will soon come to the conclusion that it's not worth the effort to try to work around the licensing mechanism, simply because it's too hard to do so."
And in reply to that, here's one of the other things I always say when this topic comes up:
"If you are protecting something so valuable that it's worth your while to invest a huge amount of time and effort in the protection, it is also so valuable that it's worth someone else's time and effort to bypass your protection. Conversely, if you put so much time and effort into your protection that it's literally 'not worth the effort to try to work around the licensing mechanism', then by definition you have invested far too much time and effort implementing the licensing scheme".
Everything Eric wrote is (AFAIK) completely true. But the fact is, the broken licensing scheme is much better, because it requires almost no investment to create, and will still stop anyone that would only violate the license when it's trivial to do so (which frankly is a large proportion of who are legitimate users).
On the other hand, there's basically no evidence that the bulk of people who would violate the license when it takes some effort to do so would ever actually be legitimate users of the software. They don't represent lost sales, and if anything they still may act as advertising for the product.
And if all that wasn't enough, there are the actual legitimate users to consider. Complicated licensing schemes are more prone to breakage, and when the schemes do break, the only users who are affected are the legitimate ones. So, not only does the developer spend all this time and effort to implement the scheme, rather than putting that time and effort into things that make the product better for legitimate users, legitimate users then suffer when the licensing scheme breaks. The legitimate users aren't harmed once, but TWICE! (As a legitimate user who has several times been harmed by a licensing scheme gone bad, I am really sick and tired of these licensing schemes).
And don't forget the normal, non-stealing end user in all of this. Make it too hard to use the product normally, and you end up alienating your paying customer base who will shop elsewhere.
@pete.d there doesn't even need to be any financial incentive - there's very little you can do against an army of bored teenagers.
Happy holidays Eric!
Very good post, this is sound advice as well as entertaining.
Is there anything in particular that prompted you to make this post? I presume you didn't just choose the topic at random ;-)
I haven't done a post on security in a while. And someone recently proposed such a scheme on StackOverflow, so I'd already been thinking about chosen ciphertext attacks. -- Eric
By the way, I love the fact that your list started from zero :-)
The number is the distance from the beginning of the list. ;-) -- Eric
Is that Windows Activation you're talking about? That one's definitely flawed...
Windows Activation solves the problem that it was designed to solve reasonably well. -- Eric
How do you fly south to Canada, though?
Easily enough. Take a look at a map of North America and you'll figure it out. -- Eric
>> If the token cannot be authenticated then I go into some error path and deny access to the software.
Another thought - maybe the default should be no access, and a special case should be to allow access, rather than assume safe, and only deny access via error code.
Yes, these scenarios are not secure.
Because this is not how RSA designed to be used! You are not suppose to encrypt messages with RSA private key at all, only short randomly-generated-by-you session keys!
When server receives a message to be encrypted, it
- generates a random symmetric-algorithm session key. Let's say, 256 bit AES key
- encrypt the message with this 256 bit key using AES algorithm
- encrypt the 256 bit key with _client's publc key_
You see, client will not receive any hints about server's private key.
- if digital signature is required, server can also attach encrypted version of that 256 bits key (may be combined with hash of the message) encrypted with his private key.
in this case, client will get some hint about server's private key, but it would be just 256 bits of encrypted random (not client-suggested) numbers. Not enough to crack the prvate key. Useless for many attack scenarios listed above.
Anyways, I have a better piece of advise. People figured out how to use cryptoalgorithms while ago. Learn that very simple know-how (plus a little bit of common sense) and use it.
Of course, if there is a solution already, like HTTPS/SSL - just use it.
Along the same lines, perhaps the issue is that with most crypto discussions go from ultra high level fun with Alice, Bob and Mary, and then quickly do a deep dive into the nuances of one-way hashes, salts and IV's. Poor coders like me spend days on some stupid requirement from someone else to do it way x, and somehow have to figure out how to make it all work and be secure. Where's the middle ground?
You're perfectly correct to say "RSA wasn't built for that" but that presumes that I know what RSA was built for, or that I can easily determine what RSA is built for, and what it's not built for. To many people encrypted = encrypted. There's a reason XOR'ing bit's is still performed by some people.
Places like stack overflow are, for better for worse, idea exchanges, and I've seen many a doozie there, and we all know what opinions are like .... because we all have em.
Don't get me wrong, but there's a reason why there is either (a) no crypto or (b) broken crypto! The problem is between the keyboard and the chair.
Hmm .. maybe i should google 'crypto wiki'
Anon: don't sell short XORing bits. The most effective crypto possible is a one-time pad (a long stream of predetermined random bits), XORed with your data.
Let's examine this bold claim. Were you to make the claim that a one-time pad shared private key the same length as the plaintext XORed with the plaintext is the most unbreakable crypto scheme ever, I would agree with you completely. It is utterly unbreakable, assuming the secrecy of the shared private key and the randomness of the key. But you didn't say "unbreakable", you said "effective". If this scheme is the most effective, and is unbreakable, then why don't people use it?
Because it is completely impractical, that's why. This scheme requires that the keys be enormous, that they be secret, and that you have a secure, tamperproof mechanism for exchanging secret keys between trusted parties. But if you have as a requirement of your system that you have a secure way to exchange long secret messages, then why do you need the cryptosystem in the first place? Just send your actual messages via the secure mechanism!
The benefit of course is that you can exchange keys when there is a secure communications mechanism -- like, Alice and Bob are in a room together -- and then leverage that into providing security over an insecure channel, like Alice wants to send an email to Bob a day later. But in modern communications, we don't often have that secure communications mechanism in the first place. It's not like Amazon.com and I get into a room together and exchange keys once a year. We need to have a system where parties can exchange short session keys over an insecure channel; public key cryptosystems do that way, way better than one-time-pad shared-key systems.
My definition of "effective" includes "practical". One-time pads are simply not effective or practical for most modern IT applications. -- Eric
Having your software run in an untrusted environment never hack-proof, but there's one advantage to trying - the DMCA gives you a bigger "stick" as it criminalises the act of circumventing your licensing scheme.