Bitwise Evolution

Musings of a Portland-area hacker bent on improving digital lifestyles.

Creating a Secure Webauth System: Part 1 -- HMAC

This is the first in an n-part series about web authentication for a system where user identification and attribution is important, but content protection is not. This entry assumes that a secure method has been used to negotiate a shared secret – as the result of username / password authentication over https, for example.

Obviously the user login / account registration portion of a web auth system will require some secure connection, but once that authentication is completed we’d like to make use of a more efficient open protocol. (eg: http vs. https). There are many reasons for this: better performance, client-side caching, etc.. I’m not going into those details here. Neither will I address the initial authentication step other than to say that part of a successful login is the negotiation of a shared secret other than the user’s password. Ideally this is a 64-byte (or larger) id with a high probability of uniqueness. A GUID, essentially. (It is critical that the secret used is NOT the user’s password!) The actual secured login and secret negotiation will be addressed in another entry . At least, that’s the plan :).

Since our primary goal with this system is to ensure that people are who they say they are, and we’ve punted on the initial authentication (for now), the only place left for an attack is for some one to spoof a user who has already logged in. With no additional work, our login system would be useless – some one could simply skip the entire authentication process and issue an RPC with instructions to do evil things as Alice’s user without needing to know Alice’s password. To prevent this, we need to ensure that the same user who authenticated initially is the user who issued the unsecured RPC. This is where the shared secret comes into play. Only Alice and the server know her shared secret, so if the secret is passed along as a parameter of the RPC, then that is a strong indication that Alice is who she says she is.

But wait! We can’t just pass the secret as an RPC parameter, because these communications aren’t secure. Charlie could lurk on the ‘net, waiting for an RPC from Alice to the server, sniff out the secret, and then proceed to impersonate Alice. We could encrypt the secret, but then we just have a different secret – Charlie doesn’t need to know the unencrypted secret if the encrypted one works just as well. Alice and the server also need an agreed upon way to change the secret so it is different for each RPC, and this must be done in a way that Charlie can’t take the changed secret and either (1) get the initial secret out, or (2) generate the next changed secret.

HMAC: keyed-Hash Message Authentication Code

HMAC is a method of ensuring that a message (an RPC in our case) was generated by someone with access to a shared secret. HMAC makes use of some sort of one-way hashing function (like MD5 or SHA-1) to encrypt the secret along with a message. This generates a short digest of 16-20 bytes that acts as a fingerprint of the message+secret combination. When the digest is sent along with the message, the receiver (our server) can re-generate the hash with the same HMAC calculation and compare the locally generated digest with the digest that came along with the message. Remember: the server has the secret too, so it has enough information to confirm the digest.

So, back to our problem. Alice can now use the shared secret to create a digest of every RPC, and send that along with the RPC as a parameter. The server can then take the digest of the RPC and secret to compare, and then verify that the RPC actually originated with Alice, right?

Not quite yet…. there are still a couple holes in our plan.

Charlie could still sit back and snoop on Alice’s traffic and save an entire RPC, complete with digest, and reissue that RPC later. This is better than letting Charlie do whatever he wants, but there are still some things that could be quite dangerous. Say Alice accidentally deletes something, and undoes the deletion. Charlie could re-issue the deletion and Alice would loose data. The server needs to know not to accept the same request twice (but what if Alice wants to do something twice, you ask? Well, we have to make Alice’s second request a little bit different from the first one, which we can do!).

What if we create a digest of some sequence identifier and pass the sequence ID along with the RPC? Since the digest, ID, and RPC are inseparable (the digest and ID are obviously linked, and the RPC can’t be sent without a valid digest, which Charlie can’t reproduce, since the accompanied digest+id pair is only good once) then we don’t need to create a digest of the entire RPC (it wouldn’t hurt, it’s just a fairly complex thing to do). By incrementing the sequence id and recalculating a digest of it and the secret, then we can keep from issuing the same request more than once, and the server will know to ignore duplicates.

So, this is where we’re at:

  • Charlie can’t snoop the secret, since it’s encrypted with a changing message (the sequence id)
  • Charlie can’t re-issue a “recorded” RPC invocation, because the digest can’t be reversed and Charlie can’t create a valid (digest, sequence) pair without the secret.
  • Charlie can’t change a RPC, again because of the trouble with creating a digest, sequence pair.

Charlie’s only recourse is to try and find a secret which generates the same digests as the secret that Alice is using. This is theoretically possible, since Charlie could probably figure out the hashing algorithm used, and run a brute-force attack, hoping to luck out and find the secret quickly. The possibility of this happening is extremely low, however. Furthermore, each session will use a new secret, so Charlie will only have one session’s worth of time to crack each secret. Even creating a rainbow table will fail if the secrets aren’t of trivial length. (A 64-bit secret will be to large, and we’re using secrets 8 times that size.)

Technical details and further reading

When implementing an approach like this, make sure to guard the secret. It would be easy to accidentally store the secret on the web client as a plain cookie which will then be transmitted in the clear with each RPC, and therefore defeat the purpose. Use a secure cookie, or some other storage method to prevent this.

The HMAC RFC describes the algorithm in detail (and it’s a fairly short, easy to read RFC.) and the Wikipedia page gives a nice description too: HMAC on WikiPedia