[Dovecot] CRAM-MD5 Password Generation Algorithm

Sun Apr 13 23:12:41 EEST 2008

At 11:57 AM +0100 4/13/08, Douglas Willcocks  imposed structure on a 
stream of electrons, yielding:
>On Sun, 13 Apr 2008 01:39:44 -0400, Bill Cole
><dovecot-20061108 at billmail.scconsult.com> wrote:
>>  At 9:32 PM +0100 4/12/08, Douglas Willcocks  imposed structure on a
>>  stream of electrons, yielding:
>>  [...]
>>>I've read through the document, and I now understand where exactly the
>>>precomputed context sits in the whole picture, but I'm still unsure how
>>  to
>>>reproduce it without dovecotpw.
>>
>>  In looking a little closer at RFC1321 that defines MD5, I think I see
>>  the trick.  The stored contexts are not full MD5's of Ki and Ko, they
>>  are just the 16 bytes of state after hashing those single key blocks.
>>  A full MD5 hash gets finalized by padding out the data with nulls and
>>  a 64-bit count of the hashed data. If you  were to do a normal MD5 of
>>  Ki and Ko, the result would include the result of the finalization
>>  step.
>>
>>  You can see this in the hmac_md5_get_cram_context and
>>  hmac_md5_set_cram_context functions in the Dovecot code. The former
>>  extracts just the 4 32-bit state values, and the latter sets them and
>>  sets the bit count in the context structure to 512 (i.e. 64 bytes)
>>
>>  Maybe that provides a helpful clue...
>>
>
>Thank you very much for the time you spent looking into this.

Some people spend their weekends doing crosswords or Sudoku...

>There are still some wholes in the algorithm, but so far:
>
>1. Do the first part of HMAC as normal up to XOR'ing with 0x5c, including
>strangely enough what looks like a full (i.e. including nulling and
>padding) md5 hash if the passphrase is longer than 64 (this step should be
>feasible to replicate)

The HMAC specification is clear on that: if the shared secret (i.e. 
the password/passphrase) is longer than the hash function blocksize, 
it is hashed first and the hash output used as the key instead of the 
secret itself.

>2. Do the first step of md5 hashing (again, using the Perl library you
>suggested, it should be feasible to replicate the required functionality)
>
>-- From now on, it gets a bit sketchy
>
>3. Using the intermediate states of Ko and Ki, extract the various segments
>into a char[] so you end up with an array of 32 8bit segments. I'm unsure
>exactly what the bitwise AND operator is used for, I thought that a bitwise
>AND with 0xFF would leave everything the same?

Are you looking at the RFC1321 reference code's Encode function? The 
point there is that the input is an array of 32-bit unsigned integers 
(UINT4), the output is an array of (8-bit) characters. Doing an AND 
with 0xFF (an 8-bit value) on bit-shifted UINT4's is a way of 
clearing  everything but the 8 bits of interest and making the 
typecast formally correct. In a compilation and runtime environment 
that will let you play freely with pointer types, that function would 
be unneeded because you could just address the input as a char * 
instead of a UINT * and be done with it, but the formal approach 
Rivest uses assures that the bits are cleanly copied from an array of 
32-bit members to an array of 8-bit members even on systems where a 
pointer cast would not give correct results.

It is also important to keep the distinction between the HMAC and MD5 
layers clear. HMAC can be used with any blockwise hash function, and 
the internal details of the hash function are not important to the 
HMAC layer.

I'd break it up this way, with a warning that I'm hand-waving the 
mathematical innards of MD5:

1. If the password is longer than 64 bytes, run a full normal MD5 on 
it and use the 16-byte result as your shared secret. Otherwise, the 
password IS the shared secret. In practice, it would be a good idea 
to simply restrict passwords in size and content because 
authentication clients (i.e. MUA's) are going to have issues with 
reproducing passwords that have non-ASCII, whitespace, and control 
characters.

2. Generate the 2 MD5 "keys" from the shared secret. In Perl this 
might look like this:

$Ki = "$secret" ^ (chr(0x36) x 64);
$Ko = "$secret" ^ (chr(0x5c) x 64);

Keep in mind that the keys are special to HMAC, but as far as MD5 is 
concerned they are just the first 64-byte/512-bit blocks fed into the 
hash.

3. Initialize a fresh MD5 hash structure (if you are using a full MD5 
implementation ) or just a 'state' array containing the 4 32-bit 
values of the standard MD5 initialization vector (if you are doing 
your own implemetation or have direct access to a MD5Transform 
equivalent.)

4. Add Ki to the hash. In reference code terms, this would be a 
MD5Update call with Ki as the input. Note that most of the logic in 
the reference code MD5Update is about supporting the addition of data 
to the hash in chunks that don't fill 64-byte transformation blocks 
perfectly. If you are doing your own implementation just for this 
application, MD5Update is logically unnecessary because you can just 
pass your MD5Transform-equivalent a state array with the standard MD5 
initialization vector and Ki (which is a 64-character array)

5. The state array now holds the 'inner' context.

6. Repeat 3-4 with another fresh hash structure or state array and Ko.

7. The state array now holds the 'outer' context,

>4. Display a hex representation of the array contents.

Rendering those numbers as a hex string in the right order may not be 
obvious...

>I'll have a go later today and report back any findings. Of course, I could
>have completely misunderstood one of the steps, and that'll screw up the
>others, but still.

If OO Perl is your thing, I think that Digest::Perl::MD5 has 
everything you need except a public interface to the state array. You 
can still get to it however, and to the subroutine that does the hex 
string rendering...

In fact, look here:

http://www.scconsult.com/bill/crampass.pl

It does have a dependency issue, in that it requires Digest::Perl::MD5.

-- 
Bill Cole
bill at scconsult.com