[Dovecot] wierd ssl-parameters.dat regeneration error
I just had a strange problem with my Dovecot 1.0 installation. After about two weeks, it logged these messages:
dovecot: Error: ssl-build-param:
rename(/usr/local/var/lib/dovecot/ssl-parameters.dat.tmp,
/usr/local/var/lib/dovecot/ssl-parameters.dat) failed: No such
file or directory
dovecot: Error: child 30689 (ssl-build-param) returned error 89
And thereafter, all login attempts resulted in this:
dovecot: Error: imap-login: read(ssl-parameters.dat) failed:
Unexpected EOF
dovecot: Error: child 19036 (login) returned error 89
Thankfully, deleting the ssl-parameters.dat file (/var/run/dovecot/memoryhole.net/login/ssl-parameters.dat) and restarting dovecot fixed the problem, but it seems odd to me that dovecot expects /usr/local/var/lib/dovecot to exist (because, of course, it doesn't). I'm not defining that directory in the config file anywhere. Does anyone have any idea what's going on?
~Kyle
I'm sick of following my dreams. I'm just going to ask them where they're going and hook up with them later. -- Mitch Hedberg
On Tuesday, May 1 at 08:33 PM, quoth Kyle Wheeler:
I just had a strange problem with my Dovecot 1.0 installation. After about two weeks, it logged these messages:
dovecot: Error: ssl-build-param:
rename(/usr/local/var/lib/dovecot/ssl-parameters.dat.tmp,
/usr/local/var/lib/dovecot/ssl-parameters.dat) failed: No such file or directory dovecot: Error: child 30689 (ssl-build-param) returned error 89
For what it's worth, here is my dovecot -n output:
# /service/dovecot-memoryhole.net/dovecot.conf base_dir: /var/run/dovecot/memoryhole log_path: /dev/stderr log_timestamp: listen: imap.memoryhole.net:143 ssl_listen: imap.memoryhole.net:993 ssl_cert_file: /etc/ssl/certs/imap.memoryhole.net.pem ssl_key_file: /etc/ssl/private/imap.memoryhole.net.key login_dir: /var/run/dovecot/memoryhole/login login_executable: /usr/local/libexec/dovecot/imap-login login_greeting: There was suppose to be an earth-shattering KA-BOOM!!! login_greeting_capability: yes valid_chroot_dirs: /var/lib/vpopmail/domains verbose_proctitle: yes first_valid_uid: 64020 last_valid_uid: 64020 first_valid_gid: 64020 last_valid_gid: 64020 mail_location: maildir:%h/Maildir dotlock_use_excl: yes maildir_copy_with_hardlinks: yes mbox_write_locks: fcntl mail_drop_priv_before_exec: yes mail_executable: /usr/local/bin/relay-ctrl-allow-wrapper.sh /usr/local/libexec/dovecot/imap mail_log_max_lines_per_sec: 0 auth default: mechanisms: plain login default_realm: memoryhole.net user: vpopmail passdb: driver: ldap args: /var/lib/dovecot/dovecot-ldap.conf userdb: driver: static args: uid=64020 gid=64020 home=/var/lib/vpopmail/domains/%Ld/%Ln
~Kyle
Always forgive your enemies; nothing annoys them so much. -- Oscar Wilde
On Tuesday, May 1 at 08:33 PM, quoth Kyle Wheeler:
I just had a strange problem with my Dovecot 1.0 installation. After about two weeks, it logged these messages:
dovecot: Error: ssl-build-param:
rename(/usr/local/var/lib/dovecot/ssl-parameters.dat.tmp,
/usr/local/var/lib/dovecot/ssl-parameters.dat) failed: No such file or directory dovecot: Error: child 30689 (ssl-build-param) returned error 89And thereafter, all login attempts resulted in this:
dovecot: Error: imap-login: read(ssl-parameters.dat) failed:
Unexpected EOF dovecot: Error: child 19036 (login) returned error 89
My problem seems to be recurrent - about once a week (whenever Dovecot decides to regenerate the ssl-parameters.dat file), Dovecot goes belly up.
If it helps, this didn't seem to happen with 1.0rc26, which is what I was using before moving to 1.0.
Please, does *anyone* know what the problem might be?
~Kyle
Three things in human life are important. The first is to be kind. The second is to be kind. And the third is to be kind. -- Henry James
On Monday, May 7 at 07:55 AM, quoth Kyle Wheeler:
dovecot: Error: ssl-build-param: rename(/usr/local/var/lib/dovecot/ssl-parameters.dat.tmp, /usr/local/var/lib/dovecot/ssl-parameters.dat) failed: No such file or directory dovecot: Error: child 30689 (ssl-build-param) returned error 89
And thereafter, all login attempts resulted in this:
dovecot: Error: imap-login: read(ssl-parameters.dat) failed:
Unexpected EOF dovecot: Error: child 19036 (login) returned error 89
Ahhh, I think I figured out what's going on.
I run several different instances of dovecot, one for each of my domains (i.e. each one has a different SSL key, and a different auth_default_realm, and a different base_dir, but otherwise the config files are identical).
When Dovecot regenerates its ssl-parameters.dat file, there is a race condition between the multiple instances of dovecot, because they all regenerate the file in the same compile-time-defined $statedir directory: /usr/local/var/lib/dovecot. Because of that, the ssl-parameters.dat gets stolen by one of the dovecot instances, and so the other dovecot instances end up wondering what happened to their ssl-parameters.dat file.
Take, for example, this timeline:
Dovecot1 Dovecot2
create ssl-parameters.dat.tmp
create ssl-parameters.dat.tmp
rename to ssl-parameters.dat
rename to ssl-parameters.dat
ERROR: tmp file missing!
Now, in old 0.99 versions of dovecot, I understand that you could, in
the config file, change the name of the ssl-parameters.dat file. If I
could still do that, I think it would fix my issue. Or, if I could
change the $statedir in the config file.
Does anyone have any good solutions? Is my only option to maintain separate compiled versions of dovecot for every domain? (This seems idiotic, not to mention a lot of hassle.)
~Kyle
I believe that every human has a finite number of heart-beats. I don't intend to waste any of mine running around doing exercises. -- Neil Armstrong
On Mon, 2007-05-07 at 12:46 -0600, Kyle Wheeler wrote:
When Dovecot regenerates its ssl-parameters.dat file, there is a race condition between the multiple instances of dovecot, because they all regenerate the file in the same compile-time-defined $statedir directory: /usr/local/var/lib/dovecot.
This should fix it: http://dovecot.org/list/dovecot-cvs/2007-May/008756.html
On Sunday, May 13 at 04:47 PM, quoth Timo Sirainen:
On Mon, 2007-05-07 at 12:46 -0600, Kyle Wheeler wrote:
When Dovecot regenerates its ssl-parameters.dat file, there is a race condition between the multiple instances of dovecot, because they all regenerate the file in the same compile-time-defined $statedir directory: /usr/local/var/lib/dovecot.
This should fix it: http://dovecot.org/list/dovecot-cvs/2007-May/008756.html
*Almost*; my several dovecot instances don't all use the same base_dir, because they got mad and started fighting over who got to put a dict-server and master.pid file in there, and who got to play with the login directory (I haven't figured out how to get them all to share an auth server yet). With this patch, some of my dovecot instances just won't regenerate the ssl-parameters.dat file (though they will fail more gracefully, which is nice).
~Kyle
Liberty means responsibility. That is why most men dread it. -- George Bernard Shaw
On Sun, 2007-05-13 at 09:08 -0600, Kyle Wheeler wrote:
On Sunday, May 13 at 04:47 PM, quoth Timo Sirainen:
On Mon, 2007-05-07 at 12:46 -0600, Kyle Wheeler wrote:
When Dovecot regenerates its ssl-parameters.dat file, there is a race condition between the multiple instances of dovecot, because they all regenerate the file in the same compile-time-defined $statedir directory: /usr/local/var/lib/dovecot.
This should fix it: http://dovecot.org/list/dovecot-cvs/2007-May/008756.html
*Almost*; my several dovecot instances don't all use the same base_dir, because they got mad and started fighting over who got to put a dict-server and master.pid file in there, and who got to play with the login directory (I haven't figured out how to get them all to share an auth server yet). With this patch, some of my dovecot instances just won't regenerate the ssl-parameters.dat file (though they will fail more gracefully, which is nice).
Only one of them needs to regenerate the file. The rest of them should just copy it to their login_dir. But yes, the patch didn't fix it completely, this should now make it really work (although still untested, it's a bit annoying to test..):
On Sunday, May 13 at 06:27 PM, quoth Timo Sirainen:
Only one of them needs to regenerate the file. The rest of them should just copy it to their login_dir.
Hmm, okay. How do they know when the file is fully regenerated?
But yes, the patch didn't fix it completely, this should now make it really work (although still untested, it's a bit annoying to test..):
One minor thing; in the first patch, in ssl-init-main.c, in main(), ret should probably be initialized to zero.
Thanks very much for looking into this, Timo; I really appreciate it.
~Kyle
And thou shalt smite the house of Ahab thy master, that I may avenge the blood of my servants the prophets, and the blood of all the servants of the LORD, at the hand of Jezebel. For the whole house of Ahab shall perish. -- Bible, II Kings (9:7-8)
On Sunday, May 13 at 09:57 AM, quoth Kyle Wheeler:
On Sunday, May 13 at 06:27 PM, quoth Timo Sirainen:
Only one of them needs to regenerate the file. The rest of them should just copy it to their login_dir.
Hmm, okay. How do they know when the file is fully regenerated?
Oh! I think I see; file_try_lock() blocks until the lock is obtained or fails, correct?
~Kyle
Reliability means never having to say you're sorry. -- Dr. Daniel J. Bernstein
On Sun, 2007-05-13 at 10:07 -0600, Kyle Wheeler wrote:
On Sunday, May 13 at 09:57 AM, quoth Kyle Wheeler:
On Sunday, May 13 at 06:27 PM, quoth Timo Sirainen:
Only one of them needs to regenerate the file. The rest of them should just copy it to their login_dir.
Hmm, okay. How do they know when the file is fully regenerated?
Oh! I think I see; file_try_lock() blocks until the lock is obtained or fails, correct?
No. It goes something like this:
- see if global ssl-parameters.dat's mtime is higher than in login dir
- if yes, copy the file to login dir preserving its mtime
- check if login/ssl-parameters.dat's mtime is older than configured
regeneration time
- if not, try again in 10 mins
- open ssl-parameters.dat.tmp file
- try to lock it
- if it fails someone's already rebuilding it. check again in 10 mins.
- write the new parameters to the .tmp file
- rename() .tmp to ssl-parameters.dat
- copy to login/
So the processes that failed to lock the .tmp file will just copy the ssl-parameters.dat after 10 minutes.
On Sunday, May 13 at 08:27 PM, quoth Timo Sirainen:
On Sun, 2007-05-13 at 10:07 -0600, Kyle Wheeler wrote:
On Sunday, May 13 at 09:57 AM, quoth Kyle Wheeler:
On Sunday, May 13 at 06:27 PM, quoth Timo Sirainen:
Only one of them needs to regenerate the file. The rest of them should just copy it to their login_dir.
Hmm, okay. How do they know when the file is fully regenerated?
Oh! I think I see; file_try_lock() blocks until the lock is obtained or fails, correct?
No. It goes something like this:
Ahh, I see. Excellent - thanks very much!
~Kyle
I see the pain on your face when you say the word intellectual, because it has so many syllables in it. -- Clive James
participants (2)
-
Kyle Wheeler
-
Timo Sirainen