[Dovecot] Dovecot LDA (deliver) stopped working
Hello,
One day a week or so ago, a server that's Postifx w/ Local users, dovecot w/ LDA decided to stop working. The only thing that stopped is the LDA portion, the rest all works fine. I also posted to the postfix mailing list yesterday but wasn't able to debug any further. Here's the situation:
if I use mailbox_command to point to deliver, postfix will always put the message in the deferred queue. However, if I run deliver from the command line, it works fine. I've recompiled Dovecot and Postfix cleanly and reinstalled a few times to ensure permissions are OK. It's not a virtual setup, so I do NOT have setuid root on the deliver binary.
Postfix people suggested placing a wrapper script for mailbox_command, which I did, and have it log to a text file, so I do know that deliver is being called. However, although I have LDA debug logging on, and it DID log prior to this, when postfix supposedly calls deliver, dovecot does NOT log anything. It DOES however log if I call deliver from any account (root or not) from the command line (ls / | /path/to/deliver -d localaccountname). Any way I can debug this further? I would assume that deliver would log something even if it's called, but it appears that it does not.
My postfix related details are here:
http://marc.info/?l=postfix-users&m=120119147917409&w=2
I followed the postfix/lda wiki (http://wiki.dovecot.org/LDA/Postfix) when setting it up, and it worked perfectly for months. I did indeed only have to change mailbox_command and it worked (I'm also using sasl auth, but that should be unrelated). Obviously something happened on my system, but I would assume recompiling/reinstalling would reset things. Nothing in the config files for postfix or dovecot changed, from what I can see, as generally I am the only one making changes.
Thanks, --falz
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Fri, Jan 25, 2008 at 08:42:09AM -0600, falz wrote:
Hello,
One day a week or so ago, a server that's Postifx w/ Local users, dovecot w/ LDA decided to stop working [...]
Postfix people suggested placing a wrapper script for mailbox_command, which I did, and have it log to a text file, so I do know that deliver is being called.
To get you right here: the wrapper is being called by Postfix, right?
However, although I have LDA debug logging on, and it
DID log prior to this, when postfix supposedly calls deliver, dovecot does NOT log anything.
...and LDA (called from the wrapper) doesn't run (or doesn't run to the point of logging anything).
It DOES however log if I call deliver from any
account (root or not) from the command line (ls / | /path/to/deliver -d localaccountname). Any way I can debug this further? I would assume that deliver would log something even if it's called, but it appears that it does not.
That's rather bizarre. I would up the verbosity of the wrapper (f.ex. putting a couple of line 'which /path/to/deliver' or 'ldd /path/to/deliver'
- -- or even 'strace /path/to/deliver'). Something in the environment must be different to the command lines, right?
Regards
- -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux)
iD8DBQFHmf6gBcgs9XrR2kYRAkDBAJ40oLz6KG8sThDwmY29j90RJKtS1wCfYwdF sbY2JuoGCcsIkKLGnZ8/GJM= =RrwT -----END PGP SIGNATURE-----
On Jan 25, 2008 9:22 AM, tomas@tuxteam.de wrote:
Postfix people suggested placing a wrapper script for mailbox_command, which I did, and have it log to a text file, so I do know that deliver is being called.
To get you right here: the wrapper is being called by Postfix, right?
Yes, postfix calls a wrapper. The wrapper is a shell script that calls deliver. This is the wrapper. $LOGNAME comes from postfix, it's a built in variable it populates as the local username:
#!/bin/sh
DELIVER=/usr/local/libexec/dovecot/deliver
DATE=date
echo "==" >> /tmp/foo.txt
echo $DATE >> /tmp/foo.txt
echo $USER >> /tmp/foo.txt
$DELIVER -d "$LOGNAME"
echo $DATE >> /tmp/foo.txt
However, although I have LDA debug logging on, and it
DID log prior to this, when postfix supposedly calls deliver, dovecot does NOT log anything.
...and LDA (called from the wrapper) doesn't run (or doesn't run to the point of logging anything).
Well, it's hard for me to tell. I know for sure that the wrapper ran, as I get my temp log does get the tiemstamp that's before/after deliver. However, I guess I dont know the best way to log the output/exit status of the deliver command.
It DOES however log if I call deliver from any
account (root or not) from the command line (ls / | /path/to/deliver -d localaccountname). Any way I can debug this further? I would assume that deliver would log something even if it's called, but it appears that it does not.
That's rather bizarre. I would up the verbosity of the wrapper (f.ex. putting a couple of line 'which /path/to/deliver' or 'ldd /path/to/deliver'
- -- or even 'strace /path/to/deliver'). Something in the environment must be different to the command lines, right?
Hmm, perhaps I should somehow log the entire status of SETENV or something. What I probably really need to do is figure out the appropriate pipes after the deliver line to find out if it's spitting anything back to stderr or something. This is usually.. >> 2&1 or something similar? Anyone have suggestions for that?
--falz
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Fri, Jan 25, 2008 at 10:03:30AM -0600, falz wrote:
On Jan 25, 2008 9:22 AM, tomas@tuxteam.de wrote:
Postfix people suggested placing a wrapper script for mailbox_command, which I did, and have it log to a text file, so I do know that deliver is being called.
To get you right here: the wrapper is being called by Postfix, right?
Yes, postfix calls a wrapper. The wrapper is a shell script that calls deliver. This is the wrapper. $LOGNAME comes from postfix, it's a built in variable it populates as the local username:
#!/bin/sh DELIVER=/usr/local/libexec/dovecot/deliver DATE=
date
echo "==" >> /tmp/foo.txt echo $DATE >> /tmp/foo.txt echo $USER >> /tmp/foo.txt $DELIVER -d "$LOGNAME" echo $DATE >> /tmp/foo.txt
I'd typically do
#!/bin/sh { DELIVER=... echo "==" echo $DATE ... } >> /tmp/foo.txt 2>&1
...so you have the redirection in one place. Plus it catches stdout and stderr of other calls (especially the one from deliver, you are interested in).
Maybe an "strace $DELIVER", although definitely overkill, might lead you to the problem.
[...]
Well, it's hard for me to tell. I know for sure that the wrapper ran, as I get my temp log does get the tiemstamp that's before/after deliver. However, I guess I dont know the best way to log the output/exit status of the deliver command.
As for the output, see above. The exit status is in the special shell variable $?
[...]
Hmm, perhaps I should somehow log the entire status of SETENV or something.
That would be env (/usr/bin/env)
What I probably really need to do is figure out the
appropriate pipes after the deliver line to find out if it's spitting anything back to stderr or something. This is usually.. >> 2&1 or something similar? Anyone have suggestions for that?
See above. I'd group everything in { ... } and do the redirection at the end. Less to write (and less to fix when you change your mind :-)
Regards
- -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux)
iD8DBQFHmh/nBcgs9XrR2kYRAtpcAJ91Y43K5UtN+W7+JskD4VVa0EVsvQCfW9tm qJCCdNitfJFnarvPUU/0shg= =th7z -----END PGP SIGNATURE-----
On Jan 25, 2008 11:44 AM, tomas@tuxteam.de wrote:
I'd typically do
#!/bin/sh { DELIVER=... echo "==" echo $DATE ... } >> /tmp/foo.txt 2>&1
Thanks, I wasn't aware of the curly bracket trick. STRACE found the problem for me! It was something completely unexpected. Patsing my script here for future reference for anyone:
#!/bin/sh
{
DELIVER=/usr/local/libexec/dovecot/deliver
STRACE=/usr/local/bin/strace
DATE=date
echo "=="
/usr/bin/env
echo $DATE
echo $LOGNAME
$STRACE $DELIVER -d "$LOGNAME"
echo $DATE
} >> /tmp/foo.txt 2>&1
So, doing this, near the end of the STRACE gave me:
open("/var/log/dovecot-lda.log", O_WRONLY|O_APPEND|O_CREAT, 0666) = 6 lseek(6, 0, SEEK_END) = 51237618 fcntl(6, F_GETFD) = 0 fcntl(6, F_SETFD, FD_CLOEXEC) = 0 open("/var/log/dovecot-lda.log", O_WRONLY|O_APPEND|O_CREAT, 0666) = 7 lseek(7, 0, SEEK_END) = 51237618 fcntl(7, F_GETFD) = 0 fcntl(7, F_SETFD, FD_CLOEXEC) = 0 fstat(7, {st_mode=0, st_size=0, ...}) = 0 gettimeofday({0, 0}, NULL) = 0 access("/etc/localtime", R_OK) = 0 open("/etc/localtime", O_RDONLY) = 8 fstat(8, {st_mode=S_IFCHR|S_ISUID|S_ISGID|S_ISVTX|045, st_rdev=makedev(37, 842006560), ...}) = 0 read(8, "\377\377\377\377\377\377\377\377\377\377\377\377\377\377"..., 7944) = 1279 close(8) = 0 write(7, "deliver(username): Jan 25 12:58:"..., 100) = -1 EFBIG (File too large) --- SIGXFSZ (Filesize limit exceeded: 25) --- --- SIGXFSZ (Filesize limit exceeded: 25) --- syscall_417(0xbfbfe820) = -1 (errno 27) exit(75) = ?
So, the issue was that SOMETHING is enforcing /var/log/dovecot-lda/log size! I was able to get this working by rotating, and the solution will be to constantly rotate, but I still need to figure out what's enforcing this.
I have no filesystem quotas enabled (FreeBSD) and never have.
I could not find any 'max file size' type of settings in dovecot. Even if there were, I don't know why this would be the default behavior!
So, anyone have thoughts as to what's enforcing the "Filesize limit exceeded: 25"?
--falz
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Fri, Jan 25, 2008 at 01:14:15PM -0600, falz wrote:
[...] STRACE found the problem for me! [...]
Good to hear!
write(7, "deliver(username): Jan 25 12:58:"..., 100) = -1 EFBIG (File too large)
Hmmm...
So, the issue was that SOMETHING is enforcing /var/log/dovecot-lda/log size! I was able to get this working by rotating, and the solution will be to constantly rotate, but I still need to figure out what's enforcing this.
Check ulimit (in the shell, say ulimit -a). Maybe some script is over-cautious and is setting user limits.
Regards
- -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux)
iD8DBQFHmsi8Bcgs9XrR2kYRApy7AKCBDdClW/u06eRQqbwbFlYFV8krrQCff3gN AECZmgT574uOGMMk7tFSYFQ= =+SDt -----END PGP SIGNATURE-----
On Jan 25, 2008 11:44 PM, tomas@tuxteam.de wrote:
So, the issue was that SOMETHING is enforcing /var/log/dovecot-lda/log size! I was able to get this working by rotating, and the solution will be to constantly rotate, but I still need to figure out what's enforcing this.
Check ulimit (in the shell, say ulimit -a). Maybe some script is over-cautious and is setting user limits.
This seems fine (on FreeBSD it's "limits"):
# limits Resource limits (current): cputime infinity secs filesize infinity kB datasize 524288 kB stacksize 65536 kB coredumpsize infinity kB memoryuse infinity kB memorylocked infinity kB maxprocesses 5547 openfiles 11095 sbsize infinity bytes vmemoryuse infinity kB
write(7, "deliver(username): Jan 25 12:58:"..., 100) = -1 EFBIG (File too large) .. So, anyone have thoughts as to what's enforcing the "Filesize limit exceeded: 25"?
Postfix's mailbox_size_limit setting most likely.
Postfix does have this set to 50MB:
mailbox_size_limit = 51200000
..which is how large the LDA log was. Very odd, because the issue was writing the LOG, not writing to the mailbox. So, mailbox_size_limit must restrict the size of whatever is forked off of mailbox_command if I'm understanding it correctly. I'll ask further about that on the potsfix mailing list, but it sounds to me like that's it. Thanks much for your help!
--falz
participants (3)
-
falz
-
Timo Sirainen
-
tomas@tuxteam.de