Hi Steffen,
Em 26/06/14 11:06, Steffen Kaiser escreveu:
maildir_copy_with_hardlinks is set on my server, too. It's the default now. I do not see these errors.
Can you check if the failed recipient is _never_ the first one? I remember some discussions about that the Dovecot LDA has to re-open the storred message of the first recipient, in order to spool it for the other ones. That caused problems because the server used an user-specific encryption key.
If that applies to LMTP, too, this might become problematic, if the first recipient deletes the message very quickly. Therefore my question, if the error pops up for the second recipient and maybe all following ones.
Just did some more scripting and got a total of 1413 distinct
emails (distinct queueids) that suffered from this 'message was expunged' problem on the 36 hours i got the problems. As noted before, not even a single one of these 1413 were nrcpt=1. Problems were from nrcpt=2 or greater. So, whatever this is, it really seems to NOT affect single recipient messages.
As you suspected, on those 1413 the very first recipient ALWAYS got
a status=sent from dovecot/lmtp. "Message was expunged" never happened on the very first recipient delivery.
About the first user (the first delivered message) checking the
message right after delivery, i tried to confirm that by the logs. To make things a little easier, i used nrcpt=2 and nrcpt=3 situations, just to avoid posting lots of logs here.
Case #1
1st recipient delivery OK, failed on 2nd recipient, just delivered successfully on 2nd try
Jun 25 17:08:53 correio postfix/qmgr[3059]: 2AFA01BF7496F: from=<alejandro.miguel@domain.com.br>, size=49366, nrcpt=2 (queue active) Jun 25 17:09:52 correio postfix/lmtp[6029]: 2AFA01BF7496F: to=<fabio.kado@domain.com.br>, relay=correio.domain.com.br[private/dovecot-lmtp], delay=59, delays=0.01/47/0/12, dsn=2.0.0, status=sent (250 2.0.0 <fabio.kado@domain.com.br> UbjnEywqq1MTPwAAHvf8vg Saved) Jun 25 17:09:55 correio postfix/lmtp[6029]: 2AFA01BF7496F: to=<marcelo.roberto@domain.com.br>, relay=correio.domain.com.br[private/dovecot-lmtp], delay=63, delays=0.01/47/0/16, dsn=4.2.0, status=deferred (host correio.domain.com.br[private/dovecot-lmtp] said: 451 4.2.0 <marcelo.roberto@domain.com.br> Message was expunged (received-date) (in reply to end of DATA command)) Jun 25 17:24:11 correio postfix/lmtp[26595]: 2AFA01BF7496F: to=<marcelo.roberto@domain.com.br>, relay=correio.domain.com.br[private/dovecot-lmtp], delay=918, delays=908/0/0/10, dsn=2.0.0, status=sent (250 2.0.0 <marcelo.roberto@domain.com.br> hWS4JMorq1OIdAAAHvf8vg Saved)
and in fact the first recipient was checking its email during the
LMTP delivery. This is the actual order of the logs:
Jun 25 17:09:47 correio dovecot: pop3-login: Login: user=<fabio.kado@domain.com.br>, method=PLAIN, rip=10.253.x.x, lip=10.252.x.x, mpid=21164 Jun 25 17:09:52 correio postfix/lmtp[6029]: 2AFA01BF7496F: to=<fabio.kado@domain.com.br>, relay=correio.domain.com.br[private/dovecot-lmtp], delay=59, delays=0.01/47/0/12, dsn=2.0.0, status=sent (250 2.0.0 <fabio.kado@domain.com.br> UbjnEywqq1MTPwAAHvf8vg Saved) Jun 25 17:09:52 correio dovecot: pop3(fabio.kado@domain.com.br), rip=10.253.x.x, lip=10.252.x.x: Disconnected: Logged out top=0/0, retr=1/49582, del=1/1, size=49550
and by the size= of the 'disconnected' pop3 log (size=49550) and
the size from qmgr (size=49366), user really seems to have checked (and deleted after that) exactly the message right BEFORE the 2nd recipient delivery was tried by LMTP
Case #2
1st and 2nd recipient delivery OK, failed on 3rd recipient, just delivered successfully on 2nd try
Jun 25 10:00:31 correio postfix/qmgr[804]: EFEB31A4BB227: from=<andrei.lunkes@domain.com.br>, size=26667, nrcpt=3 (queue active) Jun 25 10:00:47 correio postfix/lmtp[6549]: EFEB31A4BB227: to=<marcos.pereira@domain.com.br>, relay=correio.domain.com.br[private/dovecot-lmtp], delay=16, delays=0/0/0/16, dsn=2.0.0, status=sent (250 2.0.0 <marcos.pereira@domain.com.br> uTsNO+/HqlOHGQAAHvf8vg Saved) Jun 25 10:00:54 correio postfix/lmtp[6549]: EFEB31A4BB227: to=<mario.cabreira@domain.com.br>, relay=correio.domain.com.br[private/dovecot-lmtp], delay=22, delays=0/0/0/22, dsn=2.0.0, status=sent (250 2.0.0 <mario.cabreira@domain.com.br> uTsNO+/HqlOHGQAAHvf8vg Saved) Jun 25 10:00:55 correio postfix/lmtp[6549]: EFEB31A4BB227: to=<wellington.camargo@domain.com.br>, relay=correio.domain.com.br[private/dovecot-lmtp], delay=24, delays=0/0/0/24, dsn=4.2.0, status=deferred (host correio.domain.com.br[private/dovecot-lmtp] said: 451 4.2.0 <wellington.camargo@domain.com.br> Message was expunged (received-date) (in reply to end of DATA command)) Jun 25 10:03:57 correio postfix/lmtp[7312]: EFEB31A4BB227: to=<wellington.camargo@domain.com.br>, relay=correio.domain.com.br[private/dovecot-lmtp], delay=205, delays=202/1/0/2.2, dsn=2.0.0, status=sent (250 2.0.0 <wellington.camargo@domain.com.br> j1dRAU/IqlN6QAAAHvf8vg Saved)
1st recipient in fact checked email (and deleted the message) right after the delivery. Actual log sequence is:
Jun 25 10:00:40 correio dovecot: pop3-login: Login: user=<marcos.pereira@domain.com.br>, method=PLAIN, rip=10.253.22.56, lip=10.252.38.2, mpid=7233 Jun 25 10:00:47 correio postfix/lmtp[6549]: EFEB31A4BB227: to=<marcos.pereira@domain.com.br>, relay=correio.domain.com.br[private/dovecot-lmtp], delay=16, delays=0/0/0/16, dsn=2.0.0, status=sent (250 2.0.0 <marcos.pereira@domain.com.br> uTsNO+/HqlOHGQAAHvf8vg Saved) Jun 25 10:00:48 correio dovecot: pop3(marcos.pereira@domain.com.br), rip=10.253.22.56, lip=10.252.38.2: Disconnected: Logged out top=0/0, retr=1/26872, del=1/1, size=26849
2nd recipient did NOT checked email near the delivery
Jun 25 09:54:08 correio dovecot: pop3-login: Login: user=<mario.cabreira@domain.com.br>, method=PLAIN, rip=10.253.22.72, lip=10.252.38.2, mpid=29125 Jun 25 09:54:12 correio dovecot: pop3(mario.cabreira@domain.com.br), rip=10.253.22.72, lip=10.252.38.2: Disconnected: Logged out top=0/0, retr=11/1408320, del=11/11, size=1408058
Jun 25 10:06:49 correio dovecot: pop3-login: Login: user=<mario.cabreira@domain.com.br>, method=PLAIN, rip=10.253.22.72, lip=10.252.38.2, mpid=13391 Jun 25 10:06:50 correio dovecot: pop3(mario.cabreira@domain.com.br), rip=10.253.22.72, lip=10.252.38.2: Disconnected: Logged out top=0/0, retr=10/1923222, del=10/10, size=1923004
this is interesting ... logs seems to show clearly that right after
the message was delivered to the 1st recipient, it was checked and deleted. Despite of that, 2nd recipient got the message successfully delivered. But we have also a delay=22 ... maybe lmtp instance had already opened the message BEFORE it was erased by the 1st user ?
this delay is high, i know that and, usually, server do NOT suffer
from this delays on deliveries. Anyway, during this problem period, queues were pretty large and, thus, load was very high, which i think explains this unusual delays.
With these two cases, it really seems that this is somehow related to:
lmtp as delivery agent
multiple recipient messages
first (or previous users in fact) checked and deleted the message before all the recipients got the message delivery
About the two parameters, maildir_copy_with_hardlinks and pop3_fast_size_lookups, i'll try to change them one at a time and try to reproduce the problems. Fact is i only had the problem with both set to yes and problems were completly vanished when both were set to no.
--
Atenciosamente / Sincerily,
Leonardo Rodrigues
Solutti Tecnologia
http://www.solutti.com.br
Minha armadilha de SPAM, NÃO mandem email
gertrudes@solutti.com.br
My SPAMTRAP, do not email it