[Dovecot] v2.2.4 released
http://dovecot.org/releases/2.2/dovecot-2.2.4.tar.gz http://dovecot.org/releases/2.2/dovecot-2.2.4.tar.gz.sig
OK, this should be a pretty good and stable version.
+ doveadm: Added "flags" command to modify message flags.
+ doveadm: Added "deduplicate" command to expunge message duplicates.
+ dsync: Show the state in process title with verbose_proctitle=yes.
- imap/pop3 proxy: Master user logins were broken in v2.2.3
- sdbox/mdbox: A corrupted index header with wrong size was never
automatically fixed in v2.2.3.
- mbox: Fixed assert-crashes related to locking.
On 2013-06-24 7:56 PM, Timo Sirainen tss@iki.fi wrote:
http://dovecot.org/releases/2.2/dovecot-2.2.4.tar.gz http://dovecot.org/releases/2.2/dovecot-2.2.4.tar.gz.sig
OK, this should be a pretty good and stable version.
+ doveadm: Added "deduplicate" command to expunge message duplicates.
Hey Timo,
2 questions on this new 'deduplicate' capability of doveadm...
Obviously this could be scripted with a cron job, but I was wondering if it wouldn't make sense to do this automatically whenever messages are being moved around in the mailstore?
An interesting 'feature' of gmail is that if/when you are copying lots of messages from a non gmail account to a gmail account through IMAP, if the folder you are copying from contains duplicate messages, gmail will silently discard the duplicates after the first one is successfully copied up...
I discovered this a long time ago the first time I encountered an anomaly where I copied an entire folder, but the number of messages on the gmail account didn't match the number in the source folder. After comparing, I discovered that there were duplicates in the source folder, which accounted for the discrepancy.
Thanks,
--
Best regards,
Charles
On 25.6.2013, at 14.14, Charles Marcus CMarcus@Media-Brokers.com wrote:
+ doveadm: Added "deduplicate" command to expunge message duplicates.
Hey Timo,
2 questions on this new 'deduplicate' capability of doveadm...
Obviously this could be scripted with a cron job, but I was wondering if it wouldn't make sense to do this automatically whenever messages are being moved around in the mailstore?
An interesting 'feature' of gmail is that if/when you are copying lots of messages from a non gmail account to a gmail account through IMAP, if the folder you are copying from contains duplicate messages, gmail will silently discard the duplicates after the first one is successfully copied up...
I discovered this a long time ago the first time I encountered an anomaly where I copied an entire folder, but the number of messages on the gmail account didn't match the number in the source folder. After comparing, I discovered that there were duplicates in the source folder, which accounted for the discrepancy.
There's currently no efficient way to do that automatically in Dovecot. Also there are several potential problems.. Like if there are duplicate Message-ID: headers, but the body is different, should that be a duplicate? What if the body is the same but headers differ with e.g. the Subject line (maybe it's just [Dovecot] prefix)? What if only the Received: headers are different? And so on..
Anyway, copy&pasting what I just wrote to another reply about doveadm deduplicate:
The main idea behind it is to be able to revert some (more or less) accidental duplication of emails due to something that admin did, or possibly due to some bug in Dovecot (e.g. dsync). There are two modes of operation, both work only for duplicates within the same folder:
Deduplicate by message GUID. These duplicates could have only been caused by copying the mail (IMAP COPY, doveadm copy) or by "doveadm import" that imports old messages from e.g. a backup.
Deduplicate by Message-Id: header (-m parameter). I added this just because some people had asked for it previously. I'm not sure how/when it's actually useful.
Am 25.06.2013 15:28, schrieb Timo Sirainen:
Also there are several potential problems.. Like if there are duplicate Message-ID: headers, but the body is different, should that be a duplicate?
the answer is simply *yes* because there must not be the same Message-ID's for different messages because the words "single unique message identifier" are pretty clear
RFC2822
Though optional, every message SHOULD have a "Message-ID:" field. Furthermore, reply messages SHOULD have "In-Reply-To:" and "References:" fields as appropriate, as described below.
The "Message-ID:" field contains a single unique message identifier. The "References:" and "In-Reply-To:" field each contain one or more unique message identifiers, optionally separated by CFWS.
these days "every message SHOULD have a Message-ID:" is outdated
we started many years ago to block *any* message missing the header because every sane SMTP implementation adds it if it was missing from the client and so only broken implementations which are mostly spammers would be affected
On 25.6.2013, at 16.52, Reindl Harald h.reindl@thelounge.net wrote:
Am 25.06.2013 15:28, schrieb Timo Sirainen:
Also there are several potential problems.. Like if there are duplicate Message-ID: headers, but the body is different, should that be a duplicate?
the answer is simply *yes* because there must not be the same Message-ID's for different messages because the words "single unique message identifier" are pretty clear
I'm more concerned about intentional abuse. For example if you're dropping duplicate messages by Message-ID, I could first send this reply to you privately, and then another message with same Message-ID: but different content to the mailing list, and you'd never know it without looking into the archives from web.
Also I wouldn't be surprised if there still were some crappy webforms that always sent the same Message-Id..
Am 25.06.2013 16:02, schrieb Timo Sirainen:
On 25.6.2013, at 16.52, Reindl Harald h.reindl@thelounge.net wrote:
Am 25.06.2013 15:28, schrieb Timo Sirainen:
Also there are several potential problems.. Like if there are duplicate Message-ID: headers, but the body is different, should that be a duplicate?
the answer is simply *yes* because there must not be the same Message-ID's for different messages because the words "single unique message identifier" are pretty clear
I'm more concerned about intentional abuse. For example if you're dropping duplicate messages by Message-ID, I could first send this reply to you privately, and then another message with same Message-ID: but different content to the mailing list, and you'd never know it without looking into the archives from web.
this is very much theory und unlikely as well as only for this specific example possible where you send both messages
this way nobody is able to guess a message-ID of a regular message and replace it and veen if he knows he needs to be faster with hiss fake as the origin message - very very unlikely
Also I wouldn't be surprised if there still were some crappy webforms that always sent the same Message-Id..
well if we take care of such crap we can stop read any RFC and would need to disable any spamfilters which especially for score based filters rely on common standards
hence, these days on barracuda spamfirewall you get even a FULL score point if you send a HTML-message and subject/html-title differs
- Reindl Harald h.reindl@thelounge.net:
Am 25.06.2013 15:28, schrieb Timo Sirainen:
Also there are several potential problems.. Like if there are duplicate Message-ID: headers, but the body is different, should that be a duplicate?
the answer is simply *yes* because there must not be the same Message-ID's for different messages because the words "single unique message identifier" are pretty clear
RFC2822
Though optional, every message SHOULD have a "Message-ID:" field. Furthermore, reply messages SHOULD have "In-Reply-To:" and "References:" fields as appropriate, as described below.
The "Message-ID:" field contains a single unique message identifier. The "References:" and "In-Reply-To:" field each contain one or more unique message identifiers, optionally separated by CFWS.
these days "every message SHOULD have a Message-ID:" is outdated
we started many years ago to block *any* message missing the header because every sane SMTP implementation adds it if it was missing from the client and so only broken implementations which are mostly spammers would be affected
We had one funny occurance of that particular corner-case:
- Somebody sent us an email
- the user's account autoreplied on the eveing upon receipt (out of office) That autoreply was sent with a message-id A
- next morning, the user read the mail, and composed a personal reply
- that reply was discarded by the recipient's mailserver, since it had the same message-id A (dunno why that happened, but it did!) as the auto-reply the evening before.
That took me a while to discover.
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | http://www.charite.de
participants (4)
-
Charles Marcus
-
Ralf Hildebrandt
-
Reindl Harald
-
Timo Sirainen