dovecot sieve duplicates detection
Hello,
I have tested the sieve duplicate script with success so far, but I have a question.
I would like to know if the "duplicate" sieve flag in Dovecot is global to all folders, or specific to one folder only.
For instance, if I copy an email from one folder to another, and I have a discard action on duplicate email, is this action will be applied (in this case, discard) or not.
If the duplicate is global to all folders, is there a way to restrict the search in one folder only.
Thanks for your help. André
Op 11-4-2018 om 23:58 schreef André Rodier:
Hello,
I have tested the sieve duplicate script with success so far, but I have a question.
Sieve duplicate script? You mean the Sieve duplicate extension (RFC 7352)?
I would like to know if the "duplicate" sieve flag in Dovecot is global to all folders, or specific to one folder only.
It uses the lda-dupes file in the user's home directory. So, it is not normally related to folders, although the identifier used for duplicate matching could be composed of the mailbox name if you want.
For instance, if I copy an email from one folder to another, and I have a discard action on duplicate email, is this action will be applied (in this case, discard) or not.
Are you talking about IMAPSieve now? I am not sure "duplicate" is currently even allowed in that context.
If the duplicate is global to all folders, is there a way to restrict the search in one folder only.
You can set the :uniqueid parameter accordingly.
Regards,
Stephan.
On 23/04/18 14:18, Stephan Bosch wrote:
Op 11-4-2018 om 23:58 schreef André Rodier:
Hello,
I have tested the sieve duplicate script with success so far, but I have a question.
Sieve duplicate script? You mean the Sieve duplicate extension (RFC 7352)?
I would like to know if the "duplicate" sieve flag in Dovecot is global to all folders, or specific to one folder only.
It uses the lda-dupes file in the user's home directory. So, it is not normally related to folders, although the identifier used for duplicate matching could be composed of the mailbox name if you want.
For instance, if I copy an email from one folder to another, and I have a discard action on duplicate email, is this action will be applied (in this case, discard) or not.
Are you talking about IMAPSieve now? I am not sure "duplicate" is currently even allowed in that context.
If the duplicate is global to all folders, is there a way to restrict the search in one folder only.
You can set the :uniqueid parameter accordingly.
Regards,
Stephan.
Thank you, Stephan.
Yes, I meant the Sieve duplicate extension.
I am using a program to import email (mbsync), which use the IMAP append function. Sometimes, the import fail and I have to restart the program. Unfortunately, the same emails are imported again.
I found a fix by using a dovecot IMAP sieve script executed on the APPEND action (https://wiki.dovecot.org/Pigeonhole/Sieve/Plugins/IMAPSieve). I wrote a custom sieve script that "discard" the ones that are detected as "duplicate". It worked very well and the emails were not any more imported twice.
However, there was a huge side effect: archiving an email with Thunderbird is not working any more, and even lost! I have been able to understand the error as this:
- When archiving an email with Thunderbird, it is first copied (APPEND) into the archive folder, but the original folder is not expunged.
- The sieve script detect the email as duplicate, and discard it.
- When the original folder is expunged, the source email is lost...
My conclusion was the duplicate detection function is global to all folders.
If I could restrict the detection of duplicates in the current folder only, this would let me run the import program again without error.
Kind regards, André.
Op 23/04/2018 om 22:03 schreef André Rodier:
On 23/04/18 14:18, Stephan Bosch wrote:
Op 11-4-2018 om 23:58 schreef André Rodier:
Hello,
I have tested the sieve duplicate script with success so far, but I have a question.
Sieve duplicate script? You mean the Sieve duplicate extension (RFC 7352)?
I would like to know if the "duplicate" sieve flag in Dovecot is global to all folders, or specific to one folder only.
It uses the lda-dupes file in the user's home directory. So, it is not normally related to folders, although the identifier used for duplicate matching could be composed of the mailbox name if you want.
For instance, if I copy an email from one folder to another, and I have a discard action on duplicate email, is this action will be applied (in this case, discard) or not.
Are you talking about IMAPSieve now? I am not sure "duplicate" is currently even allowed in that context.
If the duplicate is global to all folders, is there a way to restrict the search in one folder only.
You can set the :uniqueid parameter accordingly.
Regards,
Stephan.
Thank you, Stephan.
Yes, I meant the Sieve duplicate extension.
I am using a program to import email (mbsync), which use the IMAP append function. Sometimes, the import fail and I have to restart the program. Unfortunately, the same emails are imported again.
I found a fix by using a dovecot IMAP sieve script executed on the APPEND action (https://wiki.dovecot.org/Pigeonhole/Sieve/Plugins/IMAPSieve). I wrote a custom sieve script that "discard" the ones that are detected as "duplicate". It worked very well and the emails were not any more imported twice.
However, there was a huge side effect: archiving an email with Thunderbird is not working any more, and even lost! I have been able to understand the error as this:
- When archiving an email with Thunderbird, it is first copied (APPEND) into the archive folder, but the original folder is not expunged.
- The sieve script detect the email as duplicate, and discard it.
- When the original folder is expunged, the source email is lost...
My conclusion was the duplicate detection function is global to all folders.
If I could restrict the detection of duplicates in the current folder only, this would let me run the import program again without error.
Specify the ID used for duplicate checking explicitly using the :uniqueid argument (https://tools.ietf.org/html/rfc7352#section-3.1). Using the variables extenion, compose the uniqueid from the message-id and the mailbox name.
Regards,
Stephan.
On 25/04/18 20:20, Stephan Bosch wrote:
Op 23/04/2018 om 22:03 schreef André Rodier:
On 23/04/18 14:18, Stephan Bosch wrote:
Op 11-4-2018 om 23:58 schreef André Rodier:
Hello,
I have tested the sieve duplicate script with success so far, but I have a question.
Sieve duplicate script? You mean the Sieve duplicate extension (RFC 7352)?
I would like to know if the "duplicate" sieve flag in Dovecot is global to all folders, or specific to one folder only.
It uses the lda-dupes file in the user's home directory. So, it is not normally related to folders, although the identifier used for duplicate matching could be composed of the mailbox name if you want.
For instance, if I copy an email from one folder to another, and I have a discard action on duplicate email, is this action will be applied (in this case, discard) or not.
Are you talking about IMAPSieve now? I am not sure "duplicate" is currently even allowed in that context.
If the duplicate is global to all folders, is there a way to restrict the search in one folder only.
You can set the :uniqueid parameter accordingly.
Regards,
Stephan.
Thank you, Stephan.
Yes, I meant the Sieve duplicate extension.
I am using a program to import email (mbsync), which use the IMAP append function. Sometimes, the import fail and I have to restart the program. Unfortunately, the same emails are imported again.
I found a fix by using a dovecot IMAP sieve script executed on the APPEND action (https://wiki.dovecot.org/Pigeonhole/Sieve/Plugins/IMAPSieve). I wrote a custom sieve script that "discard" the ones that are detected as "duplicate". It worked very well and the emails were not any more imported twice.
However, there was a huge side effect: archiving an email with Thunderbird is not working any more, and even lost! I have been able to understand the error as this:
- When archiving an email with Thunderbird, it is first copied (APPEND) into the archive folder, but the original folder is not expunged.
- The sieve script detect the email as duplicate, and discard it.
- When the original folder is expunged, the source email is lost...
My conclusion was the duplicate detection function is global to all folders.
If I could restrict the detection of duplicates in the current folder only, this would let me run the import program again without error.
Specify the ID used for duplicate checking explicitly using the :uniqueid argument (https://tools.ietf.org/html/rfc7352#section-3.1). Using the variables extenion, compose the uniqueid from the message-id and the mailbox name.
Regards,
Stephan.
Thank you, I will try this.
André
On Wed, Apr 25, 2018, at 3:20 PM, Stephan Bosch wrote:
Op 23/04/2018 om 22:03 schreef André Rodier:
On 23/04/18 14:18, Stephan Bosch wrote:
Op 11-4-2018 om 23:58 schreef André Rodier:
Hello,
I have tested the sieve duplicate script with success so far, but I have a question.
Sieve duplicate script? You mean the Sieve duplicate extension (RFC 7352)?
I would like to know if the "duplicate" sieve flag in Dovecot is global to all folders, or specific to one folder only.
It uses the lda-dupes file in the user's home directory. So, it is not normally related to folders, although the identifier used for duplicate matching could be composed of the mailbox name if you want.
For instance, if I copy an email from one folder to another, and I have a discard action on duplicate email, is this action will be applied (in this case, discard) or not.
Are you talking about IMAPSieve now? I am not sure "duplicate" is currently even allowed in that context.
If the duplicate is global to all folders, is there a way to restrict the search in one folder only.
You can set the :uniqueid parameter accordingly.
Regards,
Stephan.
Thank you, Stephan.
Yes, I meant the Sieve duplicate extension.
I am using a program to import email (mbsync), which use the IMAP append function. Sometimes, the import fail and I have to restart the program. Unfortunately, the same emails are imported again.
I found a fix by using a dovecot IMAP sieve script executed on the APPEND action (https://wiki.dovecot.org/Pigeonhole/Sieve/Plugins/IMAPSieve). I wrote a custom sieve script that "discard" the ones that are detected as "duplicate". It worked very well and the emails were not any more imported twice.
However, there was a huge side effect: archiving an email with Thunderbird is not working any more, and even lost! I have been able to understand the error as this:
- When archiving an email with Thunderbird, it is first copied (APPEND) into the archive folder, but the original folder is not expunged.
- The sieve script detect the email as duplicate, and discard it.
- When the original folder is expunged, the source email is lost...
My conclusion was the duplicate detection function is global to all folders.
If I could restrict the detection of duplicates in the current folder only, this would let me run the import program again without error.
Specify the ID used for duplicate checking explicitly using the :uniqueid argument (https://tools.ietf.org/html/rfc7352#section-3.1). Using the variables extenion, compose the uniqueid from the message-id and the mailbox name.
In my experience with dovecot's implementation, you can set the ID only once in a script. If you try to filter duplicates based on multiple IDs, only the first (or last, I don't remember) takes effect.
V/r, James Cassell
Op 25/04/2018 om 22:49 schreef James Cassell:
On Wed, Apr 25, 2018, at 3:20 PM, Stephan Bosch wrote:
Specify the ID used for duplicate checking explicitly using the :uniqueid argument (https://tools.ietf.org/html/rfc7352#section-3.1). Using the variables extenion, compose the uniqueid from the message-id and the mailbox name.
In my experience with dovecot's implementation, you can set the ID only once in a script. If you try to filter duplicates based on multiple IDs, only the first (or last, I don't remember) takes effect.
Do you have a detailed example of the supposed wrong behavior?
Regards,
Stephan.
On Mon, May 14, 2018, at 4:52 PM, Stephan Bosch wrote:
Op 25/04/2018 om 22:49 schreef James Cassell:
On Wed, Apr 25, 2018, at 3:20 PM, Stephan Bosch wrote:
Specify the ID used for duplicate checking explicitly using the :uniqueid argument (https://tools.ietf.org/html/rfc7352#section-3.1). Using the variables extenion, compose the uniqueid from the message-id and the mailbox name.
In my experience with dovecot's implementation, you can set the ID only once in a script. If you try to filter duplicates based on multiple IDs, only the first (or last, I don't remember) takes effect.
Do you have a detailed example of the supposed wrong behavior?
I don't have them readily available. Basically, the result of the first duplicate test in a script is taken as the result of any future duplicate test, even if the parameters to that future duplicate test in the same script are different and would otherwise result in a different output. The duplicate test is only evaluated once and its results are substituted everywhere.
For example, I might want to flag a message as a new conversation if I have not seen another message with the same subject. In the same script, I might want to discard messages that are exactly identical including message ID among others. The dovecot behavior would be to discard all messages that match a subject of previously received message.
Regards,
Stephan.
V/r, James Cassell
Op 14/05/2018 om 23:03 schreef James Cassell:
On Mon, May 14, 2018, at 4:52 PM, Stephan Bosch wrote:
Op 25/04/2018 om 22:49 schreef James Cassell:
On Wed, Apr 25, 2018, at 3:20 PM, Stephan Bosch wrote:
Specify the ID used for duplicate checking explicitly using the :uniqueid argument (https://tools.ietf.org/html/rfc7352#section-3.1). Using the variables extenion, compose the uniqueid from the message-id and the mailbox name.
In my experience with dovecot's implementation, you can set the ID only once in a script. If you try to filter duplicates based on multiple IDs, only the first (or last, I don't remember) takes effect.
Do you have a detailed example of the supposed wrong behavior?
I don't have them readily available. Basically, the result of the first duplicate test in a script is taken as the result of any future duplicate test, even if the parameters to that future duplicate test in the same script are different and would otherwise result in a different output. The duplicate test is only evaluated once and its results are substituted everywhere.
For example, I might want to flag a message as a new conversation if I have not seen another message with the same subject. In the same script, I might want to discard messages that are exactly identical including message ID among others. The dovecot behavior would be to discard all messages that match a subject of previously received message.
I finally managed to review this issue and I can confirm that this is a bug.
Regards,
Stephan.
On 17/08/2018 09:14, Stephan Bosch wrote:
Op 14/05/2018 om 23:03 schreef James Cassell:
On Mon, May 14, 2018, at 4:52 PM, Stephan Bosch wrote:
Op 25/04/2018 om 22:49 schreef James Cassell:
On Wed, Apr 25, 2018, at 3:20 PM, Stephan Bosch wrote:
Specify the ID used for duplicate checking explicitly using the :uniqueid argument (https://tools.ietf.org/html/rfc7352#section-3.1). Using the variables extenion, compose the uniqueid from the message-id and the mailbox name.
In my experience with dovecot's implementation, you can set the ID only once in a script. If you try to filter duplicates based on multiple IDs, only the first (or last, I don't remember) takes effect.
Do you have a detailed example of the supposed wrong behavior?
I don't have them readily available. Basically, the result of the first duplicate test in a script is taken as the result of any future duplicate test, even if the parameters to that future duplicate test in the same script are different and would otherwise result in a different output. The duplicate test is only evaluated once and its results are substituted everywhere.
For example, I might want to flag a message as a new conversation if I have not seen another message with the same subject. In the same script, I might want to discard messages that are exactly identical including message ID among others. The dovecot behavior would be to discard all messages that match a subject of previously received message.
I finally managed to review this issue and I can confirm that this is a bug.
Fix released in 2.3.9.
Regards,
Stephan.
On Wed, Dec 4, 2019, at 1:14 PM, Stephan Bosch via dovecot wrote:
On 17/08/2018 09:14, Stephan Bosch wrote:
Op 14/05/2018 om 23:03 schreef James Cassell:
On Mon, May 14, 2018, at 4:52 PM, Stephan Bosch wrote:
Op 25/04/2018 om 22:49 schreef James Cassell:
On Wed, Apr 25, 2018, at 3:20 PM, Stephan Bosch wrote:
Specify the ID used for duplicate checking explicitly using the :uniqueid argument (https://tools.ietf.org/html/rfc7352#section-3.1). Using the variables extenion, compose the uniqueid from the message-id and the mailbox name.
In my experience with dovecot's implementation, you can set the ID only once in a script. If you try to filter duplicates based on multiple IDs, only the first (or last, I don't remember) takes effect.
Do you have a detailed example of the supposed wrong behavior?
I don't have them readily available. Basically, the result of the first duplicate test in a script is taken as the result of any future duplicate test, even if the parameters to that future duplicate test in the same script are different and would otherwise result in a different output. The duplicate test is only evaluated once and its results are substituted everywhere.
For example, I might want to flag a message as a new conversation if I have not seen another message with the same subject. In the same script, I might want to discard messages that are exactly identical including message ID among others. The dovecot behavior would be to discard all messages that match a subject of previously received message.
I finally managed to review this issue and I can confirm that this is a bug.
Fix released in 2.3.9.
Awesome! Thanks for the followup!
V/r, James Cassell
participants (3)
-
André Rodier
-
James Cassell
-
Stephan Bosch