Unconfigurable 500 ms debounce on IMAP IDLE notifications
Hello all,
I have been investigating the reason for high end-to-end email delivery delay (>1 second to send a message to echo bot and receive a reply back) on a Postfix+Dovecot email server setup used to run the tests for Delta Chat core [1]. It appeared that most of the time was spent between the submission SMTP service accepting the mail for delivery and receiving an IDLE wakeup on the other side.
Looking into the Dovecot source code, I found that all notifications from the storage to callbacks set up by IDLE are delayed by a debounce constant NOTIFY_DELAY_MSECS [2]. It appears to have been added in 2009 [3], but even before that Dovecot was only delivering notifications when a second-resolution timer is changed by at least 1, so IDLE notifications probably were delayed by half a second on average even before that.
Commit message says "If the notification is done immediately, IDLE may not notice the change because it's not finished yet." This sounds like a workaround for some internal Dovecot bug.
I have nevertheless tried to compile Dovecot with this constant reduced from 500 ms to 1 ms [4] and it resulted in exceptionally fast delivery without any problems caused to the Delta Chat core online test suite which is using this Postfix+Dovecot server accounts. We are running all the tests against this server and it appears to work fine. Round trip time for sending a message to an echo bot and receiving a reply back is reduced from 1.1 s to 0.2 s, which is easily visible in a chat app [5].
I wonder if this debounce delay was added as a workaround for an internal Dovecot bug or a client bug and is not necessary anymore. It does not seem to be useful for well-behaving clients because if necessary the client may debounce notifications on its side and postpone interrupting IDLE with "DONE".
I would actually like to disable this delay completely on our server setup [6], as for chat it is common to receive multiple messages in a short period of time, e.g. when sending a message to a group and receiving multiple read receipts from currently active users.
[1] https://github.com/deltachat/deltachat-core-rust/ [2] https://github.com/dovecot/core/blob/93a53a9d590f0220de28600f36a969cc38c3148... [3] https://github.com/dovecot/core/commit/56fb5d09955b6097f77b341717fd9b70e9f13... [4] https://github.com/deltachat/chatmail/issues/72#issuecomment-1806645153 [5] https://delta.chat/assets/blog/nine-echo.mp4 [6] https://github.com/deltachat/chatmail
Best regards, Alex
On 14. Dec 2023, at 16.21, Alex <alex@merlinux.eu> wrote:
Hello all,
I have been investigating the reason for high end-to-end email delivery delay (>1 second to send a message to echo bot and receive a reply back) on a Postfix+Dovecot email server setup used to run the tests for Delta Chat core [1]. It appeared that most of the time was spent between the submission SMTP service accepting the mail for delivery and receiving an IDLE wakeup on the other side.
Looking into the Dovecot source code, I found that all notifications from the storage to callbacks set up by IDLE are delayed by a debounce constant NOTIFY_DELAY_MSECS [2]. It appears to have been added in 2009 [3], but even before that Dovecot was only delivering notifications when a second-resolution timer is changed by at least 1, so IDLE notifications probably were delayed by half a second on average even before that.
Commit message says "If the notification is done immediately, IDLE may not notice the change because it's not finished yet." This sounds like a workaround for some internal Dovecot bug.
...
I wonder if this debounce delay was added as a workaround for an internal Dovecot bug or a client bug and is not necessary anymore. It does not seem to be useful for well-behaving clients because if necessary the client may debounce notifications on its side and postpone interrupting IDLE with "DONE".
I can't really think of any situation where this would be necessary anymore. Not sure why it was necessary back in 2009 - maybe Maildir didn't have quite as many locks back then. Anyway now even if a change is partial at the time the mailbox-watch triggers, it runs full mailbox syncing which will lock the mailbox, so it waits for any previous changes to finish being written.
I would actually like to disable this delay completely on our server setup [6], as for chat it is common to receive multiple messages in a short period of time, e.g. when sending a message to a group and receiving multiple read receipts from currently active users.
We could of course make this a setting, but maybe it's not necessary. Would this behavior also work for you? :
- If the current IDLE hasn't notified anything yet, send the notification immediately without waiting
- Otherwise, send it max once per 0.5 seconds.
So if you're using the same connection to always break out of IDLE after a notification, there would be no dalays.
On Thu, Dec 14, 2023 at 07:17:16PM +0200, Timo Sirainen wrote:
I wonder if this debounce delay was added as a workaround for an internal Dovecot bug or a client bug and is not necessary anymore. It does not seem to be useful for well-behaving clients because if necessary the client may debounce notifications on its side and postpone interrupting IDLE with "DONE".
I can't really think of any situation where this would be necessary anymore. Not sure why it was necessary back in 2009 - maybe Maildir didn't have quite as many locks back then. Anyway now even if a change is partial at the time the mailbox-watch triggers, it runs full mailbox syncing which will lock the mailbox, so it waits for any previous changes to finish being written.
I have only tested with Maildir, maybe it was important for some other storage at the time it was introduced.
I would actually like to disable this delay completely on our server setup [6], as for chat it is common to receive multiple messages in a short period of time, e.g. when sending a message to a group and receiving multiple read receipts from currently active users.
We could of course make this a setting, but maybe it's not necessary. Would this behavior also work for you? :
- If the current IDLE hasn't notified anything yet, send the notification immediately without waiting
- Otherwise, send it max once per 0.5 seconds.
For IDLE this would work and be indistinguishable from no delay as Delta Chat core always interrupts IDLE with DONE as soon as it receives a response (e.g. EXISTS or RECENT, but not keepalive).
But as this delay code is a part of the storage (src/lib-storage/mailbox-watch.c) and knows nothing about IDLE, I am not sure how this could be implemented. There is also a plan to implement NOTIFY (RFC 5465) extension support in Delta Chat [1] where "NOTIFY SET" is executed only once and then there are incoming notifications for the rest of the IMAP session. NOTIFY uses the same callback and has the same 0.5 second delay as a result.
What could work is a change to callback immediately if there have been no callback call in the last 0.5 seconds and otherwise delay it. Then there will be no difference as long as two emails in a row do not come wihin half a second.
But I also wonder why not remove the delay completely then if there is no need for it? Debounce can always be implemented on the client side if necessary, if the client does not want to react to interruptions more frequently than once every half a second it can delay the reaction locally.
So if you're using the same connection to always break out of IDLE after a notification, there would be no dalays.
[1] https://github.com/deltachat/deltachat-core-rust/pull/4990
On 15. Dec 2023, at 1.59, Alex <alex@merlinux.eu> wrote:
I would actually like to disable this delay completely on our server setup [6], as for chat it is common to receive multiple messages in a short period of time, e.g. when sending a message to a group and receiving multiple read receipts from currently active users.
We could of course make this a setting, but maybe it's not necessary. Would this behavior also work for you? :
- If the current IDLE hasn't notified anything yet, send the notification immediately without waiting
- Otherwise, send it max once per 0.5 seconds.
For IDLE this would work and be indistinguishable from no delay as Delta Chat core always interrupts IDLE with DONE as soon as it receives a response (e.g. EXISTS or RECENT, but not keepalive).
But as this delay code is a part of the storage (src/lib-storage/mailbox-watch.c) and knows nothing about IDLE, I am not sure how this could be implemented. There is also a plan to implement NOTIFY (RFC 5465) extension support in Delta Chat [1] where "NOTIFY SET" is executed only once and then there are incoming notifications for the rest of the IMAP session. NOTIFY uses the same callback and has the same 0.5 second delay as a result.
Yeah, it would need some additional API changes to make it work that way.
What could work is a change to callback immediately if there have been no callback call in the last 0.5 seconds and otherwise delay it. Then there will be no difference as long as two emails in a row do not come wihin half a second.
I thought it would be important in those cases also? But if not, that's the simplest fix, yes. And it could be shrunk even further from 0.5 seconds to maybe 0.1 seconds.
But I also wonder why not remove the delay completely then if there is no need for it? Debounce can always be implemented on the client side if necessary, if the client does not want to react to interruptions more frequently than once every half a second it can delay the reaction locally.
There could be some performance loss if there is never any delay. For example some offline client could connect and start issuing a bunch of flag updates, while another IDLEing connection keeps trying to rapidly send notifications about each of those changes separately instead of bundling them all into a single mailbox sync.
On Fri, Dec 15, 2023 at 11:37:10AM +0200, Timo Sirainen wrote:
On 15. Dec 2023, at 1.59, Alex <alex@merlinux.eu> wrote:
I would actually like to disable this delay completely on our server setup [6], as for chat it is common to receive multiple messages in a short period of time, e.g. when sending a message to a group and receiving multiple read receipts from currently active users.
We could of course make this a setting, but maybe it's not necessary. Would this behavior also work for you? :
- If the current IDLE hasn't notified anything yet, send the notification immediately without waiting
- Otherwise, send it max once per 0.5 seconds.
For IDLE this would work and be indistinguishable from no delay as Delta Chat core always interrupts IDLE with DONE as soon as it receives a response (e.g. EXISTS or RECENT, but not keepalive).
But as this delay code is a part of the storage (src/lib-storage/mailbox-watch.c) and knows nothing about IDLE, I am not sure how this could be implemented. There is also a plan to implement NOTIFY (RFC 5465) extension support in Delta Chat [1] where "NOTIFY SET" is executed only once and then there are incoming notifications for the rest of the IMAP session. NOTIFY uses the same callback and has the same 0.5 second delay as a result.
Yeah, it would need some additional API changes to make it work that way.
Maybe move this debounce out of storage and into IDLE implementation (imap/cmd-idle.c) then? Then other callers of mailbox_notify_changes will have no delay. I only see other calls to mailbox_notify_changes() in imap/imap-notify.c and plugins/virtual/virtual-storage.c. Virtual storage does not need debounce then just as the normal storage.
Then if IDLE implementation manages this debounce itself and sends the first notification immediately after each IDLE resetup, that would be perfect without the need for a configuration option.
As for NOTIFY, same logic cannot be implemented, but hopefully NOTIFY users are advanced enough to debounce themselves? My plan for using NOTIFY is to subscribe directly to receive message body, so in my case every notification should be immediate.
What could work is a change to callback immediately if there have been no callback call in the last 0.5 seconds and otherwise delay it. Then there will be no difference as long as two emails in a row do not come wihin half a second.
I thought it would be important in those cases also? But if not, that's the simplest fix, yes. And it could be shrunk even further from 0.5 seconds to maybe 0.1 seconds.
Yes, ideally first notification after IDLE should always be immediate. If debounce logic is moved to IDLE implementation this looks simple too.
But I also wonder why not remove the delay completely then if there is no need for it? Debounce can always be implemented on the client side if necessary, if the client does not want to react to interruptions more frequently than once every half a second it can delay the reaction locally.
There could be some performance loss if there is never any delay. For example some offline client could connect and start issuing a bunch of flag updates, while another IDLEing connection keeps trying to rapidly send notifications about each of those changes separately instead of bundling them all into a single mailbox sync.
So this IDLE connection is some UI client that will be slowed down by trying to refresh the view on every change? Either the client tries to update the state and redraw the view on every incoming update or it updates the state immediately and debounces view redraws on its own. In the first case delaying IDLE/NOTIFY updates would not help as the number of redraws is not reduced. In the second case there is no need to debounce on the server side as state updates are going to be fast anyway.
This commit removes hardcoded 500 ms debounce from storage that delays all storage notification subscribers such as IDLE and NOTIFY commands. 500 ms debounce constant NOTIFY_DELAY_MSECS was added in 2009 [1]. Before that Dovecot was only delivering notifications when a second-resolution timer is changed by at least 1, so IDLE notifications were delayed by half a second on average. [1] https://github.com/dovecot/core/commit/56fb5d09955b6097f77b341717fd9b70e9f13... --- src/lib-storage/mail-storage-private.h | 2 +- src/lib-storage/mailbox-watch.c | 21 ++++----------------- 2 files changed, 5 insertions(+), 18 deletions(-) diff --git a/src/lib-storage/mail-storage-private.h b/src/lib-storage/mail-storage-private.h index 0cbdf4c48a..32b337d242 100644 --- a/src/lib-storage/mail-storage-private.h +++ b/src/lib-storage/mail-storage-private.h @@ -455,7 +455,7 @@ struct mailbox { /* Mailbox notification settings: */ mailbox_notify_callback_t *notify_callback; void *notify_context; - struct timeout *to_notify, *to_notify_delay; + struct timeout *to_notify; struct mailbox_notify_file *notify_files; /* Increased by one for each new struct mailbox. */ diff --git a/src/lib-storage/mailbox-watch.c b/src/lib-storage/mailbox-watch.c index 659cab3810..a56c5093fc 100644 --- a/src/lib-storage/mailbox-watch.c +++ b/src/lib-storage/mailbox-watch.c @@ -9,8 +9,6 @@ #include <fcntl.h> #include <sys/stat.h> -#define NOTIFY_DELAY_MSECS 500 - struct mailbox_notify_file { struct mailbox_notify_file *next; @@ -19,9 +17,10 @@ struct mailbox_notify_file { struct io *io_notify; }; -static void notify_delay_callback(struct mailbox *box) +static void notify_callback(struct mailbox *box) { - timeout_remove(&box->to_notify_delay); + timeout_reset(box->to_notify); + box->notify_callback(box, box->notify_context); } @@ -40,18 +39,7 @@ static void notify_timeout(struct mailbox *box) } if (notify) - notify_delay_callback(box); -} - -static void notify_callback(struct mailbox *box) -{ - timeout_reset(box->to_notify); - - if (box->to_notify_delay == NULL) { - box->to_notify_delay = - timeout_add_short(NOTIFY_DELAY_MSECS, - notify_delay_callback, box); - } + notify_callback(box); } void mailbox_watch_add(struct mailbox *box, const char *path) @@ -97,7 +85,6 @@ void mailbox_watch_remove_all(struct mailbox *box) i_free(file); } - timeout_remove(&box->to_notify_delay); timeout_remove(&box->to_notify); } -- 2.43.0
I have made a patch that actually deletes 500 ms delay from mailbox-watch.c rather than reducing it to 1 ms.
I then looked into adding debounce to src/imap/cmd-idle.c, but there I cannot be sure that mailbox pointer passed to idle_callback() will not be freed by the time debounce timer expires. If we want to have debounce in IDLE, it should be kept in mailbox-watch.c, but reset via special API from cmd-idle.c every time IDLE is started.
Then I thought about the issue again and am pretty sure debounce is not needed for IDLE. If an offline client gets online and synchronizes a lot of flags, it usually can do it in one command. If it does not, because of bad implementation or because flags are different for different messages, it will send multiple commands. In this case IDLE-ing client will receive multiple updates, but this is not bad for performance. If IDLE-ing client does not ignore these updates and actually processes them, worst case it will immediately get busy processing the first incoming update. But by the time it finishes processing the first update it will receive many updates and process them in batch, effectively debouncing on the client even if there is no special code doing this.
So unless there is an IMAP client that is known to benefits from debouncing, I suggest that no debouncing is added for IDLE.
Alex (1): storage: remove 500 ms debounce on IMAP IDLE notifications
src/lib-storage/mail-storage-private.h | 2 +- src/lib-storage/mailbox-watch.c | 21 ++++----------------- 2 files changed, 5 insertions(+), 18 deletions(-)
-- 2.43.0
participants (2)
-
Alex
-
Timo Sirainen