WG: Migrating, syncing, maybe load-balancing/failover two dovecot servers?

Aki Tuomi aki.tuomi at open-xchange.com
Thu Jan 5 16:07:16 UTC 2023


Can you verify that the stalling occurs with 2.3.20? We have a potential fix for 2.3.20 also for an TLS stall, so if someone is willing, they could try and let us know, see:

https://dovecot.org/list/dovecot/2023-January/125929.html

Aki

> On 05/01/2023 17:55 EET Joachim Lindenberg <dovecot at lindenberg.one> wrote:
> 
>  
> In my experiments I also experienced replication being stalled when running with ssl. Is this being looked into?
> Thanks,
> Joachim
> 
> -----Ursprüngliche Nachricht-----
> Von: dovecot <dovecot-bounces at dovecot.org> Im Auftrag von Paul Kudla
> Gesendet: Donnerstag, 5. Januar 2023 02:46
> An: dovecot at dovecot.org
> Betreff: Re: Migrating, syncing, maybe load-balancing/failover two dovecot servers?
> 
> 
> ok just a few quick things about replication
> 
> 1. you should upgrade both versions to at least dovecot-2.3.19.1.tar.gz
> (2.3.18 had issues on larges folder counts - you will probably run into this on smaller servers but just sharing the experience)
> 
> 2. i found replication worked better without using ssl
> 
> 3. i went through the sync failures etc as well and found that NOT using NFS etc is the way to go
> 
> 4. I can provide (or if you look on the mailing lists) my config for SCOM - it took a month of tweeking but finally got a good config that worked.
> 
> 5. One thing i just remembered that you really should run a pgsql database for user auth, this way the two system will stay up to date automatically everytime an email box is modified. The replicator service selects users from a database to keep the mbox's in sync automatically
> 
> the above are the basics but i find dovecot runs extremely well vs cyrus that i was running previous
> 
> Good job to the designers !
> 
> 
> 
> Happy Wednesday !!!
> Thanks - paul
> 
> Paul Kudla
> 
> 
> Scom.ca Internet Services <http://www.scom.ca>
> 004-1009 Byron Street South
> Whitby, Ontario - Canada
> L1N 4S3
> 
> Toronto 416.642.7266
> Main 1.866.411.7266
> Fax 1.888.892.7266
> Email paul at scom.ca
> 
> On 1/4/2023 4:24 PM, Gerben Wierda wrote:
> > So, I did set it up.
> > 
> > As I am using not real users (but a cram md5 passwd db file with every 
> > user uid=dovecot, gid=mail) and my dovecots are owning everything in 
> > the mail store I had to synchronise uid/gid of the dovecots on both 
> > ends
> > 
> > After I did that, I tested the sync. And while it has worked (I now 
> > have an equal sized store at both ends), one side (running 2.3.17, the 
> > sending 'old server') was throwing up quite a bit of this:
> > 
> > Jan 04 20:13:15 doveadm(74435): Error: write(<local>) failed: Timed 
> > out after 60 seconds Jan 04 20:13:15 doveadm(74435): Panic: file 
> > ioloop.c: line 865
> > (io_loop_destroy): assertion failed: (ioloop == current_ioloop)
> > Jan 04 20:13:15 doveadm(74435): Error: Raw backtrace: 0   
> > libdovecot.0.dylib                  0x000000010db6d157 
> > backtrace_append
> > + 58 -> 1   libdovecot.0.dylib                  0x000000010db6d255
> > backtrace_get + 31 -> 2   libdovecot.0.dylib                  
> > 0x000000010db79ff3 default_fatal_finish + 60 -> 3   libdovecot.0.dylib  
> >                  0x000000010db78afa default_error_handler + 0 -> 4   
> > libdovecot.0.dylib                  0x000000010db7973b 
> > i_internal_error_handler + 0 -> 5   libdovecot.0.dylib                  
> > 0x000000010db78c
> > Jan 04 20:13:15 doveadm(74435): Error: b8 i_fatal + 0 -> 6   
> > libdovecot.0.dylib                  0x000000010db8fa1f io_loop_destroy 
> > +
> > 826 -> 7   doveadm-server                      0x000000010d3445fc 
> > doveadm_print_server_flush + 254 -> 8   doveadm-server                  
> >      0x000000010d33df1e doveadm_print + 44 -> 9   doveadm-server        
> >                0x000000010d32bd5b cmd_dsync_run + 1618 -> 10 
> > doveadm-server                      0x000000010d32db67
> > doveadm_mail_next_user + 479 -> 11  doveadm-server                      
> > 0x000000010
> > Jan 04 20:13:15 doveadm(74435): Error: d32e8bb 
> > doveadm_cmd_ver2_to_mail_cmd_wrapper + 2439 -> 12  doveadm-server        
> >                0x000000010d33dc0c doveadm_cmd_run_ver2 + 1083 -> 13 
> > doveadm-server                      0x000000010d34224a
> > client_connection_tcp_input + 1579 -> 14  libdovecot.0.dylib            
> >        0x000000010db8efe1 io_loop_call_io + 114 -> 15 
> > libdovecot.0.dylib                  0x000000010db910cf
> > io_loop_handler_run_internal + 314 -> 16  libdovecot.0.dylib            
> >        0x000000010db8f3fb io_loop_handler_run +
> > Jan 04 20:13:15 doveadm(74435): Error:  212 -> 17  libdovecot.0.dylib    
> >                0x000000010db8f2e6 io_loop_run + 81 -> 18 
> > libdovecot.0.dylib                  0x000000010db075e0
> > master_service_run + 24 -> 19  doveadm-server                      
> > 0x000000010d344c3f main + 292 -> 20  dyld                                
> > 0x000000011c73952e start + 462
> > Jan 04 20:13:15 doveadm(74435): Fatal: master: service(doveadm): child
> > 74435 killed with signal 6 (core dumps disabled - 
> > https://dovecot.org/bugreport.html#coredumps
> > <https://dovecot.org/bugreport.html#coredumps>)
> > Jan 04 20:16:05 lmtp(pid 74518 user gerben): Warning: 
> > replication(gerben): Sync failure: Timeout in 2 secs Jan 04 20:17:05 
> > doveadm(74522): Error: write(<local>) failed: Timed out after 60 
> > seconds Jan 04 20:17:05 doveadm(74522): Panic: file ioloop.c: line 865
> > (io_loop_destroy): assertion failed: (ioloop == current_ioloop)
> > Jan 04 20:17:05 doveadm(74522): Error: Raw backtrace: 0   
> > libdovecot.0.dylib                  0x00000001050d3157 
> > backtrace_append
> > + 58 -> 1   libdovecot.0.dylib                  0x00000001050d3255
> > backtrace_get + 31 -> 2   libdovecot.0.dylib                  
> > 0x00000001050dfff3 default_fatal_finish + 60 -> 3   libdovecot.0.dylib  
> >                  0x00000001050deafa default_error_handler + 0 -> 4   
> > libdovecot.0.dylib                  0x00000001050df73b 
> > i_internal_error_handler + 0 -> 5   libdovecot.0.dylib                  
> > 0x00000001050dec
> > Jan 04 20:17:05 doveadm(74522): Error: b8 i_fatal + 0 -> 6   
> > libdovecot.0.dylib                  0x00000001050f5a1f io_loop_destroy 
> > +
> > 826 -> 7   doveadm-server                      0x00000001048aa5fc 
> > doveadm_print_server_flush + 254 -> 8   doveadm-server                  
> >      0x00000001048a3f1e doveadm_print + 44 -> 9   doveadm-server        
> >                0x0000000104891d5b cmd_dsync_run + 1618 -> 10 
> > doveadm-server                      0x0000000104893b67
> > doveadm_mail_next_user + 479 -> 11  doveadm-server                      
> > 0x000000010
> > Jan 04 20:17:05 doveadm(74522): Error: 48948bb 
> > doveadm_cmd_ver2_to_mail_cmd_wrapper + 2439 -> 12  doveadm-server        
> >                0x00000001048a3c0c doveadm_cmd_run_ver2 + 1083 -> 13 
> > doveadm-server                      0x00000001048a824a
> > client_connection_tcp_input + 1579 -> 14  libdovecot.0.dylib            
> >        0x00000001050f4fe1 io_loop_call_io + 114 -> 15 
> > libdovecot.0.dylib                  0x00000001050f70cf
> > io_loop_handler_run_internal + 314 -> 16  libdovecot.0.dylib            
> >        0x00000001050f53fb io_loop_handler_run +
> > Jan 04 20:17:05 doveadm(74522): Error:  212 -> 17  libdovecot.0.dylib    
> >                0x00000001050f52e6 io_loop_run + 81 -> 18 
> > libdovecot.0.dylib                  0x000000010506d5e0
> > master_service_run + 24 -> 19  doveadm-server                      
> > 0x00000001048aac3f main + 292 -> 20  dyld                                
> > 0x000000011487652e start + 462
> > Jan 04 20:17:05 doveadm(74522): Fatal: master: service(doveadm): child
> > 74522 killed with signal 6 (core dumps disabled - 
> > https://dovecot.org/bugreport.html#coredumps
> > <https://dovecot.org/bugreport.html#coredumps>)
> > 
> > Turns out, this is a known (and pretty old) problem 
> > (https://www.mail-archive.com/dovecot%40dovecot.org/msg85388.html
> > <https://www.mail-archive.com/dovecot%40dovecot.org/msg85388.html>) 
> > and my dovecot on the old server (macOS + MacPorts) is newer than the 
> > dovecot on the new one. I should go back to a 2.3.16 on the old server.
> > 
> > It seems the syncing works (or has worked) nonetheless, but it doesn't 
> > feel good.
> > 
> > Gerben Wierda (LinkedIn <https://www.linkedin.com/in/gerbenwierda>)
> > R&A IT Strategy <https://ea.rna.nl/> (main site)
> > Book: Chess and the Art of Enterprise Architecture 
> > <https://ea.rna.nl/the-book/>
> > Book: Mastering ArchiMate <https://ea.rna.nl/the-book-edition-iii/>
> > 
> >> On 4 Jan 2023, at 13:54, Paul Kudla <paul at scom.ca 
> >> <mailto:paul at scom.ca>> wrote:
> >>
> >>
> >> maybe look a replicator / replication
> >>
> >> its designed to do exactly that
> >>
> >>
> >>
> >>
> >> Happy Wednesday !!!
> >> Thanks - paul
> >>
> >> Paul Kudla
> >>
> >>
> >> Scom.ca <http://Scom.ca> Internet Services <http://www.scom.ca 
> >> <http://www.scom.ca>>
> >> 004-1009 Byron Street South
> >> Whitby, Ontario - Canada
> >> L1N 4S3
> >>
> >> Toronto 416.642.7266
> >> Main 1.866.411.7266
> >> Fax 1.888.892.7266
> >> Email paul at scom.ca <mailto:paul at scom.ca>
> >>
> >> On 1/4/2023 7:46 AM, Gerben Wierda wrote:
> >>> I am in the process of migrating from dovecot on one OS
> >>> (macOS/darwin) to a new server running dovecot with another OS 
> >>> (Ubuntu Linux 22.4).
> >>> I have mostly copied/adapted the setup of the old server to the new. 
> >>> I am in the process of finishing that and adding some stuff that 
> >>> still needs to be added/migrated, like rspamd. And the data of 
> >>> course before the new one takes over from the old.
> >>> I have done a migration before (MacOS X Server dovecot to MacPorts 
> >>> dovecot on macOS), many years ago, I recall that I used dovecot 
> >>> syncing but also rsync and I don't really recall (and anyway, the 
> >>> software has changed since) I have been thinking about keeping them 
> >>> both alive, with one as a failover for the other. They will not 
> >>> share their storage (e.g. NFS), So, I was wondering if I can do 
> >>> something with syncing between instances and dovecot director. I 
> >>> have been looking at the documentation, but a quick scan reveals I 
> >>> cannot locate some sort of tutorial and I am uncertain what will 
> >>> work and what not.
> >>> If keeping both alive in parallel is too problematic, it is OK to 
> >>> have regular syncing in one direction (old to new) at first and then 
> >>> switch over and have syncing in the other direction (new to old) Can 
> >>> someone enlighten me?
> >>> Gerben Wierda (LinkedIn <https://www.linkedin.com/in/gerbenwierda>)
> >>> R&A IT Strategy <https://ea.rna.nl/> (main site)
> >>> Book: Chess and the Art of Enterprise Architecture 
> >>> <https://ea.rna.nl/the-book/>
> >>> Book: Mastering ArchiMate <https://ea.rna.nl/the-book-edition-iii/>
> >>> --
> >>> This message has been scanned for viruses and dangerous content by 
> >>> *MailScanner* <http://www.mailscanner.info/>, and is believed to be 
> >>> clean.
> > 
> > 
> > --
> > This message has been scanned for viruses and dangerous content by 
> > *MailScanner* <http://www.mailscanner.info/>, and is believed to be 
> > clean.


More information about the dovecot mailing list