Replicator Issues on Large Mail Boxes
Ok running :
freebsd-12.1 dovecot-2.3.18.current dovecot-2.3-pigeonhole-0.5.18.current
simply put replication works fine on smaller email boxes without issues.
dsync also works when run manually, longest sync is 60 secords or so when using dsync so need replicator bumped up ?
i get the fact that the file locking issue relative since a larger mailbox will take longer to replicate when an email comes in ?
mail18 03-05 18:05:51 {dovecot} [15799] (872623573) doveadm(keith@elirpa.com)<32249><GnS3M53sI2L5fQAAz1jc/w>: Error: Couldn't lock /data/dovecot/users/elirpa.com/keith@elirpa.com//tmp/.dovecot-sync.lock: fcntl(/data/dovecot/users/elirpa.com/keith@elirpa.com//tmp/.dovecot-sync.lock, write-lock, F_SETLKW) locking failed: Timed out after 30 seconds (WRITE lock held by pid 31519)
so what i need is where fcntl is setting a 30 second timeout for replication (i have adjust all the others in the src code)
simply put replicator fails and retries and keeps failing which is understandable as it probably needs a little more time ?
# sync.users carol@scom.ca none 00:02:51 08:16:34 - y nick@elirpa.com low 02:45:17 09:28:32 - y keith@elirpa.com none 02:30:13 09:28:32 - y paul@scom.ca high 02:45:17 09:28:32 - y ed@scom.ca none 02:34:34 09:28:32 - y ed.hanna@dssmgmt.com high 02:45:17 09:28:32 - y
i found under /programs/src/mail/dovecot-2.3.18.current/src/lib
file-lock.c
struct flock fl;
fl.l_type = lock_type; fl.l_whence = SEEK_SET; fl.l_start = 0; fl.l_len = 0;
ret = fcntl(fd, timeout_secs != 0 ? F_SETLKW : F_SETLK, &fl); if (timeout_secs != 0) { alarm(0); file_lock_wait_end(path); }
if (ret == 0) break;
if (timeout_secs == 0 && (errno == EACCES || errno == EAGAIN)) { /* locked by another process */ *error_r = t_strdup_printf( "fcntl(%s, %s, F_SETLK) locking failed: %m " "(File is already locked)", path, lock_type_str); return 0; }
if (err_is_lock_timeout(started, timeout_secs)) { errno = EAGAIN; *error_r = t_strdup_printf( "fcntl(%s, %s, F_SETLKW) locking failed: " "Timed out after %u seconds%s", path, lock_type_str, timeout_secs, file_lock_find(fd, set->lock_method, lock_type)); return 0; } *error_r = t_strdup_printf("fcntl(%s, %s, %s) locking failed: %m", path, lock_type_str, timeout_secs == 0 ? "F_SETLK" : "F_SETLKW"); if (errno == EDEADLK && !set->allow_deadlock) { i_panic("%s%s", *error_r, file_lock_find(fd, set->lock_method, lock_type)); } return -1; #endif
--
Happy Saturday !!! Thanks - paul
Paul Kudla
Scom.ca Internet Services <http://www.scom.ca> 004-1009 Byron Street South Whitby, Ontario - Canada L1N 4S3
Toronto 416.642.7266 Main 1.866.411.7266 Fax 1.888.892.7266
Hi,
simply put replication works fine on smaller email boxes without issues.
dsync also works when run manually, longest sync is 60 secords or so when using dsync so need replicator bumped up ?
You can configure dsync parameters for the replicator to adjust the time limit, e.g.:
replication_dsync_parameters = -d -N -l 120 -U
best regards, Carsten
participants (2)
-
Carsten Brandt
-
Paul Kudla (Scom.ca Internet Services Inc.)