Redundant Database, Pgsql ?

Wed Feb 22 12:45:43 UTC 2023

think about this, you have connect= host=host1 host=host2 ...

when host1 fails, you must kick that host while it is down, each and 
every single time you want a new connection to the database. there is no 
record saying that box is down, no logic in the app saying i've already 
tried that box, so avoid it for "some period of time".  you have to 
attempt a connection to it, since it is listed first in your connection 
parameters.  you have to get an error back or wait for a timeout period 
to occur, and then move onto the next box.

have you ever faced a DNS resolution issue, where DNS was not working 
and everything slowed to a crawl because of it?  similar kind of delays 
in processing here.  you have to have an error or timeout occur before 
you move onto the next configured box.  when you are dealing with high 
volume production environments, this is not a scaled solution.

with load balancing, you have an active connection from your app to the 
database VIP.  then there is a secondary connection from the load 
balancer to the database host in the pool.  when that connection dies 
and goes away, the app retries the connection to VIP and is 
automatically assigned to a different, alive and available host in the 
pool.  no waiting for an error or timeout to occur, and then trying the 
next host in the config.

when you put the infrastructure together properly, no human intervention 
is required to mark a box down and not send traffic to it.  as i said:

it requires a bit of supporting infrastructure to get it all working, 
but you wind up with a */highly-available, fault tolerant PostgreSQL 
footprint with automatic failover./*

by having the load balancer keep track of which hosts are alive and 
available, you dont have any guess work as to which host to connect to.  
the app just has to retry the connection.  once a host is marked down by 
the load balancer, no traffic will be sent to it until it passes a 
health check. the health checking with PostgreSQL is the ETCd and 
Patroni pieces.  those two processes are critical in the chain.  with 
the health of the boxes and processes handled, knowing which PG host is 
the active R/W one is simple.  when failure occurs, there is an election 
process to promote one of the standby hosts to the active R/W node, and 
that is reported up to the load balancer through the Patroni -> ETCd 
chain.  all new connections wind up going to the newly elected active 
R/W node in the pool.
*//*

On 2/22/23 5:29 AM, David White wrote:
> If I understand open source Postgres correctly, though, this setup basically requires that the application be read-only, or at least be intelligent enough not to attempt to write to a host if it has failed over to it, right? Don't you have to have human intervention to actually fail the master / primary over for write purposes?
>
> Sent with Proton Mail secure email.
>
> ------- Original Message -------
> On Tuesday, February 21st, 2023 at 10:28 PM, Marcus Jodorf<bofh at killfile.de>  wrote:
>
>
>> On 2/21/23 18:04, Lars Schotte wrote:
>>
>>> Yes, that looks nice, I am going to try that too.
>>>
>>> Because every other option is based on some other software,
>>> like relayd or nginx, it is all possible, but adds complexity.
>>>
>>> The best would be to have it inside connection string.
>> /etc/dovecot/dovecot-sql.conf.ext:
>>
>> ...
>> connect= host=host1 host=host2 dbname=<db> user=<user> password=<pwd>
>>
>> I'm using that since pretty much 2006 - if I'm not mistaken - on my
>> little servers. Simply works.
>>
>> If one server is not reachable you just get an error log entry in
>> mail.err that connect failed to the database that is down.
>> But that is all - dovecot keeps working as normal.
>>
>> BTW: Same with postfix. Simply list an additional fallback and it just
>> works.
>>
>> Best,
>>
>> Marcus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://dovecot.org/pipermail/dovecot/attachments/20230222/aea7b64c/attachment.htm>