Redundant Database, Pgsql ?
I am aware that there are provising for redundant database connections
Basically i was working on my main db server (which is also a mail sever)
I current have this in the dovecot-pgsql.conf
driver = pgsql connect = host=localhost port=5433 dbname=scom_billing user=pgsql password=xxxxxxxxx default_pass_scheme = PLAIN
password_query = SELECT username as user, password FROM email_users WHERE username = '%u' and password <> 'alias' and status = True and destination = '%u'
user_query = SELECT home, uid, gid FROM email_users WHERE username = '%u' and password <> 'alias' and status = True and destination = '%u'
#iterate_query = SELECT user, password FROM email_users WHERE username = '%u' and password <> 'alias' and status = True and destination = '%u'
iterate_query = SELECT "username" as user, domain FROM email_users WHERE status = True and alias_flag = False
Basically I have replicaed servers all over the place but mail18 (because dovecot only knows of one db instance) went down as well.
May I please get some guidance on what to add to talk to another postgresql server (i have 6 replicated servers so would probably want a couple worst case issue)
--
Happy Sunday !!! Thanks - paul
Paul Kudla
Scom.ca Internet Services <http://www.scom.ca> 004-1009 Byron Street South Whitby, Ontario - Canada L1N 4S3
Toronto 416.642.7266 Main 1.866.411.7266 Fax 1.888.892.7266 Email paul@scom.ca
It sounds like you want your application (Dovecot) to gracefully failover.
Take a look at https://www.citusdata.com/blog/2021/02/12/reconnecting-your-app-after-a-post.... Perhaps that's what you're looking for, namely, the section on "Multi-Hosts Connection Strings, thanks to libpq"
Someone else on the list could go into the details on this.
Sent with Proton Mail secure email.
------- Original Message ------- On Sunday, February 19th, 2023 at 10:01 AM, Paul Kudla <paul@scom.ca> wrote:
I am aware that there are provising for redundant database connections
Basically i was working on my main db server (which is also a mail sever)
I current have this in the dovecot-pgsql.conf
driver = pgsql connect = host=localhost port=5433 dbname=scom_billing user=pgsql password=xxxxxxxxx default_pass_scheme = PLAIN
password_query = SELECT username as user, password FROM email_users WHERE username = '%u' and password <> 'alias' and status = True and
destination = '%u'
user_query = SELECT home, uid, gid FROM email_users WHERE username = '%u' and password <> 'alias' and status = True and destination = '%u'
#iterate_query = SELECT user, password FROM email_users WHERE username = '%u' and password <> 'alias' and status = True and destination = '%u'
iterate_query = SELECT "username" as user, domain FROM email_users WHERE status = True and alias_flag = False
Basically I have replicaed servers all over the place but mail18 (because dovecot only knows of one db instance) went down as well.
May I please get some guidance on what to add to talk to another postgresql server (i have 6 replicated servers so would probably want a couple worst case issue)
--
Happy Sunday !!! Thanks - paul
Paul Kudla
Scom.ca Internet Services http://www.scom.ca
004-1009 Byron Street South Whitby, Ontario - Canada L1N 4S3
Toronto 416.642.7266 Main 1.866.411.7266 Fax 1.888.892.7266 Email paul@scom.ca
Yes, that looks nice, I am going to try that too.
Because every other option is based on some other software, like relayd or nginx, it is all possible, but adds complexity.
The best would be to have it inside connection string.
On Sun, 2023-02-19 at 17:34 +0000, David White wrote:
Take a look at https://www.citusdata.com/blog/2021/02/12/reconnecting-your-app-after-a-post... . Perhaps that's what you're looking for, namely, the section on "Multi-Hosts Connection Strings, thanks to libpq"
Lars Schotte Mudroňova 13 92101 Piešťany
Lars Schotte skrev den 2023-02-21 18:04:
Yes, that looks nice, I am going to try that too.
+1
Because every other option is based on some other software, like relayd or nginx, it is all possible, but adds complexity.
+1
The best would be to have it inside connection string.
real fix or even possible is to add failback host=, eg host=localhost failbackfrom=dbi:/path-to-sqlite3 file, with is just cache from postgresql, then postgres can be down, if cache holds the wather
all this imho needs code changes, the cache must not be write cache, only read cache
that means it cant work for dict unless dict is sqlite aswell
On 2/21/23 18:04, Lars Schotte wrote:
Yes, that looks nice, I am going to try that too.
Because every other option is based on some other software, like relayd or nginx, it is all possible, but adds complexity.
The best would be to have it inside connection string.
/etc/dovecot/dovecot-sql.conf.ext:
... connect= host=host1 host=host2 dbname=<db> user=<user> password=<pwd>
I'm using that since pretty much 2006 - if I'm not mistaken - on my little servers. Simply works.
If one server is not reachable you just get an error log entry in mail.err that connect failed to the database that is down. But that is all - dovecot keeps working as normal.
BTW: Same with postfix. Simply list an additional fallback and it just works.
Best,
Marcus
If I understand open source Postgres correctly, though, this setup basically requires that the application be read-only, or at least be intelligent enough not to attempt to write to a host if it has failed over to it, right? Don't you have to have human intervention to actually fail the master / primary over for write purposes?
Sent with Proton Mail secure email.
------- Original Message ------- On Tuesday, February 21st, 2023 at 10:28 PM, Marcus Jodorf <bofh@killfile.de> wrote:
On 2/21/23 18:04, Lars Schotte wrote:
Yes, that looks nice, I am going to try that too.
Because every other option is based on some other software, like relayd or nginx, it is all possible, but adds complexity.
The best would be to have it inside connection string.
/etc/dovecot/dovecot-sql.conf.ext:
... connect= host=host1 host=host2 dbname=<db> user=<user> password=<pwd>
I'm using that since pretty much 2006 - if I'm not mistaken - on my little servers. Simply works.
If one server is not reachable you just get an error log entry in mail.err that connect failed to the database that is down. But that is all - dovecot keeps working as normal.
BTW: Same with postfix. Simply list an additional fallback and it just works.
Best,
Marcus
On 2/22/23 11:29, David White wrote:
If I understand open source Postgres correctly, though, this setup basically requires that the application be read-only, or at least be intelligent enough not to attempt to write to a host if it has failed over to it, right? Don't you have to have human intervention to actually fail the master / primary over for write purposes?
Yes, that is correct. But in the context of using Dovecot (passdb/userdb/quota read-only is normally sufficient).
For write fail-over you would typically use for example Pgpool-II.
Best,
Marcus
I don't even get what the advatages are of doing this with sql. If you use local replicated ldap and use local credential caching then your master ldap can go down without issues, even the local caching handle some local slapd issues. I guess the local caching is also faster. Afaik were databases not designed for this purpose and a better fit is ldap.
If I understand open source Postgres correctly, though, this setup basically requires that the application be read-only, or at least be intelligent enough not to attempt to write to a host if it has failed over to it, right? Don't you have to have human intervention to actually fail the master / primary over for write purposes?
Sent with Proton Mail secure email.
------- Original Message ------- On Tuesday, February 21st, 2023 at 10:28 PM, Marcus Jodorf <bofh@killfile.de> wrote:
On 2/21/23 18:04, Lars Schotte wrote:
Yes, that looks nice, I am going to try that too.
Because every other option is based on some other software, like relayd or nginx, it is all possible, but adds complexity.
The best would be to have it inside connection string.
/etc/dovecot/dovecot-sql.conf.ext:
... connect= host=host1 host=host2 dbname=<db> user=<user> password=<pwd>
I'm using that since pretty much 2006 - if I'm not mistaken - on my little servers. Simply works.
If one server is not reachable you just get an error log entry in mail.err that connect failed to the database that is down. But that is all - dovecot keeps working as normal.
BTW: Same with postfix. Simply list an additional fallback and it just works.
Best,
Marcus
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512
On Wed, 2023-02-22 at 11:08 +0000, Marc wrote:
I don't even get what the advatages are of doing this with sql. If you use local replicated ldap and use local credential caching then your master ldap can go down without issues, even the local caching handle some local slapd issues.
Going to have to +1 this. LDAP also does multi-master replication, which can make failover easier via DNS (like with a round robin for ldap.mydomain), or multiple LDAP dictionaries for dovecot. The [big] problem with OSS Postgres is that it only does master/slave replication, with no plans to add multi-master replication to the code base (there is Percona and 2ndQuadrant, but for small outfits, and individual there is a price barrier there). Personally I love PGSQL as a DB, but for SSO I use LDAP - because that's what it's designed for (i.e. read more than written).
I guess the local caching is also faster. Afaik were databases not designed for this purpose and a better fit is ldap.
This is totally true. RDBMS were not designed with this kind of use in mind, LDAP was - it is, after all, a directory service. So unless your auth stuff is part of some larger DB "thing" the directory type solutions are not suitable for (how many table joins, or other extensive SQL actions are taking place on that DB) then LDAP is the better way to go, and extending LDAP with custom schemas is simple - just grab an IANA number for you, or your organisation, so that you don't trample on any other schema out there. I have a custom schema that I use for postfix/dovecot - it's simple, quick, and efficient without the DB overhead ... and I get the multi-master replication in OpenLDAP.
Nikolai Lusan Email: nikolai@lusan.id.au -----BEGIN PGP SIGNATURE-----
iQIzBAEBCgAdFiEEVfd4GW6z4nsBxdLo4ZaDRV2VL6QFAmP2/+4ACgkQ4ZaDRV2V L6RRHxAAnSpKmYdg2+51CuSgxdQ8fHY65CRwflqQQzIEGIpApmgzh4fl5YkB8+qw utLe5ao6Qd8dIbWmENsLVEZIHeHr9fsmENXXhgF1DIcHwSH12rZHC5pt0zjKWh3j +dmmQ8/tlg/A3HbeEpMuFiVX02Zk3pddMW3D/ja7aAWXjecNmDHqxbghdhJrEzNy 3LiZQsGLQ0RvPiqULtk4ObSmjvuqZ9yJmnmV0IZTY1nRqJ74rHghwQOVK9HTTiMG MizBLqCzbA3suV+TyYxEGE02EtTOGAcWyTcJWY5EGDs1K9GHNML6ZhimZepMmBxL Bb4wr3rohNchdBiXyrACF0eNPJKMdMMWGAzySbWYiiXLRyyyM+KWjzvz6HDLmpXK gCAwY4l8oPRf7Uw3XD1yVyZY6yPrkWV3AZoLL+UYc9N48sApQMG1xh1PcgsfU3+0 Y3FkFy8iltxENh2diLZo5ikfVmQ+YxAZEr5Epl/xmuj30u8tHqpbANKEcvJkhXNG 4bUZ2d+6zxj25vEQq+RkVVRnsfyXNlCaD+E8yyegqRNFnTJAlUlcFWCQ60TcnWqX JKDDOTQj3rbfDZQMeTjbGfW5Fa8PJg5uRbnulXrK7RCizInoDIF8OpmXW4Hg4Ob9 1DiOJeQoefxIIkqs8Mlfj8t4mNy1/iZqs52DcTXRQVqogDesBWw= =ZyC3 -----END PGP SIGNATURE-----
i didnt pick up on the fact that this was auth stuff, and not indexes (indices?) or other data. LDAP is a hierarchical database, where the relationship between data is forced into a superior/subordinate structure. if you ask enough people, they will say that LDAP is not an authentication platform, but a database. i tend to agree, and have put Kerberos and LDAP together as my AuthN/Z suite. Kerberos is truly an AuthN protocol, and when i can point something at it i do. i store my Kerberos data in LDAP, and run n-way multi-primary replication. MIT Kerberos does not have an event based replication means, only time/schedule based. LDAP has an event based replication mechanism, so when data changes all the nodes in the cluster get an immediate replicated update. OpenLDAP can also proxy AuthN events to Kerberos with a few specific configurations set, and the password field being set to a specific string, {SASL}user@domain. this way something that talks LDAP, can point to LDAP for AuthN, and be proxied to back Kerberos. to me, this maintains the ever important "single source of the truth" for credentials.
I run LDAP behind a HAProxy VIP too, for load balancing purposes and transparent failover, so apps see less impact when failures occur. load balancing allows me to scale out (handle more requests in a given unit of time), and be fault tolerant. take a box out of the mix for updates, reboots, maintenance, whatever and not interrupt processing. i load balance nearly every stateful protocol, whenever possible. i anycast most stateless protocols, too, as a means of load sharing. by having more than one instance available to do the same work, you greatly reduce the "hair on fire" calls in the middle of the night, or at least shorten the Mean Time to Recovery.
On 2/23/23 12:55 AM, Nikolai Lusan wrote:
On Wed, 2023-02-22 at 11:08 +0000, Marc wrote:
I don't even get what the advatages are of doing this with sql. If you use local replicated ldap and use local credential caching then your master ldap can go down without issues, even the local caching handle some local slapd issues.
Going to have to +1 this. LDAP also does multi-master replication, which can make failover easier via DNS (like with a round robin for ldap.mydomain), or multiple LDAP dictionaries for dovecot. The [big] problem with OSS Postgres is that it only does master/slave replication, with no plans to add multi-master replication to the code base (there is Percona and 2ndQuadrant, but for small outfits, and individual there is a price barrier there). Personally I love PGSQL as a DB, but for SSO I use LDAP - because that's what it's designed for (i.e. read more than written).
I guess the local caching is also faster. Afaik were databases not designed for this purpose and a better fit is ldap.
This is totally true. RDBMS were not designed with this kind of use in mind, LDAP was - it is, after all, a directory service. So unless your auth stuff is part of some larger DB "thing" the directory type solutions are not suitable for (how many table joins, or other extensive SQL actions are taking place on that DB) then LDAP is the better way to go, and extending LDAP with custom schemas is simple - just grab an IANA number for you, or your organisation, so that you don't trample on any other schema out there. I have a custom schema that I use for postfix/dovecot - it's simple, quick, and efficient without the DB overhead ... and I get the multi-master replication in OpenLDAP.
think about this, you have connect= host=host1 host=host2 ...
when host1 fails, you must kick that host while it is down, each and every single time you want a new connection to the database. there is no record saying that box is down, no logic in the app saying i've already tried that box, so avoid it for "some period of time". you have to attempt a connection to it, since it is listed first in your connection parameters. you have to get an error back or wait for a timeout period to occur, and then move onto the next box.
have you ever faced a DNS resolution issue, where DNS was not working and everything slowed to a crawl because of it? similar kind of delays in processing here. you have to have an error or timeout occur before you move onto the next configured box. when you are dealing with high volume production environments, this is not a scaled solution.
with load balancing, you have an active connection from your app to the database VIP. then there is a secondary connection from the load balancer to the database host in the pool. when that connection dies and goes away, the app retries the connection to VIP and is automatically assigned to a different, alive and available host in the pool. no waiting for an error or timeout to occur, and then trying the next host in the config.
when you put the infrastructure together properly, no human intervention is required to mark a box down and not send traffic to it. as i said:
it requires a bit of supporting infrastructure to get it all working, but you wind up with a */highly-available, fault tolerant PostgreSQL footprint with automatic failover./*
by having the load balancer keep track of which hosts are alive and available, you dont have any guess work as to which host to connect to. the app just has to retry the connection. once a host is marked down by the load balancer, no traffic will be sent to it until it passes a health check. the health checking with PostgreSQL is the ETCd and Patroni pieces. those two processes are critical in the chain. with the health of the boxes and processes handled, knowing which PG host is the active R/W one is simple. when failure occurs, there is an election process to promote one of the standby hosts to the active R/W node, and that is reported up to the load balancer through the Patroni -> ETCd chain. all new connections wind up going to the newly elected active R/W node in the pool. *//*
On 2/22/23 5:29 AM, David White wrote:
If I understand open source Postgres correctly, though, this setup basically requires that the application be read-only, or at least be intelligent enough not to attempt to write to a host if it has failed over to it, right? Don't you have to have human intervention to actually fail the master / primary over for write purposes?
Sent with Proton Mail secure email.
------- Original Message ------- On Tuesday, February 21st, 2023 at 10:28 PM, Marcus Jodorf<bofh@killfile.de> wrote:
On 2/21/23 18:04, Lars Schotte wrote:
Yes, that looks nice, I am going to try that too.
Because every other option is based on some other software, like relayd or nginx, it is all possible, but adds complexity.
The best would be to have it inside connection string. /etc/dovecot/dovecot-sql.conf.ext:
... connect= host=host1 host=host2 dbname=<db> user=<user> password=<pwd>
I'm using that since pretty much 2006 - if I'm not mistaken - on my little servers. Simply works.
If one server is not reachable you just get an error log entry in mail.err that connect failed to the database that is down. But that is all - dovecot keeps working as normal.
BTW: Same with postfix. Simply list an additional fallback and it just works.
Best,
Marcus
Paul Kudla skrev den 2023-02-19 16:01:
May I please get some guidance on what to add to talk to another postgresql server (i have 6 replicated servers so would probably want a couple worst case issue)
change host=localhost to host=some-other-hostname-with-multiple-ips :)
then dovecot with timeout and test next server ip
there might be more to it, but i think this is how to do it
yes that seems to be the approach
i setup a dns entry and pointed to 3 servers
it does work round robin (ie from main, secondary etc) but that is ok
at least it is working when i take the main server offline for maintenance !
Happy Tuesday !!! Thanks - paul
Paul Kudla
Scom.ca Internet Services <http://www.scom.ca> 004-1009 Byron Street South Whitby, Ontario - Canada L1N 4S3
Toronto 416.642.7266 Main 1.866.411.7266 Fax 1.888.892.7266 Email paul@scom.ca
On 2023-02-19 12:56 p.m., Benny Pedersen wrote:
Paul Kudla skrev den 2023-02-19 16:01:
May I please get some guidance on what to add to talk to another postgresql server (i have 6 replicated servers so would probably want a couple worst case issue)
change host=localhost to host=some-other-hostname-with-multiple-ips :)
then dovecot with timeout and test next server ip
there might be more to it, but i think this is how to do it
let me preface all of this by saying i dont have PostgreSQL running in any fashion, but have come across footprints that are standing in a Production environment.
are you running a single primary R/W node, with multiple secondary R/O nodes? from what i have seen/heard, PG does not really have a well documented and currently support n-way multi-primary R/W load balancing mechanism. that said, some effort did exist but may no longer be supported.
i have seen a HA footprint of PG setup with HAProxy, PostgreSQL, ETCd and Patroni. HAProxy handles the Virtual IP (VIP) and can be setup as HA with VRRP running between the HAProxy nodes. ETCd is setup with 3 nodes, all monitoring the "active" status of the PostgreSQL nodes via Patroni. Patroni runs on each PG node, watching the status of the PG instances for failures. PG runs with one node in the "active" state, and replicates to the secondary nodes running in the "standby" state.
when the primary node encounters an issue, Patroni idenitifies that the node is no longer able to process, and ETCd records the updated status. HAProxy polls ETCd and is notified of the event, and marks the previously "active" member as down or in standby state. by election process, the standby nodes promote a node to the active state, and the Patroni -> ETCd -> HAProxy chain picks up the new active node.
it requires a bit of supporting infrastructure to get it all working, but you wind up with a highly-available, fault tolerant PostgreSQL footprint with automatic failover. the caveat is that you only have a single R/W instance at any one point. this could be a performance bottleneck in high volume environments.
some links that may shed light on what and how:
https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Poo... https://patroni.readthedocs.io/en/latest/ https://www.percona.com/blog/postgresql-ha-with-patroni-your-turn-to-test-fa... https://arctype.com/blog/postgres-patroni/
i do have a n-way multi-primary MariaDB footprint running with HAProxy, MariaDB and Galera running. Each MariaDB instance is R/W and can take a write event, and then replicate the event to the other cluster members. the VIP on HAProxy for port 3306 has all three cluster members load balanced using least connections. on the same VIP, but using a different port (3316, 3326, 3336) i have a backend for each of the individual cluster members, so i can isolate and troubleshoot each node separately.
in the PostgreSQL footprints i have come across, a similar setup using other ports has been used for access to the R/O nodes in the cluster. this can allow for queries, instead of writes, and reporting functionality.
best of luck,
brendan kearney
On 2/21/23 4:02 AM, Paul Kudla (SCOM.CA Internet Services Inc.) wrote:
yes that seems to be the approach
i setup a dns entry and pointed to 3 servers
it does work round robin (ie from main, secondary etc) but that is ok
at least it is working when i take the main server offline for maintenance !
Happy Tuesday !!! Thanks - paul
Paul Kudla
Scom.ca Internet Services <http://www.scom.ca> 004-1009 Byron Street South Whitby, Ontario - Canada L1N 4S3
Toronto 416.642.7266 Main 1.866.411.7266 Fax 1.888.892.7266 Email paul@scom.ca
On 2023-02-19 12:56 p.m., Benny Pedersen wrote:
Paul Kudla skrev den 2023-02-19 16:01:
May I please get some guidance on what to add to talk to another postgresql server (i have 6 replicated servers so would probably want a couple worst case issue)
change host=localhost to host=some-other-hostname-with-multiple-ips :)
then dovecot with timeout and test next server ip
there might be more to it, but i think this is how to do it
participants (10)
-
Benny Pedersen
-
Brendan Kearney
-
David White
-
Lars Schotte
-
Marc
-
Marcus Jodorf
-
Marcus Jodorf
-
Nikolai Lusan
-
Paul Kudla
-
Paul Kudla (SCOM.CA Internet Services Inc.)