Redundant Database, Pgsql ?

newer
doveadm sending invalid...

older
Can I encrypt already existant...

Paul Kudla

19 Feb 2023 19 Feb '23

3:01 p.m.

I am aware that there are provising for redundant database connections

Basically i was working on my main db server (which is also a mail sever)

I current have this in the dovecot-pgsql.conf

driver = pgsql connect = host=localhost port=5433 dbname=scom_billing user=pgsql password=xxxxxxxxx default_pass_scheme = PLAIN

password_query = SELECT username as user, password FROM email_users WHERE username = '%u' and password <> 'alias' and status = True and destination = '%u'

user_query = SELECT home, uid, gid FROM email_users WHERE username = '%u' and password <> 'alias' and status = True and destination = '%u'

#iterate_query = SELECT user, password FROM email_users WHERE username = '%u' and password <> 'alias' and status = True and destination = '%u'

iterate_query = SELECT "username" as user, domain FROM email_users WHERE status = True and alias_flag = False

Basically I have replicaed servers all over the place but mail18 (because dovecot only knows of one db instance) went down as well.

May I please get some guidance on what to add to talk to another postgresql server (i have 6 replicated servers so would probably want a couple worst case issue)

Happy Sunday !!! Thanks - paul

Paul Kudla

Scom.ca Internet Services <http://www.scom.ca> 004-1009 Byron Street South Whitby, Ontario - Canada L1N 4S3

Toronto 416.642.7266 Main 1.866.411.7266 Fax 1.888.892.7266 Email paul@scom.ca

Show replies by date

David White

19 Feb 19 Feb

5:34 p.m.

It sounds like you want your application (Dovecot) to gracefully failover.

Take a look at https://www.citusdata.com/blog/2021/02/12/reconnecting-your-app-after-a-post.... Perhaps that's what you're looking for, namely, the section on "Multi-Hosts Connection Strings, thanks to libpq"

Someone else on the list could go into the details on this.

Sent with Proton Mail secure email.

------- Original Message ------- On Sunday, February 19th, 2023 at 10:01 AM, Paul Kudla <paul@scom.ca> wrote:

...

I am aware that there are provising for redundant database connections

...

Basically i was working on my main db server (which is also a mail sever)

...

I current have this in the dovecot-pgsql.conf

...

driver = pgsql connect = host=localhost port=5433 dbname=scom_billing user=pgsql password=xxxxxxxxx default_pass_scheme = PLAIN

...

password_query = SELECT username as user, password FROM email_users WHERE username = '%u' and password <> 'alias' and status = True and

...

destination = '%u'

...

user_query = SELECT home, uid, gid FROM email_users WHERE username = '%u' and password <> 'alias' and status = True and destination = '%u'

...

#iterate_query = SELECT user, password FROM email_users WHERE username = '%u' and password <> 'alias' and status = True and destination = '%u'

...

iterate_query = SELECT "username" as user, domain FROM email_users WHERE status = True and alias_flag = False

...

Basically I have replicaed servers all over the place but mail18 (because dovecot only knows of one db instance) went down as well.

...

May I please get some guidance on what to add to talk to another postgresql server (i have 6 replicated servers so would probably want a couple worst case issue)

...

--

...

Happy Sunday !!! Thanks - paul

...

Paul Kudla

...

Scom.ca Internet Services http://www.scom.ca

...

004-1009 Byron Street South Whitby, Ontario - Canada L1N 4S3

...

Toronto 416.642.7266 Main 1.866.411.7266 Fax 1.888.892.7266 Email paul@scom.ca

Lars Schotte

21 Feb 21 Feb

5:04 p.m.

Yes, that looks nice, I am going to try that too.

Because every other option is based on some other software, like relayd or nginx, it is all possible, but adds complexity.

The best would be to have it inside connection string.

On Sun, 2023-02-19 at 17:34 +0000, David White wrote:

...

Take a look at https://www.citusdata.com/blog/2021/02/12/reconnecting-your-app-after-a-post... . Perhaps that's what you're looking for, namely, the section on "Multi-Hosts Connection Strings, thanks to libpq"

Lars Schotte Mudroňova 13 92101 Piešťany

Benny Pedersen

5:57 p.m.

Lars Schotte skrev den 2023-02-21 18:04:

...

Yes, that looks nice, I am going to try that too.

...

Because every other option is based on some other software, like relayd or nginx, it is all possible, but adds complexity.

...

The best would be to have it inside connection string.

real fix or even possible is to add failback host=, eg host=localhost failbackfrom=dbi:/path-to-sqlite3 file, with is just cache from postgresql, then postgres can be down, if cache holds the wather

all this imho needs code changes, the cache must not be write cache, only read cache

that means it cant work for dict unless dict is sqlite aswell

Marcus Jodorf

22 Feb 22 Feb

3:28 a.m.

On 2/21/23 18:04, Lars Schotte wrote:

...

Yes, that looks nice, I am going to try that too.

Because every other option is based on some other software, like relayd or nginx, it is all possible, but adds complexity.

The best would be to have it inside connection string.

/etc/dovecot/dovecot-sql.conf.ext:

... connect= host=host1 host=host2 dbname=<db> user=<user> password=<pwd>

I'm using that since pretty much 2006 - if I'm not mistaken - on my little servers. Simply works.

If one server is not reachable you just get an error log entry in mail.err that connect failed to the database that is down. But that is all - dovecot keeps working as normal.

BTW: Same with postfix. Simply list an additional fallback and it just works.

Best,

Marcus

If I understand open source Postgres correctly, though, this setup basically requires that the application be read-only, or at least be intelligent enough not to attempt to write to a host if it has failed over to it, right? Don't you have to have human intervention to actually fail the master / primary over for write purposes?

Sent with Proton Mail secure email.

------- Original Message ------- On Tuesday, February 21st, 2023 at 10:28 PM, Marcus Jodorf <bofh@killfile.de> wrote:

...

On 2/21/23 18:04, Lars Schotte wrote:

...

...
Yes, that looks nice, I am going to try that too.

...

...
Because every other option is based on some other software, like relayd or nginx, it is all possible, but adds complexity.

...

...
The best would be to have it inside connection string.

...

/etc/dovecot/dovecot-sql.conf.ext:

...

... connect= host=host1 host=host2 dbname=<db> user=<user> password=<pwd>

...

I'm using that since pretty much 2006 - if I'm not mistaken - on my little servers. Simply works.

...

If one server is not reachable you just get an error log entry in mail.err that connect failed to the database that is down. But that is all - dovecot keeps working as normal.

...

BTW: Same with postfix. Simply list an additional fallback and it just works.

...

Best,

...

Marcus

Marcus Jodorf

11:02 a.m.

On 2/22/23 11:29, David White wrote:

...

If I understand open source Postgres correctly, though, this setup basically requires that the application be read-only, or at least be intelligent enough not to attempt to write to a host if it has failed over to it, right? Don't you have to have human intervention to actually fail the master / primary over for write purposes?

Yes, that is correct. But in the context of using Dovecot (passdb/userdb/quota read-only is normally sufficient).

For write fail-over you would typically use for example Pgpool-II.

Best,

Marcus

Marc

11:08 a.m.

I don't even get what the advatages are of doing this with sql. If you use local replicated ldap and use local credential caching then your master ldap can go down without issues, even the local caching handle some local slapd issues. I guess the local caching is also faster. Afaik were databases not designed for this purpose and a better fit is ldap.

...

If I understand open source Postgres correctly, though, this setup basically requires that the application be read-only, or at least be intelligent enough not to attempt to write to a host if it has failed over to it, right? Don't you have to have human intervention to actually fail the master / primary over for write purposes?

Sent with Proton Mail secure email.

------- Original Message ------- On Tuesday, February 21st, 2023 at 10:28 PM, Marcus Jodorf <bofh@killfile.de> wrote:

...
On 2/21/23 18:04, Lars Schotte wrote:

...
...
Yes, that looks nice, I am going to try that too.

...
...
Because every other option is based on some other software, like relayd or nginx, it is all possible, but adds complexity.

...
...
The best would be to have it inside connection string.

...
...
/etc/dovecot/dovecot-sql.conf.ext:

...
... connect= host=host1 host=host2 dbname=<db> user=<user> password=<pwd>

...
...
...
I'm using that since pretty much 2006 - if I'm not mistaken - on my little servers. Simply works.

...
If one server is not reachable you just get an error log entry in mail.err that connect failed to the database that is down. But that is all - dovecot keeps working as normal.

...
BTW: Same with postfix. Simply list an additional fallback and it just works.

...
Best,

...
Marcus

Nikolai Lusan

23 Feb 23 Feb

5:55 a.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512

On Wed, 2023-02-22 at 11:08 +0000, Marc wrote:

...

I don't even get what the advatages are of doing this with sql. If you use local replicated ldap and use local credential caching then your master ldap can go down without issues, even the local caching handle some local slapd issues.

Going to have to +1 this. LDAP also does multi-master replication, which can make failover easier via DNS (like with a round robin for ldap.mydomain), or multiple LDAP dictionaries for dovecot. The [big] problem with OSS Postgres is that it only does master/slave replication, with no plans to add multi-master replication to the code base (there is Percona and 2ndQuadrant, but for small outfits, and individual there is a price barrier there). Personally I love PGSQL as a DB, but for SSO I use LDAP - because that's what it's designed for (i.e. read more than written).

...

I guess the local caching is also faster. Afaik were databases not designed for this purpose and a better fit is ldap.

This is totally true. RDBMS were not designed with this kind of use in mind, LDAP was - it is, after all, a directory service. So unless your auth stuff is part of some larger DB "thing" the directory type solutions are not suitable for (how many table joins, or other extensive SQL actions are taking place on that DB) then LDAP is the better way to go, and extending LDAP with custom schemas is simple - just grab an IANA number for you, or your organisation, so that you don't trample on any other schema out there. I have a custom schema that I use for postfix/dovecot - it's simple, quick, and efficient without the DB overhead ... and I get the multi-master replication in OpenLDAP.

Nikolai Lusan Email: nikolai@lusan.id.au -----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEVfd4GW6z4nsBxdLo4ZaDRV2VL6QFAmP2/+4ACgkQ4ZaDRV2V L6RRHxAAnSpKmYdg2+51CuSgxdQ8fHY65CRwflqQQzIEGIpApmgzh4fl5YkB8+qw utLe5ao6Qd8dIbWmENsLVEZIHeHr9fsmENXXhgF1DIcHwSH12rZHC5pt0zjKWh3j +dmmQ8/tlg/A3HbeEpMuFiVX02Zk3pddMW3D/ja7aAWXjecNmDHqxbghdhJrEzNy 3LiZQsGLQ0RvPiqULtk4ObSmjvuqZ9yJmnmV0IZTY1nRqJ74rHghwQOVK9HTTiMG MizBLqCzbA3suV+TyYxEGE02EtTOGAcWyTcJWY5EGDs1K9GHNML6ZhimZepMmBxL Bb4wr3rohNchdBiXyrACF0eNPJKMdMMWGAzySbWYiiXLRyyyM+KWjzvz6HDLmpXK gCAwY4l8oPRf7Uw3XD1yVyZY6yPrkWV3AZoLL+UYc9N48sApQMG1xh1PcgsfU3+0 Y3FkFy8iltxENh2diLZo5ikfVmQ+YxAZEr5Epl/xmuj30u8tHqpbANKEcvJkhXNG 4bUZ2d+6zxj25vEQq+RkVVRnsfyXNlCaD+E8yyegqRNFnTJAlUlcFWCQ60TcnWqX JKDDOTQj3rbfDZQMeTjbGfW5Fa8PJg5uRbnulXrK7RCizInoDIF8OpmXW4Hg4Ob9 1DiOJeQoefxIIkqs8Mlfj8t4mNy1/iZqs52DcTXRQVqogDesBWw= =ZyC3 -----END PGP SIGNATURE-----

Brendan Kearney

1:08 p.m.

i didnt pick up on the fact that this was auth stuff, and not indexes (indices?) or other data. LDAP is a hierarchical database, where the relationship between data is forced into a superior/subordinate structure. if you ask enough people, they will say that LDAP is not an authentication platform, but a database. i tend to agree, and have put Kerberos and LDAP together as my AuthN/Z suite. Kerberos is truly an AuthN protocol, and when i can point something at it i do. i store my Kerberos data in LDAP, and run n-way multi-primary replication. MIT Kerberos does not have an event based replication means, only time/schedule based. LDAP has an event based replication mechanism, so when data changes all the nodes in the cluster get an immediate replicated update. OpenLDAP can also proxy AuthN events to Kerberos with a few specific configurations set, and the password field being set to a specific string, {SASL}user@domain. this way something that talks LDAP, can point to LDAP for AuthN, and be proxied to back Kerberos. to me, this maintains the ever important "single source of the truth" for credentials.

I run LDAP behind a HAProxy VIP too, for load balancing purposes and transparent failover, so apps see less impact when failures occur. load balancing allows me to scale out (handle more requests in a given unit of time), and be fault tolerant. take a box out of the mix for updates, reboots, maintenance, whatever and not interrupt processing. i load balance nearly every stateful protocol, whenever possible. i anycast most stateless protocols, too, as a means of load sharing. by having more than one instance available to do the same work, you greatly reduce the "hair on fire" calls in the middle of the night, or at least shorten the Mean Time to Recovery.

On 2/23/23 12:55 AM, Nikolai Lusan wrote:

...

On Wed, 2023-02-22 at 11:08 +0000, Marc wrote:

...
I don't even get what the advatages are of doing this with sql. If you use local replicated ldap and use local credential caching then your master ldap can go down without issues, even the local caching handle some local slapd issues.

Going to have to +1 this. LDAP also does multi-master replication, which can make failover easier via DNS (like with a round robin for ldap.mydomain), or multiple LDAP dictionaries for dovecot. The [big] problem with OSS Postgres is that it only does master/slave replication, with no plans to add multi-master replication to the code base (there is Percona and 2ndQuadrant, but for small outfits, and individual there is a price barrier there). Personally I love PGSQL as a DB, but for SSO I use LDAP - because that's what it's designed for (i.e. read more than written).

...
I guess the local caching is also faster. Afaik were databases not designed for this purpose and a better fit is ldap.

This is totally true. RDBMS were not designed with this kind of use in mind, LDAP was - it is, after all, a directory service. So unless your auth stuff is part of some larger DB "thing" the directory type solutions are not suitable for (how many table joins, or other extensive SQL actions are taking place on that DB) then LDAP is the better way to go, and extending LDAP with custom schemas is simple - just grab an IANA number for you, or your organisation, so that you don't trample on any other schema out there. I have a custom schema that I use for postfix/dovecot - it's simple, quick, and efficient without the DB overhead ... and I get the multi-master replication in OpenLDAP.

Brendan Kearney

22 Feb 22 Feb

12:45 p.m.

think about this, you have connect= host=host1 host=host2 ...

when host1 fails, you must kick that host while it is down, each and every single time you want a new connection to the database. there is no record saying that box is down, no logic in the app saying i've already tried that box, so avoid it for "some period of time". you have to attempt a connection to it, since it is listed first in your connection parameters. you have to get an error back or wait for a timeout period to occur, and then move onto the next box.

have you ever faced a DNS resolution issue, where DNS was not working and everything slowed to a crawl because of it? similar kind of delays in processing here. you have to have an error or timeout occur before you move onto the next configured box. when you are dealing with high volume production environments, this is not a scaled solution.

with load balancing, you have an active connection from your app to the database VIP. then there is a secondary connection from the load balancer to the database host in the pool. when that connection dies and goes away, the app retries the connection to VIP and is automatically assigned to a different, alive and available host in the pool. no waiting for an error or timeout to occur, and then trying the next host in the config.

when you put the infrastructure together properly, no human intervention is required to mark a box down and not send traffic to it. as i said:

it requires a bit of supporting infrastructure to get it all working, but you wind up with a */highly-available, fault tolerant PostgreSQL footprint with automatic failover./*

by having the load balancer keep track of which hosts are alive and available, you dont have any guess work as to which host to connect to. the app just has to retry the connection. once a host is marked down by the load balancer, no traffic will be sent to it until it passes a health check. the health checking with PostgreSQL is the ETCd and Patroni pieces. those two processes are critical in the chain. with the health of the boxes and processes handled, knowing which PG host is the active R/W one is simple. when failure occurs, there is an election process to promote one of the standby hosts to the active R/W node, and that is reported up to the load balancer through the Patroni -> ETCd chain. all new connections wind up going to the newly elected active R/W node in the pool. *//*

On 2/22/23 5:29 AM, David White wrote:

...

If I understand open source Postgres correctly, though, this setup basically requires that the application be read-only, or at least be intelligent enough not to attempt to write to a host if it has failed over to it, right? Don't you have to have human intervention to actually fail the master / primary over for write purposes?

Sent with Proton Mail secure email.

------- Original Message ------- On Tuesday, February 21st, 2023 at 10:28 PM, Marcus Jodorf<bofh@killfile.de> wrote:

...
On 2/21/23 18:04, Lars Schotte wrote:

...
Yes, that looks nice, I am going to try that too.

Because every other option is based on some other software, like relayd or nginx, it is all possible, but adds complexity.

The best would be to have it inside connection string. /etc/dovecot/dovecot-sql.conf.ext:

... connect= host=host1 host=host2 dbname=<db> user=<user> password=<pwd>

I'm using that since pretty much 2006 - if I'm not mistaken - on my little servers. Simply works.

If one server is not reachable you just get an error log entry in mail.err that connect failed to the database that is down. But that is all - dovecot keeps working as normal.

BTW: Same with postfix. Simply list an additional fallback and it just works.

Best,

Marcus

Benny Pedersen

19 Feb 19 Feb

5:56 p.m.

Paul Kudla skrev den 2023-02-19 16:01:

...

May I please get some guidance on what to add to talk to another postgresql server (i have 6 replicated servers so would probably want a couple worst case issue)

change host=localhost to host=some-other-hostname-with-multiple-ips :)

then dovecot with timeout and test next server ip

there might be more to it, but i think this is how to do it

Paul Kudla (SCOM.CA Internet Services Inc.)

21 Feb 21 Feb

9:02 a.m.

yes that seems to be the approach

i setup a dns entry and pointed to 3 servers

it does work round robin (ie from main, secondary etc) but that is ok

at least it is working when i take the main server offline for maintenance !

Happy Tuesday !!! Thanks - paul

Paul Kudla

Scom.ca Internet Services <http://www.scom.ca> 004-1009 Byron Street South Whitby, Ontario - Canada L1N 4S3

Toronto 416.642.7266 Main 1.866.411.7266 Fax 1.888.892.7266 Email paul@scom.ca

On 2023-02-19 12:56 p.m., Benny Pedersen wrote:

...

Paul Kudla skrev den 2023-02-19 16:01:

...
May I please get some guidance on what to add to talk to another postgresql server (i have 6 replicated servers so would probably want a couple worst case issue)

change host=localhost to host=some-other-hostname-with-multiple-ips :)

then dovecot with timeout and test next server ip

there might be more to it, but i think this is how to do it

Brendan Kearney

1:30 p.m.

let me preface all of this by saying i dont have PostgreSQL running in any fashion, but have come across footprints that are standing in a Production environment.

are you running a single primary R/W node, with multiple secondary R/O nodes? from what i have seen/heard, PG does not really have a well documented and currently support n-way multi-primary R/W load balancing mechanism. that said, some effort did exist but may no longer be supported.

i have seen a HA footprint of PG setup with HAProxy, PostgreSQL, ETCd and Patroni. HAProxy handles the Virtual IP (VIP) and can be setup as HA with VRRP running between the HAProxy nodes. ETCd is setup with 3 nodes, all monitoring the "active" status of the PostgreSQL nodes via Patroni. Patroni runs on each PG node, watching the status of the PG instances for failures. PG runs with one node in the "active" state, and replicates to the secondary nodes running in the "standby" state.

when the primary node encounters an issue, Patroni idenitifies that the node is no longer able to process, and ETCd records the updated status. HAProxy polls ETCd and is notified of the event, and marks the previously "active" member as down or in standby state. by election process, the standby nodes promote a node to the active state, and the Patroni -> ETCd -> HAProxy chain picks up the new active node.

it requires a bit of supporting infrastructure to get it all working, but you wind up with a highly-available, fault tolerant PostgreSQL footprint with automatic failover. the caveat is that you only have a single R/W instance at any one point. this could be a performance bottleneck in high volume environments.

some links that may shed light on what and how:

https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Poo... https://patroni.readthedocs.io/en/latest/ https://www.percona.com/blog/postgresql-ha-with-patroni-your-turn-to-test-fa... https://arctype.com/blog/postgres-patroni/

i do have a n-way multi-primary MariaDB footprint running with HAProxy, MariaDB and Galera running. Each MariaDB instance is R/W and can take a write event, and then replicate the event to the other cluster members. the VIP on HAProxy for port 3306 has all three cluster members load balanced using least connections. on the same VIP, but using a different port (3316, 3326, 3336) i have a backend for each of the individual cluster members, so i can isolate and troubleshoot each node separately.

in the PostgreSQL footprints i have come across, a similar setup using other ports has been used for access to the R/O nodes in the cluster. this can allow for queries, instead of writes, and reporting functionality.

best of luck,

brendan kearney

On 2/21/23 4:02 AM, Paul Kudla (SCOM.CA Internet Services Inc.) wrote:

...

yes that seems to be the approach

i setup a dns entry and pointed to 3 servers

it does work round robin (ie from main, secondary etc) but that is ok

at least it is working when i take the main server offline for maintenance !

Happy Tuesday !!! Thanks - paul

Paul Kudla

Scom.ca Internet Services <http://www.scom.ca> 004-1009 Byron Street South Whitby, Ontario - Canada L1N 4S3

Toronto 416.642.7266 Main 1.866.411.7266 Fax 1.888.892.7266 Email paul@scom.ca

On 2023-02-19 12:56 p.m., Benny Pedersen wrote:

...
Paul Kudla skrev den 2023-02-19 16:01:

...
May I please get some guidance on what to add to talk to another postgresql server (i have 6 replicated servers so would probably want a couple worst case issue)

change host=localhost to host=some-other-hostname-with-multiple-ips :)

then dovecot with timeout and test next server ip

there might be more to it, but i think this is how to do it

989

Age (days ago)

993

Last active (days ago)

List overview

13 comments

10 participants

participants (10)

Benny Pedersen
Brendan Kearney
David White
Lars Schotte
Marc
Marcus Jodorf
Marcus Jodorf
Nikolai Lusan
Paul Kudla
Paul Kudla (SCOM.CA Internet Services Inc.)