Re: [Dovecot] Automatically Cleaning Kerberos Credential Cache Files
On 12/10/2012 05:31 PM, Ben Morrow wrote:
At 3PM -0500 on 10/12/12 you (Maura Dailey) wrote:
I'm in a situation here at work where I'm trying to support a mixed network of OS X and RHEL desktop machines with a Postfix/Dovecot combination. - user account information is stored in LDAP - user credentials are in MIT Kerberos - server is running RHEL 6/Dovecot 2.0.9/Postfix 2.6.6
I am currently using the PAM passdb module to authenticate my users (I began to have trouble with using GSSAPI directly). After I implemented it, a few weeks later, I noticed that some users were no longer getting their mail if they hadn't logged in during the past day. Postfix's mailq showed that hundreds of messages were backing up in the queue. I eventually tracked it down to leftover Kerberos credential cache files (/tmp/krb5cc_????) sitting in /tmp on the mail server. The presence of expired credential files was preventing Postfix from delivering mail to those users' mail spools. If I delete the credential files manually, Postfix immediately delivers the queued emails. This is rather odd. Is krb5-authenticated NFS involved here, or does Postfix's delivery make any other use of Kerberos? The only other thing I can think of is that so many expired ccaches are accumulating that the user goes over their inode quota.
Each user has one credential cache file in /tmp on the mail server after logging into Dovecot. We aren't using randomized names, so everything is in the standard format /tmp/krb5cc_uid. We do use KRB5 authenticated (and encrypted) NFS, but we don't deliver mail to home directories. Since all users are "real" users, and our office size is small, everyone has a mail spool directory on the mail server.
Postfix is configured to use Dovecot for Kerberos. The relevant lines in its /etc/postfix/main.cf are as follows: smtpd_sasl_type = dovecot smtpd_sasl_path = private/auth smtpd_sasl_auth_enable = yes smtpd_recipient_restrictions = permit_mynetworks, permit_sasl_authenticated, reject_unauth_destination smtpd_client_restrictions = permit_mynetworks, permit_sasl_authenticated, reject smtpd_sasl_security_options = noanonymous
Currently, I have a cron job deleting the files manually every night. Obviously, this is a cruddy solution. Well, I don't know about that: obviously, it would be a good idea to work out what's going on here, in case it causes anything else to go wrong, but a cronjob clearing out /tmp is a good idea in any case. Fair enough. And of course, I get a cache file every time I use sudo or log in through a GUI. Those aren't cleaned up either! Maybe I should be deleting credential cache files on all the machines every night. Those cache files have never given me a problem, though.
I have Dovecot configured on a RHEL 6 box. The Pam stack on a RHEL 6 machine uses sssd (pam_sss.so) for authentication with Kerberos, not pam_krb5.so. I'm trying to track down which piece of the puzzle is responsible for cleaning up leftover credential caches. Is there a configuration option I can pass to Dovecot's passdb directly to clean up these cache files? There are two relevant Dovecot settings: passing setcred=yes to the pam passdb will make Dovecot call pam_setcred, and passing session=yes will make it open and immediately close a PAM session. (Probably it ought to wait until the IMAP user logs out to close the session, but currently it can't do that because of the way the passdb lookups are done.) Changing either of these may have an effect, depending on when exactly your PAM module creates and destroys ccaches. I've tried session=yes by itself, but I don't think I've tried it with setcred=yes. I'll throw it in there and give it a try after I send this email. Do others generally have more success using a custom PAM stack with pam_krb5.so instead of pam_sss.so? Well, I don't use RH (I use FreeBSD), but I use and would recommend Russ Allbery's pam_krb5.so, which may or may not be the same as the normal pam_krb5.so provided by your system. It has options to control whether and where ccaches are created; assuming Dovecot doesn't need krb5 creds (say, for NFS), you would probably be better off telling it not to create a permanent ccache at all.
http://www.eyrie.org/~eagle/software/pam-krb5/ I haven't had to configure pam_krb5.so directly before (we use the Red Hat/Fedora configured default, pam_sss.so, which claims to be a one stop shop for LDAP/Kerberos/NIS,etc.), but it does seem to have more options. We certainly don't need credential caches to stick around for email users. All the mail spools are stored on locally mounted storage on the mail server.
Poring over sssd's configuration options didn't reveal anything useful. I'm still not sure why Postfix even cares if there are expired credential cache files in /tmp at all.
I'm back to trying GSSAPI directly again as well in the meantime on a few test clients. When I used that in the past, users were getting issued duplicate Kerberos tickets and users were forced to log into the mail server directly using SSH after a day in order to get their mail to work (seemingly a related issue). If I get those same errors again, I'll start another thread. This certainly does all sound related. What are the ccache files called: are they just /tmp/krb5cc_UID or is there a random portion as well? Are they being created with the correct permissions, and are there any security policies (SELinux or ACLs of some kind) set up which might interfere with their creation or destruction? No random portion (the only application we have configured with random ccache names is SSH), just the usual /tmp/krb5cc_uid. Permissions appear correct, they belong to the appropriate user and group accounts. The SELinux permissions are set to: system_u:object_r:user_tmp_t:s0. This is different from the machine's credential cache (system_u:object_r:gssd_tmp_t:s0), but I've disabled and re-enabled SELinux during different parts of my testing and didn't notice any errors.
Using GSSAPI directly doesn't create any cached credential files on the server. However, I just verified that I'm still sporadically getting the duplicate tickets (two lines for imap/hostname.server.com@SERVER.COM, with identical expiration dates). It comes and goes. I have a few users testing it for me, and they'll let me know if their logins break tomorrow like they have in the past once the normal log in period elapses. Since the "fix" for that problem is to log into the mail server, I have a hard time testing for the problem myself (being logged into the mail server nearly continuously just now).
What happens if you log in as an ordinary user (preferably using the same PAM stack as Dovecot uses), use klist to find the ccache name and 'ls -i' to find its inode number, then manually kinit again? Does the kinit succeed, and does the new ccache have a different inode number from before? Does the ccache file get removed when you log out? I've noticed that the ccache files for regular users do not get deleted when they log out. On another user's workstation, I found a ccache file for my account from a week ago (my current session was using a randomly generated SSH ccache file). I ran a command with sudo to force it to reauthenticate with pam_sss, and the timestamp on the ccache file was updated and the inode changed. My best guess is that pam_sss.so just doesn't do ccache cleanup.
Ben
Thanks for taking a look, I appreciate all the suggestions. This problem's been driving me up a wall for weeks now. I only just managed to narrow down the cause in the last few days. If I can't get a fix working in Dovecot, I'll try tackling it from a different perspective. I understand why Dovecot might not be cleaning up credential files (especially if it's just calling PAM normally), but I definitely do not understand why Postfix is behaving the way it does. If I can just get GSSAPI to work, I could probably bypass the problem as well.
- Maura Dailey maura@eclipse.ncsc.mil
At 7PM -0500 on 10/12/12 you (Maura Dailey) wrote:
On 12/10/2012 05:31 PM, Ben Morrow wrote:
At 3PM -0500 on 10/12/12 you (Maura Dailey) wrote:
I'm in a situation here at work where I'm trying to support a mixed network of OS X and RHEL desktop machines with a Postfix/Dovecot combination. - user account information is stored in LDAP - user credentials are in MIT Kerberos - server is running RHEL 6/Dovecot 2.0.9/Postfix 2.6.6
I am currently using the PAM passdb module to authenticate my users (I began to have trouble with using GSSAPI directly). After I implemented it, a few weeks later, I noticed that some users were no longer getting their mail if they hadn't logged in during the past day. Postfix's mailq showed that hundreds of messages were backing up in the queue. I eventually tracked it down to leftover Kerberos credential cache files (/tmp/krb5cc_????) sitting in /tmp on the mail server. The presence of expired credential files was preventing Postfix from delivering mail to those users' mail spools. If I delete the credential files manually, Postfix immediately delivers the queued emails. This is rather odd. Is krb5-authenticated NFS involved here, or does Postfix's delivery make any other use of Kerberos? The only other thing I can think of is that so many expired ccaches are accumulating that the user goes over their inode quota.
Each user has one credential cache file in /tmp on the mail server after logging into Dovecot. We aren't using randomized names, so everything is in the standard format /tmp/krb5cc_uid. We do use KRB5 authenticated (and encrypted) NFS, but we don't deliver mail to home directories. Since all users are "real" users, and our office size is small, everyone has a mail spool directory on the mail server.
Hmm. I don't have much experience with KrbNFS, and none at all on Linux, but the implementations I've seen seem to be terribly flaky about passing krb5 creds to the kernel. (What they ought to do is implement AFS' aklog and setpag; they're irritating, but at least they're well-understood...)
In any case, it's likely that the delivery process looks in the user's home directory even if delivery is to a separate mail spool, unless you've taken steps to prevent this. For instance, Postfix's local(8) checks for ~/.forward by default, LDAs like procmail or maildrop look for similar per-user RC files, and Dovecot's LDA looks for (at least) ~/.dovecot.sieve. Is it possible that the NFS code returns a different error for 'no ccache present' vs 'ccache present but the creds have expired', such that Postfix will carry on delvering if it gets the first error but not the second?
<snip>
Well, I don't use RH (I use FreeBSD), but I use and would recommend Russ Allbery's pam_krb5.so, which may or may not be the same as the normal pam_krb5.so provided by your system. It has options to control whether and where ccaches are created; assuming Dovecot doesn't need krb5 creds (say, for NFS), you would probably be better off telling it not to create a permanent ccache at all.
http://www.eyrie.org/~eagle/software/pam-krb5/ I haven't had to configure pam_krb5.so directly before (we use the Red Hat/Fedora configured default, pam_sss.so, which claims to be a one stop shop for LDAP/Kerberos/NIS,etc.), but it does seem to have more options. We certainly don't need credential caches to stick around for email users. All the mail spools are stored on locally mounted storage on the mail server.
Where do users' private IMAP folders live? Are they in the mail spool as well, or are they in the user's home directory? Once a user has logged in Dovecot will change directory to their home directory (as returned by the userdb), so you may find you *do* need ccaches if they are on KrbNFS.
This certainly does all sound related. What are the ccache files called: are they just /tmp/krb5cc_UID or is there a random portion as well? Are they being created with the correct permissions, and are there any security policies (SELinux or ACLs of some kind) set up which might interfere with their creation or destruction? No random portion (the only application we have configured with random ccache names is SSH), just the usual /tmp/krb5cc_uid.
(I assume you're aware of the potential DoS here, given that /tmp is world-writable and sticky? I'm not sure if there's anything you can do about it if you're using KrbNFS, though.)
Permissions appear correct, they belong to the appropriate user and group accounts. The SELinux permissions are set to: system_u:object_r:user_tmp_t:s0. This is different from the machine's credential cache (system_u:object_r:gssd_tmp_t:s0), but I've disabled and re-enabled SELinux during different parts of my testing and didn't notice any errors.
I know nothing whatever about SELinux, but this might be relevant. gssd handles client-side credentials for NFS, so if it ends up unable to snoop on a user's ccache you will have problems. (This is why you can't rename them to something secure, and is the problem aklog solves.)
Using GSSAPI directly doesn't create any cached credential files on the server. However, I just verified that I'm still sporadically getting the duplicate tickets (two lines for imap/hostname.server.com@SERVER.COM, with identical expiration dates).
That's odd, but it shouldn't be a problem. A valid ticket is a valid ticket, regardless of whatever other tickets might exist.
It comes and goes. I have a few users testing it for me, and they'll let me know if their logins break tomorrow like they have in the past once the normal log in period elapses. Since the "fix" for that problem is to log into the mail server, I have a hard time testing for the problem myself (being logged into the mail server nearly continuously just now).
It sounds to me as though clearing out dead ccaches, maybe even with an hourly-or-more cronjob that only deletes them if they've expired, will fix the problem for the moment. A more fundamental fix will require understanding when your PAM modules delete ccaches, and possibly turning off Postfix features like ~/.forward if you're not using them.
AFAICS the only piece of Dovecot configuration that might be relevant at this point is that a user's Dovecot home directory does not have to be the same as their 'real' home directory, so you can (if necessary) move all Dovecot-related files onto a local disk. See http://wiki2.dovecot.org/AuthDatabase/Passwd .
Ben
On 12/11/2012 08:52 AM, Ben Morrow wrote:
On 12/10/2012 05:31 PM, Ben Morrow wrote:
At 3PM -0500 on 10/12/12 you (Maura Dailey) wrote:
I'm in a situation here at work where I'm trying to support a mixed network of OS X and RHEL desktop machines with a Postfix/Dovecot combination. - user account information is stored in LDAP - user credentials are in MIT Kerberos - server is running RHEL 6/Dovecot 2.0.9/Postfix 2.6.6
I am currently using the PAM passdb module to authenticate my users (I began to have trouble with using GSSAPI directly). After I implemented it, a few weeks later, I noticed that some users were no longer getting their mail if they hadn't logged in during the past day. Postfix's mailq showed that hundreds of messages were backing up in the queue. I eventually tracked it down to leftover Kerberos credential cache files (/tmp/krb5cc_????) sitting in /tmp on the mail server. The presence of expired credential files was preventing Postfix from delivering mail to those users' mail spools. If I delete the credential files manually, Postfix immediately delivers the queued emails. This is rather odd. Is krb5-authenticated NFS involved here, or does Postfix's delivery make any other use of Kerberos? The only other thing I can think of is that so many expired ccaches are accumulating that the user goes over their inode quota. Each user has one credential cache file in /tmp on the mail server after logging into Dovecot. We aren't using randomized names, so everything is in the standard format /tmp/krb5cc_uid. We do use KRB5 authenticated (and encrypted) NFS, but we don't deliver mail to home directories. Since all users are "real" users, and our office size is small, everyone has a mail spool directory on the mail server. Hmm. I don't have much experience with KrbNFS, and none at all on Linux, but the implementations I've seen seem to be terribly flaky about
At 7PM -0500 on 10/12/12 you (Maura Dailey) wrote: passing krb5 creds to the kernel. (What they ought to do is implement AFS' aklog and setpag; they're irritating, but at least they're well-understood...)
Flaky is an understatement. Especially when you have to support Mac OS X users. After months of intermittent RPC errors, I had to revert them to NFS3. At least the RHEL users can use NFS4 and get the speed bump.
In any case, it's likely that the delivery process looks in the user's home directory even if delivery is to a separate mail spool, unless you've taken steps to prevent this. For instance, Postfix's local(8) checks for ~/.forward by default, LDAs like procmail or maildrop look for similar per-user RC files, and Dovecot's LDA looks for (at least) ~/.dovecot.sieve. Is it possible that the NFS code returns a different error for 'no ccache present' vs 'ccache present but the creds have expired', such that Postfix will carry on delvering if it gets the first error but not the second?
That is a very good point. We're using Postfix's local, which probably doesn't even know it needs credentials. It looks like I can change the forward_path, or set allow_mail_to_commands and allow_mail_to_files to disallow forwarding. As I said, we're a small office, so I doubt anyone will complain. I've set up forwarding for users on travel before in /etc/aliases.
Well, I don't use RH (I use FreeBSD), but I use and would recommend Russ Allbery's pam_krb5.so, which may or may not be the same as the normal pam_krb5.so provided by your system. It has options to control whether and where ccaches are created; assuming Dovecot doesn't need krb5 creds (say, for NFS), you would probably be better off telling it not to create a permanent ccache at all.
http://www.eyrie.org/~eagle/software/pam-krb5/ I haven't had to configure pam_krb5.so directly before (we use the Red Hat/Fedora configured default, pam_sss.so, which claims to be a one stop shop for LDAP/Kerberos/NIS,etc.), but it does seem to have more options. We certainly don't need credential caches to stick around for email users. All the mail spools are stored on locally mounted storage on the mail server. Where do users' private IMAP folders live? Are they in the mail spool as well, or are they in the user's home directory? Once a user has logged in Dovecot will change directory to their home directory (as returned by
<snip> the userdb), so you may find you *do* need ccaches if they are on KrbNFS.
User's IMAP folders live in their mail spools. I've been watching the maillogs all morning, and I've noticed that my test users (who are using GSSAPI without leaving credential files behind) are getting the error message "Error: chdir(/home/user/) failed: Permission denied (euid=1000(user) egid=2002(group) missing +x perm: /home, euid is not dir owner)," so your premise that the credential files might have been reused for Kerberos seems to be correct. However, I'd much rather prevent Dovecot AND Postfix from looking in /home. I will try your tip that you gave at the end of your email, overriding the user's home directory.
This certainly does all sound related. What are the ccache files called: are they just /tmp/krb5cc_UID or is there a random portion as well? Are they being created with the correct permissions, and are there any security policies (SELinux or ACLs of some kind) set up which might interfere with their creation or destruction? No random portion (the only application we have configured with random ccache names is SSH), just the usual /tmp/krb5cc_uid. (I assume you're aware of the potential DoS here, given that /tmp is world-writable and sticky? I'm not sure if there's anything you can do about it if you're using KrbNFS, though.) Oh yeah, I'm painfully aware of how vulnerable we are to this now. Even if I find a fix, I'm pretty sure the cron job stays. If I can get direct GSSAPI logins (which don't create credential cache files) to work on the Macs, then I can proceed to lock down /tmp some more. Right now, if I create an empty file /tmp/krb5cc_myuid and chown it to root, sudo breaks. I assume the same is true even if I just create it as a different user.
Permissions appear correct, they belong to the appropriate user and group accounts. The SELinux permissions are set to: system_u:object_r:user_tmp_t:s0. This is different from the machine's credential cache (system_u:object_r:gssd_tmp_t:s0), but I've disabled and re-enabled SELinux during different parts of my testing and didn't notice any errors. I know nothing whatever about SELinux, but this might be relevant. gssd handles client-side credentials for NFS, so if it ends up unable to snoop on a user's ccache you will have problems. (This is why you can't rename them to something secure, and is the problem aklog solves.) The targeted policy in RHEL really simplifies SELinux deployment. I've only ever had a few problems with it (usually whenever they start covering a service for the first time). In this case, PAM can clobber the files without a problem, so it's probably the case that user_tmp_t is generous enough.
Using GSSAPI directly doesn't create any cached credential files on the server. However, I just verified that I'm still sporadically getting the duplicate tickets (two lines for imap/hostname.server.com@SERVER.COM, with identical expiration dates). That's odd, but it shouldn't be a problem. A valid ticket is a valid ticket, regardless of whatever other tickets might exist. I suspect it's the kind of thing you notice when you're debugging things and not otherwise. It did make me wonder if I'd somehow screwed up my keytab file, but it looks clean and all the key numbers match.
It comes and goes. I have a few users testing it for me, and they'll let me know if their logins break tomorrow like they have in the past once the normal log in period elapses. Since the "fix" for that problem is to log into the mail server, I have a hard time testing for the problem myself (being logged into the mail server nearly continuously just now). It sounds to me as though clearing out dead ccaches, maybe even with an hourly-or-more cronjob that only deletes them if they've expired, will fix the problem for the moment. A more fundamental fix will require understanding when your PAM modules delete ccaches, and possibly turning off Postfix features like ~/.forward if you're not using them.
AFAICS the only piece of Dovecot configuration that might be relevant at this point is that a user's Dovecot home directory does not have to be the same as their 'real' home directory, so you can (if necessary) move all Dovecot-related files onto a local disk. See http://wiki2.dovecot.org/AuthDatabase/Passwd . That does sound like a useful feature. In the past, I've seen Dovecot spit out spurious errors about accessing user home directories. I'll have to check if I'm still getting those, because this sounds like a fix for that. I should be getting some good data from my GSSAPI test users later today or tomorrow. One woman claimed she has duplicate emails, but I suspect that's because I migrated her to IMAP.
Ben
I really do appreciate all the help.
- Maura Dailey maura@eclipse.ncsc.mil
participants (2)
-
Ben Morrow
-
Maura Dailey