What is the best way to do a (server-side) backup of all mail in a
user's mail?
I don't think I'm doing anything weird as far as configs go; here's
dovecot -n if it helps:
# 1.1.4: /etc/dovecot/dovecot.conf protocols: imaps listen: *, [::] ssl_cert_file: /etc/ssl/dovecot/cert.pem ssl_key_file: /etc/ssl/dovecot/key.pem login_dir: /var/run/dovecot/login login_executable: /usr/libexec/dovecot/imap-login mail_location: maildir:~/.maildir auth default: mechanisms: plain login passdb: driver: pam args: * userdb: driver: passwd socket: type: listen client: path: /var/spool/postfix/private/auth mode: 432 user: postfix group: postfix
on 10-29-2008 12:25 PM Neil spake the following:
What is the best way to do a (server-side) backup of all mail in a user's mail?
I don't think I'm doing anything weird as far as configs go; here's dovecot -n if it helps:
# 1.1.4: /etc/dovecot/dovecot.conf protocols: imaps listen: *, [::] ssl_cert_file: /etc/ssl/dovecot/cert.pem ssl_key_file: /etc/ssl/dovecot/key.pem login_dir: /var/run/dovecot/login login_executable: /usr/libexec/dovecot/imap-login mail_location: maildir:~/.maildir auth default: mechanisms: plain login passdb: driver: pam args: * userdb: driver: passwd socket: type: listen client: path: /var/spool/postfix/private/auth mode: 432 user: postfix group: postfix
I usually just rsync the /home directories to another server. The inital sync can take a while, but it gets faster after there is a base to work from.
-- MailScanner is like deodorant... You hope everybody uses it, and you notice quickly if they don't!!!!
On Oct 29, 2008, at 3:42 PM, Scott Silva wrote:
What is the best way to do a (server-side) backup of all mail in a user's mail?
I usually just rsync the /home directories to another server. The
inital sync can take a while, but it gets faster after there is a base to work
from.
...and it's much less painful if you're using maildir instead of
mbox!
-Dave
-- Dave McGuire Port Charlotte, FL
on 10-29-2008 12:47 PM Dave McGuire spake the following:
On Oct 29, 2008, at 3:42 PM, Scott Silva wrote:
What is the best way to do a (server-side) backup of all mail in a user's mail?
I usually just rsync the /home directories to another server. The inital sync can take a while, but it gets faster after there is a base to work from.
...and it's much less painful if you're using maildir instead of mbox!
-Dave
Mbox syncs fairly quickly also. Rsync is very good at working with large text files like mbox, even if users purge stuff from the middle.
But since he did show Maildir was in use, I left out also backing up /var/spool/mail/* for the inboxes on a default mbox installation.
-- MailScanner is like deodorant... You hope everybody uses it, and you notice quickly if they don't!!!!
On 29 Oct 2008, at 16:02, Scott Silva wrote:
on 10-29-2008 12:47 PM Dave McGuire spake the following:
On Oct 29, 2008, at 3:42 PM, Scott Silva wrote:
What is the best way to do a (server-side) backup of all mail in a user's mail?
I usually just rsync the /home directories to another server. The inital sync can take a while, but it gets faster after there is a base to work
from....and it's much less painful if you're using maildir instead of
mbox!-Dave
Mbox syncs fairly quickly also. Rsync is very good at working with
large text files like mbox, even if users purge stuff from the middle.But since he did show Maildir was in use, I left out also backing up /var/spool/mail/* for the inboxes on a default mbox installation.
Yeah, the maildir line was mostly why I put the dovecot -n there.
Do you think rsync will be easier on my servers than tarball/bzip2/scp ?
Thanks for the help, -Neil.
on 10-29-2008 2:46 PM Neil spake the following:
On 29 Oct 2008, at 16:02, Scott Silva wrote:
on 10-29-2008 12:47 PM Dave McGuire spake the following:
On Oct 29, 2008, at 3:42 PM, Scott Silva wrote:
What is the best way to do a (server-side) backup of all mail in a user's mail?
I usually just rsync the /home directories to another server. The inital sync can take a while, but it gets faster after there is a base to work from.
...and it's much less painful if you're using maildir instead of mbox!
-Dave
Mbox syncs fairly quickly also. Rsync is very good at working with large text files like mbox, even if users purge stuff from the middle.
But since he did show Maildir was in use, I left out also backing up /var/spool/mail/* for the inboxes on a default mbox installation.
Yeah, the maildir line was mostly why I put the dovecot -n there.
Do you think rsync will be easier on my servers than tarball/bzip2/scp ?
Thanks for the help, -Neil.
Rsync will use more memory on large filesystems, but it is usually lighter in CPU, network, and IO time. But tar gives you multiple backups. To achieve that with rsync you need the rbackup script or rsnapshot.
-- MailScanner is like deodorant... You hope everybody uses it, and you notice quickly if they don't!!!!
Scott Silva wrote:
Rsync will use more memory on large filesystems, but it is usually lighter in CPU, network, and IO time. But tar gives you multiple backups. To achieve that with rsync you need the rbackup script or rsnapshot.
Also check snapback2 (similar to tools you mentioned above)
And brackup looks quite interesting for backing up maildir... (same chap who wrote memcached)
Ed W
On Wednesday 29 of October 2008, Dave McGuire wrote:
On Oct 29, 2008, at 3:42 PM, Scott Silva wrote:
What is the best way to do a (server-side) backup of all mail in a user's mail?
I usually just rsync the /home directories to another server. The inital sync can take a while, but it gets faster after there is a base to work from.
...and it's much less painful if you're using maildir instead of mbox!
Not for rsyncing. Tons of small files means much slower rsync.
-Dave
-- Arkadiusz Miśkiewicz PLD/Linux Team arekm / maven.pl http://ftp.pld-linux.org/
On Oct 29, 2008, at 5:32 PM, Arkadiusz Miskiewicz wrote:
What is the best way to do a (server-side) backup of all mail in a user's mail?
I usually just rsync the /home directories to another server. The inital sync can take a while, but it gets faster after there is a base to work from.
...and it's much less painful if you're using maildir instead of mbox!
Not for rsyncing. Tons of small files means much slower rsync.
Due to connection turnaround latency, I assume? (I've never
looked at the rsync protocol) If that's the case, then I stand very
much corrected, thank you. I was going from the same logic regarding
mbox vs. maildir in the context of backups. One new message
delivered and a 400MB mail spool gets backed up again..
-Dave
-- Dave McGuire Port Charlotte, FL
on 10-29-2008 3:18 PM Dave McGuire spake the following:
On Oct 29, 2008, at 5:32 PM, Arkadiusz Miskiewicz wrote:
What is the best way to do a (server-side) backup of all mail in a user's mail?
I usually just rsync the /home directories to another server. The inital sync can take a while, but it gets faster after there is a base to work from.
...and it's much less painful if you're using maildir instead of mbox!
Not for rsyncing. Tons of small files means much slower rsync.
Due to connection turnaround latency, I assume? (I've never looked at the rsync protocol) If that's the case, then I stand very much corrected, thank you. I was going from the same logic regarding mbox vs. maildir in the context of backups. One new message delivered and a 400MB mail spool gets backed up again..
-Dave
Rsync adds some latency as it indexes and compares files on both ends. Obviously it would take more time to compare 40,000 1K files then 1000 40K files even though the data size is similar. It would still be better than tar/bzip/scp which has to compress everything and transfer the lot every time.
-- MailScanner is like deodorant... You hope everybody uses it, and you notice quickly if they don't!!!!
Scott Silva wrote, On 10/30/2008 12:34 AM:
on 10-29-2008 3:18 PM Dave McGuire spake the following:
What is the best way to do a (server-side) backup of all mail in a user's mail? I usually just rsync the /home directories to another server. The inital sync can take a while, but it gets faster after there is a base to work from. ...and it's much less painful if you're using maildir instead of mbox! Not for rsyncing. Tons of small files means much slower rsync. Due to connection turnaround latency, I assume? (I've never looked at
On Oct 29, 2008, at 5:32 PM, Arkadiusz Miskiewicz wrote: the rsync protocol) If that's the case, then I stand very much corrected, thank you. I was going from the same logic regarding mbox vs. maildir in the context of backups. One new message delivered and a 400MB mail spool gets backed up again..
-Dave
Rsync adds some latency as it indexes and compares files on both ends. Obviously it would take more time to compare 40,000 1K files then 1000 40K files even though the data size is similar. It would still be better than tar/bzip/scp which has to compress everything and transfer the lot every time.
Maildirsync it an "Online synchronizer for Maildir-format mailboxes" See http://hacks.dlux.hu/maildirsync/
Sot.
I use the tar/bzip method, and have been wondering about the rsync. All my users have system accounts on the dovecot server, and use Maildir format. If i rsync the mail to another box where the users do not have system accounts, will the ownerships/ permissions etc. be goofed up ?
Correctly, or incorrectly, I've been using tar to preserve all that information.
Cal Gordon
Sotiris Tsimbonis wrote:
Scott Silva wrote, On 10/30/2008 12:34 AM:
on 10-29-2008 3:18 PM Dave McGuire spake the following:
> What is the best way to do a (server-side) backup of all mail in a > user's mail? I usually just rsync the /home directories to another server. The inital sync can take a while, but it gets faster after there is a base to work from. ...and it's much less painful if you're using maildir instead of mbox! Not for rsyncing. Tons of small files means much slower rsync. Due to connection turnaround latency, I assume? (I've never looked at
On Oct 29, 2008, at 5:32 PM, Arkadiusz Miskiewicz wrote: the rsync protocol) If that's the case, then I stand very much corrected, thank you. I was going from the same logic regarding mbox vs. maildir in the context of backups. One new message delivered and a 400MB mail spool gets backed up again..
-Dave
Rsync adds some latency as it indexes and compares files on both ends. Obviously it would take more time to compare 40,000 1K files then 1000 40K files even though the data size is similar. It would still be better than tar/bzip/scp which has to compress everything and transfer the lot every time.
Maildirsync it an "Online synchronizer for Maildir-format mailboxes" See http://hacks.dlux.hu/maildirsync/
Sot.
Calvin Gordon wrote:
I use the tar/bzip method, and have been wondering about the rsync. All my users have system accounts on the dovecot server, and use Maildir format. If i rsync the mail to another box where the users do not have system accounts, will the ownerships/ permissions etc. be goofed up ?
Correctly, or incorrectly, I've been using tar to preserve all that information.
rsync preserves all that too, but you should preserve uid->username and gid->groupname mappings too, otherwise all that information is not as useful. Saving the password files is usually sufficient, assuming you are doing backups for disaster recovery, and not just for the occasional restore after an "oops, I deleted all my mail!" phonecall.
rsnapshot is nice too. It uses rsync and hard links to make as many snapshots of the filesystem as you like. This creates many 'restore points' with total disk usage being just over what a single full backup would take.
Ken
Cal Gordon
Sotiris Tsimbonis wrote:
Scott Silva wrote, On 10/30/2008 12:34 AM:
on 10-29-2008 3:18 PM Dave McGuire spake the following:
>> What is the best way to do a (server-side) backup of all mail in a >> user's mail? > I usually just rsync the /home directories to another server. The > inital sync > can take a while, but it gets faster after there is a base to work > from. ...and it's much less painful if you're using maildir instead of mbox! Not for rsyncing. Tons of small files means much slower rsync. Due to connection turnaround latency, I assume? (I've never looked at
On Oct 29, 2008, at 5:32 PM, Arkadiusz Miskiewicz wrote: the rsync protocol) If that's the case, then I stand very much corrected, thank you. I was going from the same logic regarding mbox vs. maildir in the context of backups. One new message delivered and a 400MB mail spool gets backed up again..
-Dave
Rsync adds some latency as it indexes and compares files on both ends. Obviously it would take more time to compare 40,000 1K files then 1000 40K files even though the data size is similar. It would still be better than tar/bzip/scp which has to compress everything and transfer the lot every time.
Maildirsync it an "Online synchronizer for Maildir-format mailboxes" See http://hacks.dlux.hu/maildirsync/
Sot.
-- Ken Anderson Pacific.Net
Dave McGuire wrote:
On Oct 29, 2008, at 3:42 PM, Scott Silva wrote:
What is the best way to do a (server-side) backup of all mail in a user's mail?
I usually just rsync the /home directories to another server. The inital sync can take a while, but it gets faster after there is a base to work from.
...and it's much less painful if you're using maildir instead of mbox!
-Dave
I have to wonder. I have a mailserver that I do a bootable complete image copy of with all files and O/S in two hours to an Ultrium-2 tape, 95 GB. When I switch to maildir, I will go from some 25,000 mbox files to 2.5 to 3 million files...I can't believe that isn't going to hurt and will force me into incrementals.....
On Thu, 2008-10-30 at 11:00 -0400, Stewart Dean wrote:
Dave McGuire wrote:
On Oct 29, 2008, at 3:42 PM, Scott Silva wrote:
What is the best way to do a (server-side) backup of all mail in a user's mail?
I usually just rsync the /home directories to another server. The inital sync can take a while, but it gets faster after there is a base to work from.
...and it's much less painful if you're using maildir instead of mbox!
-Dave
I have to wonder. I have a mailserver that I do a bootable complete image copy of with all files and O/S in two hours to an Ultrium-2 tape, 95 GB. When I switch to maildir, I will go from some 25,000 mbox files to 2.5 to 3 million files...I can't believe that isn't going to hurt and will force me into incrementals.....
One possibility is to just wait for dbox with multiple-messages-per-file feature. I can't really say when it'll be ready (or when I'll even start implementing it), but I know I want to use it myself and some companies have also recently been asking about it.
Timo Sirainen:
One possibility is to just wait for dbox with multiple-messages-per-file feature. I can't really say when it'll be ready (or when I'll even start implementing it), but I know I want to use it myself and some companies have also recently been asking about it.
Have you considered making dbox a major priority for v. 1.2?
I have been holding back on v.1.2 because I dont really see the big improvements in it that I saw in v.1.0 and v.1.1. With 1.0 and 1.1 I hurried off using them in production environments even while they where still in beta (of course only after proper testing) because they posed so many advantages (primarily speed and stability) over other solutions.
Since Im focused almost entirely on stability and speed, and very little on fancy functionality, what v.1.0 offers in terms of functionality is just fine. What drove me towards 1.1 were speed improvements (and stability on NFS). I remember you made a post about not many people testing v.1.2. I think the reason may be that most users feel the same as me. Theyd like to se a major feature that benefits their primary needs, which isnt in term of functionality but more in term of speed improvements. Dbox could be that feature as I think there isnt much room for further developing the Maildir format (and as far as I can see you have gone as far as possible with regards to optimizing speed while working within the boundaries of the Maildir standard).
Maildir is nice compared to mbox but it really isnt optimal. In days where IOPS is the most difficult resource to get into your server (and dovecot already using close to nothing in terms of CPU time and memory) having one file per e-mail is less than sub-optimal especially when a large amount of users just downloads the whole mailbox using POP3 (not to mention backing up Maildirs).
Now don't take this as a critic, I love your software. I just would really like to se dbox evolve and think it would be a major driving force for v.1.2 :)
Develop dbox, Do it. Do it naoughw! (preferably pronounced with a schwarzeneggerish accent like in the last three seconds of this splendid video http://www.youtube.com/watch?v=adc3MSS5Ydc).
Best regards, Mikkel
I'd like to add my vote here as well; dbox would be *the* feature that would make me happy. I'm the guy who asked a few weeks ago about ways to speed access on our GFS clustered mail environment.
Meanwhile, I've done some preliminary testing with mbox. As expected, it's vastly faster than the Maildirs that we're using now. Of course it pains me to go "backwards" but that may be the interim solution. I got stopped temporarily when it seemed that I couldn't nest folders using mbox, but hopefully that's untrue.
Allen
mikkel@euro123.dk wrote:
Timo Sirainen:
One possibility is to just wait for dbox with multiple-messages-per-file feature. I can't really say when it'll be ready (or when I'll even start implementing it), but I know I want to use it myself and some companies have also recently been asking about it.
Have you considered making dbox a major priority for v. 1.2?
I have been holding back on v.1.2 because I don’t really see the big improvements in it that I saw in v.1.0 and v.1.1. With 1.0 and 1.1 I hurried off using them in production environments even while they where still in beta (of course only after proper testing) because they posed so many advantages (primarily speed and stability) over other solutions.
Since I’m focused almost entirely on stability and speed, and very little on fancy functionality, what v.1.0 offers in terms of functionality is just fine. What drove me towards 1.1 were speed improvements (and stability on NFS). I remember you made a post about not many people testing v.1.2. I think the reason may be that most users feel the same as me. They’d like to se a major feature that benefits their primary needs, which isn’t in term of functionality but more in term of speed improvements. Dbox could be that feature as I think there isn’t much room for further developing the Maildir format (and as far as I can see you have gone as far as possible with regards to optimizing speed while working within the boundaries of the Maildir standard).
Maildir is nice compared to mbox but it really isn’t optimal. In days where IOPS is the most difficult resource to get into your server (and dovecot already using close to nothing in terms of CPU time and memory) having one file per e-mail is less than sub-optimal especially when a large amount of users just downloads the whole mailbox using POP3 (not to mention backing up Maildirs).
Now don't take this as a critic, I love your software. I just would really like to se dbox evolve and think it would be a major driving force for v.1.2 :)
Develop dbox, Do it. Do it naoughw! (preferably pronounced with a schwarzeneggerish accent like in the last three seconds of this splendid video http://www.youtube.com/watch?v=adc3MSS5Ydc).
Best regards, Mikkel
-- Allen Belletti allen@isye.gatech.edu 404-894-6221 Phone Industrial and Systems Engineering 404-385-2988 Fax Georgia Institute of Technology
on 10-30-2008 11:42 AM Allen Belletti spake the following:
I'd like to add my vote here as well; dbox would be *the* feature that would make me happy. I'm the guy who asked a few weeks ago about ways to speed access on our GFS clustered mail environment.
Meanwhile, I've done some preliminary testing with mbox. As expected, it's vastly faster than the Maildirs that we're using now. Of course it pains me to go "backwards" but that may be the interim solution. I got stopped temporarily when it seemed that I couldn't nest folders using mbox, but hopefully that's untrue.
You can nest folders with mbox, but you can't have "folders" that contain both messages and other folders. A "folder" in mbox can either hold messages or other folders, but not both.
-- MailScanner is like deodorant... You hope everybody uses it, and you notice quickly if they don't!!!!
On Thu, 2008-10-30 at 14:42 -0400, Allen Belletti wrote:
I'd like to add my vote here as well; dbox would be *the* feature that would make me happy. I'm the guy who asked a few weeks ago about ways to speed access on our GFS clustered mail environment.
Meanwhile, I've done some preliminary testing with mbox. As expected, it's vastly faster than the Maildirs that we're using now. Of course it pains me to go "backwards" but that may be the interim solution. I got stopped temporarily when it seemed that I couldn't nest folders using mbox, but hopefully that's untrue.
You could use Maildir++ layout with mboxes too:
mail_location = mbox:~/mail:LAYOUT=maildir++
mbox handling code is less stable than maildir code though.
On Oct 30, 2008, at 2:35 PM, mikkel@euro123.dk wrote:
Maildir is nice compared to mbox but it really isn’t optimal. In days where IOPS is the most difficult resource to get into your server (and dovecot already using close to nothing in terms of CPU time and
memory) having one file per e-mail is less than sub-optimal especially when a large amount of users just downloads the whole mailbox using POP3
(not to mention backing up Maildirs).
It seems to me that a database like Postgres or MySQL would be the
best bet.
-Dave
-- Dave McGuire Port Charlotte, FL
On Oct 30, 2008, at 2:35 PM, mikkel@euro123.dk wrote:
Maildir is nice compared to mbox but it really isnt optimal. In days where IOPS is the most difficult resource to get into your server (and dovecot already using close to nothing in terms of CPU time and memory) having one file per e-mail is less than sub-optimal especially when a large amount of users just downloads the whole mailbox using POP3 (not to mention backing up Maildirs).
It seems to me that a database like Postgres or MySQL would be the best bet.
That's a matter of opinion. Moving mail storage to a database would probably be the last thing I would ever do (I'm not saying it's not the right thing for some people. I'm just not one of them). I'm using mysql for storing the users database but thats another story.
Adding a database is one additional level of complexity. One more program to govern. In my opinion it's nice to know that as long as the disk is readable nothing can go completely wrong.
The database in my case would be roughly 400 GB holding some 60 million records. Just imagine if one single byte got written to the wrong place. Power outage, OS crash, software bug or whatever could easily result in this (I regularly experience mysql tables that crash on their own from heavy use). Having to run a repair on a table of that size whilst all users are eager to get to their data must be a nightmare of proportions.
Just imagine backing the thing up, exporting 60.000.000 SQL queries. Not to say importing them again if something should go really wrong. Actually I'n not even sure it would be faster. When the index files grow to several gigabytes they kind of loose their purpose.
Maildir is very resilient to various errors. It is virtually impossible to corrupt a maildir (at least I've never experienced anything). Also you can backup up the thing without worrying about anything accessing it at the same time. Mbox less so but still a lot better than having one huge database.
Dbox would be the ultimate compromise between crash resilience and a low number of files (not to mention the enormous potential for speed gains).
Regards, Mikkel
mikkel@euro123.dk wrote:
On Oct 30, 2008, at 2:35 PM, mikkel@euro123.dk wrote:
Maildir is nice compared to mbox but it really isn’t optimal. In days where IOPS is the most difficult resource to get into your server (and dovecot already using close to nothing in terms of CPU time and memory) having one file per e-mail is less than sub-optimal especially when a large amount of users just downloads the whole mailbox using POP3 (not to mention backing up Maildirs). It seems to me that a database like Postgres or MySQL would be the best bet.
That's a matter of opinion. Moving mail storage to a database would probably be the last thing I would ever do (I'm not saying it's not the right thing for some people. I'm just not one of them). I'm using mysql for storing the users database but that’s another story.
Adding a database is one additional level of complexity. One more program to govern. In my opinion it's nice to know that as long as the disk is readable nothing can go completely wrong.
I have to jump in here and go a bit tangential by saying there are databases and want-to-be's.
The database in my case would be roughly 400 GB holding some 60 million records.
Fair sized but not really big.
Just imagine if one single byte got written to the wrong place. Power outage, OS crash, software bug or whatever could easily result in this (I regularly experience mysql tables that crash on their own from heavy use). Having to run a repair on a table of that size whilst all users are eager to get to their data must be a nightmare of proportions.
There is the difference between an enterprise database and MySQL. Yes, yes, yes lots of /enterprises/ run applications that use MySQL but most of those apps have throw away data or they are not using the free version of MySQL.
Just imagine backing the thing up, exporting 60.000.000 SQL queries. Not to say importing them again if something should go really wrong. Actually I'n not even sure it would be faster. When the index files grow to several gigabytes they kind of loose their purpose.
There are many businesses backing up way-more data than that and it it isn't 60,000,000 queries -- it is one command. But if you use serious hardware "backing up" isn't really needed. RAID, redundant/hot-swap servers, etc. make backing up /extra redundancy/. :-)
And I bring this up because "Archiveopteryx" <http://www.archiveopteryx.org/> uses a database - PostgreSQL.
Rod
Maildir is very resilient to various errors. It is virtually impossible to corrupt a maildir (at least I've never experienced anything). Also you can backup up the thing without worrying about anything accessing it at the same time. Mbox less so but still a lot better than having one huge database.
Dbox would be the ultimate compromise between crash resilience and a low number of files (not to mention the enormous potential for speed gains).
Regards, Mikkel
mikkel@euro123.dk wrote:
Just imagine backing the thing up, exporting 60.000.000 SQL queries. Not to say importing them again if something should go really wrong. Actually I'n not even sure it would be faster. When the index files grow to several gigabytes they kind of loose their purpose.
There are many businesses backing up way-more data than that and it it isn't 60,000,000 queries -- it is one command. But if you use serious hardware "backing up" isn't really needed. RAID, redundant/hot-swap servers, etc. make backing up /extra redundancy/. :-)
Why make things complicated and expensive when you can make them cheap and simple? Anything is possible if you wanna pay for it (in terms of hardware, administration and licenses). I have focused primarily on making it as simple as possible.
And while running a 400 GB with 60.000.000 records database isn't impossible it would be if it were to run on the same hardware that now comprises the system. Roughly 1000 IOPS is plenty to handle all mail operations.
I seriously doubt that it would be enough to even supply one lookup a second on that huge db (and even less over NFS as is now being used). And I assume that a hundreds of lookups a second would be required to handle the load.
So it would require a lot more resources and still give nothing but trouble (risk of crashed database and backup issues that now aren't there).
By the way data is stored in a SAN it needs to be backed up. 500 GB SATA disks takes a day to synchronize if one breaks down and we can't really take that chance (Yes I will eventually move the data to smaller 15.000 RPM disks but there is no need to pay for them before its necessary). Also there is the risk of data being deleted by a mistake, hacker attacks or software malfunctioning.
But we really are moving off-topic here.
Regards, Mikkel
Stewart Dean wrote:
Dave McGuire wrote:
On Oct 29, 2008, at 3:42 PM, Scott Silva wrote:
What is the best way to do a (server-side) backup of all mail in a user's mail?
I usually just rsync the /home directories to another server. The inital sync can take a while, but it gets faster after there is a base to work from.
...and it's much less painful if you're using maildir instead of mbox!
-Dave
I have to wonder. I have a mailserver that I do a bootable complete image copy of with all files and O/S in two hours to an Ultrium-2 tape, 95 GB. When I switch to maildir, I will go from some 25,000 mbox files to 2.5 to 3 million files...I can't believe that isn't going to hurt and will force me into incrementals..... Well, I can't talk about an installation of that size - but I CAN say I've been using an rsync based backup and been thrilled with the results. I'm supporting a measly dozen users, but I'm incrementally backing up our complete server on a nightly basis. I think the script now takes less than an hour to run - for a complete backup.
Since I'm doing this via a VPN to a server at my house, it makes it rather convenient. I love having an immediately available backup I can ssh to - that now represents a daily snapshot for over a year - so when one of my users says, "Um...if I deleted a file that I've been working on for the past two weeks...can you get it back?"
-- Daniel
Dave McGuire wrote:
On Oct 29, 2008, at 3:42 PM, Scott Silva wrote:
What is the best way to do a (server-side) backup of all mail in a user's mail?
I usually just rsync the /home directories to another server. The inital sync can take a while, but it gets faster after there is a base to work from.
...and it's much less painful if you're using maildir instead of mbox!
-Dave
I have to wonder. I have a mailserver that I do a bootable complete image copy of with all files and O/S in two hours to an Ultrium-2 tape, 95 GB. When I switch to maildir, I will go from some 25,000 mbox files to 2.5 to 3 million files...I can't believe that isn't going to hurt and will force me into incrementals.....
My thoughts on rsync.
You may want to consider that incremental backups wont help you much if you use Maildir. Incremental or full rsync still has to generate a list of all the files.
Whether itll work for you is impossible to say. I guess youll just have to make a test. But you're right that the large amount of files will be an issue.
Rsync seems to be loading information about each file into memory before comparing the lists of files and doing the actual transfer. That may be a lot of memory if you have a lot of files.
I sometimes overcome this by rsyncing each user or domain one at a time. That way you will also limit issues of files no longer existing once the transfer begins (makes rsync generate errors).
You can estimate the time needed to list all the files. Try and use iostat to get a rough idea of how many OIPS your system handles under max stress load and how many it handles under normal operation. The difference is the amount available to you during the backup. Divide the total number of files with the number of available IOPS.
Say you have 100 IOPS available then it will take roughly 8 hours (3,000,000/100/3600=8.3 hours) to generate the list of 3,000,000 files. The afterwards transfer will probably be a lot faster. I'm not sure whether reading information about one file take up one IO operation. But that way of calculating the time to generate the lists wasn't much off last time I tried.
One option that I would prefer if I were to backup the entire store with one command would be generating a snapshot of the file system. And then rsync or cp that snapshot. That way youll always get a consistent backup and you wont have to worry about how long the backup takes to finish.
Regards, Mikkel
mikkel@euro123.dk wrote:
Rsync seems to be loading information about each file into memory before comparing the lists of files and doing the actual transfer. That may be a lot of memory if you have a lot of files.
I sometimes overcome this by rsync’ing each user or domain one at a time. That way you will also limit issues of files no longer existing once the transfer begins (makes rsync generate errors).
If you are using rsync2 then definitely this is good advice. Massive memory consumption to backup large mailboxes (and a long time before anything starts happening, ie snapshot useful)
However, with rsync3 you should look at the options required to do use the incremental protocol. This trades a bit of efficiency on hardlinked files for lower memory and perhaps faster sync speeds. I haven't personally tried this, but reports on the web seem promising. You need rsync3 at both ends of the link and to examine your sync options a little
However, one thing which is sadly missing on rsync is a fuzzy option which can spot files moving from /new to /cur... This may well cause additional load for imap backups which is potentially avoidable with a simple copy. I suspect it would be easy to patch a custom bit of code to handle this though..?
One option that I would prefer if I were to backup the entire store with one command would be generating a snapshot of the file system. And then rsync or cp that snapshot. That way you’ll always get a consistent backup and you won’t have to worry about how long the backup takes to finish.
Snapshot seems like an excellent idea to avoid files missing files moving between /cur and /new. However, it should be pointed out that this is extra io for the server (with LVM at least) whilst the backup is running
I should think rsync3 incremental, plus some custom patching to look for files moving between /cur and /new would be very efficient for backing up maildir filestores (at least to the extent your filesystem allows efficient iterating over lots of files)
Ed W
mikkel@euro123.dk wrote:
One option that I would prefer if I were to backup the entire store with one command would be generating a snapshot of the file system. And then rsync or cp that snapshot. That way youll always get a consistent backup and you wont have to worry about how long the backup takes to finish.
Snapshot seems like an excellent idea to avoid files missing files moving between /cur and /new. However, it should be pointed out that this is extra io for the server (with LVM at least) whilst the backup is running
I only have experience wuth UFS (FreeBSD) and ZFS (Solaris). Snapshots on UFS is a horrible thing for large file systems.
Snapshots on ZFS is marvellous (which I use). It does not result in any extra IO whatsoever due to some clever designing. If you have the option of using ZFS it's definitely the best way to do it.
Regards, Mikkel
Hello all.
I read al this tread and still wondering - what FS best to use to manage up to 1,5Tb of maildir spool and make a near to real time back up of it?
Firstly i plan to make it all on FreeBSD with UFS2 and use rsync, but I never rsync such much of space and files.
If using solaris best is ZFS - but i don`t have any experience on this OS.
What Linux can provide for this task?
-- Best reagrds, Proskurin Kirill
Proskurin Kirill schrieb:
Hello all.
I read al this tread and still wondering - what FS best to use to manage up to 1,5Tb of maildir spool and make a near to real time back up of it?
Firstly i plan to make it all on FreeBSD with UFS2 and use rsync, but I never rsync such much of space and files.
If using solaris best is ZFS - but i don`t have any experience on this OS.
What Linux can provide for this task?
Hi, i use maildir format, its no problem to backup big maildirs with rsync, at backup i see no "real" relation with with filessystem until you use a modern one ( i.e. ext3... ), perhaps you might use nfs or gfs for redundancy solutions after all this type of backing up ( tar , rsync, i.e by cron ) is only a snapshot ( maybe ok on your side ), but i.e you may use always_bcc i.e with postfix or imapsync etc choosing zfs maybe is the right choice anyway i ve read much good about it, but i dont see what it might helpfull at backup
Best Regards
MfG Robert Schetterer
Germany/Munich/Bavaria
participants (16)
-
Allen Belletti
-
Arkadiusz Miskiewicz
-
Calvin Gordon
-
Daniel L. Miller
-
Dave McGuire
-
Ed W
-
Ken A
-
mikkel@euro123.dk
-
Neil
-
Proskurin Kirill
-
Robert Schetterer
-
Roderick A. Anderson
-
Scott Silva
-
Sotiris Tsimbonis
-
Stewart Dean
-
Timo Sirainen