convert mdbox to maildir
Hi!
We need to move all users from one (pretty old) installation of dovecot to a new one. The old one uses mdbox for users' mailboxes and maildir for shared/public mailboxes. The new one must be maildir only.
I believe that I can just copy the shared/public maildir structure to the new installation without problems. Am I right? Do I have to expect any trouble? (Set owner, permissions etc...)
The real problem is that we must not use the running, old dovecot installation. So we are not able to connect to the old server, pull all folders and mails and create a new maildir structure. Currently, we can't do anything against it. What we get are the users' mdbox files.
Is there any way to convert mdbox files and structures to maildir directly from filesystem? Or do we have to build a copy of the old machine (dovecot only, or -maybe better- a vm) and then use doveadm backup? Or is it ok to just set up the completely new installation, set mail_location to where the new Maildirs will be, like maildir:~/Maildir and then run something like doveadm backup mdbox:/tmp/$user/mdbox -u $user? Will this transfer all mails and folders or do we have to keep an eye on some specific things?
I thank you very much and
cheers! -lutzn
*My* inbox gets filled with thousands of emails, more or less commercial content and trivial notifications from shopping online, and postfix crashes and will not accept new messages if the file "/var/mail/justina" becomes too large.
Configuring postfix to deliver the mail to "~/Maildir" solved that problem.
I still need to configure a sieve or a filter or some nicer mechanism to clear out messages that are either outright spam or too old or no longer of interest to me.
On August 13, 2022 10:00:36 AM AKDT, Marc <marc@f1-outsourcing.eu> wrote:
We need to move all users from one (pretty old) installation of dovecot to a new one. The old one uses mdbox for users' mailboxes and maildir for shared/public mailboxes. The new one must be maildir only.
why did you decide to move to maildir?
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
I was just asking since I saw your gmx.net address, I thought this could be some large scale stuff and then iops and the used storage have requirements on your mail format. If you only have a few thousand accounts it does not matter that much (I guess)
It was a decision we have not been involved in. That's it. We have to do what we are paid for. But I did not want to start a meta-discussion. I was looking for ideas and technical experience. (And I agree with Justina that maildir seems to be much more manageable. It is a well-known format.)
Gesendet: Samstag, 13. August 2022 um 21:08 Uhr Von: "justina colmena ~biz" <justina@colmena.biz> An: dovecot@dovecot.org Betreff: RE: convert mdbox to maildir *My* inbox gets filled with thousands of emails, more or less commercial content and trivial notifications from shopping online, and postfix crashes and will not accept new messages if the file "/var/mail/justina" becomes too large.
Configuring postfix to deliver the mail to "~/Maildir" solved that problem.
I still need to configure a sieve or a filter or some nicer mechanism to clear out messages that are either outright spam or too old or no longer of interest to me.
On August 13, 2022 10:00:36 AM AKDT, Marc <marc@f1-outsourcing.eu> wrote:
We need to move all users from one (pretty old) installation
of dovecot to a new one. The old one uses mdbox for users' mailboxes and maildir for shared/public mailboxes. The new one must be maildir only.
why did you decide to move to maildir?
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Not sure what this had to do with the question asked. You're using mbox, but he is using mdbox. Postfix is unable to write to mdbox itself.
Quoting justina colmena ~biz <justina@colmena.biz>:
*My* inbox gets filled with thousands of emails, more or less
commercial content and trivial notifications from shopping online,
and postfix crashes and will not accept new messages if the file
"/var/mail/justina" becomes too large.Configuring postfix to deliver the mail to "~/Maildir" solved that problem.
I still need to configure a sieve or a filter or some nicer
mechanism to clear out messages that are either outright spam or too
old or no longer of interest to me.On August 13, 2022 10:00:36 AM AKDT, Marc <marc@f1-outsourcing.eu> wrote:
We need to move all users from one (pretty old) installation of dovecot to a new one. The old one uses mdbox for users' mailboxes and maildir for shared/public mailboxes. The new one must be maildir only.
why did you decide to move to maildir?
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
So? that is why you have this lmtp not? Afaik was mdbox created to solve the (performance) issues with mbox and maildir etc. So I just wonder what the logics is behind chosing maildir current day.
Not sure what this had to do with the question asked. You're using mbox, but he is using mdbox. Postfix is unable to write to mdbox itself.
Quoting justina colmena ~biz <justina@colmena.biz>:
*My* inbox gets filled with thousands of emails, more or less commercial content and trivial notifications from shopping online, and postfix crashes and will not accept new messages if the file "/var/mail/justina" becomes too large.
Configuring postfix to deliver the mail to "~/Maildir" solved that problem.
I still need to configure a sieve or a filter or some nicer mechanism to clear out messages that are either outright spam or too old or no longer of interest to me.
On August 13, 2022 10:00:36 AM AKDT, Marc <marc@f1-outsourcing.eu> wrote:
We need to move all users from one (pretty old) installation of
dovecot
to a new one. The old one uses mdbox for users' mailboxes and maildir for shared/public mailboxes. The new one must be maildir only.
why did you decide to move to maildir?
mdbox is a proprietary format, only dovecot can handle it (afaik). Postfix can write maildir natively, other MTAs, too. There are many tools to help in case something strange happens. I believe that using maildir seems to be not that bad. And I don't see the real benefits of mdbox. Maybe with "old" spindle driven devices, you may have a performance benefit, but with SDDs... If you use the right filesystem, use raid 10.... Yes, it is said that mdbox is performing better. The largest installations we had with dovecot were about 2k users (maildir was used) and another one had about 3k users (with mdbox), they all have branch offices that are connected to their "home" via some sort of vpn or direct connect by the ISP. We did not see a real difference between maildir and mdbox, maybe because of the surrounding conditions. And I believe mdbox may be faster, but we did not have the critical masses to see that.
Gesendet: Sonntag, 14. August 2022 um 11:47 Uhr Von: "Marc" <Marc@f1-outsourcing.eu> An: "Patrick Domack" <patrickdk@patrickdk.com>, "dovecot@dovecot.org" <dovecot@dovecot.org> Betreff: RE: convert mdbox to maildir
So? that is why you have this lmtp not? Afaik was mdbox created to solve the (performance) issues with mbox and maildir etc. So I just wonder what the logics is behind chosing maildir current day.
Not sure what this had to do with the question asked. You're using mbox, but he is using mdbox. Postfix is unable to write to mdbox itself.
Quoting justina colmena ~biz <justina@colmena.biz>:
*My* inbox gets filled with thousands of emails, more or less commercial content and trivial notifications from shopping online, and postfix crashes and will not accept new messages if the file "/var/mail/justina" becomes too large.
Configuring postfix to deliver the mail to "~/Maildir" solved that problem.
I still need to configure a sieve or a filter or some nicer mechanism to clear out messages that are either outright spam or too old or no longer of interest to me.
On August 13, 2022 10:00:36 AM AKDT, Marc <marc@f1-outsourcing.eu> wrote:
We need to move all users from one (pretty old) installation of
dovecot
to a new one. The old one uses mdbox for users' mailboxes and maildir for shared/public mailboxes. The new one must be maildir only.
why did you decide to move to maildir?
On 14/08/22 21:47, Marc wrote:
So? that is why you have this lmtp not? Afaik was mdbox created to solve the (performance) issues with mbox and maildir etc. So I just wonder what the logics is behind chosing maildir current day.
maildir is probably what most people use and should continue to use. There are cases where mbox is still viable but nowadays they are rare edge cases. Basically put mbox was one file for all mail in the mailbox, it served us well in the days of POP when clients would download all the mail from the server and it would be deleted right away on the server-side with no folders and very rarely leaving mail on the server at all. This worked because when a client downloaded the messages they could for the most part just basically stream the entire mbox file straight through the TCP POP connection and then simply delete or truncate the file.
Nowadays IMAP is prevalent and so we have multiple folders stored server-side and mail is largely left on the server, so messages need to be accessed in a sort of random-access style instead of just streaming the whole lot of them down at once as used to be done in the POP days. This makes Maildir (where messages are stored one-per-file) much more efficient for storage and access. Most people should probably be using Maildir nowadays, it's a good format and is extremely portable so that other tools can easily recognize and work directly with the Maildir files.
There is, however, one major issue with Maildir. Filesystems store files in clusters on disk (and even, I believe on SSD drives), and these cluster sizes have been growing over the years in order to accommodate increasingly bigger file and filesystem sizes. The problem is that when you have 10,000 messages all approximately 500 bytes in size and a 4k cluster size, those messages don't take up 5MB on disk, but rather they take up approximately 40MB on disk because each file (which correspeonds to each message) takes up at least a full cluster on disk.
To solve this we now have mdbox which stores many messages (by default 10M worth) in one file, but not so many as to make the file unwieldy for random access of messages. The idea is that we can store way more messages that way in the same given space because we're not wasting most of the disk space on the filesystem having to use a full cluster per message. 10M is not, however, a huge amount of memory to allocate in RAM to manipulate one file with, so storing 10M worth of messages to each file tends to bedome a good compromise between storing all of the messages in one file vs storing one message per file.
At the end of the day, though, the storage benefits of mdbox should be weighed against the sheer simplicity and widespread use of Maildir. If you have really huge mailboxes (like ones that contain 50,000 or more messages) then mdbox may be the right solution for you, but most people will be fine with Maildir.
Peter
Am 14.08.22 um 13:05 schrieb Peter:
but most people will be fine with Maildir.
I had to switch from Maildir to something else and chose sdbox (one file per mail) and don't remember exactly what issues I had with Maildir but I think it was somehow IMAP keywords related like moving mails from one mailbox to another reset their keywords.
-- Cheers spi
On Sat, 2022-08-13 at 18:36 +0200, lutz.niederer@gmx.net wrote:
The real problem is that we must not use the running, old dovecot installation. So we are not able to connect to the old server, pull all folders and mails and create a new maildir structure. Currently, we can't do anything against it. What we get are the users' mdbox files.
Why not? Is the old server broken beyond repair? If not, is there an actual reason behind, or is it just a arbitrary decision capable of being swayed by facts? Is the customer willing to pay for the large increase in time to rebuild the whole thing?
Will they at least let you rsync the old server's entire mdbox structure to a machine where you can do your conversion? I don't know, to me their act of giving you some files and saying "it's your problem now" seems arbitrary, and you should charge them a lot of money.
Is there any way to convert mdbox files and structures to maildir directly from filesystem? Or do we have to build a copy of the old machine (dovecot only, or -maybe better- a vm) and then use doveadm backup? Or is it ok to just set up the completely new installation, set mail_location to where the new Maildirs will be, like maildir:~/Maildir and then run something like doveadm backup mdbox:/tmp/$user/mdbox -u $user? Will this transfer all mails and folders or do we have to keep an eye on some specific things?
All I know about mdbox comes from this document: https://doc.dovecot.org/admin_manual/mailbox_formats/dbox/
Quoting a specific sentence: "One of the main reasons for dbox’s high performance is that it uses Dovecot’s index files as the only storage for message flags and keywords, so the indexes don’t have to be “synchronized”. Dovecot trusts that they’re always up-to-date (unless it sees that something is clearly broken). This also means that you must not lose the dbox index files, as they can’t be regenerated without data loss."
The quote says *dbox*, but it's in a section devoted to both dbox and mdbox, so I'm thinking it might be true of both. Have they given you the index files? If not, it sounds to me like any regeneration would be an approximation at best.
Do you have a way of accurately putting together the directory structure of the former mdbox system?
My experience 10 years ago converting about a quarter million kmail emails to Dovecot Maildir is it takes about an hour to transfer between 25,000 and 50,000 emails, but of course that was on a much more anemic machine than I have today. I'd guess that if you have both databases on the same machine, the way I did ten years ago, the process will go pretty fast. Here's a count of my Dovecot Maildir today:
[root@mydesk Maildir]# du -hs 16G . [root@mydesk Maildir]# find . | wc -l 734906 [root@mydesk Maildir]#
I don't know much about your particular situation, but it seems to me like the majority of your problem isn't technical.
SteveT
Yes, you are right. The problems are not of technical nature. The reason seems to be some sort of fear (and "admins"). We have all we need. We have the old dovecot config, we have the mdbox files and the complete directory structure. We are simply not allowed to do all the stuff on the live system. Even if dsync backup exists that does not modify anything. Currently it is planned that the real conversion will take place on a weekend. We are setting up an environment that enables us to do that (how well does it work, how long will it take). They are only around 500 users and ca 4TB (incl public folders). I don't see a problem with that. And if they feel some sort of fear, then I can understand it (except the "admins" who should know better). We do what the customer wants us to do. And yes, they pay pretty well for working on weekends.
What I would like to do is set up the real new machine with everything - the new live system. Put all the mdbox files/structures into a directory and run the conversion for each user. So the first question is if the mdbox files can really be "simple files in folders" or if they have to be "hosted" by an imap server, too. (As I understood, the destination can be our final imap server.) And I believe we can have the source in the filesystem (what would be mach faster). The other question is if we can run some of these conversion in parallel? I would say "why not".
Gesendet: Sonntag, 14. August 2022 um 18:24 Uhr Von: "Steve Litt" <slitt@troubleshooters.com> An: dovecot@dovecot.org Betreff: Re: convert mdbox to maildir
On Sat, 2022-08-13 at 18:36 +0200, lutz.niederer@gmx.net wrote:
The real problem is that we must not use the running, old dovecot installation. So we are not able to connect to the old server, pull all folders and mails and create a new maildir structure. Currently, we can't do anything against it. What we get are the users' mdbox files.
Why not? Is the old server broken beyond repair? If not, is there an actual reason behind, or is it just a arbitrary decision capable of being swayed by facts? Is the customer willing to pay for the large increase in time to rebuild the whole thing?
Will they at least let you rsync the old server's entire mdbox structure to a machine where you can do your conversion? I don't know, to me their act of giving you some files and saying "it's your problem now" seems arbitrary, and you should charge them a lot of money.
Is there any way to convert mdbox files and structures to maildir directly from filesystem? Or do we have to build a copy of the old machine (dovecot only, or -maybe better- a vm) and then use doveadm backup? Or is it ok to just set up the completely new installation, set mail_location to where the new Maildirs will be, like maildir:~/Maildir and then run something like doveadm backup mdbox:/tmp/$user/mdbox -u $user? Will this transfer all mails and folders or do we have to keep an eye on some specific things?
All I know about mdbox comes from this document: https://doc.dovecot.org/admin_manual/mailbox_formats/dbox/
Quoting a specific sentence: "One of the main reasons for dbox’s high performance is that it uses Dovecot’s index files as the only storage for message flags and keywords, so the indexes don’t have to be “synchronized”. Dovecot trusts that they’re always up-to-date (unless it sees that something is clearly broken). This also means that you must not lose the dbox index files, as they can’t be regenerated without data loss."
The quote says *dbox*, but it's in a section devoted to both dbox and mdbox, so I'm thinking it might be true of both. Have they given you the index files? If not, it sounds to me like any regeneration would be an approximation at best.
Do you have a way of accurately putting together the directory structure of the former mdbox system?
My experience 10 years ago converting about a quarter million kmail emails to Dovecot Maildir is it takes about an hour to transfer between 25,000 and 50,000 emails, but of course that was on a much more anemic machine than I have today. I'd guess that if you have both databases on the same machine, the way I did ten years ago, the process will go pretty fast. Here's a count of my Dovecot Maildir today:
[root@mydesk Maildir]# du -hs 16G . [root@mydesk Maildir]# find . | wc -l 734906 [root@mydesk Maildir]#
I don't know much about your particular situation, but it seems to me like the majority of your problem isn't technical.
SteveT
On August 14, 2022 9:46:54 AM AKDT, lutz.niederer@gmx.net wrote:
Yes, you are right. The problems are not of technical nature. ... We do what the customer wants us to do. And yes, they pay pretty well for working on weekends. ... I'm sure there are more than enough professional mental health services available in any given district or locality, but I'm not sure why they are being discussed on a technical mailing list.
If your job is technical in nature, and that's what your customers are paying you for, then those problems of a technical nature are precisely what you'd better be focusing on.
Mostly I am a semi-technical do-it-yourselfer on the principle that I just can't tolerate the p*rn-surfing techie crowd from Silicon Valley, CA, and I find that most of the time if you want the job done right, you'd better do it yourself, especially if it's something very specific or technical.
Which is what many people have done and consequently why so much free and open source software exists in the first place.
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
lutz.niederer@gmx.net wrote:
Yes, you are right. The problems are not of technical nature. The reason seems to be some sort of fear (and "admins"). We have all we need. We have the old dovecot config, we have the mdbox files and the complete directory structure. We are simply not allowed to do all the stuff on the live system. Even if dsync backup exists that does not modify anything.
Would they allow a backup to your remote system aka:
doveadm backup -u XYZ -f -d destination
Have a look at man dovedam-backup for the specifics. Haven't done that before.
You have to run dovecot at the old server and the new server. The new server has to have maildir set as mail storage. Start with an initial remote backup and incoming mail running at the old server, because that will take some time. At the weekend cut of incoming mail and repeat the backup. Finally you have to redirect mail to the new server. Haven't done that before to a remote server but to different filesystems at one host, instead.
If I might have misunderstood what you need to achieve, forget about this mail ;-)
Regards, Michael
2nd idea (see below)
On 14. Aug 2022, at 21:15, Michael Grimm <trashcan@ellael.org> wrote:
lutz.niederer@gmx.net wrote:
Yes, you are right. The problems are not of technical nature. The reason seems to be some sort of fear (and "admins"). We have all we need. We have the old dovecot config, we have the mdbox files and the complete directory structure. We are simply not allowed to do all the stuff on the live system. Even if dsync backup exists that does not modify anything.
Would they allow a backup to your remote system aka:
doveadm backup -u XYZ -f -d destination
Have a look at man dovedam-backup for the specifics. Haven't done that before.
You have to run dovecot at the old server and the new server. The new server has to have maildir set as mail storage. Start with an initial remote backup and incoming mail running at the old server, because that will take some time. At the weekend cut of incoming mail and repeat the backup. Finally you have to redirect mail to the new server. Haven't done that before to a remote server but to different filesystems at one host, instead.
If I might have misunderstood what you need to achieve, forget about this mail ;-)
Regards, Michael
Or set up replication between old and new server. If all mail is relicated redirect mail to the new server. With this setup they can even continue to use their old server for a while ;-)
participants (9)
-
justina colmena ~biz
-
lutz.niederer@gmx.net
-
Marc
-
Marc
-
Michael Grimm
-
Patrick Domack
-
Peter
-
spi
-
Steve Litt