[Dovecot] mbox: extra linefeed after Content-Length header in 1.1.rc8
mbox messages gets header corruption caused by an extra linefeed after Content-Length
Users sees their mails in Sent mbox folder without the from and to fields, without attachments and with the date of 1/1/1970
Diego.
Here is an anonymized header:
From xxxxxxxx@xxxxxx.xxxxxx.xxxxx.xx.xx Tue Jun 03 09:14:33 2008 Message-ID: <xxxxxxxx.xxxxxxx@xxxxxx.xxxxx.xx.xx> X-UID: 3913 Status: RO X-Keywords: Content-Length: 6817
xxxx: xxx, xx xxx xxxx xx:xx:xx +xxxx xxxx: xxxxxxx xxxxxxxx <xxxxxxx.xxxxxxxx@xxxxxx.xxxxx.xx.xx> xxxx-xxxxx: xxxxxxxxxxx x.x.x.x (xxxxxxx/xxxxxxxx) xxxx-xxxxxxx: x.x xx: "xxxxxxxx@xxxxxxxx.xx" <xxxxxxxx@xxxxxxxx.xx> xx: xxxxx.xxxxxxxxxx@xxxxxxxxxxxx.xx, xxxxxx xxxxx <xxxxxx.xxxxx@xxxxxxxx.xxxxxx.xx>, xxxxxxx xxxxxxxxxxx <xxxxxxxxxx@xxxxxxxxxxx.xxx> xxxxxxx: xx: x: xx: xxxxxxxxx xxxxxxxxxx: <xxxxxxxxxxx.xxxxxxxx@xxxxxxxx.xx> xx-xxxxx-xx: <xxxxxxxxxxx.xxxxxxxx@xxxxxxxx.xx> xxxxxxx-xxxx: xxxx/xxxxx; xxxxxxx=xxx-x; xxxxxx=xxxxxx xxxxxxx-xxxxxxxx-xxxxxxxx: xxxx
On Tue, 2008-06-03 at 10:34 +0200, Diego Liziero wrote:
mbox messages gets header corruption caused by an extra linefeed after Content-Length
Fixed: http://hg.dovecot.org/dovecot-1.1/rev/e043135e971d
I guess 1.1.rc9 will still come. But I'll wait a couple of days before releasing it to see if there are more bugs..
On Tue, Jun 3, 2008 at 3:05 PM, Timo Sirainen <tss@iki.fi> wrote:
On Tue, 2008-06-03 at 10:34 +0200, Diego Liziero wrote:
mbox messages get header corruption caused by an extra linefeed after Content-Length
Works, thank you.
Now I have to fix users mbox files.
As the extra linefeed is between Content-Length and Subject headers, I'm thinking about using a regexp based replace such as s/(Content-Length: [0-9]+)\n\n(Subject: )/$1\n$2/s but I can't find how to make multiple lines matching work.
Any suggestion?
Regards, Diego.
On Wed, 4 Jun 2008, Diego Liziero wrote:
On Tue, Jun 3, 2008 at 3:05 PM, Timo Sirainen <tss@iki.fi> wrote:
On Tue, 2008-06-03 at 10:34 +0200, Diego Liziero wrote:
mbox messages get header corruption caused by an extra linefeed after Content-Length
Works, thank you.
Now I have to fix users mbox files.
As the extra linefeed is between Content-Length and Subject headers, I'm thinking about using a regexp based replace such as s/(Content-Length: [0-9]+)\n\n(Subject: )/$1\n$2/s but I can't find how to make multiple lines matching work.
Python has an re.MULTILINE option you can pass to the regular expression so that it can cross lines. Perhaps Perl or your favorite regular expression toolkit has something similar?
If not, Python it is! (-;
-- Asheesh.
-- Do not drink coffee in early A.M. It will keep you awake until noon.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Wed, Jun 04, 2008 at 03:03:34PM -0700, Asheesh Laroia wrote:
[...]
Python has an re.MULTILINE option you can pass to the regular expression so that it can cross lines. Perhaps Perl or your favorite regular expression toolkit has something similar?
That would be the s modifier for a Perl regexp (treat string as a single line):
$x =~ /.../s
(This basically changes the meaning of . to also match end-of-line chars. To control whether ^ and $ match beginning/end of string or beginning/end of line whithin the string, see the m modifier).
If not, Python it is! (-;
Nah ;-)
Regards
- -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFIR4SmBcgs9XrR2kYRAiiXAJ43v4e7kJcztLeET+6DUfKYxgZGHgCeJ1zi YGYHYtPMsd8W2wy6M2tQOPA= =lbOV -----END PGP SIGNATURE-----
On Wed, 2008-06-04 at 23:59 +0200, Diego Liziero wrote:
On Tue, Jun 3, 2008 at 3:05 PM, Timo Sirainen <tss@iki.fi> wrote:
On Tue, 2008-06-03 at 10:34 +0200, Diego Liziero wrote:
mbox messages get header corruption caused by an extra linefeed after Content-Length
Works, thank you.
Now I have to fix users mbox files.
As the extra linefeed is between Content-Length and Subject headers, I'm thinking about using a regexp based replace such as s/(Content-Length: [0-9]+)\n\n(Subject: )/$1\n$2/s but I can't find how to make multiple lines matching work.
Any suggestion?
Perl maybe? Something like (not tested):
perl -pe 'BEGIN { $/ = ""; } s/^(Content-Length: [0-9]+)\n\n(Subject: )/$1\n$2/g' < mbox > mbox2
$/ changes the line separator.
On Wed, 2008-06-04 at 23:59 +0200, Diego Liziero wrote: As the extra linefeed is between Content-Length and Subject headers, I'm thinking about using a regexp based replace such as s/(Content-Length: [0-9]+)\n\n(Subject: )/$1\n$2/s but I can't find how to make multiple lines matching work.
Any suggestion?
Thank you everyone for your help. After some quick tries, and following your suggestions, I ended up in writing a silly perl script that matched one by one each of the three lines and printed only the first and third one.
On Thu, Jun 5, 2008 at 12:07 AM, Timo Sirainen <tss@iki.fi> wrote:
Perl maybe? Something like (not tested):
perl -pe 'BEGIN { $/ = ""; } s/^(Content-Length: [0-9]+)\n\n(Subject: )/$1\n$2/g' < mbox > mbox2
$/ changes the line separator.
Almost right. But this deletes all empty lines, not just the ones in the header. I didn't try to have a deeper look.
On Thu, Jun 5, 2008 at 8:16 AM, <tomas@tuxteam.de> wrote:
That would be the s modifier for a Perl regexp (treat string as a single line):
$x =~ /.../s
This should be the right way.. see below.
On Thu, Jun 5, 2008 at 12:03 AM, Asheesh Laroia <asheesh@asheesh.org> wrote:
Python has an re.MULTILINE option you can pass to the regular expression so that it can cross lines. Perhaps Perl >or your favorite regular expression toolkit has something similar?
Yes, but with perl I didn't find quickly a solution to read multiple lines from a file without filling all system memory when files are some gigabytes big.
Regards, Diego.
participants (4)
-
Asheesh Laroia
-
Diego Liziero
-
Timo Sirainen
-
tomas@tuxteam.de