On Wed, 2008-06-04 at 23:59 +0200, Diego Liziero wrote: As the extra linefeed is between Content-Length and Subject headers, I'm thinking about using a regexp based replace such as s/(Content-Length: [0-9]+)\n\n(Subject: )/$1\n$2/s but I can't find how to make multiple lines matching work.
Any suggestion?
Thank you everyone for your help. After some quick tries, and following your suggestions, I ended up in writing a silly perl script that matched one by one each of the three lines and printed only the first and third one.
On Thu, Jun 5, 2008 at 12:07 AM, Timo Sirainen <tss@iki.fi> wrote:
Perl maybe? Something like (not tested):
perl -pe 'BEGIN { $/ = ""; } s/^(Content-Length: [0-9]+)\n\n(Subject: )/$1\n$2/g' < mbox > mbox2
$/ changes the line separator.
Almost right. But this deletes all empty lines, not just the ones in the header. I didn't try to have a deeper look.
On Thu, Jun 5, 2008 at 8:16 AM, <tomas@tuxteam.de> wrote:
That would be the s modifier for a Perl regexp (treat string as a single line):
$x =~ /.../s
This should be the right way.. see below.
On Thu, Jun 5, 2008 at 12:03 AM, Asheesh Laroia <asheesh@asheesh.org> wrote:
Python has an re.MULTILINE option you can pass to the regular expression so that it can cross lines. Perhaps Perl >or your favorite regular expression toolkit has something similar?
Yes, but with perl I didn't find quickly a solution to read multiple lines from a file without filling all system memory when files are some gigabytes big.
Regards, Diego.