Hi,
Sorry for the late answer...
On 06/13/11 15:40, Timo Sirainen wrote:
On Thu, 2011-06-09 at 20:56 +0200, Attila Nagy wrote:
Hi,
Currently Dovecot's LMTPd writes incoming emails to mail_temp_dir if it's bigger than 128k. But I would like to spare those unnecessary operations (creating a file, deleting it, writing into it, reading from it, checking whether there is free space and if not, rejecting (temporarily) the message). Memory is cheap, disk IO is not. :) And BTW, on a lot of systems, /tmp is a memory file system already, so there is absolute no need for this. If there's not enough disk space, nowadays the message is read fully into memory instead of tempfailing. Well, that doesn't seem to be the case (or maybe it's caused by other stuff, like pigeonhole?). Dovecot 2.0.13, with a temp dir capable of holding <64k: Filesystem Size Used Avail Capacity
Mounted on tmpfs 64k 4.0k 60k 6%
/data/tmp
Sending a message of 60k succeeds: smtp-source -d -f from@from -l 60000 -m 1 -s 1 -S test -t to@to -L -v dovecot:24 /var/tmp/smtp-source: name_mask: all /var/tmp/smtp-source: smtp_stream_setup: maxtime=300 enable_deadline=0 /var/tmp/smtp-source: vstream_tweak_tcp: TCP_MAXSEG 1448 /var/tmp/smtp-source: <<< 220 dovecot Dovecot LMTP ready /var/tmp/smtp-source: LHLO me /var/tmp/smtp-source: <<< 250-dovecot /var/tmp/smtp-source: <<< 250-8BITMIME /var/tmp/smtp-source: <<< 250-ENHANCEDSTATUSCODES /var/tmp/smtp-source: <<< 250 PIPELINING /var/tmp/smtp-source: MAIL FROM:from@from /var/tmp/smtp-source: <<< 250 2.1.0 OK /var/tmp/smtp-source: RCPT TO:to@to /var/tmp/smtp-source: <<< 250 2.1.5 OK /var/tmp/smtp-source: DATA /var/tmp/smtp-source: <<< 354 OK /var/tmp/smtp-source: . /var/tmp/smtp-source: <<< 250 2.0.0 to@to id Saved /var/tmp/smtp-source: QUIT /var/tmp/smtp-source: <<< 221 2.0.0 Client quit
While with a bigger message: smtp-source -d -f from@from -l 200000 -m 1 -s 1 -S test -t to@to -L -v dovecot:24 /var/tmp/smtp-source: name_mask: all /var/tmp/smtp-source: smtp_stream_setup: maxtime=300 enable_deadline=0 /var/tmp/smtp-source: vstream_tweak_tcp: TCP_MAXSEG 1448 /var/tmp/smtp-source: <<< 220 dovecot Dovecot LMTP ready /var/tmp/smtp-source: LHLO me /var/tmp/smtp-source: <<< 250-dovecot /var/tmp/smtp-source: <<< 250-8BITMIME /var/tmp/smtp-source: <<< 250-ENHANCEDSTATUSCODES /var/tmp/smtp-source: <<< 250 PIPELINING /var/tmp/smtp-source: MAIL FROM:from@from /var/tmp/smtp-source: <<< 250 2.1.0 OK /var/tmp/smtp-source: RCPT TO:to@to /var/tmp/smtp-source: <<< 250 2.1.5 OK /var/tmp/smtp-source: DATA /var/tmp/smtp-source: <<< 354 OK /var/tmp/smtp-source: . /var/tmp/smtp-source: <<< 451 4.3.0 Temporary internal failure /var/tmp/smtp-source: fatal: end of data rejected: 451 4.3.0 Temporary internal failure
When I give a bigger tmp filesystem to it, it accepts the message.
Also are you sure that writing to the file actually produces disk I/O? It depends. On a tmpfs file system, it is possible, if there is not enough memory and the system must page. Pretty bad condition. Of course this is mostly the same with no temporary files (holding the emails in memory). Well, mostly, because you don't duplicate all e-mails in memory. And if emails come and go in the range of some hundred Mbps, this can count. Also, a file in tmpfs possibly requires more memory than the same message in an efficient memory structure (a c string for example, which has only a small metadata, compared to tmpfs). If the tmp directory is not a tmpfs, it depends on whether you commit the written bits (I guess you don't fsync it, why would you :) and whether the file system wants to write them. There are file systems, which can't handle blocks belonging to different files independently with fsync. So if you fsync a small file, and you have written 3 GB to the temporary dir (let's assume they are on the same FS), which you will delete in the next second and you haven't fsynced them, 3 GB plus the small file will be written (to the log). Of course you can (and will) separate the temporary file system, which alleviates this problem. But even then it will be possible that the bits will written, for example because the file system's "commit time" has come and see the above, it may write out a lot of stuff.
Even if /tmp isn't a memory filesystem, I think there's a good chance that the file will be gone before any disk writes have a chance to start. Can you see some measurable disk I/O change by changing this value? I can't really measure it now, because I don't have a separate disk pool for temporary files (because nothing uses /tmp, so it would be useless, all resources are delegated to the main pool) and I use tmpfs. But even it's just a few IOPS and some wasted CPU cycles, why wouldn't I set that? :)
I think it would be nice to have this as a configurable option, so there would be no need to rebuild every time.