                                   Maildir++

In this document:
  • HOWTO.maildirquota
  • Mission statement
  • Definitions and goals
  • Contents of a maildirsize
  • Calculating maildirsize
  • Calculating the quota for a Maildir++
  • Delivering to a Maildir++
  • Reading from a Maildir++
  • Bugs

HOWTO.maildirquota

The remaining portion of this document is a technical description of the maildir
quota extension. This section is a brief overview of this extension.

  What is a maildirquota?

If you would like to have a quota on your maildir mailboxes, the best solution
is to always use filesystem-based quotas: per-user usage quotas that is enforced
by the operating system.

This is the best solution when the default Maildir is located in each account's
home directory. This solution will NOT work if Maildirs are stored elsewhere, or
if you have a large virtual domain setup where a single userid is used to hold
many individual Maildirs, one for each virtual user.

This extension to the maildir format allows a "voluntary" maildir quota
implementation that does not rely on filesystem-based quotas.

  When maildirquota will not work.

For this quota mechanism to work, all software that accesses a maildir must
observe this quota protocol. It follows that this quota mechanism can be easily
circumvented if users have direct (shell) access to the filesystem containing
the users' maildirs.

Furthermore, this quota mechanism is not 100% effective. It is possible to have
a situation where someone may go over quota. This quota implementation uses a
deliverate trade-off. It is necessary to use some form of locking in order to
have a complete bulletproof quota enforcement, but maildirs mail stores were
explicitly designed to avoid any kind of locking. This quota approach does not
use locking, and the tradeoff is that sometimes it is possible for a few extra
messages to be delivered to the maildir, before the door is permanently shot.

For best performance, all maildir clients should support this quota extension,
however there's a wide degree of tolerance here. As long as the mail delivery
agent that puts new messages into a Maildir uses this extension, the quota will
be enforced without excessive degradation.

In the worst case scenario, quotas are automatically recalculated every fifteen
minutes. If a maildir goes over quota, and a mail client that does not support
this quota extension removes enough mail from the maildir, the mail delivery
agent will not be immediately informed that the maildir is now under quota.
However, eventually the correct quota will be recalculated and mail delivery
will resume.

Mail user agents sometimes put messages into the maildir themselves. Messages
added to a maildir by a mail user agent that does not understand the quota
extension will not be immediately counted towards the overall quota, and may not
be counted for an extensive period of time. Additionally, if there are a lot of
messages that have been added to a maildir from these mail user agents, quota
recalculation may impose non-trivial load on the system, as the quota
recalculator will have to issue the stat system call for each message.

  How to implement the quota

The best way to do that is to modify your mail server to implement the protocol
defined by this document. Not everyone, of course, has this ability. Therefore,
an alternate approach is available.

This package builds two small utility programs: "maildirmake" and
"deliverquota". maildirmake is an extended version of the Maildir creation
utility, with some additional options, including quota support.

The -qoptions to maildirmake installs the maildirsize file in an existing
Maildir, which enables quota support:

> maildirmake -q 10000000S ./Maildir

./Maildir is an existing maildir, and this -q options sets a quota of about 10
megabytes.

deliverquota reads the message from standard input, then delivers it to the
maildir specified by the first argument to deliverquota, observing any quota
that's set for the maildir. If the maildir is over quota, deliverquota
terminates with exit code 77. Otherwise, it delivers the message, updates the
quota, and terminates with exit code 0.

You will need to configure your mail server to use deliverquota instead of
delivering directly to maildirs. The instructions for doing so depends on which
mail server you use. For example, if you use Qmail and your maildirs are all
located in $HOME/Maildir, replace the './Maildir/' argument to qmail-start with
the following:

> '| /usr/local/bin/deliverquota ./Maildir'

Then, run maildirmake with the -q option to set up quotas on all the maildirs.

That's pretty much it. If you handle a moderate amount of mail, I have one more
suggestion. If possible, use deliverquota to deliver mail for a few weeks
beforing setting up any quotas. Even if quotas are not used, deliverquota uses
certain optimizations that permit very fast quota recalculation. Messages
delivered by deliverquota have their message size encoded in their filename;
this makes it possible to avoid stat-ing all files in the Maildir, when
recalculating the quota. Then, after most messages in your maildirs have been
delivered by deliverquota, activate the quotas.

  maildirquota-enhanced applications

This is a list of applications that have been enhanced to support the
maildirquota extension:

  • maildrop - mail delivery agent/mail filter.
  • SqWebMail - webmail CGI binary.
  • Courier-IMAP - an IMAP server
  • Courier - all of the above

  Quotas and deleted messages

The default application configuration that uses this maildirquota library does
not count deleted messages, and any contents of the Trash folder, against the
quota. Messages that are marked as deleted (but not yet actually removed), or
messages that are moved to the Trash folder (which is subject to automatic
purging) do not count towards the set quota.

It is possible to recompile the library to include all messages in the Maildir
against the quota. This is done by using the --with-trashquota option to the
configure script. Note that this option MUST be used to compile EVERY
application that uses this maildirquota library. So, for example, if you have
both maildrop and SqWebMail installed, you must use this option to recompile
both applications.

════════════════════════════════════════════════════════════════════════════════

Mission statement

Maildir++ is a mail storage structure that's based on the Maildir structure,
first used in the Qmail mail server. Actually, Maildir++ is just a minor
extension to the standard Maildir structure.

For more information, see http://www.courier-mta.org/maildir.html. I am not
going to include the definition of a Maildir in this document. Consider it
included right here. This document only describes the differences.

Maildir++ adds a couple of things to a standard Maildir: folders and quotas.

Quotas enforce a maximum allowable size of a Maildir. In many situations, using
the quota mechanism of the underlying filesystem won't work very well. If a
filesystem quota mechanism is used, then when a Maildir goes over quota, Qmail
does not bounce additional mail, but keeps it queued, changing one bad situation
into another bad situation. Not only do you have an account that's backed up,
but now your queue starts to back up too.

Definitions, and goals

Maildir++ and Maildir shall be completely interchangeable. A Maildir++ client
will be able to use a standard Maildir, automatically "upgrading" it in the
process. A Maildir client will be able to use a Maildir++ just like a regular
Maildir. Of course, a plain Maildir client won't be able to enforce a quota, and
won't be able to access messages stored in folders.

Folders are created as subdirectories under the main Maildir. The name of the
subdirectory always starts with a period. For example, a folder named
"Important" will be a subdirectory called ".Important". You can't have
subdirectories that start with two periods.

A Maildir++ client ignores anything in the main Maildir that starts with a
period, but is not a subdirectory.

Each subdirectory is a fully-fledged Maildir of its own, that is you have
.Important/tmp, .Important/new, and .Important/cur. Everything that applies to
the main Maildir applies equally well to the subdirectory, including
automatically cleaning up old files in tmp. A Maildir++ enhancement is that a
message can be moved between folders and/or the main Maildir simply by
moving/renaming the file (into the cur subdirectory of the destination folder).
Therefore, the entire Maildir++ must reside on the same filesystem.

Within each subdirectory there's an empty file, maildirfolder. Its existence
tells the mail delivery agent that this Maildir is a really a folder underneath
a parent Maildir++.

Only one special folder is reserved: Trash (subdirectory .Trash). Instead of
marking deleted messages with the D flag, Maildir++ clients move the message
into the Trash folder. Maildir++ readers are responsible for expunging messages
from Trash after a system-defined retention interval.

When a Maildir++ reader sees a message marked with a D flag it may at its
option: remove the message immediately, move it into Trash, or ignore it.

Can folders have subfolders, defined in a recursive fashion? The answer is no.
If you want to have a client with a hierarchy of folders, emulate it. Pick a
hierarchy separator character, say ":". Then, folder foo/bar is subdirectory
.foo:bar.

This is all that there's to say about folders. The rest of this document deals
with quotas.

The purpose of quotas is to temporarily disable a Maildir, if it goes over the
quota. There is one and only major goal that this quota implementation tries to
achieve:

  • Place as little overhead as possible on the mail system that's delivering to
    the Maildir++
That's it. To achieve that goal, certain compromises are made:
  • Mail delivery will stop as soon as possible after Maildir++'s size goes over
    quota. Certain race conditions may happen with Maildir++ going a lot over
    quota, in rare circumstances. That is taken into account, and the situation
    will eventually resolve itself, but you should not simply take your
    systemwide quota, multiply it by the number of mail accounts, and allocate
    that much disk space. Always leave room to spare.
  • How well the quota mechanism will work will depend on whether or not
    everything that accesses the Maildir++ is a Maildir++ client. You can have a
    transition period where some of your mail clients are just Maildir clients,
    and things should run more or less well. There will be some additional load
    because the size of the Maildir will be recalculated more often, but the
    additional load shouldn't be noticeable.
This won't be a perfect solution, but it will hopefully be good enough. Maildirs
are simply designed to rely on the filesystem to enforce individual quotas. If a
filesystem-based quota works for you, use it.

A Maildir++ may contain the following additional file: maildirsize.

Contents of maildirsize

maildirsize contains two or more lines terminated by newline characters.

The first line contains a copy of the quota definition as used by the system's
mail server. Each application that uses the maildir must know what it's quota
is. Instead of configuring each application with the quota logic, and making
sure that every application's quota definition for the same maildir is exactly
the same, the quota specification used by the system mail server is saved as the
first line of the maildirsize file. All other application that enforce the
maildir quota simply read the first line of maildirsize.

The quota definition is a list, separate by commas. Each member of the list
consists of an integer followed by a letter, specifying the nature of the quota.
Currently defined quota types are 'S' - total size of all messages, and 'C' -
the maximum count of messages in the maildir. For example, 10000000S,1000C
specifies a quota of 10,000,000 bytes or 1,000 messages, whichever comes first.

All remaining lines all contain two whitespace-delimited integers. The first
integer is interpreted as a byte count. The second integer is interpreted as a
file count. A Maildir++ writer can add up all byte counts and file counts from
maildirsize and enforce a quota based either on number of messages or the total
size of all the messages.

The current implementation of Maildir++ in Courier inserts whitespace padding on
each line so that each line (including the terminating \n) is 14 bytes in size.
This minimizes the impact of appending-related bugs in some NFS implementations.

Calculating maildirsize

In most cases, changes to maildirsize are recorded by appending an additional
line. Under some conditions maildirsize has to be recalculated from scratch.
These conditions are defined later. This is the procedure that's used to
recalculate maildirsize:

 1. If we find a maildirfolder within the directory, we're delivering to a
    folder, so back up to the parent directory, and start again.
 2. Read the contents of the new and cur subdirectories. Also, read the contents
    of the new and cur subdirectories in each Maildir++ folder, except Trash.
    Before reading each subdirectory, stat() the subdirectory itself, and keep
    track of the latest timestamp you get.
 3. If the filename of each message is of the form xxxxx,S=nnnnn or
    xxxxx,S=nnnnn:xxxxx where "xxxxx" represents arbitrary text, then use nnnnn
    as the size of the file (which will be conveniently recorded in the filename
    by a Maildir++ writer, within the conventions of filename naming in a
    Maildir). If the message was not written by a Maildir++ writer, stat() it to
    obtain the message size. If stat() fails, a race condition removed the file,
    so just ignore it and move on to the next one.
 4. When done, you have the grand total of the number of messages and their
    total size. Create a new maildirsize by: creating the file in the tmp
    subdirectory, observing the conventions for writing to a Maildir. Then
    rename the file as maildirsize. Afterwards, stat all new and cur
    subdirectories again. If you find a timestamp later than the saved
    timestamp, either remove maildirsize and proceed, or repeat the
    recalculation.
 5. Before running this calculation procedure, the Maildir++ user wanted to know
    the size of the Maildir++, so return the calculated values. This is done
    even if maildirsize was removed.

Calculating the quota for a Maildir++

This is the procedure for reading the contents of maildirsize for the purpose of
determine if the Maildir++ is over quota.

 1. If maildirsize does not exist, or if its size is at least 5120 bytes,
    recalculate it using the procedure defined above, and use the recalculated
    numbers. Otherwise, read the contents of maildirsize, and add up the totals.
 2. The most efficient way of doing this is to: open maildirsize, then start
    reading it into a 5120 byte buffer (some broken NFS implementations may
    return less than 5120 bytes read even before reaching the end of the file).
    If we fill it, which, in most cases, will happen with one read, close it,
    and run the recalculation procedure.
 3. In many cases the quota calculation is for the purpose of adding or removing
    messages from a Maildir++, so keep the file descriptor to maildirsize open.
    A file descriptor will not be available if quota recalculation ended up
    removing maildirsize due to a race condition, so the caller may or may not
    get a file descriptor together with the Maildir++ size.
 4. If the numbers we got indicated that the Maildir++ is over quota, some
    additional logic is in order: if we did not recalculate maildirsize, if the
    numbers in maildirsize indicated that we are over quota, then if maildirsize
    was more than one line long, or if the timestamp on maildirsize indicated
    that it's at least 15 minutes old, throw out the totals, and recalculate
    maildirsize from scratch.

Eventually the 5120 byte limitation will always cause maildirsize to be
recalculated, which will compensate for any race conditions which previously
threw off the totals. Each time a message is delivered or removed from a
Maildir++, one line is added to maildirsize (this is described below in greater
detail). Most messages are less than 10K long, so each line appended to
maildirsize will be either between seven and nine bytes long (four bytes for
message count, space, digit 1, newline, optional minus sign in front of both
counts if the message was removed). This results in about 640 Maildir++
operations before a recalculation is forced. Since most messages are added once
and removed once from a Maildir, expect recalculation to happen approximately
every 320 messages, keeping the overhead of a recalculation to a minimum. Even
if most messages include large attachments, most attachments are less than 100K
long, which brings down the average recalculation frequency to about 150
messages.

Also, the effect of having non-Maildir++ clients accessing the Maildir++ is
reduced by forcing a recalculation when we're potentially over quota. Even if
non-Maildir++ clients are used to remove messages from the Maildir, the fact
that the Maildir++ is still over quota will be verified every 15 minutes.

Delivering to a Maildir++

Delivering to a Maildir++ is like delivering to a Maildir, with the following
exceptions:

 1. Follow the usual Maildir conventions for naming the filename used to store
    the message, except that append ,S=nnnnn to the name of the file, where
    nnnnn is the size of the file. This eliminates the need to stat() most
    messages when calculating the quota. If the size of the message is not known
    at the beginning, append ,S=nnnnn when renaming the message from tmp to new.
 2. As soon as the size of the message is known (hopefully before it is written
    into tmp), calculate Maildir++'s quota, using the procedure defined
    previously. If the message is over quota, back out, cleaning up anything
    that was created in tmp.
 3. If a file descriptor to maildirsize was opened for us, after moving the file
    from tmp to new append a line to the file containing the message size, and
    "1".

Reading from a Maildir++

Maildir++ readers should mind the following additional tasks:

 1. Make sure to create the maildirfolder file in any new folders created within
    the Maildir++.
 2. When moving a message to the Trash folder, append a line to maildirsize,
    containing a negative message size and a '-1'.
 3. When moving a message from the Trash folder, follow the steps described in
    "Delivering to Maildir++", as far as quota logic goes. That is, refuse to
    move messages out of Trash if the Maildir++ is over quota.
 4. Moving a message between other folders carries no additional requirements.

References

   Visible links
   . http://www.courier-mta.org/maildrop/
   . http://www.courier-mta.org/sqwebmail/
   . http://www.courier-mta.org/imap/
   . http://www.courier-mta.org/
   . http://www.courier-mta.org/maildir.html
