Postfix Queue Scheduler - How the queue manager scheduler works

Postfix Documentation
Previous Page	Home	Next Page

How the queue manager scheduler works

The following text is from Patrik Rak and should be read together with the postconf(5) manual that describes each configuration parameter in detail.

From user's point of view, oqmgr(8) and qmgr(8) are both the same, except for how next message is chosen when delivery agent becomes available. You already know that oqmgr(8) uses round-robin by destination while qmgr(8) uses simple FIFO, except for some preemptive magic. The postconf(5) manual documents all the knobs the user can use to control this preemptive magic - there is nothing else to the preemption than the quite simple conditions described in there.

As for programmer-level documentation, this will have to be extracted from all those emails we have exchanged with Wietse [rats! I hoped that Patrik would do the work for me -- Wietse] But I think there are no missing bits which we have not mentioned in our conversations.

However, even from programmer's point of view, there is nothing more to add to the message scheduling idea itself. There are few things which make it look more complicated than it is, but the algorithm is the same as the user perceives it. The summary of the differences of the programmer's view from the user's view are:

Simplification of terms for users: The user knows about messages and recipients. The program itself works with jobs (one message is split among several jobs, one per each transport needed to deliver the message) and queue entries (each entry may group several recipients for same destination). Then there is the peer structure introduced by qmgr(8) which is simply per-job analog of the queue structure.
Dealing with concurrency limits: The actual implementation is complicated by the fact that the messages (resp. jobs) may not be delivered in the exactly scheduled order because of the concurrency limits. It is necessary to skip some "blocker" jobs when the concurrency limit is reached and get back to them again when the limit permits.
Dealing with resource limits: The actual implementation is complicated by the fact that not all recipients may be read in-core. Therefore each message has some recipients in-core and some may remain on-file. This means that a) the preemptive algorithm needs to work with recipient count estimates instead of exact counts, b) there is extra code which needs to manipulate the per-transport pool of recipients which may be read in-core at the same time, and c) there is extra code which needs to be able to read recipients into core in batches and which is triggered at appropriate moments.
Doing things efficiently: All important things I am aware of are done in the minimum time possible (either directly or at least when amortized complexity is used), but to choose which job is the best candidate for preempting the current job requires linear search of up to all transport jobs (the worst theoretical case - the reality is much better). As this is done every time the next queue entry to be delivered is about to be chosen, it seemed reasonable to add cache which minimizes the overhead. Maintenance of this candidate cache slightly obfuscates things.

The points 2 and 3 are those which made the implementation (look) complicated and were the real coding work, but I believe that to understand the scheduling algorithm itself (which was the real thinking work) is fairly easy.

Postfix Documentation
Previous Page	Home	Next Page