Postfix Performance Tuning - Tuning the number of simultaneous deliveries

Postfix Documentation
Previous Page	Home	Next Page

Tuning the number of simultaneous deliveries

Although Postfix can be configured to run 1000 SMTP client processes at the same time, it is rarely desirable that it makes 1000 simultaneous connections to the same remote system. For this reason, Postfix has safety mechanisms in place to avoid this so-called "thundering herd" problem.

The Postfix queue manager implements the analog of the TCP slow start flow control strategy: when delivering to a site, send a small number of messages first, then increase the concurrency as long as all goes well; reduce concurrency in the face of congestion.

The initial_destination_concurrency parameter (default: 5) controls how many messages are initially sent to the same destination before adapting delivery concurrency. Of course, this setting is effective only as long as it does not exceed the process limit and the destination concurrency limit for the specific mail transport channel.
The default_destination_concurrency_limit parameter (default: 20) controls how many messages may be sent to the same destination simultaneously. You can override this setting for specific message delivery transports by taking the name of the master.cf entry and appending "_destination_concurrency_limit".

Examples of transport specific concurrency limits are:

The local_destination_concurrency_limit parameter (default: 2) controls how many messages are delivered simultaneously to the same local recipient. The recommended limit is low because delivery to the same mailbox must happen sequentially, so massive parallelism is not useful. Another good reason to limit delivery concurrency to the same recipient: if the recipient has an expensive shell command in her .forward file, or if the recipient is a mailing list manager, you don't want to run too many instances of those processes the same time.
The default smtp_destination_concurrency_limit of 20 seems enough to noticeably load a system without bringing it to its knees. Be careful when changing this to a much larger number.

The above default values of the concurrency limits work well in a broad range of situations. Knee-jerk changes to these parameters in the face of congestion can actually make problems worse. Specifically, large destination concurrencies should never be the default. They should be used only for transports that deliver mail to a small number of high volume domains.

A common situation where high concurrency is called for is on gateways relaying a high volume of mail from between the Internet and an intranet mail environment. Approximately half the mail (assuming equal volumes inbound and outbound) will be destined for the internal mail hubs. Since the internal mail hubs will be receiving all external mail exclusively from the gateway, it is reasonable to configure the gateway to make greater demands on the capacity of the internal SMTP servers.

The tuning of the inbound concurrency limits need not be trial and error. A high volume capable mailhub should be able to easily handle 50 or 100 (rather than the default 20) simultaneous connections, especially if the gateway forwards to multiple MX hosts. When all MX hosts are up and accepting connections in a timely fashion, throughput will be high. If any MX host is down and completely unresponsive, the average connection latency rises to at least 1/N * $smtp_connection_timeout, if there are N MX hosts. This limits throughput to at most the destination concurrency * N / $smtp_connection_timeout.

For example, with a destination concurrency of 100 and 2 MX hosts, each host will handle up to 50 simultaneous connections. If one MX host is down and the default SMTP connection timeout is 30s, the throughput limit is 100 * 2 / 30 ~= 6 messages per second. This suggests that high volume destinations with good connectivity and multiple MX hosts need a lower connection timeout, values as low as 5s or even 1s can be used to prevent congestion when one or more, but not all MX hosts are down.

If necessary, set a higher transport_destination_concurrency_limit (in main.cf since this is a queue manager parameter) and a lower smtp_connection_timeout (with a "-o" override in master.cf since this parameter has no per-transport name) for the relay transport and any transports dedicated for specific high volume destinations.

Postfix Documentation
Previous Page	Home	Next Page