Now let's run some benchmarks and compare.
Example 13-2. Benchmark/Handler.pm
package Benchmark::Handler;
use Apache::Constants qw(:common);
sub handler {
$r = shift;
$r->send_http_header('text/plain');
$r->print("Hello");
return OK;
}
1;
We will add these settings to httpd.conf:
PerlModule Benchmark::Handler
<Location /benchmark_handler>
SetHandler perl-script
PerlHandler Benchmark::Handler
</Location>
The first directive preloads and compiles the
Benchmark::Handler module. The remaining lines
tell Apache to execute the subroutine
Benchmark::Handler::handler when a request with
the relative URI /benchmark_handler is made.
We will use the usual configuration for
Apache::Registryscripts, where all the URIs
starting with /perl are mapped to the files
residing under the /home/httpd/perl directory:
Alias /perl /home/httpd/perl
<Location /perl>
SetHandler perl-script
PerlHandler +Apache::Registry
Options ExecCGI
PerlSendHeader On
</Location>
We will use Apache::RegistryLoader to preload and
compile the script at server startup as well, so the benchmark is
fair and only processing time is measured. To accomplish the
preloading we add the following code to the
startup.pl file:
use Apache::RegistryLoader ( );
Apache::RegistryLoader->new->handler(
"/perl/benchmarks/registry.pl",
"/home/httpd/perl/benchmarks/registry.pl");
To create the heavy benchmark set, let's leave the
preceding code examples unmodified but add some CPU-intensive
processing operation (e.g., an I/O operation or a database query):
my $x = 100;
my $y = log ($x ** 100) for (0..10000);
This code does lots of mathematical processing and is therefore very
CPU-intensive.
------------------------------
name | avtime rps
------------------------------
light handler | 15 911
light registry | 21 680
------------------------------
heavy handler | 183 81
heavy registry | 191 77
------------------------------
First let's compare the results from the light set.
We can see that the average overhead added by
Apache::Registry (compared to the custom handler)
is about:
21 - 15 = 6 milliseconds
per request.
The difference in speed is about 40% (15 ms versus 21 ms). Note that
this doesn't mean that the difference in real-world
applications would be so big. The results of the heavy set confirm
this.
In the heavy set the average processing time is almost the same for
Apache::Registry and the custom handler. You can
clearly see that the difference between the two is almost the same as
in the light set's results—it has grown from 6
ms to 8 ms (191 ms - 183 ms). This means that the identical heavy
code that has been added was running for about 168 ms (183 ms - 15
ms). However, this doesn't mean that the added code
itself ran for 168 ms; it means that it took 168 ms for this code to
be completed in a multiprocess environment where each process gets a
time slice to use the CPU. The more processes that are running, the
more time the process will have to wait to get the next time slice
when it can use the CPU.
We have answered the second question as well (whether the overhead of
Apache::Registry is significant when used for
heavy code). You can see that when the code is not just the
hello script, the overhead added by
Apache::Registry is almost insignificant.
It's not zero, though. Depending on your
requirements, this 5-10 ms overhead may be tolerable. If
that's the case, you may choose to use
Apache::Registry.
An interesting observation is that when the server being tested runs
on a very slow machine the results are completely different:
------------------------------
name | avtime rps
------------------------------
light handler | 50 196
light registry | 160 61
------------------------------
heavy handler | 149 67
heavy registry | 822 12
------------------------------
First of all, the 6-ms difference in average processing time we saw
on the fast machine when running the light set has now grown to 110
ms. This means that the few extra operations that
Apache::Registry performs turn out to be very
expensive on a slow machine.
Secondly, you can see that when the heavy set is used, the time
difference is no longer close to that found in the light set, as we
saw on the fast machine. We expected that the added code would take
about the same time to execute in the handler and the script.
Instead, we see a difference of 673 ms (822 ms - 149 ms).
The explanation lies in the fact that the difference between the
machines isn't merely in the CPU speed.
It's possible that there are many other things that
are different—for example, the size of the processor cache. If
one machine has a processor cache large enough to hold the whole
handler and the other doesn't, this can be very
significant, given that in our heavy benchmark set, 99.9% of the CPU
activity was dedicated to running the calculation code.