Of course, there are drawbacks. Luckily, these are not functionality
drawbacks—they are more administration hassles. The
disadvantages are:
-
You have another daemon to worry about, and while proxies are
generally stable, you have to make sure to prepare proper startup and
shutdown scripts, which are run at boot and reboot as appropriate.
This is something that you do once and never come back to again.
Also, you might want to set up the crontab to
run a watchdog script that will make sure that the proxy server is
running and restart it if it detects a problem, reporting the problem
to the administrator on the way. Chapter 5
explains how to develop and run such watchdogs.
-
Proxy servers can be configured to be light or heavy. The
administrator must decide what gives the highest performance for his
application. A proxy server such as Squid is light in the sense of
having only one process serving all requests, but it can consume a
lot of memory when it loads objects into memory for faster service.
-
If you use the default logging mechanism for all requests on the
front- and backend servers, the requests that will be proxied to the
backend server will be logged twice, which makes it tricky to merge
the two log files, should you want to. Therefore, if all accesses to
the backend server are done via the frontend server,
it's the best to turn off logging of the backend
server.
If the backend server is also accessed directly, bypassing the
frontend server, you want to log only the requests that
don't go through the frontend server. One way to
tell whether a request was proxied or not is to use
mod_proxy_add_forward, presented later in this chapter, which sets
the HTTP header X-Forwarded-For for all proxied
requests. So if the default logging is turned off, you can add a
custom PerlLogHandler that logs only requests made
directly to the backend server.
If you still decide to log proxied requests at the backend server,
they might not contain all the information you need, since instead of
the real remote IP of the user, you will always get the IP of the
frontend server. Again, mod_proxy_add_forward, presented later,
provides a solution to this problem.
First let's explain an abbreviation used in the
networking world. If someone claims to have a 56-kbps connection, it
means that the connection is made at 56 kilobits per second (~56,000
bits/sec). It's not 56 kilobytes per second, but 7
kilobytes per second, because 1 byte equals 8 bits. So
don't let the merchants fool you—your modem
gives you a 7 kilobytes-per-second connection at most, not 56
kilobytes per second, as one might think.
Another convention used in computer literature is that 10 Kb usually
means 10 kilo-bits and 10 KB means 10 kilobytes. An uppercase B
generally refers to bytes, and a lowercase b refers to bits (K of
course means kilo and equals 1,024 or 1,000, depending on the field
in which it's used). Remember that the latter
convention is not followed everywhere, so use this knowledge with
care.
In the typical scenario (as of this writing), users connect to your
site with 56-kbps modems. This means that the speed of the
user's network link is 56/8 = 7 KB per second.
Let's assume an average generated HTML page to be of
42 KB and an average mod_perl script to generate this response in 0.5
seconds. How many responses could this script produce during the time
it took for the output to be delivered to the user? A simple
calculation reveals pretty scary numbers:
(42KB)/(0.5sx7KB/s)
= 12
Twelve other dynamic requests could be served at the same time, if we
could let mod_perl do only what it's best at:
generating responses.
This very simple example shows us that we need only one-twelfth the
number of children running, which means that we will need only
one-twelfth of the memory.
But you know that nowadays scripts often return pages that are blown
up with JavaScript and other code, which can easily make them 100 KB
in size. Can you calculate what the download time for a file that
size would be?
Furthermore, many users like to open multiple browser windows and do
several things at once (e.g., download files and browse graphically
heavy sites). So the speed of 7 KB/sec we assumed before may in
reality be 5-10 times slower. This is not good for your server.
Considering the last example and taking into account all the other
advantages that the proxy server provides, we hope that you are
convinced that despite a small administration overhead, using a proxy
is a good thing.
Of course, if you are on a very fast local area network (LAN) (which
means that all your users are connected from this network and not
from the outside), the big benefit of the proxy buffering the output
and feeding a slow client is gone. You are probably better off
sticking with a straight mod_perl server in this case.