The backup router's job is to monitor the active router and
assume its role in the event of failure.
Figure 7-1 shows
a simple LVS cluster consisting of two layers. On the first layer
are two LVS routers — one active and one backup. Each of the
LVS routers has two network interfaces, one interface on the
Internet and one on the private network, enabling them to regulate
traffic between the two networks. For this example the active
router is using Network Address
Translation or NAT to direct traffic
from the Internet to a variable number of real servers on the
second layer, which in turn provide the necessary services.
Therefore, the real servers in this example are connected to a
dedicated private network segment and pass all public traffic back
and forth through the active LVS router. To the outside world, the
server cluster appears as one entity.
Service requests arriving at the LVS cluster are addressed to a
virtual IP address or VIP. This is a
publicly-routable address the administrator of the site associates
with a fully-qualified domain name, such as www.example.com, and
which is assigned to one or more virtual
server. Note that a VIP address migrates from
one LVS router to the other during a failover, thus maintaining a
presence at that IP address, also known as floating IP addresses.
VIP addresses may be aliased to the same device which connects
the LVS router to the Internet. For instance, if eth0 is connected
to the Internet, than multiple virtual servers can be aliased to
eth0:1. Alternatively, each virtual
server can be associated with a separate device per service. For
example, HTTP traffic can be handled on eth0:1, and FTP traffic can be handled on
eth0:2.
Only one LVS router is active at a time. The role of the active
router is to redirect service requests from virtual IP addresses to
the real servers. The redirection is based on one of eight
supported load-balancing algorithms described further in Section 7.3 LVS Scheduling
Overview.
The active router also dynamically monitors the overall health
of the specific services on the real servers through simple
send/expect scripts. To aid in detecting
the health of services that require dynamic data, such as HTTPS or
SSL, the administrator can also call external executables. If a
service on a real server malfunctions, the active router stops
sending jobs to that server until it returns to normal
operation.
The backup router performs the role of a standby system.
Periodically, the LVS routers exchange heartbeat messages through
the primary external public interface and, in a failover situation,
the private interface. Should the backup node fail to receive a
heartbeat message within an expected interval, it initiates a
failover and assumes the role of the active router. During
failover, the backup router takes over the VIP addresses serviced
by the failed router using a technique known as ARP spoofing — where the backup LVS router
announces itself as the destination for IP packets addressed to the
failed node. When the failed node returns to active service, the
backup node assumes its hot-backup role again.
The simple, two-layered configuration used in Figure 7-1 is best for
clusters serving data which does not change very frequently —
such as static webpages — because the individual real servers
do not automatically sync data between each node.
Since there is no built-in component in LVS clustering to share
the same data between the real servers, the administrator has two
basic options:
The first option is preferred for servers that do not allow
large numbers of users to upload or change data on the real
servers. If the cluster allows large numbers of users to modify
data, such as an e-commerce website, adding a third layer is
preferable.
There are many ways an administrator can choose to synchronize
data across the pool of real servers. For instance, shell scripts
can be employed so that if a Web engineer updates a page, the page
is posted to all of the servers simultaneously. Also, the cluster
administrator can use programs such as rsync to replicate changed data across all nodes at
a set interval.
However, this type of data synchronization does not optimally
function if the cluster is overloaded with users constantly
uploading files or issuing database transactions. For a cluster
with a high load, a three-tiered topology
is the ideal solution.