IPQoS Architecture and the Diffserv Model
This section describes the IPQoS architecture and how IPQoS implements the differentiated services
(Diffserv) model that is defined inRFC 2475, An Architecture for Differentiated Services. The following elements of the Diffserv model
are included in IPQoS:
In addition, IPQoS includes the flow-accounting module and the dlcosmk marker for
use with virtual local area network (VLAN) devices.
Classifier Module
In the Diffserv model, the classifier is responsible for organizing selected traffic flows
into groups on which to apply different service levels. The classifiers that are
defined in RFC 2475 were originally designed for boundary routers. In contrast, the
IPQoS classifier ipgpc is designed to handle traffic flows on hosts that are
internal to the local network. Therefore, a network with both IPQoS systems and
a Diffserv router can provide a greater degree of differentiated services. For a
technical description of ipgpc, refer to the ipgpc(7ipp) man page.
The ipgpc classifier does the following:
Selects traffic flows that meet the criteria specified in the IPQoS configuration file on the IPQoS-enabled system
The QoS policy defines various criteria that must be present in packet headers. These criteria are called selectors. The ipgpc classifier compares these selectors against the headers of packets that are received by the IPQoS system. ipgpc then selects all matching packets.
Separates the packet flows into classes, network traffic with the same characteristics, as defined in the IPQoS configuration file
Examines the value in the packet's differentiated service (DS) field for the presence of a differentiated services codepoint (DSCP)
The presence of the DSCP indicates whether the incoming traffic has been marked by the sender with a forwarding behavior.
Determines what further action is specified in the IPQoS configuration file for packets of a particular class
Passes the packets to the next IPQoS module specified in the IPQoS configuration file, or returns the packets to the network stream
For an overview of the classifier, refer to Classifier (ipgpc) Overview. For information on invoking
the classifier in the IPQoS configuration file, refer to IPQoS Configuration File.
IPQoS Selectors
The ipgpc classifier supports a variety of selectors that you can use in
the filter clause of the IPQoS configuration file. When you define a filter,
always use the minimum number of selectors that are needed to successfully retrieve
traffic of a particular class. The number of filters you define can impact
IPQoS performance.
The next table lists the selectors that are available for ipgpc.
Table 37-1 Filter Selectors for the IPQoS Classifier
Selector |
Argument |
Information Selected |
saddr |
IP
address number. |
Source address. |
daddr |
IP address number. |
Destination address. |
sport |
Either a port number or service name,
as defined in /etc/services. |
Source port from which a traffic class originated. |
dport |
Either a
port number or service name, as defined in /etc/services. |
Destination port to which
a traffic class is bound. |
protocol |
Either a protocol number or protocol name,
as defined in /etc/protocols. |
Protocol to be used by this traffic class. |
dsfield |
DS codepoint
(DSCP) with a value of 0–63. |
DSCP, which defines any forwarding behavior to
be applied to the packet. If this parameter is specified, the dsfield_mask parameter must
also be specified. |
dsfield_mask |
Bit mask with a value of 0–255. |
Used in tandem with
the dsfield selector. dsfield_mask is applied to the dsfield selector to determine which
of its bits to match against. |
if_name |
Interface name. |
Interface to be used for either
incoming or outgoing traffic of a particular class. |
if_groupname |
Interface group name. |
Interface group to be
used for either incoming or outgoing traffic of a particular class. |
user |
Number of the
UNIX user ID or user name to be selected. If no user
ID or user name is on the packet, the default –1 is used. |
User
ID that is supplied to an application. |
projid |
Number of the project ID to
be selected. |
Project ID that is supplied to an application. |
priority |
Priority number. Lowest priority
is 0. |
Priority that is given to packets of this class. Priority is
used to order the importance of filters for the same class. |
direction |
Argument can be
one of the following: |
Direction of packet flow on the IPQoS machine. |
|
LOCAL_IN |
Input
traffic local to the IPQoS system. |
|
LOCAL_OUT |
Output traffic local to the IPQoS system. |
|
FWD_IN |
Input
traffic to be forwarded. |
|
FWD_OUT |
Output traffic to be forwarded. |
precedence |
Precedence value. Highest precedence is
0. |
Precedence is used to order filters with the same priority. |
ip_version |
V4 or V6 |
Addressing
scheme that is used by the packets, either IPv4 or IPv6. |
Meter Module
The meter tracks the transmission rate of flows on a per-packet basis.
The meter then determines whether the packet conforms to the configured parameters. The
meter module determines the next action for a packet from a set
of actions that depend on packet size, configured parameters, and flow rate.
The meter consists of two metering modules, tokenmt and tswtclmt, which you
configure in the IPQoS configuration file. You can configure either module or both
modules for a class.
When you configure a metering module, you can define two parameters for rate:
committed-rate – Defines the acceptable transmission rate in bits per second for packets of a particular class
peak-rate – Defines the maximum transmission rate in bits per second that is allowable for packets of a particular class
A metering action on a packet can result in one of three
outcomes:
green – The packet causes the flow to remain within its committed rate.
yellow – The packet causes the flow to exceed its committed rate but not its peak rate.
red – The packet causes the flow to exceed its peak rate.
You can configure each outcome with different actions in the IPQoS configuration file.
Committed rate and peak rate are explained in the next section.
tokenmt Metering Module
The tokenmt module uses token buckets to measure the transmission rate of a flow.
You can configure tokenmt to operate as a single-rate or two-rate meter. A
tokenmt action instance maintains two token buckets that determine whether the traffic flow conforms
to configured parameters.
The tokenmt(7ipp) man page explains how IPQoS implements the token meter paradigm. You
can find more general information about token buckets in Kalevi Kilkki's Differentiated Services for the Internet and
on a number of web sites.
Configuration parameters for tokenmt are as follows:
committed_rate – Specifies the committed rate of the flow in bits per second.
committed_burst – Specifies the committed burst size in bits. The committed_burst parameter defines how many outgoing packets of a particular class can pass onto the network at the committed rate.
peak_rate – Specifies the peak rate in bits per second.
peak_burst – Specifies the peak or excess burst size in bits. The peak_burst parameter grants to a traffic class a peak-burst size that exceeds the committed rate.
color_aware – Turns on awareness mode for tokenmt.
color_map – Defines an integer array that maps DSCP values to green, yellow, or red.
Configuring tokenmt as a Single-Rate Meter
To configure tokenmt as a single-rate meter, do not specify a peak_rate parameter
for tokenmt in the IPQoS configuration file. To configure a single-rate tokenmt instance to
have a red, green, or a yellow outcome, you must specify the
peak_burst parameter. If you do not use the peak_burst parameter, you can configure
tokenmt to have only a red outcome or green outcome. For an example
of a single-rate tokenmt with two outcomes, see Example 34-3.
When tokenmt operates as a single-rate meter, the peak_burst parameter is actually
the excess burst size. committed_rate, and either committed_burst or peak_burst, must be nonzero
positive integers.
Configuring tokenmt as a Two-Rate Meter
To configure tokenmt as a two-rate meter, specify a peak_rate parameter for the
tokenmt action in the IPQoS configuration file. A two-rate tokenmt always has the three
outcomes, red, yellow, and green. The committed_rate, committed_burst, and peak_burst parameters must be
nonzero positive integers.
Configuring tokenmt to Be Color Aware
To configure a two-rate tokenmt to be color aware, you must add parameters
to specifically add “color awareness.” The following is an example action statement that
configures tokenmt to be color aware.
Example 37-1 Color-Aware
tokenmt Action for the IPQoS Configuration File
action {
module tokenmt
name meter1
params {
committed_rate 4000000
peak_rate 8000000
committed_burst 4000000
peak_burst 8000000
global_stats true
red_action_name continue
yellow_action_name continue
green_action_name continue
color_aware true
color_map {0-20,22:GREEN;21,23-42:RED;43-63:YELLOW}
}
}
You turn on color awareness by setting the color_aware parameter to true. As
a color-aware meter, tokenmt assumes that the packet has already been marked as
red, yellow, or green by a previous tokenmt action. Color-aware tokenmt evaluates a
packet by using the DSCP in the packet header in addition to
the parameters for a two-rate meter.
The color_map parameter contains an array into which the DSCP in the packet
header is mapped. Consider the following color_map array:
color_map {0-20,22:GREEN;21,23-42:RED;43-63:YELLOW}
Packets with a DSCP of 0–20 and 22 are mapped to green.
Packets with a DSCP of 21 and 23–42 are mapped to red. Packets
with a DSCP of 43–63 are mapped to yellow. tokenmt maintains a default color
map. However, you can change the default as needed by using the
color_map parameters.
In the color_action_name parameters, you can specify continue to complete processing of the packet.
Or, you can add an argument to send the packet to a
marker action, for example, yellow_action_name mark22.
tswtclmt Metering Module
The tswtclmt metering module estimates average bandwidth for a traffic class by using
a time-based rate estimator. tswtclmt always operates as a three-outcome meter. The rate
estimator provides an estimate of the flow's arrival rate. This rate should approximate the
running average bandwidth of the traffic stream over a specific period or time,
its time window. The rate estimation algorithm is taken from RFC 2859, A Time Sliding Window Three Colour Marker.
You use the following parameters to configure tswtclmt:
committed_rate – Specifies the committed rate in bits per second
peak_rate – Specifies the peak rate in bits per second
window – Defines the time window, in milliseconds over which history of average bandwidth is kept
For technical details on tswtclmt, refer to thetswtclmt(7ipp) man page. For general information
on rate shapers that are similar to tswtclmt, see RFC 2963, A Rate Adaptive Shaper for Differentiated Services.
Marker Module
IPQoS includes two marker modules, dscpmk and dlcosmk. This section contains information for using
both markers. Normally, you should use dscpmk because dlcosmk is only available for IPQoS
systems with VLAN devices.
For technical information about dscpmk, refer to the dscpmk(7ipp) man page. For technical
information about dlcosmk, refer to the dlcosmk(7ipp) man page.
Using the dscpmk Marker for Forwarding Packets
The marker receives traffic flows after the flows are processed by the classifier
or by the metering modules. The marker marks the traffic with a forwarding
behavior. This forwarding behavior is the action to be taken on the flows
after the flows leaving the IPQoS system. Forwarding behavior to be taken on
a traffic class is defined in the per-hop behavior (PHB). The PHB assigns a
priority to a traffic class, which indicates the precedence flows of that class
in relation to other traffic classes. PHBs only govern forwarding behaviors on the
IPQoS system's contiguous network. For more information on PHBs, refer to Per-Hop Behaviors.
Packet forwarding is the process of sending traffic of a particular class to its
next destination on a network. For a host such as an IPQoS system,
a packet is forwarded from the host to the local network stream. For
a Diffserv router, a packet is forwarded from the local network to the
router's next hop.
The marker marks the DS field in the packet header with a
well-known forwarding behavior that is defined in the IPQoS configuration file. Thereafter, the IPQoS
system and subsequent Diffserv-aware systems forward the traffic as indicated in the DS
field until the mark changes. To assign a PHB, the IPQoS system marks
a value in the DS field of the packet header. This value is
called the differentiated services codepoint (DSCP). The Diffserv architecture defines two types of
forwarding behaviors, EF and AF, which use different DSCPs. For overview information about
DSCPs, refer to DS Codepoint.
The IPQoS system reads the DSCP for the traffic flow and evaluates the
flow's precedence in relation to other outgoing traffic flows. The IPQoS system then
prioritizes all concurrent traffic flows and releases each flow onto the network
by its priority.
The Diffserv router receives the outgoing traffic flows and reads the DS field
in the packet headers. The DSCP enables the router to prioritize and schedule
the concurrent traffic flows. The router forwards each flow by the priority
that is indicated by the PHB. Note that the PHB cannot apply beyond
the boundary router of the network unless Diffserv-aware systems on subsequent hops also
recognize the same PHB.
Expedited Forwarding (EF) PHB
Expedited forwarding (EF) guarantees that packets with the recommended EF codepoint 46 (101110) receive
the best treatment that is available on release to the network. Expedited forwarding
is often compared to a leased line. Packets with the 46 (101110) codepoint
are guaranteed preferential treatment by all Diffserv routers en route to the packets'
destination. For technical information about EF, refer to RFC 2598, An Expedited Forwarding PHB.
Assured Forwarding (AF) PHB
Assured forwarding (AF) provides four different classes of forwarding behaviors that you can specify
to the marker. The next table shows the classes, the three drop precedences
that are provided with each class, and the recommended DSCPs that are associated
with each precedence. Each DSCP is represented by its AF value, its value
in decimal, and its value in binary.
Table 37-2 Assured Forwarding Codepoints
|
Class 1 |
Class 2 |
Class 3 |
Class 4 |
Low-Drop Precedence |
AF11 = 10 (001010) |
AF21
= 18 (010010) |
AF31 = 26 (011010) |
AF41 = 34 (100010) |
Medium-Drop Precedence |
AF12 = 12 (001100) |
AF22 = 20 (010100) |
AF32 = 28 (011100) |
AF42
= 36 (100100) |
High-Drop Precedence |
AF13 = 14 (001110) |
AF23 = 22 (010110) |
AF33 = 30 (011110) |
AF43 = 38 (100110) |
Any Diffserv-aware system can use the AF codepoint as a guide for
providing differentiated forwarding behaviors to different classes of traffic.
When these packets reach a Diffserv router, the router evaluates the packets' codepoints
along with DSCPs of other traffic in the queue. The router then forwards
or drops packets, depending on the available bandwidth and the priorities that are
assigned by the packets' DSCPs. Note that packets that are marked with the
EF PHB are guaranteed bandwidth over packets that are marked with the various
AF PHBs.
Coordinate packet marking between any IPQoS systems on your network and the Diffserv
router to ensure that packets are forwarded as expected. For example, suppose IPQoS
systems on your network mark packets with AF21 (010010), AF13 (001110), AF43 (100110),
and EF (101110) codepoints. You then need to add the AF21, AF13, AF43,
and EF DSCPs to the appropriate file on the Diffserv router.
For a technical explanation of the AF codepoint table, refer to RFC 2597.
Router manufacturers Cisco Systems and Juniper Networks have detailed information about setting the
AF PHB on their web sites. You can use this information to define
AF PHBs for IPQoS systems as well as routers. Additionally, router manufacturers' documentation
contains instructions for setting DS codepoints on their equipment.
Supplying a DSCP to the Marker
The DSCP is 6 bits in length. The DS field is 1
byte long. When you define a DSCP, the marker marks the first 6
significant bits of the packet header with the DS codepoint. The remaining 2
least-significant bits are unused.
To define a DSCP, you use the following parameter within a marker action
statement:
dscp_map{0-63:DS_codepoint}
The dscp_map parameter is a 64-element array, which you populate with the (DSCP)
value. dscp_map is used to map incoming DSCPs to outgoing DSCPs that are
applied by the dscpmk marker.
You must specify the DSCP value to dscp_map in decimal notation. For example,
you must translate the EF codepoint of 101110 into the decimal value 46,
which results in dscp_map{0-63:46}. For AF codepoints, you must translate the various codepoints
that are shown in Table 37-2 to decimal notation for use with dscp_map.
Using the dlcosmk Marker With VLAN Devices
The dlcosmk marker module marks a forwarding behavior in the MAC header of
a datagram. You can use dlcosmk only on an IPQoS system with a
VLAN interface.
dlcosmk adds four bytes, which are known as the VLAN tag, to the MAC
header. The VLAN tag includes a 3-bit user-priority value, which is defined by
the IEEE 801.D standard. Diffserv-aware switches that understand VLAN can read the user-priority
field in a datagram. The 801.D user priority values implement the class-of-service (CoS) marks,
which are well known and understood by commercial switches.
You can use the user-priority values in the dlcosmk marker action by defining
the class of service marks that are listed in the next table.
Table 37-3 801.D User-Priority Values
Class of
Service |
Definition |
0 |
Best effort |
1 |
Background |
2 |
Spare |
3 |
Excellent effort |
4 |
Controlled load |
5 |
Video less than 100ms latency |
6 |
Video less than 10ms latency |
7 |
Network
control |
For more information on dlcosmk, refer to the dlcosmk(7ipp) man page.
IPQoS Configuration for Systems With VLAN Devices
This section introduces a simple network scenario that shows how to implement IPQoS
on systems with VLAN devices. The scenario includes two IPQoS systems, machine1 and
machine2, that are connected by a switch. The VLAN device on machine1 has the
IP address 10.10.8.1. The VLAN device on machine2 has the IP address
10.10.8.3.
The following IPQoS configuration file for machine1 shows a simple solution for marking
traffic through the switch to machine2.
Example 37-2 IPQoS Configuration File for a System With a VLAN Device
fmt_version 1.0
action {
module ipgpc
name ipgpc.classify
filter {
name myfilter2
daddr 10.10.8.3
class myclass
}
class {
name myclass
next_action mark4
}
}
action {
name mark4
module dlcosmk
params {
cos 4
next_action continue
global_stats true
}
}
In this configuration, all traffic from machine1 that is destined for the VLAN
device on machine2 is passed to the dlcosmk marker. The mark4 marker action instructs
dlcosmk to add a VLAN mark to datagrams of class myclass with a
CoS of 4. The user-priority value of 4 indicates that the switch between
the two machines should give controlled load forwarding to myclass traffic flows from
machine1.
flowacct Module
The IPQoS flowacct module records information about traffic flows, a process that is
referred to as flow accounting. Flow accounting produces data that can be used for
billing customers or for evaluating the amount of traffic to a particular class.
Flow accounting is optional. flowacct is typically the final module that metered or
marked traffic flows might encounter before release onto the network stream. For an
illustration of flowacct's position in the Diffserv model, see Figure 32-1. For detailed technical
information about flowacct, refer to the flowacct(7ipp) man page.
To enable flow accounting, you need to use the Solaris exacct accounting facility
and the acctadm command, as well as flowacct. For the overall steps
in setting up flow accounting, refer to Setting Up Flow Accounting (Task Map).
flowacct Parameters
The flowacct module gathers information about flows in a flow table that is composed
of flow records. Each entry in the table contains one flow record. You cannot
display a flow table.
In the IPQoS configuration file, you define the following flowacct parameters to
measure flow records and to write the records to the flow table:
timer – Defines an interval, in milliseconds, when timed-out flows are removed from the flow table and written to the file that is created by acctadm
timeout – Defines an interval, in milliseconds, which specifies how long a packet flow must be inactive before the flow times out
Note - You can configure timer and timeout to have different values.
max_limit – Places an upper limit on the number of flow records that can be stored in the flow table
For an example of how flowacct parameters are used in the IPQoS configuration
file, refer to How to Configure Flow Control in the IPQoS Configuration File.
Flow Table
The flowacct module maintains a flow table that records all packet flows that
are seen by a flowacct instance. A flow is identified by the following
parameters, which include the flowacct 8–tuple:
Source address
Destination address
Source port
Destination port
DSCP
User ID
Project ID
Protocol Number
If all the parameters of the 8–tuple for a flow remain the
same, the flow table contains only one entry. The max_limit parameter determines the number
of entries that a flow table can contain.
The flow table is scanned at the interval that is specified in
the IPQoS configuration file for the timer parameter. The default is 15 seconds. A
flow “times out” when its packets are not seen by the IPQoS system
for at least the timeout interval in the IPQoS configuration file. The default
time out interval is 60 seconds. Entries that have timed out are then
written to the accounting file that is created with the acctadm command.
flowacct Records
A flowacct record contains the attributes described in the following table.
Table 37-4 Attributes of a flowacct Record
Attribute Name |
Attribute Contents |
Type |
src-addr-address-type |
Source
address of the originator. address-type is either v4 for IPv4 or v6 for
IPv6, as specified in the IPQoS configuration file. |
Basic |
dest-addr-address-type |
Destination address for the packets.
address-type is either v4 for IPv4 or v6 for IPv6, as specified in the
IPQoS configuration file. |
Basic |
src-port |
Source port from which the flow originated. |
Basic |
dest-port |
Destination port number
to which this flow is bound. |
Basic |
protocol |
Protocol number for the flow. |
Basic |
total-packets |
Number
of packets in the flow. |
Basic |
total-bytes |
Number of bytes in the flow. |
Basic |
action-name |
Name of
the flowacct action that recorded this flow. |
Basic |
creation-time |
First time that a packet is
seen for the flow by flowacct. |
Extended only |
last-seen |
Last time that a packet of
the flow was seen. |
Extended only |
diffserv-field |
DSCP in the outgoing packet headers of the
flow. |
Extended only |
user |
Either a UNIX User ID or user name, which is obtained
from the application. |
Extended only |
projid |
Project ID, which is obtained from the application. |
Extended only |
Using acctadm with the flowacct Module
You use the acctadm command to create a file in which to store
the various flow records that are generated by flowacct. acctadm works in
conjunction with the extended accounting facility. For technical information about acctadm, refer to the
acctadm(1M) man page.
The flowacct module observes flows and fills the flow table with flow
records. flowacct then evaluates its parameters and attributes in the interval that is
specified by timer. When a packet is not seen for at least
the last_seen plus timeout values, the packet times out. All timed-out entries
are deleted from the flow table. These entries are then written to the
accounting file each time the interval that is specified in the timer parameter elapses.
To invoke acctadm for use with the flowacct module, use the following syntax:
acctadm -e file-type -f filename flow
- acctadm -e
Invokes acctadm with the -e option. The -e indicates that a resource list follows.
- file-type
Specifies the attributes to be gathered. file-type must be replaced by either basic or extended. For a list of attributes in each file type, refer to Table 37-4.
- -ffile-name
Creates the filefile-name to hold the flow records.
- flow
Indicates that acctadm is to be run with IPQoS.