This chapter will discuss the theoretical details about an IP filter, what it
is, how it works and basic things such as where to place firewalls, policies,
etcetera.
Questions for this chapter may be, where to actually put the firewall? In most
cases, this is a simple question, but in large corporate environments it may get
trickier. What should the policies be? Who should have access where? What is
actually an IP filter? All of these questions should be fairly well answered
later on in this chapter.
It is important to fully understand what an IP filter is. Iptables is an IP
filter, and if you don't fully understand this, you will get serious problems
when designing your firewalls in the future.
An IP filter operates mainly in layer 2, of the TCP/IP reference
stack. Iptables however has the ability to also work in layer 3, which actually
most IP filters of today have. But per definition an IP filter works in the
second layer.
If the IP filter implementation is strictly following the definition, it would
in other words only be able to filter packets based on their IP headers
(Source and Destionation address, TOS/DSCP/ECN, TTL, Protocol, etcetera. Things
that are actually in the IP header. However, since the Iptables implementation
is not perfectly strict around this definition, it is also able to filter
packets based on other headers that lie deeper into the packet (TCP, UDP,
etc), and shallower (MAC source address).
There is one thing however, that iptables is very strict about in these days.
It does not "follow" streams or puzzle data together. This would simply be too
timeconsuming. The implications of this will be discussed a little bit more
just further on. It does keep track of packets and see if they are of the same
stream (via sequence numbers, port numbers, etc.) almost exactly the same way
as the real TCP/IP stack. This is called connection tracking, and thanks to
this we can do things such as Destination and Source Network Address
Translation (generally called DNAT and SNAT), as well as state matching of
packets.
As I implied above, iptables can not connect data from different packets to
each other, and hence you can never be fully certain that you will see the
complete data at all times. I am specifically mentioning this since there are
constantly at least a couple of questions about this on the different mailing
lists pertaining to netfilter and iptables and how to do things that are
generally considered a really bad idea. For example, every time there is a new
windows based virus, there are a couple of different persons asking how to drop
all streams containing a specific string. The bad idea about this is that it is
so easily circumvented. For example if we match for something like this:
cmd.exe
Now, what happens if the virus/exploit writer is smart enough to make the
packet size so small that cmd winds up in one packet, and
.exe winds up in the next packet? Or what if the packet has
to travel through a network that has this small a packet size on its own? Yes,
since these string matching functions is unable to work across packet
boundaries, the packet will get through anyway.
Some of you may now be asking yourself, why don't we simply make it possible
for the string matches, etcetera to read across packet boundaries? It is
actually fairly simple. It would be too costly on processor time. Connection
tracking is already taking way to much processor time to be totally
comforting. To add another extra layer of complexity to connection tracking,
such as this, would probably kill more firewalls than anyone of us could
expect. Not to think of how much memory would be used for this simple task on
each machine.
There is also a second reason for this functionality not being developed. There
is a technology called proxies. Proxies were developed to handle traffic in
the higher layers, and are hence much better at fullfilling these requirements.
Proxies were originally developed to handle downloads and often used pages and
to help you get the most out of slow Internet connections. For example,
Squid is a webproxy. A
person who wants to download a page sends the request, the proxy either grabs
the request or receives the request and opens the connection to the web
browser, and then connects to the webserver and downloads the file, and when
it has downloaded the file or page, it sends it to the client. Now, if a
second browser wants to read the same page again, the file or page is already
downloaded to the proxy, and can be sent directly, and saves bandwidth for us.
As you may understand, proxies also have quite a lot of functionality to go in
and look at the actual content of the files that it downloads. Because of
this, they are much better at looking inside the whole streams, files, pages
etc.