We wanted to determine how well the firewalls protected application-layer traffic and how well they performed. We were surprised to see little support for POP3 or IMAP and support for H.323 limited to opening the necessary UDP ports.
For the protocols we were most interested in--DNS, FTP, H.323, HTTP, IMAP, POP3, SMTP--we tried to find out whether the firewall performed any syntax checking at the protocol layer. We used Netcat, a tool developed by @Stake, to bind a commend shell to specific TCP and UDP ports. We then telnetted to the port. If we received the command shell, we knew the protocols weren't being enforced and we could move on. If we couldn't get a shell, we knew the firewalls were enforcing protection. We then tried two paths to determine how well the enforcement worked.
First we took well-known exploits that violated the protocol and determined if we could get the traffic through the firewall. We used Network Associates' Sniffer Distributed s4000 Model EG2S to monitor the packets coming out of the firewall. If the attack didn't get through the firewall, we knew the firewall was blocking. Then we used Cenzic's Hailstorm 2.0 to generate malformed packets primarily against HTTP, IMAP, POP3 and SMTP protocols. With Hailstorm, we could try to send header strings with long strings and non-ASCII characters through the headers, which typically is not allowed. You can find more detailed descriptions of our protection testing below.
We used Spirent Technologies' WebAvalanche and WebReflector to create a large user base and a Web farm, respectively. WebAvalanche and WebReflector combined produced a closed-loop test bed that simulated large numbers of users performing HTTP gets. We created three scenarios designed to test each firewall's ability to handle a high connection and tear-down rate, a high number of concurrent connections and a high traffic load over a moderate number of concurrent connections. We made every attempt to configure each firewall to provide a similar level of protection.
More Details on Our Performance Tests
Throughput testing typically consists of bit-blasting the device under test
(DUT) to find when the DUT begins to fail. And though bit-blasting is fine for
testing Layer 2/3 infrastructure devices, it doesn't really test the performance
of DUTs that have to maintain session state while passing traffic. With
firewalls?both stateful packet filtering and application proxies?new TCP/UDP
sessions start and end while traffic is flowing through the DUT. Because the
firewall is kept busy tracking a dynamic traffic load, session management
becomes a serious performance factor.
We created three tests to test specific aspects of firewall performance. Our
connection-rate test determined how many connections per second the firewall
could handle. Although typical traffic patterns are predictable and can be
scaled to, you'll always need to accommodate spikes in traffic. Our maximum
sustained connections test determined how many connections the DUT could hold
open. This finds the upper limit of sustained connections. Finally, our
throughput test determined how much data could be passed through the firewall
over HTTP while opening and closing sessions. This test is a better determiner
of firewall performance than a bit-blast test over a few open sessions because
the firewall is doing real work similar to what you would see on a real network.
Connection Rate
In WebReflector, we created four Web servers for the clients connect to and set
up 500 connections per second for a 60-second time period. The connections were
HTTP/1.0 with KeepAlive turned off, and the file was an 8-byte file (we didn't
want to saturate the link). Next, we increased the connections per second by 500
over a 60-second time period. That gave us a gradual increase in connections per
second without triggering SYN flood detection. We continued to increase the
connections per second until we saw failures. We then stepped the test back to
the failure point and ran each test for 10 minutes until we found the sustained
connection rate. Since the WebAvalanche simulated Web users, as one connection
closed, another was opened, maintaining the desired connection-per-second rate.
This exercised the firewall's state and TCP tables.
Maximum Sustained Connections
We a similar test bed similar to that of our connection-rate test, but we set a
delay on WebReflector of 60 seconds, which kept the connection active. We then
ramped up the connections until we could make no more. We set the maximum
connections to varying levels until we found the maximum number of connections.
Throughput
We used HTTP/1.1 with persistent connections and created a transaction where a
user asked for a 25-KB page with two subobjects of 25 KB each. Each transaction
then consisted of a 75-KB download. We varied the number of connections per
second to reach a desired bandwidth. We disabled caching on the firewalls that
supported it and configured each page to be expired.
Protection Testing
We used a mixture of exploit from online.securityfocus.com and Cenzic's
Hailstorm to exercise the firewall application proxies. We set up DNS, FTP,
HTTP, IMAP, PO3, SMTP and NetMeeting for H.323.
Protocol Conformance
We wanted to determine if we could get nonprotocol conforming traffic past the
proxies?like you would see with server-based buffer overflows where non-ASCII
traffic is inserted into the protocol headers?or make a telnet connection to a
Netcat listener on a well-known port. If the firewall is a real application
proxy, nonconforming traffic should be blocked. We tested DNS, HTTP, IMAP, POP3
and SMTP for conformance. We used scripts from Cenzic Hailstorm to generate fill
headers with long strings as an attempt to overflow buffers and we used Bit
Walker, which essentially flips bits in a string to generate ASCII and non-ASCII
characters. To determine if the protocols got through the firewall, we used
Network Associates Sniffer Distributed to capture packets on the protected side
of the firewall. Any malformed packets that got through were marked as a
failure.
For H.323, which opens dynamic protocol ports during the connections, we started
a session and then used WUPS, a Windows UDP port scanner from Arne Vidstrom, on
the host making the connection. If we saw the UDP packets on the protected side
of the firewall, we knew that dynamic ports were not opened as needed.
HTTP Testing
We tested the ability of the proxies to block methods and HTTP URL filtering
based on length or text matching. Application firewalls aren't meant to provide
full-blown Web application protection; however, simple URL matching shouldn't be
difficult. We tested whether the HTTP proxy could block a common
directory-traversal attack. We used regular expressions or simple strings,
depending on what the firewall supported. We tested the blocking by sending the
offending URL containing the offending strings and even Unicode encoded them to
bypass the filter proxies.
DNS Cache Poisoning
We set up a Bind DNS server on the external network and created a new zone under
our domain name called zone-1-1. We also created a zone called zone-1-2. We
created an alias, "spoof" on zone-1-1 that points to "real" in zone-1-2. In
zone-1-2, we created an A record that resolved "real" to an IP address. When we
queried zone-1-2 for "spoofed," the DNS server returned the answer as an alias
for real.zone-1-2 and the address for real.zone-1-2. Our misconfigured DNS
server promptly cached both entries.
SMTP testing
Because none of the firewalls had true POP3 and IMAP application proxies, we
focused on SMTP. We tested open relaying of e-mail by configuring the SMTP proxy
only to allow e-mail to our domain, and then, using an SMTP client on the
outside, we tried to relay mail by direct relay and used bang-path strings such
as victim!blah.com@our.domain.com or victim%blah.com@our.domain.com.