On Location: Implementing Icarus P2P-Blocking Software

Icarus put the kibosh on P2P play with a collection of PERL scripts paired with a MySQL RDBMS.

February 13, 2004

11 Min Read
Network Computing logo

All this makes for an intriguing debate: limited resources, university policies and copyright infringement versus privacy and freedom. For now, however, we'll leave that cerebral discussion to philosophers like David Joachim (see "The Enforcers," page 40) and focus on the technology behind Icarus. To paraphrase Descartes, we network, therefore we are.

Whose P2P?

First, let's define our terms. P2P connections are prohibited under UF's residence hall "no server" network policy. Some people mistakenly think that P2P sessions involve a community of peers working together with equal responsibilities--a serverless environment that would seem to exclude this class of application from UF's policy. Although it's true that P2P machines are peers, whenever one of those peers makes resources available to another, as it does when sharing an MP3 file, for example, that client machine becomes a server--a UF policy no-no.

The P2P category is broad, encompassing applications ranging from real-time Web conferencing to Kazaa and its ilk, which let a dynamic community of users share music and other files.

With P2P network nodes freely alternating as clients and servers, how--or even whether--a particular node will participate at any given moment is unpredictable. As P2P expert Clay Shirky explains, "P2P is a class of application that takes advantage of resources--storage, cycles, content, human presence--available at the edges of the Internet. Because accessing these decentralized resources means operating in an environment of unstable connectivity and unpredictable IP addresses, P2P nodes must operate outside the DNS system and have significant or total autonomy from central servers."Right Place at the Right Time

UF has 9,000 nodes connected at 100 Mbps on its housing network, including students and staff. The housing network is connected to the campus backbone via a gigabit connection, and all UF's networks share an OC-12 link to the Internet. It doesn't take much for 9,000 100-Mbps pipes to fill a 1-Gbps pipe--and, predictably, it didn't take long for thousands of P2P machines to bring the housing network's gigabit connection to its knees.

Getting additional bandwidth between the housing network and the campus backbone was out of the question: The folks in central computing knew P2P was driving most of the traffic, and they weren't about to redirect bandwidth from the shared Internet connection. Robert Bird, Tony Hernandez and Will Saxon, the university housing department's network administrators, would have to come up with their own solution.

With just three bodies to oversee a 9,000-node network, they knew that any bandwidth-recovery system they implemented would have to be highly automated. On the plus side, the men could give the network their undivided attention: UF's housing department, unlike most other departments of its kind, has a network administration staff unto itself. What's more, Bird, Hernandez and Saxon all have development backgrounds. But their biggest asset is that they had mastered the art of using security tools--firewall and intrusion-detection logs, port-scanner data, SNMP data, and system logs--to identify P2P traffic against a backdrop of normative data.

"The distribution of traffic and the way people behave in the residential environment is very predicable," says Bird, coordinator of network services in UF's housing division and the lead developer of Icarus. That made it easy to spot new patterns and behaviors--including a new P2P application. All Bird's team had to do was determine how to automate its manual P2P ID procedures and specify the actions to take based on policies.From the start, the trio knew they needed a fast-tracked solution that played to their strengths. First, they looked at open-source projects to see if there was anything they could use or adapt. They found a number of handy tools but quickly realized they needed to develop a flexible, modular framework to tie all the pieces together.The developers all had experience using PERL to automate administrative processes, and its flexibility made that language a good choice to both parse the data collected for the Icarus system and script the actions to take when a policy was violated.

Icarus was pilot-tested during the university's spring and summer 2003 sessions. Generally speaking, it passed with flying colors, though it did encounter some minor turbulence. One example: Icarus was sure that a beta version of MSN Messenger was actually a P2P application because the traffic patterns for the videoconferencing piece of MSN Messenger closely resemble other P2P profiles for which Icarus was looking. Some fine-tuning of the profiles dissuaded Icarus from this mistaken belief.

At the heart of the Icarus framework is a MySQL RDBMS, where about 25 GB of data--five days' worth--is available for analysis at any given time. Data sources include IDS and system logs, port scanners and SNMP. The data is normalized and sequenced, then parsed by PERL scripts designed to look for factors that identify targeted behavior, like P2P traffic or virus and worm activity. Eighty percent of the traffic analysis takes place in real time, and the rest relies on small-interval batch processing.

Once a targeted behavior is identified, another set of PERL scripts takes the actions necessary to enforce the university's policies. These may include setting new router entries or modifying existing ones, as well as reconfiguring the VLANs (virtual LANs) associated with a network switch port.

Say a student activates Kazaa on his or her system (see "Icarus Setup," page 55). All data reflecting the student's activity is collected in the Icarus database and simultaneously parsed to determine whether it fits the pattern for Kazaa traffic. When the analysis determines that the traffic is indeed Kazaa's, additional databases are consulted to identify the port and registered the machine from which the traffic originated, as well as the student to whom the machine belongs. The Icarus framework then kicks off a PERL module that initiates a sequence of actions pursuant to the university's network administration policy.At this point, the student's network port is switched to an on-campus-only connection, and the student is notified by a pop-up screen and e-mail that a policy violation has been recorded and that the machine will be restricted to on-campus traffic only until Kazaa is shut down. If the student has committed violations in the past, additional judicial proceedings may be initiated.

What Else You Got?

Because all applications leave distinct fingerprints behind, it's not a big leap to recognize that a framework that can identify P2P traffic and quarantine the nodes involved can recognize the traffic profiles for any application and take appropriate action.

Enter the Welchia worm. At the start of the fall 2003 semester, its outbreak paralyzed university networks throughout the country. Some large schools suffered weeks of downtime and spent many thousands of dollars cleaning up the mess.

Not UF. Bird and his crew identified the Welchia pattern on the housing network and used Icarus to dynamically quarantine infected machines and tell the machines' owners what remedial actions to take. Whereas other universities could do little more than manually remove tainted computers from the network, UF automated the process so that a student's infected machine could be quarantined at 4:30 a.m., linked to a Web site equipped with the appropriate inoculation software, rescanned for the infection and, when verified as clean, reconnected to the production network by 4:40 a.m. Students didn't need to wait for the helpdesk to open at 8:30.But Bird and his team had another hurdle to clear: third-generation P2P applications like Earthstation 5, which are written to avoid detection. Because ES5 is encrypted, conventional detection tools that look for particular strings inside the application layer can't do their job. And because ES5 uses the same ports as legitimate traffic, it thwarts port-blocking as well. This represents an escalation of the P2P arms race, as far as university systems administrators, ISPs, the Motion Picture Association of America, the National Music Publishers Association and the Recording Industry Association of America are concerned.

ES5 traffic began to ramp up on the Internet at the same time as Icarus' debut. So Icarus was doomed from the start, right? Nope. The Icarus developers discovered an ES5 signature that occurs over time. This let them identify ES5 traffic and quarantine machines running the software.

The point is, the Icarus developers think their framework is flexible enough to stay ahead of the curve as the P2P battle continues to rage. Time will tell.

The Biggest Hurdle of All

Icarus' developers are cautiously optimistic about selling their system to other learning institutions, if not the broader world of ISPs, broadband providers and metropolitan area networks (see "An Intelligent Convergence?," below).UF has a number of large universities waiting in the wings to test Icarus on their networks. If things go according to plan, the University of Arizona should be putting Icarus through its paces in Tucson this spring. This is a critical first step that will enable the UF development team to identify the steps necessary to generalize this homegrown application to other environments. Bird is the first to admit that Icarus is far from a turnkey solution. But given the application's exceptional track record and demonstrated ROI, we wouldn't bet against its creators.

RON ANDERSON is NETWORK COMPUTING's lab director. Before joining the staff, he managed IT in various capacities at Syracuse University and the Veteran's Administration. Write to him at [email protected].

Kathy Bergsma: Information Security ManagerAt Work: Responsible for incident response, vulnerability assessment, policy enforcement, training and awareness

At Home: 45 years old. Single, no children. Hobbies include biking, hiking, skiing, canoeing, dancing, horticulture, sewing, knitting, crocheting and painting

Alma Mater: University of Florida, M.S. and B.S. in horticultureHOW SHE GOT HERE:

1999 to 2003: IT security coordinator, UF

1995 to 1999: Novell server administrator, animal science department, UF

1993 to 1995: Unix server administrator, statistics department, UF

1986 to 1993: Biological scientist, horticultural sciences department, UF1981 to 1986: Manager of greenhouse and interiorscape businesses, West Farms


What I say to critics of P2P-blocking: "Copyright violations are both a federal and state crime in Florida."

I work at UF because: "It offers great job opportunities."

The most misunderstood aspect of my job: "Spam is not necessarily a security problem."I love technology when: "It facilitates my access to information."

I hate technology when: "Security is not considered."

My next career: "Who knows? I never would have guessed five years ago that I would be the university's information security manager."

When I retire, I will: "Hopefully still be alive."Using multiple data sources, intelligent switches and some homegrown applications, Icarus' developers have done what many IT organizations only dream about: Enforce a predefined policy in an automated manner by leveraging existing technology.

When we asked Icarus' developers why they went the homegrown route, they dismissed the notion that cost was a factor. Although UF considered (and uses) a number of commercial offerings, none provided the completeness required. Icarus uses commercial as well as custom-designed components to supply a broader solution than any individual product could. With an IT staff of three, the broader the better.Still, though Icarus is the first deployment of what appears to be a fully automated and integrated peer-to-peer blocking mechanism that we've witnessed, we're watching a few product spaces begin to converge on this goal. It's a tricky task, involving gathering data from dissimilar devices and ultimately making configuration changes that affect production environments, but first steps are being taken. For example, companies like Cisco Systems, Extreme Networks and Enterasys Networks are shipping 802.1x-enabled switches capable of authenticating users on a per-port basis. And endpoint protection suites from companies like Sygate and ZoneLabs (now Check Point) are being designed to work with these 802.1x-enabled switches and agents, providing functionality beyond basic authentication.

Further, firewall vendors like NetScreen Technologies and Check Point are building more "Layer 7 smarts" into their products, enabling these devices to better detect rogue streams through more intelligent and deeper protocol inspection. Intrusion detection and prevention vendors, such as Network Associates, SourceFire Network Security and TippingPoint Technologies, are implementing similar inspection mechanisms and continue to design signatures that help IT organizations home in on unauthorized network applications.

Finally, traffic anomaly-detection products from companies like Mazu Networks, Arbor Networks and Lancope avoid some of the failures of signature-based technology by using behavioral profiling to detect unwanted network services.

Of course the real challenge is taking the intelligence gathered from these devices and doing something with it in an automated fashion. Event-correlation vendors such as GuardedNet and ArcSight are getting closer, but we wouldn't trust them to reprogram our switches in a fully automated manner; the technology is still too new, and current products are designed more for monitoring and less for control. We're curious to see whether security-enabled infrastructure companies like Cisco and Enterasys will bridge the gap between intelligent correlation and control, or whether it will take third-party vendors to get us there.

Bottom line: if Icarus works as claimed, the UF staff should be very proud. From our vantage point, we expect similar functionality to be commercially available this year. Whether organizations choose to trust vendor implementations with their production traffic, however, remains to be seen. -- Greg ShipleyPost a comment or question on this story.

Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like

More Insights