Solve the complexity crisis in network system design
SBC system providers are rising to the challenge of network system complexity by providing equipment with the necessary performance.
August 4, 2005
While the telecom bubble of 1999 seems light years away, those of us in the business of designing networking equipment have seen the complexity and number of challenges in this area steadily rise. Predictably, this increased complexity has either resulted in compromises in product functionality or performance, or long development cycles or both, with increased risk, fueled by rapidly evolving standards. Although the increasing complexity of network applications has been documented many times in the past, let’s look at the macro factors driving this complexity. Staying at the macro level, some of the key networking drivers are:
A general increase in traffic, resulting in higher line speeds, higher throughput, more network elements, etc.
Convergence and standardization on IP and eventual migration to IPv6, resulting in more gateways (e.g., media gateways, Packet Cable, etc.).
Mobility, resulting in a significant increase in network endpoints (e.g., handsets) as well as contributing to IP conversion and an increase in traffic.
Security concerns, resulting in an increased number of network elements with security functionality as well as a trend to combine what have traditionally been individual security devices into multi-function devices, such as Unified Threat Management devices.
Despite the convergence and standardization of IP, there’s still a lot of change with new protocols, additional standards, new uses, and new threads emerging.
While each one of these factors drives increased function, capacity, and/or performance in networking equipment, let’s start with security. Obviously, maintaining information security is one of the biggest challenges affecting the broader usage of the Internet for communications. Looking deeper, a technology such as Voice over Internet Protocol (VoIP) will continue to grow in popularity as a low cost alternative to traditional service, but implementing it securely brings some major challenges for communications systems designers.
A key component of VoIP systems is the session border controller (SBC) that enables VoIP media to traverse enterprise firewalls and network address translation (NAT), as well as supporting other advanced features like encrypted media. Other examples of complex SBC features include the network hardening of these devices to prevent denial-of-service attacks and support deep packet processing, such as signature detection for intrusion detection.
SBC system providers, like other communication system vendors, are rising to the challenge by providing equipment with the performance to analyze data packets on the fly at wire-speed. In general, the industry approach has been to use programmable processor-based systems (as opposed to ASIC- or FPGA-based designs) to maximize the flexibility needed to keep pace with the rate of change. Designs typically incorporate the following:
high degrees of parallelism, either through multi-core general purpose processors or network processors;
fixed-function accelerators, such as encryption/decryption, hash units, and TCAMs;
various memory types, each with potentially different access models and timings;
various buses/interconnects to other silicon, again, each with different access models and timings;
interaction with other system elements or planes.
Focusing on SBC vendors, they represent a perfect example of this. Despite pressure from carriers to minimize the number of unique boxes in their networks, SBCs have established a foothold as standalone elements in the networks. The reason is because SBCs have stayed ahead of the innovation curve, and because they have visibility to both the signaling and media content, they are well positioned to implement new features.
But all of this comes at is the cost of increased software development complexity, simply because meeting the performance goals typically requires a new and complex architecture. One example is the proliferation of the network processor (NP). NPs are optimized for packet processing because they incorporate a high degree of parallelism through pipelining or superscalar processor arrangements, multi-threading, multiple memory types, and dedicated-function accelerators (e.g., hash units, CRC generators, etc).
The most obvious impact of this complex architectural environment is increased development and lifecycle software costs due to the steep learning curve and lengthened development, debug, and test phases. But there are other impacts, too. For example, the lengthened development cycle often causes a functional prototype to be delayed until late in the project cycle, thus delaying integration with other system components and overall system performance modeling. Another artifact of the high complexity is that designers are often hesitant to modify or enhance working designs due to the risk of change and lengthy debug cycle, and therefore negate one of the benefits of NP-based designs, namely flexibility.In our SBC example, we can see the importance of flexibility and extensibility. The original reason for SBCs was because the VoIP signaling protocols, including Session Initiation Protocol (SIP), H.323, and Media Gateway Control Protocol (MGCP), are poorly behaved, meaning they transfer key information in layers above OSI layer 4. This prevented the VoIP media traffic from traversing the various NAT devices and firewalls between endpoints. To solve this, SBCs monitor both the signaling traffic (to sniff out the parameters of new VoIP calls) and the media traffic (to open up pinholes and enable the media to pass through the various security devices).
But SBC vendors didn’t stop innovating. Once it was clear that having visibility to both the signaling and media traffic provided an architectural advantage over other boxes in the network, additional features were added, such as quality of service, route optimization, sophisticated call detail records, and encryption/decryption. None of this innovation would have been possible without a flexible underlying implementation that allowed for scalability and extensibility.
As more communications and networking system vendors seek to exploit the new NP silicon, however, an application software development complexity crisis is brewing that requires some innovative thinking. After decades of communications system design where system optimization relied on low-level code optimization to maximize performance, the task of optimizing the newest NP applications has become too complex for this technique to be accomplished without missing fleeting market windows.
As an example of SBC application complexity, consider the relatively simple task of programming an NP to implement pinhole firewall, NAT, and encryption. In somewhat simplified terms, the software engineer must program the processor to first determine if the incoming packet is part of the VoIP media (as opposed to some other control or management packet), and if so, perform a flow/session lookup to determine if the packet is part of an established and accepted session. This lookup is often based on the five- or six-tuple of IP and layer 4 header fields with an optional VLAN identifier and needs to be performed on a table capable of tracking the maximum number of supported concurrent connections (often tens or hundreds of thousands).
Due to the table's size, hashing or other search algorithms are usually used, as opposed to linearly searching the entire table. If the packet isn’t part of an established session, it's then processed as an exception and some combination of logging, dropping, and alerting is performed on the packet. If it is part of an established session, NAT is performed by updating the packet header fields using data structures indexed by the unique flow index returned from the lookup. Other state variables are inspected to determine if this is a secure flow, and if so, encryption keys and cipher state (e.g., which cipher and/or hash is to be used) are accessed through similar data structures and the packet is encrypted or decrypted. Lastly, packet values are updated to account for any padding and updates to UDP checksum and the IP header checksum, and then the packet is forwarded to the appropriate output port.Expressing this logic using assembly language code typically requires thousands of lines and would not only involve coding the packet logic described, but would also deal with the intricacies of the underlying hardware, such as partitioning and synchronizing the function across parallel processors.
Implementing this in C language wouldn't be much easier, because of the lack of any type of parallel operating system or library support and the need to interact with the underlying NP at the hardware level. It could easily take months to develop and debug the code from scratch. Even if reference code is available, the time to integrate and test the resultant application would take weeks. If future changes or upgrades are required, not only must the changed packet logic be debugged and tested, but also the re-partitioning and re-mapping to the underlying hardware.
A new approach is needed
A better approach is employing powerful software abstraction techniques to allow application programmers to focus on the information that's being transferred rather than dealing with the bits. One way to do this is by using a high-level, application-specific programming model. Taking this approach one step further, allow the programmer to program in a functional language that could naturally be implemented as a virtual machine (VM) atop the NP. Implementing a VM in software atop the NP gives the programmer an architecture-independent environment that offers the potential to be completely portable and scalable. But there are other significant benefits, such as superior robustness by building in logic to perform bounds checking, null pointer/handle checking, and other exception handling. Lastly, a VM approach allows for advanced concepts such as dynamic compilation.
Referring back to the SBC example, using a programming model that provides a robust set of built-in algorithms such as a high-performance n-tuple table lookup (typically required for NAT) or a standard way to encrypt/decrypt or authenticate packets (required for implementing secure media) decreases the effort of programming and debugging the application by at least an order of magnitude. Because the programming model could be tailored to a specific applications class (like packet processing) the model could include a focused set of data types, control mechanisms, and built-in values, all of which would be useful in dealing with VoIP RTP packets/streams, which would further abstract the programming task.
In a Packet Processing Language (PPL) programming model, the fundamental object is the packet to which the PPL VM provides a rich set of packet-handling functions, such as header insertion/stripping, connection/session tracking, content inspection, and rate monitoring. In addition, the PPL software has some built-in lower-level functions, such as automatically calculating key packet state values (such as the offset and length of the various headers and whether or not the packet is part of a fragment), as well as to allow common packet fields to be accessed symbolically (see the figure). But the most important aspect of this approach is that it lets users express their logic in this high-level manner, then automatically maps the logic onto a parallel, complex NP, yielding a high-performance implementation.
The primary features and benefits of PPL software center around primitives and complex algorithms that process IP packets.
Note that abstraction can come at a price, usually a real or perceived performance penalty compared to ideal performance. This is where an application-specific domain like packet-processing is beneficial. By focusing on a particular application area, the VM implementation can be implemented with highly-optimized, best-of-breed algorithms and state machines used in that domain. These hand-tuned implementations will often outperform those written by NP application designers. The overall VM architecture can be optimized to the domain-specific processing and data flow characteristics. Lastly, one could argue that the final system performance will greater when all system components system are available as early in the product cycle as possible, thus allowing more time to analyze and optimize in real-world conditions.
Another way to look at the performance tradeoff of using abstraction is to consider that increased design complexity will soon get to the point where it constricts the ability to tune and optimize systems at the lowest levels. In simpler terms, the more complex the application gets, the harder it is to optimize on a sufficiently complex piece of hardware. This is where the VM approach comes into play. The VM can be implemented by NP experts, who have the expertise and can take the time to optimize the VM implementation. These optimizations can then be leveraged by using the high-level programming language.
One final point to make is that abstraction isn’t the only solution to today's networking problems. There are clear cut cases for highly-tuned and optimized designs for some network elements, such as core routers and switches. However, the complexity crisis is very real and needs to be addressed.
About the author
Kevin Graves is the CTO of IP Fabrics. He holds a BS in computer science from Pennsylvania State University. Graves can be reached at [email protected].0
You May Also Like