This is a problem for network-management systems that rely on unique addresses throughout the domains they manage. Unique addresses link fault and performance statistics to database entries. Network topology is in part derived from addressing and mask combinations found in routing tables and routing caches. When an IP address is duplicated, the results are ambiguous at best. But more commonly, the duplication results in erratic network data that causes the network-management system to render false and misleading topology and availability assumptions.
The most complete solution would be to readdress the managed networks and lose NAT altogether. But that's easier said than done. NAT solves the problem of nonunique IP addressing by translating the source and destination header address from one routed interface to different addresses on another routed interface. On its own this does not create duplicate addresses, but networks that have deployed the private addresses in RFC 1918 will likely have duplicate addresses, hence the need for NAT.
Readdressing isn't practical or financially viable in many cases. The amount of work necessary to readdress a network is immense. If network management is outsourced, the situation gets even stickier. The customer would have to make an unacceptably large commitment to the management service provider to endure the pain (and cost) of readdressing.
Another approach is to manage the network by being on the inside or by installing an appliance on the inside that reports to the outside. SilverBack Technologies offers a service that uses such an appliance, and Peregrine Systems' InfraTools uses this approach. This method requires that every site must have an appliance; the trade-off is scalability. This approach obviates readdressing, but in situations where a single network uses duplicate addresses, the appliance approach overpartitions network-management domains.
Two other approaches are more practical. Neither is perfect, but both work. One is a free configuration workaround from Hewlett-Packard for NNM (Network Node Manager), and the other is IBM's translation product, CNAT (Comprehensive Network Address Translation). These solutions eliminate the silo problem that management systems face every time a new network is connected via NAT.
Static NAT Works Best
NAT can describe three different kinds of techniques: static, dynamic and port, or overload, in Cisco Systems terms. Addresses must be mapped one-to-one from one side of the NAT to the other. You must have a static mapping to manage through a NAT.
Let's clarify what we mean by NAT. A NAT router identifies which interface represents which address domain. These interfaces are referred to as the inside and the outside interfaces, or the true and the normalized interfaces, or private and public interfaces. The most common relationship is for the private side to represent RFC 1918 addresses and public side to represent the Internet routable addresses, but the relationship for our purposes is about addresses that are duplicated (private) and those that are unique (public).
Commonly, NAT is used to describe the process of mapping a single public address to many private addresses. This process is known as PAT (Port Address Translation), also referred to as NAPT (Network Address Port Translation), and is often used for SOHO (small office/home office) applications. A second commonly deployed NAT method is dynamic NAT, in which a pool of addresses on either the public or the private side of a NAT router is used to provide a temporary mapping for address translation of a session that crosses the NAT device.
Unfortunately, neither is usable for network-management systems, which depend on unique addresses to index the devices in their databases. Once a device is indexed, performance, fault, inventory statistics and state are attached to and referenced by that address.
Neither dynamic NAT nor PAT can ensure a unique address. In the case of PAT, the address for every device would be a single address, making the SNMP statistics unreliable. The traffic can be monitored by type, but if the SNMP statistics on the target machine are going to be monitored, a specific address must be mapped in the network-management database. In the case of dynamic NAT, the actual device the address represents could change from SNMP get to get, misaligning the statistics and availability. For this reason, a static mapping of a public address to a private address is mandatory to deal with the collection of SNMP data in a NAT-connected network.
Getting Around NAT With NNM
The white paper "Managing Duplicate IP Address Domains with NNM", written by Hewlett-Packard's Pete Zwetkof, describes a way to configure HP's OpenView NNM. The approach modifies the default database population method, forcing NNM to carry devices with duplicate addresses.
A noDiscovery NETMON file is created with entries of the duplicate private addresses. This prevents those private addresses from being discovered as the network-management station goes through MIBs that have routing information regarding the private address. To test NNM, we created entries of 176.230. 0.0/16, as both of our private addresses used this address domain. This kept NNM from automatically discovering the devices on the private side of the NAT.
We added the devices we wanted to manage on the private side of the NAT to our host file, so naming could resolve device names. We could have used DNS or NIS as well. We used address entries that pointed to the public addresses. Remember that on the public side all addresses are unique for routing and management purposes.
Finally, we added the devices to the database via the loadhosts command using the public unique address. The devices can't be discovered automatically because the addresses in MIB II provide hints to the management station regarding the duplicated private addresses.
The filtering of the topology data via the no Discovery NETMON file prevents NNM from viewing devices as duplicates. However, the SNMP MIB data retrieved does include duplicate addresses, and you can use it only with HP OpenView.
Using CNAT to Get Around NAT
CNAT, from the NetView group of IBM's Tivoli Systems, translates SNMP payload addresses. It is a straightforward idea, the translation of the private address into the public address and vice versa. The beauty of this solution lies in its simplicity--any network-management system can leverage its translation. But in testing CNAT, we found a big-time bugaboo.
The basic application of CNAT is as a default router on the subnet to which the network-management system is attached. The CNAT passively monitors IP source and destination addresses in IP headers. Then, based on rules defined by the CNAT user, it translates IP addresses in the SNMP payload. It sounds simple, but during testing our heads hurt just about every day. Here's why.
The translation rules are bidirectional, which means they apply regardless of the traffic direction. This simplicity is elegant in application and requires little work from the host OS and hardware. We used an IBM 43p-140 with a PowerPC 604e running at 200-MHz with 128-MB RAM. We weren't testing performance and didn't create much traffic, but the platform never got above 1 percent CPU utilization. IBM says its performance tests show less than 2 percent utilization when translating 1,000 events per minute.
The CNAT device partitions NAT-translated addresses into rule sets designed to coincide with the duplicate private address sites or customer sites. Every time a header IP address matches a translation rule within a rule set, a translation occurs. Source and destination don't matter.
For example, NAT translates the private address 10.0.0.1 to public address 188.8.131.52. As the packet traverses the CNAT, the 184.108.40.206 address is looked up and a match found. Then, within the SNMP payload, every instance of 10.0.0.1 is translated to 220.127.116.11.
The secret ingredient that allows for address translation with little overhead or delay lies in the CNAT MIB scanner. The scanner acts as a MIB compiler in that it defines MIB II and enterprise MIBs to the CNAT. Then, when a CNAT translation rule matches a header address, the possible locations within scanned MIBs are known, so the entire payload does not need to be scanned for octets that could be IP addresses.
CNAT has rule sets. So while, from a service provider's point of view, many customer networks might have duplicate private addresses, like 10.0.0.1, these rule sets provide the mechanism to keep customer performance and fault data unique. This is done by having different translation rules in each rule set or mapping the nonunique private addresses to the unique public addresses.
For example, in "How CNAT Gets Around NAT," Customer 1 and Customer 2 use private subnet address 176.230. 92.0. Within subnet 176. 230.92.92 are different Microsoft Windows NT servers: Customer 1's Server 1 is attached to Router B and Customer 2's Server 2 is attached to Router C.
When a packet from Server 1 passes through NAT, its private 176.230. 92.92 source address is translated to public IP address 18.104.22.168. Similarly, when a packet from Server 2 passes through NAT, its Private 176.230. 92.92 source address is translated to public IP address 22.214.171.124. When the packets arrive at the CNAT, even though the header addresses are translated, the SNMP MIB data carried in the payload is decoded bearing the private addresses in the 176.230. 92.* for both Server 1 and Server 2. Server 1's 20.20. 92.92 header IP address triggers a rule set one in the CNAT to translate all instances of 176.230.92.* into 20.20.92.*. Server 2's 126.96.36.199 header IP address triggers rule set two, which is defined to translate all 172.230.92.* addresses in the MIB into 30.30.92 The rule sets can hold multiple translation rules and support aggregation through the use of subnet masks.
This worked flawlessly, letting us manage both NT Server 1 and Server 2 devices as if they had unique addresses. However, we did find that on the NAT routers Router B and Router C, the 176.230.*.* addresses were leaking out. This caused the network-management stations no end of confusion, because to them it appeared that the addresses 176.230.*.* were moving back and forth from Router B to Router C.
The leakage came from the appearance of the public addresses in the MIB, mainly the IP and AT tables. The public addresses were fine, of course, but when CNAT processed them, the header translation rule translated them into private addresses. This drove us crazy until we realized, with IBM's help, that the translation rules are bidirectional. So if 176. 230.92.* is defined to be translated in rule set one to 20.20.92.*, the reverse also applies. While this makes simplifies rule definition simpler and reduces the processing needed by CNAT, which doesn't have to consider direction, it means that 20.20.92.* gets translated to 172.230.92.*. And because rule set two translates 30.30.92.* to 172.230.92.*, our management station was left in a very confused state.
The basic workaround is to add a negative discovery entry into the seed file for the duplicate private address space 176.230.*.*, as we had done with HP NNM. This let us discover the 20.20.92.* and 30.30. 92.* networks and populate the database.
As we came to learn, CNAT was designed to replace NAT routers and to provide SNMP and header translation as well. IBM soon found that most organizations did not want to replace their NAT devices with general-purpose AIX boxes. If CNAT is located at the translation border, however, a translation rule can be defined to avoid the leakage.
Bruce Boardman is executive editor of Network Computing, testing and writing on network systems and management. He has 12 years' IT experience managing networks and distributed computing for a financial service provider. Send your comments on this article to Bruce Boardman at firstname.lastname@example.org.