We all make mistakes, but some networking mistakes wreak more havoc than others. Avoiding a few common errors will drastically increase your LAN uptime and decrease your troubleshooting time and frustration level. Here is a top 10 list of errors, omissions, misconfigurations and points of confusion that lead to network mayhem.
10. Mismatched Masks
We start our top 10 countdown with this little goof, which is either an honest mistake or the result of someone not paying attention. Maybe you’re used to typing /24, or maybe you didn’t catch that the correct netmask documented was /20. Mismatched masks also occur when expanding networks and reducing masks. Masks may get updated on most devices and DHCP clients, but perhaps a few pesky manually configured or forgotten endpoints got left out.
9. Lack Of Fiber Standard Familiarity
There are a lot of fiber standards out there, and most of the confusion surrounds multimode fiber since there are so many flavors with different core sizes and bandwidths. These ratings have everything to do with the quality and transmission rates of data, and they’ll determine the maximum distance for any given optic. If you’re moving from 1 GbE to 10 GbE, you must know your fiber types, including core and bandwidth, as well as distance before you can proceed.
8. Link Aggregation Dilemmas
As if it weren’t messy enough due to verbiage discrepancies, link aggregation got a little hairier with the advent of advanced server and storage systems. These newer devices typically utilize multiple links without the use of link aggregation on the switch side. In cases where link aggregation is required, be sure to check your vocabulary and settings. Some devices do better with LACP, but dynamic LACP requires configuration. Aside from the LACP standard, Cisco has EtherChannel and HP has trunking. Symptoms of misconfiguration may look like spanning tree issues, ports being shut down and dropped packets across the link.
7. Relying On Auto Negotiation Settings
There’s a standard for speed and duplex auto-configuration on all switches. However, the sad truth is that there are still many devices that don’t do a good job with this menial task. When working with critical connections or endpoints, always manually match the speed and duplex settings on both ends when possible. Most endpoints will still work with incorrect speed and duplex settings, but inter-switch links and connections to media converters usually won’t pass packets until the settings match.
6. VLAN Delusions
If there’s one basic networking concept that lingers as a dark unknown in most people’s minds, it’s VLANs. If you can’t wrap your head around them, you’re not alone. The three most confusing concepts seem to be: understanding when to tag and untag (or, in Cisco terms, trunk and access port); when and where to extend VLANs in the network; and when something is being switched at Layer 2 on a VLAN and routed at Layer 3 from a VLAN with an IP. If you can master these three issues, then you’ll have very few VLAN woes when designing and troubleshooting.
5. Recruiting The Wrong Resources
Although not technical, having the right personnel is a major determining factor of how smoothly a network will run. Technologists and managers of various sorts routinely find themselves in a position of having to handle the daily tasks or oversight of network management. Everyone has strengths and weaknesses; it’s in everyone’s best interest not to put a bench technician, a security manager or anyone without enterprise networking experience in a role outside his or her competency.
4. Spanning Tree
If I have to pick the one standard I hate most, it’s spanning tree. If used properly, STP wouldn’t be the horrendous mess it’s evolved to be. But what was meant to be a protocol for link redundancy quickly turned into the go-to feature for preventing loops at the network edge. If you truly need STP for link path redundancy, then make sure you not only enable it, but you completely configure it. STP configuration is holistic and must be done throughout the entire infrastructure. If you just need loop protection, check your switches for other options, like Cisco loop protection and HP loop guard.
[TRILL allows for much needed advances over STP, but vendor licensing requirements make it hard to switch. Read Tom Hollingsworth's analysis of the problem in "TRILL's Hidden Cost."]
3. Default Gateways
Misconfigured default gateways are sneaky little things, and they’ll fly under the radar until something changes in the network. When making changes, always be sure to document and update default gateways on endpoints, servers, appliances and your edge switches. Your device will use its configured default gateway as the next hop to look for anything outside the network. If you’re getting lost pings, asymmetrical pings or a partial traceroute, you may have a misconfigured gateway on a device.
2. Duplicate IPs
Duplicate IPs earn the No. 2 spot for the sheer chaos that frequently ensues, especially when a device takes on the IP of a default gateway or routing device. If a routing device’s IP is hijacked (maliciously or inadvertently), the Layer 2 traffic within networks will keep flowing as normal, but packets seeking a path out get misdirected. Just this month, I saw a de-provisioned ISP router at a location suddenly spring back to life with an IP that was in use elsewhere. In another case, a printing system was erroneously installed with the wrong IP. In both cases, the misfit device hijacked the IP of a main gateway, and wreaked havoc all over. Duplicate IPs can best be traced using good management tools and monitoring, and by comparing the MAC address with the IP in device routing and ARP tables.
1. “While I'm In There” Syndrome
Topping our list is this malady, which sadly, many a colleague has fallen victim to and learned a tough lesson as a result. For many people, networks are set it and forget it. But if you don’t manage your devices regularly, then diving in for configuration can lead to trouble. It’s easy to think, “While I'm in there, I'll just go ahead and do [xyz].” If you’re not managing your network on a daily basis, then I especially encourage you to avoid this tactic. Frequently, something doesn’t function properly after a change, and it’s much more work to pick apart what haphazard changes were made and reverse them than it is to start with an orderly change plan.
This wraps up our top 10 list. The next time you’re scratching your head over a networking issue, refer back to this list and see if one of these common mistakes may be the source of the problem.