I’ve decided I have some real reservations about OSPF routing. I was going to title this blog “Is OSPF Evil?” but decided that is not appropriate. At worst, OSPF is a pain.
Why? I keep seeing people with network complexity deriving from use of OSPF and/or — as an indirect cause — the difficulty or effort of restructuring OSPF to accommodate changed design needs.
The key problems I see people having:
- Inflexible OSPF area structure
- OSPF’s limitations on filtering prefixes
- Consequent pockets of BGP or separate OSPF processes to filter when redistributing — sometimes to the point where it seems BGP is becoming the IGP for the organization
- Using VLANs to extend many WAN areas and area 0 between two datacenters
What usually compounds such problems is that people are deploying, rather than documenting, their routing design and considering alternatives. After many years, I’ve discovered that trying to write up and diagram a routing scheme usually rubs my nose in things I hadn’t considered.
Recommendation: Shake out the problems on paper, or discover the complexity by trying to describe how it is supposed to work rather than live! This also helps later when you’re trying to remember how your routing was supposed to work. Especially if complex. Pro tip: Don’t do complex; you’ll hate yourself later!
One example where filtering would help: customers with a MAN and MPLS WAN mix. They wanted to use a 10 Gbps link between two datacenters for replication. I’d summarize sites over the MAN (and in MPLS BGP) and do more specifics for the point-to-point link and for the replication endpoint prefixes. But the replication prefix addressing was not amenable to filtering by summarization in OSPF. With EIGRP, fine, you can filter. With OSPF, you can filter more specifics with summary routes; but that wasn’t an option. Result: complexity.
Tentative conclusion: EIGRP flexibility gives you tools for traffic engineering as a network evolves.
If you’ve been reading Twitter and blogs, a number of people who shouldknow have been noting that, up to a point, putting everything in area 0 is possible with more RAM and faster CPU in routers. Having said that, that’s not necessarily a great idea as you scale it up — particularly on a WAN.
Personally, I’ve been liking EIGRP for WAN, campus, and datacenter use. It’s more flexible, it behaves well (especially if you summarize properly), and best, I can do filtering where and when I need it. Not that I espouse having routing filters/distribute lists all over the place in uncontrolled fashion; that’s a recipe for 3 a.m. CCIE head scratching and tedious checking of filters.
The one snag is that almost nobody’s firewalls speak EIGRP. And I want dynamic routing to firewalls, as I consider static route/IP SLA games to be more complex. Just do BGP to receive 0/0, advertise a public prefix (with filters), redistribute into OSPF at the edge router (or L3 switch). I don’t trust firewall code to do “fancy routing” well, so I draw the line at informing them — no redistribution, area boundaries, etc. on the firewalls.
WHERE TO USE OSPF?
My current design conclusions in regard to all this are:
- If the site(s) are small enough, OSPF everywhere works. That keeps it simpler for staff. If sites connect via MPLS and BGP, then area 0 at each site might work.
- If there is a need to talk OSPF to other devices that are scattered around, that might suggest using OSPF everywhere.
- If the site is larger and staff has the necessary skills, and if there is clean modularity separating the Internet edge from WAN, campus, datacenter, and core, then I might have the Internet inner-edge devices redistribute OSPF into EIGRP (bi-directionally with filters) and use EIGRP everywhere you can. With reasonable addressing, the amount redistributed (default that way, network 10 this way) is usually small.
Here’s what that latter design might look like:
Another setting that is occurring more often lately: Multisite MAN networks. They are a special case, as they represent a full mesh of routing peers (assuming you’re not adventurous enough to share L2 between several sites).
CCDE wisdom is that EIGRP is better at hub and spoke, OSPF is better (with tweaks) for full mesh situations.
With EIGRP in hub and spoke, filtering to only advertise 0/0 (and/or “corporate default 10/8”) out of the hub may well be possible to really reduce advertisements and enhance scalability. I’ll note Cisco IWAN (which tends to be hubs and spokes) uses EIGRP or BGP route reflector to scale.
OSPF with a multipoint MAN is a classic DR/BDR LAN situation, reducing the amount of peer-to-peer flooding. I haven’t run into this at large scale in a design setting yet. Would having such a MAN provide a pretty good reason to run OSPF overall? How would one damp instability in such a network? Large failure domain? What number of peers is “too big” for a full mesh MAN?
The other problem I’m still mulling over is the OSPF WAN to dual datacenters design. In one case, a customer was running more than 250 VLANs (one per area) over DWDM, and more recently over OTV between datacenters, with more than 4000 GRE over IPsec tunnels.
Dual hub DMVPN and BGP route reflectors looks very attractive compared to that. “Totally stubby EIGRP” — hubs that advertise only 0/0 or corporate default to remote sites — could also work well.
By the way, if you are using EIGRP, note Cisco’s clever recent stub-site feature, which was probably built to simplify IWAN. It lets you make EIGRP sites with two WAN routers effectively stubby by doing automatic filtering for split-horizoning and also having the routers advertise themselves as stubs to minimize EIGRP queries to the dual-router site. Neat!
This article first appeared on the NetCraftsmen blog.