Facebook today unveiled a network application platform originally designed as a shortest-path routing system for its Terragraph multi-node wireless network, which aims to provide high-speed Internet connectivity to dense urban areas.
Since it was created, Open/R evolved into a modular platform that allows Facebook to quickly prototype and deploy new applications in the network, the company said. Facebook also has adapted the platform for use in other parts of its networking infrastructure, and plans to figure out a way to contribute the software to the open source community.
Facebook engineers introduced Open/R at the invitation-only [email protected] conference. Open/R is another DIY-networking effort for Facebook, which two years ago introduced its own top-of-rack switch, Wedge, and earlier this year unveiled its network troubleshooting platform, NetNORAD.
In a blog post, Facebook engineer Petr Lapukhov said the social media company was inspired to build its own routing system in order to solve fast recovery challenges for Terragraph, a 60 GHz wireless network that uses small nodes. Existing open source projects were difficult to extend quickly and in a supportable way. "Given this, we decided to iterate fast with our own implementation," he wrote. "We kept it simply by re-using as much open-source code as possible."
Describing the design of Open/R, Lapukhov wrote that the platform essentially "generalizes the concept of a replicated state database found in well-known, link-state routing protocols such as OSPF and ISIS." Open/R uses this as an underling message system upon which to build multiple applications, and distributed routing is just one of the applications.
"We didn't want to get bogged down into discussions over the lower-level protocol details, such as frame formatting, handshakes, etc. and so we decided to simply leverage Thrift for all message encoding and use the well-documented and mature open source ZeroMQ library for all message exchange – be it intra- or inter-process," he said.
His blog post goes into additional technical detail and describes how Facebook tested Open/R's scalability. Adding more applications on top of routing, such as link utilization measurement or MPLS label allocation for segment routing has proven straightforward, he said.
And in contrast to the trend of placing network intelligence in a central controller, Facebook believes that autonomous network functions are key, Lapukhov said.
"Using both centralized and distributed control throughout different domains in our network, often in a hybrid fashion, ultimately helps [make] the network more reliable and easier to manage," he wrote.
Facebook is testing Terragraph at its Menlo Park headquarters and plans a broader test deployment in San Jose, Calif.