Firewall Temporarily Downs Massachusetts’ 911 System Highlighting the Need for Network Resiliency
To avoid outages of critical systems like 911, network operators should aim for network diversity and redundancy.
June 28, 2024
On June 18, a day after the Boston Celtics won the NBA championship, officials in Massachusetts announced an outage of the statewide 911 system as they were holding a news conference to discuss plans for the team’s parade, which was held on June 21. The outage also occurred during the middle of a Northeast heat wave.
"Never a dull moment, we just wanted to start actually with a notification that currently the statewide 911 system is down, and calls are not going through," Boston Mayor Michelle Wu said, per CNN, during the June 18 press conference.
The Massachusetts State 911 Department discovered the outage at about 1:15 p.m. ET. The outage lasted about two hours, and 911 services were fully restored at 3:15 p.m. ET that day, according to the State 911 Department.
During the outage, dispatch centers had the capability to identify phone numbers and return calls, but the State 911 Department said it did not receive reports of emergencies during the outage.
In a June 19 press release, the Massachusetts State 911 Department attributed the outage to a firewall that blocked 911 calls. The firewall was designed to protect the 911 system from cyberattacks and hacking, but instead, it prevented calls from reaching the 911 dispatch centers, which are called Public Safety Answer Points (PSAPs). In 2023, 8,800 calls a day reached Massachusetts’ 204 PSAPs combined, according to the State 911 Department.
During the outage, Massachusetts officials urged the public to call local police stations and firehouses or pull red fire call boxes on street corners if they had an emergency.
How the Massachusetts Outage Occurred
Comtech, Massachusetts’ 911 vendor, referred Network Computing to the State 911 Department for comment.
"In response to the recent statewide 911 system disruption occurring as the result of an automated action taken by a firewall, the Massachusetts State 911 and Comtech took immediate action to improve processes and implement technical adjustments to protect and manage the network that will prevent a future systemwide security interruption," a State 911 Department spokesperson told Network Computing in a statement. "These adjustments will allow both State 911 and Comtech to better mediate the automated response to anomalies to effectively meet public safety needs and ensure our hardworking first responders can meet the needs of our communities."
The firewall blocked the 911 calls because it detected anomalies with inbound data management in how the traffic was traveling across the state’s network, according to the State 911 Department. The State 911 Department and Comtech quickly adjusted the firewall's rules following the outage to restore communications on the 911 network. The system is designed to spot anomalous network traffic behavior and isolate routing of data, including calls, if this traffic behavior occurs.
Massachusetts uses a Next Generation 911 (NG911) system, which Comtech maintains. NG911 incorporates a mapping and address database that MassGIS, the state’s geospatial data and mapping organization, manages. On May 21, Comtech announced a renewal of its contract with the Commonwealth of Massachusetts through July 31, 2029, with the option to renew until July 31, 2034.
Comtech determined that a cyberattack did not cause the outage, State 911 reported. Public networks cannot reach the NG911 system and would require breaching multiple security layers, according to the State 911 Department.
This event resulted in a false-positive security response, the State 911 Department reported. For now, the department will manually test updates before rolling them out.
Brandon Abley, chief technology officer for the National Emergency Number Association (NENA), says outages sometimes occur when robust border protection in a network leads to a configuration error that disrupts traffic entering or exiting a network.
“There are multiple things used at the edge of a 911 system to protect traffic at the border,” Abley tells Network Computing. “It could be any one of those things in the stacks that if there's an issue, it could affect your emergency calling traffic.”
Abley says a border could encompass an entire state or a single building, depending on the network and 911 system.
"You do see in large networks of this kind that have strong border protection that sometimes, if there's a configuration or deployment issue on those border elements, that can disrupt the ability for any traffic to enter or exit the network," Abley says. "It's not exactly a single point of network failure. Usually, these things are especially for 911. And next-generation 911 has to be designed for redundancy and to have hot standbys so that these types of errors don't happen."
For 911 systems, the industry aims for five 9s, or 99.999% availability of services, Abley says. However, he says three 9s or four 9s are more common.
“We shoot for five in 911 because 911 always needs to work,” Abley says. “It can't go down.”
A hot standby system creates a safeguard so that if a network fails, you can roll back to a previous time.
In April, an outage halted 911 service in Texas, Nebraska, Nevada, and South Dakota when a fiber line was cut during a light pole installation. Lumen supports the 911 systems in these states. Just like in the Massachusetts outage, authorities in Las Vegas had the capability to call back people who tried reaching the 911 line. These callers experienced a fast busy signal, according to an X post by Douglas County, Nebraska.
Behind the Network Infrastructure of 911 Systems Today
NENA is responsible for the standards behind NG911, according to Abley. The conventional 911 system was built on traditional telephone systems while NG911 is a digital IP-based system that’s based on Session Initiation Protocol (SIP).
“We support every kind of multimedia voice, video, and text. We support language marking, and the system is much more survivable,” Abley explains. “A lot of the elements for redundancy and diversity and backups and security are standardized and built into the architecture and should be there right out of the box.”
That means NG911 systems are less prone to errors and cyberattacks than traditional 911 systems, Abley suggests. About half of American citizens live near where an NG911 system is being built or is currently working, Abley says.
He adds that NG911 systems are more interoperable than traditional 911 and allow multiple government agencies to respond to a mutual aid event.
How to Keep 911 Systems from Going Down
To avoid outages of critical systems like 911, network operators should aim for diversity and redundancy, Abley advises. That includes “different physical layer transmission options” like wired and wireless, he says.
Also, because widespread failures like the 911 incidents usually occur during upgrades and updates, these improvements should be staged, according to Abley.
“They need to be rehearsed prior to going into production, and you need to be able to roll back from them,” Abley says.
Companies such as satellite-enabled platform Somewear Labs offer an alternative to the primary cellular communications systems in case people lose a primary mode of communications during an emergency.
"When it comes to the architecture, we have to think about resiliency, and we have to think about contingencies in the core architecture, and not just the operations," James Kubik, cofounder and CEO of Somewear Labs, tells Network Computing.
He stresses that 911 systems need alternatives beyond “single-threaded systems.” Kubik also cites a U.S. military methodology called Primary, Alternate, Contingency, and Emergency (PACE), which various stakeholders must coordinate to ensure reliable communications despite any obstacles that occur. He believes this strategy should be applied to 911 systems by network operators using parallel paths of servers.
Somewear Labs provides backup to primary 911 systems to democratize emergency communications beyond local cellular networks using third-party dispatching services. However, Kubik sees the potential to work more directly with public safety. He also says emergency pull boxes can be modernized to incorporate platforms such as Somewear Labs to support multiple modalities of communications, including satellite.
Parallel systems can include satellite communications, local cellular networks, land mobile radio networks, and local mesh networks that route back into a central system, Kubik says.
Related articles:
About the Author
You May Also Like