If You Rebuild It, They Will Come

The newly reconstructed MLB.com Web site stayed online during the busiest opening day in the site's history, fielding queries from 10 million fans in one day -- the first time

August 19, 2003

7 Min Read
Network Computing logo

"We had a lot of outages, and there was a lot of strain on the dynamic portion of Web servers," says Justin Shaffer, director of operations for MLB Advanced Media. Major League Baseball's 30 teams each hold an equal stake in MLB Advanced Media, which also runs the team sites.

Shaffer and the rest of the Web services team, brought on board for the 2001 season, were working so many hours tending to the site that they didn't have time to attend many games. "We were working 80 to 100 hours a week," he says. The front office wasn't happy with MLB.com's performance, either. "We'd all like to forget opening day '01. The site was down more than it was up," recalls Jim Gallagher, vice president of corporate communications for MLB. "We weren't prepared for the onslaught of fans online."

So MLB Advanced Media called for a changeup. The Web team rewrote most of MLB's internal applications and consolidated the site from a three-tier to a two-tier server architecture. The organization added content switching to off-load SSL and TCP/IP processing, and it compressed files for those with dial-up access. It also began caching site content.

The retooled site, which attracts 2 million to 3 million fans worldwide each day, runs about 15 percent faster than it did before, Shaffer says, using 30 percent less bandwidth.

From ScratchWhen MLB Advanced Media first started brainstorming on how to revamp the site, it got more bad news: The Cable & Wireless data center in Staten Island, N.Y., that housed the MLB.com servers was closing its doors. But it was a blessing in disguise, giving MLB a chance to rebuild the site architecture from scratch.

Besides rewriting the applications, MLB Enterprises merged its Web and dedicated application servers. The site, relocated to C&W's Weehawken, N.J., facility earlier this year, runs on Sun Microsystems Solaris servers.

Shaffer and his team installed two NetScaler 9800 content switches to distribute traffic among the Web servers. Although the appliances spread site content around the servers, the heavy volume of JSP-based dynamic content was still a problem. Recompiling those scripts every time its Gameday statistics application records a hit or error in a game nearly crashed the Web servers, Shaffer says, especially with the constant updates to the site's Liveline scoreboard.

"Our rate of change was so great that it became a downward spiral," he says. So Shaffer and the team mounted an NFS (Network File System) cluster on the Web servers to execute the JSP recompiles instead.

Certs and PerksThe site's content switches also handle SSL sessions. So when a fan subscribes to MLB's new MLBtv streaming video service that broadcasts all MLB games, his credit-card transaction goes to the NetScaler box, which processes the keys itself rather than having the Web servers do so. That eliminated the extra hardware accelerator cards installed in each Web server to handle the CPU-intensive encryption processing.

"And we also now only need one SSL digital certificate for the NetScaler boxes, rather than one for each Web server," Shaffer says. That eased the management headaches of maintaining certs for each server.

The site had been using VeriSign's digital certificate service, which cost $500 to $1,000 per server per year; it now subscribes to a root certificate from VeriSign just for the NetScaler boxes.

TCP/IP processing, too, had been a burden on the site's servers, so the NetScaler appliances also perform the TCP/IP handshake.

"That lets the application boxes handle JavaScripts and other tasks at hand," says Craig Currim, NetScaler's senior systems engineer who helped MLB set up the site. The NetScaler boxes also compress the site's hefty graphics, images and text, which can be too fat for a fan's dial-up or cellular modem connection.The revamped site also relies on URL caching to store content in compressed formats with open-source Web Proxy caches running in front of some Web servers. And C&W's Footprint caching service lets MLB.com distribute its content geographically, too.

Meanwhile, MLB Advanced Media recently installed two additional NetScaler devices for the Gameday application and online polling, to use with tasks such as All-Star voting and other fan feedback. The extra boxes will let the organization split individual applications into modules among the Web servers for even more efficiency.

Shaffer says his group also is considering grid computing for the Web servers. Today those servers handle one or two tasks at a time, such as JSP processing and browser redirects. But the ability to share CPU power could boost performance even more for the ever-expanding Web site. With the site's Sun SunOne server farm mostly Netra 1125s and SPARC 4500s, MLB has already laid the groundwork for going grid, Shaffer says. "When we feel grid is production-ready, we may move to it," he says.

Post a comment or question on this story.

Tell us about you Network and we may profile it in a future issue. Send e-mail to [email protected] or call (516) 562-5914.

The engineering and operations teams at MLB Advanced Media nearly struck out when they tried to sell a two-tier Web infrastructure to upper management at the end of the 2002 season. They were victims of their own success: Although MLB.com's scalability was limited, they had transformed it into a relatively reliable site.

"Everyone was skeptical about two tiers at first," says Justin Shaffer, director of operations for MLB Advanced Media, which built and runs Major League Baseball's Web site.

But everyone also agreed that MLB.com required some big changes. So after pitching the results of a matchup between the two-tier and three-tier models, Shaffer and the engineering teams convinced the company's CTO that the site no longer needed separate application servers.

"We knew we could do it all in a simple server engine," Shaffer says. "But it's always a tough decision to take apart everyone's hard work and start from scratch."

There was no opposition to the Web site overhaul and the addition of content switches. "The NetScaler boxes sold themselves, and we didn't have a lot of choice about our data center move," he says. Shaffer says he was able to squeeze the NetScaler boxes into the budget because they replaced the cost of the hardware encryption cards for SSL.The next big project for MLB Advanced Media may be adding a disaster-recovery architecture. That would mean distributing the site's applications and traffic across multiple data centers and adding more backup sites for disaster recovery.

"I'm thinking a lot about disaster recovery, and flattening things out more," Shaffer says. That would also simplify any future moves from its collocation sites.

"This would mean a significant change to our applications ... but we would complete this kind of move without any downtime," Shaffer says.

Justin Shaffer, Director of Operations, MLB advanced media, New York

Justin Shaffer manages the MLB Advanced Media team that runs Major League Baseball's Web site and the sites of all 30 major league teams. Shaffer and his team led the overhaul of MLB.com. Shaffer, 21, has been in IT for eight years.Next time, I'll: Orient the publishing/templating system toward building full pages, instead of components within a JSP (Java Server Pages) or other dynamic Web scriptinglanguage system. A site the scale of MLB.com, with event-driven traffic, needs to be built from the ground up with long-term scalability in mind.

Worst time for a site outage: During a high-traffic day, which would affect the baseball fan most. Some of our highest traffic days have in the past been opening day, draft day and big game days during the post season.

What makes MLB.com sexy: The site has a vast amount of rich editorial content in many formats--text, audio, video and interactive applications, as well as real-time statistics. There's Gameday, the live Flash-based text/graphical representation of each game, and MLBtv, and other interesting ways for fans to connect with their teams. Baseball fans around the world can experience the game in real time, whether or not they have access to it on TV.

Another day at the office: When the San Francisco Giants sold tickets online for the World Series last season, our traffic went from 100,000 page views to millions. We had a 1,000 percent increase in traffic in a couple of minutes.

Toughing it out: Sometimes it's difficult to represent the interest of all 30 clubs spread around the country.Biggest mistake made in technology circles today: Overcomplicating and obfuscating what could be straightforward solutions to even the most complex projects/problems.

For fun: Playing ice hockey, backcountry skiing, sailing and tinkering with electronics projects.

Wheels: 1987 BMW 635csi--the last of the really gorgeous older BMWs.

Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like

More Insights