On Location: Chicago Tribune: Project and Vendor Management
The Tribune's IT shop got the story right: It's implementing initiatives that will meet needs and ensure the paper always gets published. Here are their keys to success.
August 13, 2004
Down to the Wire
The server-consolidation project touches all of the Tribune's core systems--and there is little room for errors or missed deadlines. The organization has been doing what many of us have done--moving systems off mainframes and onto smaller boxes. Although it still owns some mainframes, the paper has migrated many systems to Sun Microsystems minicomputers to reap the benefits of a distributed architecture and cut costs. All well and good, except IT found itself maintaining various Sun boxes at different OS levels and with different database versions. Disaster recovery was a concern as well, because all the Tribune's systems were at a single location.
The server-virtualization goal was to clean up the Sun machines and implement real-time disaster recovery. When the project is completed, two Sun Fire 15K boxes will replace as many as five smaller systems. A redundant fiber link will connect the data center and headquarters. Two separate networks will merge into a single network, the company will go from standby disaster recovery to active-active disaster recovery, and many applications will be upgraded.
Darko Dejanovic, vice president and CTO at both the Chicago Tribune and the Tribune Co., has given this project his full attention and backing. Some members of what Dejanovic calls his "talented staff" proposed merging these projects into a single venture that would reduce both cost and complexity. They took the time to sell the project, write the plan and arrange a capital financial budget that dwarfs some IT shops' entire annual expenditures.
What perceived benefit could drive such a large investment? Three-minute downtime. If the primary server in the server-consolidation project were to go offline, there would be only three minutes of downtime before users were automatically logged back in and could continue working. In an industry where every minute is precious, that's music to users' ears.The Tribune's IT staffers are excited about the projects. They talk about "what's best for the business" and "assuring reliability," but the sparks really fly when they get into project details. No wonder--they've put together a network that would make any geek drool: An EMC SAN with Brocade switches, two of Sun's new Sun Fire 15K Solaris boxes, a dark-fiber loop through the heart of Chicago and a bevy of applications to move to the new servers. It's fast, it's new, and it meets business needs astoundingly well. What more could IT want?
How about realistic project time lines and a reputation in the user community for delivering quality on time? Well, according to everyone we talked to at the Tribune, the IT group has all that too--though it wasn't always this way; see "Tribune's New Goals Require Culture Shock," for a look at the bad old days. Staffers sometimes must put in a few extra hours, of course, but they meet their deadlines. Imagine that.
Naturally, deadlines are sometimes affected by outside forces. For this project, the Tribune got representatives from Sun, Nortel and AT&T together to hash out delivery time lines and project milestones. And they recorded it all. It sounds astounding, but these three vendors were able to work together to deliver in a timely manner. This, Dejanovic says, is key--you must hold your vendors accountable for on-time delivery.
We've seen too many projects led astray by vendors that couldn't do this. Don't be fooled into believing that smaller-budget IT shops couldn't tie vendors to hard delivery time lines and service levels in the same manner as the Tribune did. To your vendor's salesperson, your organization is money. Make him or her earn it by including wording in the contract that protects you from a failure to implement systems on time.
The EnvironmentThe Sun Fire 15Ks are situated at geographically separate locations and linked by a local loop of AT&T fiber. These mighty machines can hold 18 system boards, each with four 1.4-GHz CPUs. The Tribune says it plans to move the workload from two Sun E10Ks, a Sun E4800 and two Sun Enterprise Server 65Ks onto these two boxes.
Server Consolidation SetupClick to Enlarge |
The fiber is split into eight 2-Gbps fiber channels for connection to the corporate SAN and interconnection of servers, and four 1-Gbps data channels. That's a big pipe by anyone's standard. The paper uses Layer 2 bridging to hook the virtual server network into the core corporate network and unify the channels for failover. The IT group says it thinks it has enough bandwidth to soon begin testing VoIP (voice over IP) between the two Chicago buildings. Two Cisco Catalyst 6500s at each fiber-loop endpoint connect the networks. To facilitate backups, the networks from the two buildings also were merged in the course of this project, giving them a unified network-addressing scheme.
During implementation, the IT staff tested for a single point of failure in the loop and found a spot where the two fibers ran through a single location. That single place for fiber to get clipped was eliminated.
Using Sun Cluster 3.1 plus Veritas File System and Volume Manager, the Tribune IT staff can partition these boxes into nine domains, or virtual machines. Seven will hold production applications; two will be for testing.The SAN environment comprises three EMC Clarion boxes hosting terabytes of disk space with Brocade switches on the front end. All this was purchased through Dell, but support is brought in direct from EMC.
The SAN was in place before the server-consolidation project. Even so, server setup and configuration, the dark-fiber link and the SAN connections, along with the upgrade and migration of several end-user applications from smaller boxes, needed six full-time Tribune employees for three months. That's a tight time line to stick to and a small staff for such a large project. But there were representatives from all three vendors on-site, and these days Sun has a person in the building during normal business hours.
When we asked about application upgrades, the Tribune staffers said they don't see it as that big a deal. This uncovered an important doctrine that contributes to the reputation they have for always being on time and managing highly reliable systems: "We don't customize," says Deepak Agarwal, director of client systems. CTO Darko echoed the sentiment, which is one we think more IT shops should adopt. In the long run, thinking you are so much more special than others in your industry that you need massive customization just lands you in a crunch at delivery time. That's something to tell your users: The more customization you perform, the higher the risk.
The Tribune is able to ask application vendors, "Will this run on the new version of Solaris, and will it work with the newest release of Oracle?" We got the impression that there was an implied, "The answer had better be yes."
"Our vendors know not to surprise us," Agarwal says. "If we get a surprise, it comes for free."So far, with the exception of one blip earlier this summer that caused printing to be delayed by five hours, those applications have worked just fine. Thinking about the volume of change the Tribune has introduced, and knowing how most IT shops operate, we don't see five hours as a huge loss. We know some projects at other shops that are years late, with losses in the millions.
When asked about the key to timely delivery, technology director Scott Tafelski says, "project management." And the Tribune puts its money where its mouth is: It uses a customized set of the Project Management Institute's standards, and if you're a project manager and want to become PMI-certified, the Tribune will pay your freight.
On time, within budget, is one goal Tribune IT project plans are built to ensure, and using milestones and a separate QA/Compliance group helps keep surprises to a minimum. QA is involved in each project, both formally and informally, often dropping in to check things out. A good tech's first reaction to that kind of involvement is, "Don't you trust me?" But remember: The group is also watching the people you're counting on to deliver in a timely manner, meaning you're more likely to know if a problem elsewhere is going to have an impact on your deadlines.
Another task the Tribune IT department does well is help business management understand that it often knows best. The department has told users that bringing in another database, for example, will cost more because of increased maintenance, training and management expenditures, and has kept core systems on Oracle, an astounding feat in this time of "business alignment." Educating users about the impact their decisions have on IT costs and man-hours is a critical part of "business alignment" that we often overlook. Far too often, the term is used to mask "give users whatever they want and to hell with long-term costs." Long term, that will leave you in a world of hurt. Apparently, the Tribune IT staff feels the same way.
The last point that Tafelski made--one we grudgingly agree with--is that if you truly budget a project, down to every last cent, you also must know all the steps and potential pitfalls. If you don't have that knowledge and you hit a snag that causes a vendor to tack on $20,000, there goes your budget. To avoid this, project managers are told not to sign off on "something you can't buy into." They are taught how to force vendors to commit to deadlines and guaranteed pricing.By the end of the vendor meetings, the IT staff had an agreement among vendors as to who was responsible for what, and had recorded it all. Meanwhile, the team members learned a lot about the project and how it could be structured, and they were ready to negotiate for fixed bids. This is exactly the process they followed with AT&T, Nortel and Sun, getting them all in one room and recording their discussions about how the project would be implemented and who was responsible for each piece of the project. All of this work was done before the team members went in to ask for capital expenditures. That gave them a leg up on project planning because they had firm time lines and commitments--on tape--that the vendors would meet those dates. Making the overall milestone list must have been easy at that point--just list the delivery dates agreed to by your vendors. A willingness to use those tapes to walk away from a contract, or even to head to court if necessary, must be assumed.
In Journalism 101, budding reporters are taught that if your mother says she loves you, check it out. That diligence seems to have seeped from the newsroom to the IT shop, which has learned to do its homework. For example, Tribune employees visited a couple of shops to get an understanding of issues and stumbling blocks they would face during implementation. The research seems to have paid off: When we visited in June, the project was ahead of schedule.
The Tribune realized several benefits from this project, both planned and unplanned:
Because the 15Ks can split CPUs and memory across different process domains, they can shift processing power to where it is needed in a sort of "virtualization" mechanism. Nightly batch-processing has taken advantage of this architecture: The time needed for the task has dropped from about five hours to as little as one hour.
Fewer staffers are required to run the Sun Fire 15Ks, which means the Tribune can add applications without increasing head count.
The 15Ks are faster than previous servers because of the high-speed data link and clustering. Increased application performance can mean increased user productivity.
Then there's that tantalizing three-minute downtime if one of the Sun Fire 15K servers fails. If, like most of us, you are in an environment where downtime equals lost revenue, this benefit alone might justify implementation.
And, finally, there's the change-control factor. Instead of different machines with different revisions of the operating system, the staff can install and test each application on one of the test domains, and when all are ready to move to the newest OS release, they simply upgrade. Bam. No concerns about tracking multiple OSs on multiple hardware revisions. That's less specialized knowledge required of the IT staff, netting more time to handle forward-looking tasks.
When we asked Agarwal what advice he has for network managers considering this kind of project, he said vendor-relationship management is the key. Without knowing what was required to be successful and lacking commitments from vendors to meet those requirements, the Tribune might have been in trouble, he says. We've seen much smaller projects get bogged down in the "vendor can't deliver in time" morass, while others fall into the "we need six more months to reimplement our customizations" trap. In our book, bringing a project this large in on time and within budget is a story worthy of stopping the presses. Not for long, of course--the paper must get out, after all.
Don MacVittie is a technology editor at Network Computing. He previously worked at WPS Resources as an application engineer. Write to him at [email protected].
Anticipate budget angst and get your ducks in a row before presenting your case. IT brought in finance people to help model an ROI report that took into account the cost of equipment, software and maintenance agreements if the projects were done individually compared with a consolidated initiative. Findings that the Tribune would save 130 percent on hardware and software maintenance agreements alone helped sway upper management.
Avoid customization. The more customization you perform, the higher the risk, and the less able you are to hold vendors' feet to the fire if porting proves difficult.
Think QA, QA, QA. Get project stakeholders past the "What, don't you trust me?" mind-set. Remind them that QA/Compliance teams are watching their backsides as well by ensuring deliverables that affect their pieces of the project are on track.Technology Director, Chicago Tribune Technology Director, Tribune co.
At Work: Responsible for IT project management and quality assurance across the Chicago Tribune IT organization
At Home: 45 years old. Married, no children. Hobby: travelingAlma Mater: Loyola Institute of Chicago, M.S. in organizational development
HOW HE GOT HERE: 2002 to present: Technology Director, the Chicago Tribune and Tribune Co.
1998 to 2002: Technology Director, the Chicago Tribune
MOUTHING OFF:
What I say to people who think disaster recovery is a luxury: "They are working in companies at risk."What I say to those resisting migration from the mainframe: "They are not taking advantage of new technology."
Most surprising comment from a user: "Thank you."
I work at the Chicago Tribune because: "It is a good product."
The most misunderstood aspect of my job: "Finding the line between technology issues and user issues, and identifying project-management structure that overlies both."
Greatest business challenge: "Identifying the demands from advertisers and subscribers."If I had the server-consolidation project to do over again, I would: "Have started earlier."
I love technology when: "When it provides obvious business value."
I hate technology when: "It doesn't work."
My next career: "Teacher."
When I retire, I will: "Continue teaching."Production Director, the Chicago Tribune
At Work: Responsible for IT operations for production, circulation and distribution.
At Home: 47 years old. Single. Favorite hobby: working with new technology
Alma Mater: DePaul University, M.S. in MIS
HOW HE GOT HERE:2002 to present: Production Director, the Chicago Tribune
1999 to 2002: Senior Project Manager, the Chicago Tribune
MOUTHING OFF:
What I say to people who think disaster recovery is a luxury: "It ends up costing you more in the long run if you don't have it."
Worst moment of downtime in my career: I was giving a presentation when my hard drive crashed. While I was fixing it, I was continuing the presentation."Most off-the-wall complaint ever made by a user: "A user once complained that he couldn't make a connection. I asked if the modem line was operating, and he said, 'No, the power is off.' He couldn't understand that when the modem was off, he couldn't dial out."
I work at the Chicago Tribune because: "It is a dynamic organization that allows you to do the best you can, and gives you opportunities to grow."
The most misunderstood aspect of my job: "That most of the effort should be unseen by users. If we do it right, users shouldn't know who we are."
I love technology when: "It solves a difficult problem, and you see the results of your effort."
I hate technology when: "It takes too long to solve a difficult problem."My next career: "Develop new applications."
When I retire, I will: "Read the newspaper every day."
You May Also Like