Upcoming Events

Cloud Connect
Santa Clara
Feb 13-16, 2012

Cloud Connect brings together the entire cloud eco-system to better understand the transformation we're experiencing and promises to be the defining event of the cloud computing industry. Learn about the latest cloud technologies and platforms from thought leaders in Cloud Connect’s comprehensive conference.

Register Now!

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up
Column
C O L U M N  
Cradle to Grave

  August 20, 2001
  By Stephen Litchfield


In today's ever-changing world, nothing lasts very long, and computer-based systems are no exception. In our shop, we sometimes find ourselves changing systems before the old ones even get into full production.



On the other hand, we have some stuff that's been running so long nobody can even recall when it was implemented. With our older mainframe systems, we have the luxury of backward compatibility. The hardware has gotten faster and smaller, but the software hasn't changed much at all. As a result, these systems have been a stable part of our operation since what seems like the beginning of time, and we know them stone cold.

Unfortunately, we can't say the same of our new systems. Many are evolving as technology changes, and we've found ourselves running production applications systems about which the operations staff knows little to nothing. Everything is fine and dandy until something breaks and we haven't a clue how to fix it. Worse yet, we have to deal with hundreds of individual platforms -- each with its own personality, hardware and software.

Recently, we began a serious effort to document our systems, starting from their inception, to provide better support once the systems achieve production status. Aside from documenting the obvious nuts and bolts of a system, we include dependencies on other systems, steps involved for disaster recovery, key contact personnel, vendor phone numbers, security information and so on. The more information, the better. We're also putting it online so it will be accessible from the nearest browser with appropriate security.

Much effort has been spent providing standards for hardware and software platforms, helping to provide consistency whenever possible. Having published standards means new systems have a better chance of fitting seamlessly into our environment. In the past, systems were sometimes developed without serious regard for operational issues. This created more "systems from hell" than I care to remember. With the number of applications we support today, it's impossible for us to know every idiosyncrasy. It's not easy, but we try to keep things as standardized as possible around here.

Over time, we have created a change-control process to follow the evolution of each system throughout its life. This process is aggressively backed by management and must be followed (or else!). Even when an emergency forces us to act now and follow up later, we make sure the change-control work gets done.

A change-control process has two main functions. First, it creates a mechanism whereby the proper people are notified of changes so they can provide input if any problems arise. Second, it ensures we update system documentation. Done properly, this method guarantees that we end up with a complete history of every system. And it's all just a click away. Surely, this beats scrounging to find the last person who worked on a specific system and then picking his or her brain.

Another change-control process benefit involves your retired systems. Do you ever give much thought to the possible consequences of retiring a system? Probably not. Just erase the data and shut it down. R.I.P. A long time ago, we had a power outage that knocked out a bunch of old DECservers. When the power returned, the DECservers didn't work because they couldn't find the host VAX that provided them with their download software. Someone finally realized we had decommissioned the host VAX months earlier. If we had our change-control system back then, we would have documented this dependency when the DECservers were installed and avoided the problem.

Most of the time, dealing with a change-control process seems too much of an annoyance -- and sometimes it is. But when it's 3 in the morning and you're the one trying to get a sick server up and running again, having all this information available is anything but annoying.

Stephen Litchfield is a senior staff LAN analyst at a Massachusetts-based manufacturer. Send your comments on this column to him at slitchfield@nwc.com.


Research and Reports

Hypervisor Derby
August 2011

Network Computing: August 2011

TechWeb Careers