home news blogs forums events research newsletter whitepapers careers


Network Computing Network Computing Network Computing
HOT PICKS

IMMERSE YOURSELF:

SOA

  |

Data Center

  |

802.11n

  |

Data Privacy

  |
APO  |

Virtualization

  |

NAC

  |

Security

  |

Network Mgmt

  |

Enterprise Apps

  |

Storage & Servers



Table of Contents

Management Issues

Finances

Managers often find it nearly impossible to devote time to adequately understanding issues that do not contribute directly to the bottom line. Consequently, it is usually difficult to justify the expense associated with the network, and as we will see later on, the price tag goes up with the level of redundancy. One useful technique for securing adequate funding, however, is to lay out in black and white the cost of downtime.

The Cost of Downtime

In this simplistic example, we examine the cost of downtime for a mythical consumer-oriented business, such as an airline's or hotel's reservation center. The customers have a choice. If they cannot reach the reservation center, they will call a competitor and place their order there. Lost business is really gone for good.

Our hypothetical customer service center has a staff of 500 people, each of which carries a burdened cost of $25 an hour. They make an average of 60 transactions per hour and average of three high-priced sales per hour. Hours of operation are 24 hours a day, seven days a week, 365 days a year.

In actuality, line managers of the site should calculate the costs of downtime, not the IS staff. This information often is not forthcoming, however. So you present it to give a general sense of the impact that downtime has on the bottom line. The goal is to open some eyes and generate some debate. Use this example as a guideline for how to estimate the cost of outages in your environment.





As we can see, the cost of outages in our hypothetical network with an availability rate of 99.9 percent is about half a million dollars a year. We have already bought the hardware and software necessary to do the job. We can consider this estimate a guideline on the additional budget to spend on providing redundancy. This is separate and apart from the funds required to provide a base level of network functionality.

Some additional industry statistics may help. In the September 1994 issue of HP Professional , an article, Down but not out, . said:

The average company loses two to three percent of its gross sales within 10 days after losing its data processing, and critical business functions cannot continue for more than 4.8 days without a recovery plan in progress. Half of the companies that do not restore their data center to operation within 10 business days never fully recover. Ninety-three percent of the companies lacking a recovery plan are out of business wi thin five years of a major disaster.

It is really not worth rushing headlong into designing a fault-tolerant network unless all parties agree on all the implications that downtime has to the operation. This is the time to seek an executive sponsor to champion the process. Assuming there is a consensus on the real cost of downtime, now we can move on to crafting a plan of action.

The Service Level Agreement

That plan should start with a service level agreement. A service level agreement is simply a contract between your corporate customer and the IS department. Basically, the service level agreement formalizes the relationship on a customer/supplier basis. The agreements documents the understanding between customer and supplier. Some IS departments will view the process with skepticism. It can be unnerving to relinquish the upper hand if users have been viewed as mere consumers - not valued customers.

In order to receive funding and more importantly to document the responsibilities and expectations of all parties, however, this process fits many medium-to-large organizations. It should be a win-win for all involved. We should strive to maximize results, concentrate effort, and recommend organizational change where appropriate.

At the outset of the plan, note that fault tolerance is not simply a response to failure. It involves an ongoing cycle of planning, design, daily monitoring, long term trends and regular re-evaluation. We should include all assumptions and forecasts as part of the plan and update it as assumptions or growth changes.

Measurement of progress against the plan should not be cast in terms of how long a particular switch or server has been up. Tracking individual components and sub-systems is obviously important, but it cannot be reflected in terms of customer service. Rather, progress should reflect the ability of the system to meet the users expectations as documented in th e service level agreement.

The service level agreement should document the understanding between the parties about:

€ The priority that systems or groups receive in a triage situation.
€ Mandatory or core functions that need extra protection versus desirable or support functions: In some situations, core functions can comprise as little as 20%-30% of the total number of features.
€ User responsibilities. For example, only approved software will be used.
€ The understanding that no unauthorized software will be installed
€ The responsibilities of all IS parties_development, support, database, network, operations, and vendors
€ Time frames for response and repair
€ Expected levels of unplanned outage
€ Expected levels of planned outage
€ Expected performance characteristics during normal conditions
€ Expected performance characteristics during failure conditions
€ Certification process for new systems
€ Standards and guidelines for all components
€ Resolution of inadequate performance
€ Costs for different alternatives
€ Process for changes in forecasts
€ Exceptions, if any
€ Escalation procedures
€ Re-Evaluation processes

Ideally, this should be applied to all parts of the system including central and remote sites. Don't forget to consider your partners and customers external to the organization.

Hold everybody's feet to the fire until you get participation. Draw up a set of assumptions based on your own experience with the applications and groups involved. Then, on an individual basis, set up interviews, meetings, surveys or whatever it takes to get buy-in.

If this sounds like a lot of extra effort, it is. But the IS business is about service. Excellence in customer service is the only real difference between you and the competition. It makes sense to apply a certain amoun t of rigor to the process. If yours is the type of shop that flies by the seat of the pants, now is a good time to re-evaluate that position. Again, it goes back to attitude. Organizational determination needs to exist in order to truly provide for fault tolerance. Remember the responsibility for avoiding failures, recovering from them, and providing backup and restore falls entirely on your shoulders. No vendor can relieve you of this responsibility.

Conforming to the Service Level Agreement

The point is to ensure that we have consistency between design, implementation and our goals. We need to put methods in place to ensure that. The first step in ensuring the long term quality of the effort is to determine which statistics will be tracked. At this stage, we will need a plan to track conformance to the service level agreement. The plan should include:
€ recommended methods and tools
€ change control
€ configuration management
€ daily and long term statistics
€ documentation plan for service level reporting

Be practical about how much data to store in your database of statistics. A roll-up, or summary, of statistics, if done with foresight, may be sufficient. Plan on keeping it to a reasonable size. Break it into sections which can be managed independently by different groups.

Scope

Unfortunately, in the real world, there is no way to limit scope. Disasters can occur anywhere in the network, at any level in the ISO communciations model from physical layer to presentation. It can be useful, even necessary, however, to separate the problem into logical groups. For example, most corporate IS staff are divided into something like the following groups:

workstation support
network
server support
database
development, both client and server

In addition two functions are shared across all group s:

planning
operations

In the following sections we will address the issues and responsibilities regarding fault tolerance as they apply to those groups. Emphasis will be placed on network considerations. However, remember that responsibility is shared across all groups.

Table of Contents

November 15, 1996

Print This Page


e-mail E-mail this URL






Ready to take that job and shove it?

Function:

Keyword(s):

State:
SPONSOR
RECENT JOB POSTINGS
CAREER NEWS
Go beyond Google and get vertical. These specialized search sites will help you find the business information you need -- fast.

Ari Balogh was named to the post of chief technology officer as the companys for a "realignment" of employees.










InformationWeek U.S. IT Salary Survey 2008
Salaries for business technology professionals are falling. Here's what you need to know in order to make good hiring decisions and personal career choices. Purchase Today: $299
 
ROLLING RIGHT ALONG
Follow key Network Computing Reviews from conception to completion. This Week: Holistic APM.



Network Computing Reports Emerging Enterprise Podcast Series: Secrets to Success








TechSearch


Microsite of the Week


Powerful Information at Your Fingertips



techweb
Online Communities TechWebInformationWeekLight ReadingIntelligent EnterprisebMightyNetwork ComputingDark ReadingDigital LibraryWall Street & Technology
Byte & SwitchNo JitterInternet EvolutionLight Reading's Cable Digital NewsContentinopleUnStrungBank Systems & TechnologyAdvanced TradingInsurance & Technology
Face-to-Face Events
InteropWeb 2.0 ExpoWeb 2.0 SummitVoiceConBlack HatCSISoftwareEntrprise 2.0 ConferenceGTEC
Mobile Business Expo
InformationWeek 500 ConferenceBuy Side Trading XchangeBuy Side Trading SummitBank Executive SummitInsurance Executive SummitTelcoTVEthernet ExpoOptical Expo
Magazines  
InformationWeekWall Street & TechnologyInsurance & TechnologyBank Systems & TechnologyAdvanced TradingMSDNTechNetSmart EnterpriseThe Architecture JournalDatabase Magazine
 
Research & Analyst Services  
Heavy ReadingInformationWeek ReportsInformationWeek Analytics
 
   
   
App Infrastructure   |   Messaging & Collaboration   |   Network & Systems Mgmt   |   Network Infrastructure   |   Security  |   Storage & Servers   |   Wireless   |   Enterprise Apps
About Us  |  Contact Us  |  Site Map  |  Technology Marketing Solutions  |   Briefing Centers
Copyright © 2008  United Business Media LLC  |  Privacy Statement  |  Terms of Service  |  Your California Privacy Rights