The Disaster Recovery Team
A successful disaster recovery is not possible without a competent disaster recovery team. Securing talented resources to perform the disaster recovery is the first step to being ultimately protected. It will be up to each of them to create, follow and maintain the disaster recovery plan. Follow the guide below to ensure the proper personnel are recruited.
|
A succession of natural and man-made disasters in the United States in recent years-the San Francisco and Northridge earthquakes, the Chicago River flood, Hurricane Andrew-has spurred corporate interest in disaster recovery. Frequently, it is not the disasters themselves that prompt corporate leaders to invest in disaster recovery planning for their information technology systems. Typically, the move to disaster recovery planning originates from the mandate of financial institutions, the sting of a negative external audit or the threat of a shareholder's lawsuit.
Even when companies acknowledge the benefits of disaster recovery, many executives do not feel a sense of urgency and fail to recognize the necessity of planning until sudden disaster strikes. The reality is that any business that relies on information technology, which includes most businesses, needs a disaster recovery plan. This is especially true for medium and small companies that, unlike large companies, have limited resources. These companies are often the first to succumb to a disaster.
Management support is essential because disaster recovery planning costs money and affects the entire company. At times, getting that support can be difficult. For a number of reasons some managers are reluctant to invest in something that they probably-and hopefully-will never need. Others are optimists, believing disasters are things that happen to other companies. Still others believe that they are already prepared. Overcoming management objections to disaster recovery planning requires increased awareness of risks, their potential impact, and a well-thought out campaign strategy. It is important to have a project plan that defines a reasonable period of time for developing the disaster recovery plan, the resources available, budget required and key milestones for measuring the progress of the project.
Build Reliable Support
For any BRP effort to be successful, it needs senior management support. However, some disaster recovery professionals associated with large organizations claim that the support they received at the beginning of the development effort is not enough to sustain their project through to its completion. So what can be done to maintain senior management support? One suggestion, proposed by Glancy and Stamiezkin, is to help senior management relate to the effort and keep them interested in the project on an ongoing basis. The authors suggested that, for most organizations, a business impact analysis (BIA) is a helpful tool with which to start. The BIA helps to identify the major threats to the organization and the impact of having one of those disasters wipe out one or more of their operations, but those organizations also quickly become aware of its limitations. The probability of any of the "BIG" threats occurring is fairly small. Most senior managers intuitively know that those threats exist and what the general impact would be to the organization. Glancy and Stamiezkin caution that ongoing support from senior managers will not be achieved by running "The Big One" up the flagpole over and over again.
On the other hand, keeping management focused on an effort that addresses the safety of the organization's people, mitigates the impact on customer service, and maintains the financial well-being of the company are much more powerful motivators for senior management.
In another article, Iyer and Diez (1997a, pp. 11-12) encourage the use of an wareness training program for senior managers that does not merely restate news tems and facts (such as Hurricane Andrew, World Trade Center bombing, etc.). Rather, the program should clearly demonstrate how events that are relevant to an organization's physical location and business environment could lead to unwarranted and costly consequences. The suggested approach is to select a business process and run through scenarios based on relevant risks and threats likely in and around the location of the organization, to better illustrate the need for business continuity planning. Another approach is to select an event (flood, winter storm, or tornado in the local area) and interactively dissect the event's impact on all of the business processes. Table 4 lists a set of criteria that may be followed in designing an awareness-training program. It is important to note that the audience should leave the meeting or presentation with a clear understanding of the value added from the commitment that senior managers make for resources, time, and funding for the business continuity planning program.
Most BRPs are developed on a "worst case scenario" and, for most companies, it makes little difference if an earthquake or a flood, a fire or a bomb destroys a building. The building is gone and the company needs to recover. Consequently, some consultants believe that it is wise to invest most of the limited time, money, and energy available for developing the BRP at the functional or work group level as opposed to focusing on a detailed corporate-wide BIA.
Top 10 List of Criteria for Developing Senior Management Awareness Training Programs
1. Know your audience.
2. Aim for awareness and then commitment, not just approvals.
3. Seek to understand, not to be understood.
4. Examples, examples, examples, but relevant, relevant, relevant.
5. Make them listen, not just hear.
6. Fifteen minutes or less to sell.
7. Ask not what they will give you; ask what you can give them.
8. Is it merely a job, or is it an adventure?
9. Commitment starts after the presentation.
10. Now that you are the chosen one, what is your next step?
By moving quickly from the analysis stage to the development of the disaster recovery plans for the functional areas, companies may be able to identify ways to increase operating efficiency; build a stronger, more stable organization; and improve their position and marketability in the marketplace.
Another more drastic strategy for raising management's consciousness regarding the importance of disaster recovery planning is proposed by Woodworth (1996). He recommends that if you are confronted with people to whom disaster recovery planning seems to be an easy project to postpone try a new tact, i.e.-test first. Experience, he claims, shows that tests (even though small in scale) can provide the catalyst needed to bring out the interest and enthusiasm necessary to make real progress in developing a crisis plan to cope with major incidents. If this tact is adopted, he suggests that the following steps be initiated:
1. Identify an area of particular importance to the company. Note: Pick an area that has considerable visibility in the company, probably from cash flow and/or customer service. It needs to be an area where its management would be reasonably receptive to a small test specifically tailored to their requirements.
2. Develop a reasonable scenario and intentionally limit the scope of the exercise. Note: A good scenario may be a fire that destroys a floor or two of a large facility or destroys a small, operationally important facility. Pick an operational area rather than a computer center; the test itself will most likely highlight the importance of computer information anyhow.
3. Ask the appropriate management to give verbal approval to this small scale test. Note: Assure the manager that routine business will be accomplished while the test is underway. A good point to remember at this stage is that a "surprise" test is not needed. In fact, a 100% surprise test would probably be counterproductive and leave the employees with a negative attitude toward disaster recovery planning in general.
4. Assemble a small planning/disaster recovery team. Note: The team should include persons knowledgeable about voice and data communications, facilities management, and two or three key employees from the area to be tested. During the pretest stage, the voice communications person should prepare a diagram of the area's phone network while the data communications person prepares the same for applicable data circuits. Primary reasons for these diagrams are to disclose voice or data connectivity links to other areas of the company many of which employees may not be aware. The facilities person should pre-design a space assignment matrix based on essential personnel, voice and data requirements for 24 and 48 hours and one week. The chart should be about 2 x 3 feet and laminated so that dry erase markers can be used during the test.
5. Plan and execute the test. Note: The date/time should be pre-approved by the manager involved and the test team. Normally the best arrangement is to start the test about 8:30 AM with a test commencement meeting, have an update meeting at 1 PM and a debriefing session (one hour maximum) around 10 AM the next day. Arrange a convenient place to be designated as the emergency operations center from which to run the disaster recovery effort. Consider choosing a room in the building that's supposed to be damaged, rather than equipping a fancy, pre-equipped EOC at this early stage of plan development.
Disaster Recovery Coordinator
The Quarterback of the DR Team
It is essential that the senior manager of the tested area personally kick off the test commencement meeting and designate a Disaster Recovery Team Chairman. The senior manager's remarks should be followed by a detailed test brief by the test coordinator.
The briefing should include the test goals, the scenario, any contributing factors (e.g., each department should simulate one casualty requiring first aid), test rules (e.g., no access to burned out floors for at least one week) and the debriefing and report writing requirements.
What should you expect to get out of the test?
1. Notification trees will be produced.
2. Space/voice/data planning for the short term will have been accomplished with a chart available for use in a real emergency.
3. Employees will have looked at their life support plans (e.g., first aid supplies and training, and assisting handicapped or injured persons).
4. Employees will have looked at whether there is satisfactory off-site record duplication.
5. Priority tasks will have been assigned.
6. Plans for responding to the media will have been covered.
7. The personnel department will have determined how they will handle next of kin notifications, pay, insurance claims, etc.
8. Disaster recovery insurance coverage will have been reviewed.
9. Many other important planning items will come to light.
When the test is over, a good report is necessary; however, for this initial test it doesn't have to be long. It is important that the senior manager sign the report, not the test coordinator. The report should be addressed to the appropriate executive management personnel (with copies if you wish) and should be due within two weeks of the test. A responsible person within the test area assigned by the senior manager should write the report. The test report should consist of a cover memo, possibly an executive summary and then a more detailed report including findings, recommendations and conclusions, along with any appropriate supporting documents.
The bottom line is that it is going to take commitment to develop a disaster recovery plan. It first starts with commitment by senior executives. Followed by:
The right person to drive the process. (This may be the hardest part of long-term planning.)
Commitment to startup costs.
Commitment to maintenance costs.
Commitment to test costs.
Commitment to changing the way the business operates.
Endurance to make the plan cost effective enough to pay for itself.
Secure and Prepare Resources
In order to muster support and resources, it is necessary to create a high-level strategy for the effort. Senior managers want to know the status of the disaster recovery project and how resources are being used. When positioning the project resources, there are a few methods from which to choose:
1. Have a central staff exclusively dedicated to developing plans for all areas
2. Have each functional team pick a person to develop their own plan
3. Develop a central control area responsible for coordinating the overall project and enlisting a small group of BRP planners who are located within the line operations.
Having a central staff area responsible for developing the BRP may provide some efficiency, but it puts the emphasis on disaster recovery in the wrong place in the organization. The functional business areas need to take full responsibility for developing and maintaining their own BRPs. After all, if an incident wipes out a business function, it won't be the people in the staff area who will recover the operation.
The second option presents a different set of issues. Making each functional business area responsible for developing its own plan places the responsibility for disaster recovery in the right place. However, this option lacks focus and consistency. The most significant drawback is the lack of coordination and prioritization among functional areas. With no central area coordinating the effort, the functional teams are left to their own resources to develop plans. Each functional area often sees itself as the most important area for the organization to recover. The quality and consistency of plans across functional teams may also be hard to establish and even harder to maintain. Also, functional teams often compete for limited resources, such as equipment and space, following a disaster.
Option 3 above seems to draw on the strengths of both options 1 and 2. In centralizing the coordination responsibility for the project, redundant disaster recovery requirements may be combined to support more than one functional unit (e.g., two or more teams may list the same resource as critical to their disaster recovery efforts).
In the case of one organization's initial concentrated effort to develop and implement the functional units' BRPs, two full-time resources staffed the central control area. This area reported directly to the president of the company and was responsible for coordinating the development effort, training the line planners, maintaining the central database and managing scarce resources. These individuals were also responsible for supporting the Incident Management Team (Emergency Management Team) and the Emergency Operations Center if these needed to be activated.
In a large organization, even after the infrastructure above is established, the scope of the project may still be prohibitively large. For example, the company initially determined that there were approximately 95 functional area teams covering 4,500 employees throughout North America that needed disaster recovery plans. They also estimated that it would take two years to complete the development of all those BRPs. One suggested approach, at this stage, is to focus on the development of the Incident Management Plan. In this way the members of the Incident Management Team, mostly senior level managers, become quickly involved in the development of a comprehensive disaster recovery effort. Through a series of gradually more complex rehearsals of the Incident Management Plan, the team members also become increasingly aware of the importance of being prepared to handle any incident. In the case of this organization, they ended up with 177 plans and, because of efficiencies they discovered along the way, they completed the development plans for all functional areas six months earlier than they estimated.
Appoint a Disaster Recovery Coorinator
Who is an ideal disaster recovery coordinator? There is no pat answer to this question. A disaster recovery coordinator does not need the highly technical skills set of a programmer, systems analyst, hardware specialist, or network administrator. It is important that the candidate be able to communicate with technical staff and adequately interpret what they say in order to communicate it effectively to nontechnical users in reports, procedures and other documentation. It is important that the coordinator be:
Organized
Detail oriented
A competent writer
Able to work through complex problems and issues
Experienced in managing vendors
Experienced in evaluating product offerings
Fluent in project management principles and techniques
Highly skilled in patience, perseverance, and diplomacy.
In most large organizations the job of a disaster recovery coordinator is a fulltime task. In smaller companies that do not have a full-time coordinator, the heads of various departments throughout the company must share those responsibilities.
