Getting an organization to declare a
disaster can be a matter of perspective,
according to Don Stewart, director of professional services at Ongoing Operations,
a non-profit business continuity service
provider for U.S. credit unions. “In some
events, IT is so focused on fixing the problem that they don’t inform senior management of the disaster event,” said Stewart.
Some organizations have not defined
what a business disruption is, therefore
senior management will hesitate to declare
a disaster if the event is perceived to be
minor; for instance in the case of a phone
system failure, or delays in e-mail messaging.
Staying prepared requires more than
having a documented business continuity
plan; it requires teamwork from all stakeholders. Having a stake in planning at
this level ensures that business operations
would be maintained in the event of a disruption. Stewart recommends that a good
plan starts with a risk impact analysis.
Most companies, according to Stewart,
will purchase an in-depth risk assessment
and then do nothing about it; “the report
just sits there with no further actions being
taken.” This is as effective as making a list
of essentials to pack in a kit in case of a
house fire but never assembling the kit.
The Strengths of Business
Continuity
Recently, the IT department of the U.S.
State of Ohio virtualized their data centers
that provide governmental social services
to residents with developmental disabili-ties. The goal of the project was to provide employees and external users access
to service applications without any downtime and the ability to scale for future
growth. This project supports 80,000 Ohio
residents.
TechTarget reported on the project
and relates how the entire project took
nine months of architecture planning, and
before they began building the infrastructure, disaster recovery requirements were
a top priority. By leveraging the experience and expertise of internal staff and
by working with a qualified third party
IT service company from the beginning,
this project was completed on time and
currently supports 200 virtual machines.
More than 90 percent of the department’s
servers have been virtualized, Tech Target
reports. This project is an excellent example of how IT virtualization projects can
work in harmony with business continuity
objectives to deliver quality services.
Mercy Medical Center, Cedar Rapids,
Iowa, provides a success story of having
a business continuity plan in place for the
entire organization. They successfully put
their business continuity plan into action
during the Midwest floods of 2008, and
according to the hospital’s website, after
three weeks the hospital returned to full
operations. The Wall Street Journal’s
Health Blog has a compelling interview
about the plan’s evacuation and recovery
process.
In summary, companies that invest the
time, resources, and technology into business continuity plans are better prepared
to handle business disruptions. The preceding accounts of successful recoveries
affirm the value of disaster readiness.
The Weaknesses of
Technology and Business
Continuity
On the other hand, overconfidence in
the technology that powers the business
continuity recovery point and time (RPO,
RTO) objectives can be dangerous. As a
former data recovery engineer I can attest
to the numerous questions IT administrators, business directors, or executives have
regarding their technology disaster and
how it happened to them.
Nobody wants a data loss or business
disruption on the systems they are responsible for, yet, there is usually a cascade of
technological failures that happen when
IT disaster occurs. Discovering during the
course of an IT disaster that backups have
been failing or that backup software has
not been reporting media failures is gut
wrenching. Too often, a serious data loss
or business disruption results in unemployment for those responsible or thought
to be responsible, and the equipment is no
longer viewed as reliable.
The Cascade of Failures
Disaster recovery efforts went from bad
to worse for a company in Europe recently.
During routine maintenance on the com-
pany’s SAN storage that housed its virtual
machines, the SAN was presented to a dif-
ferent physical server by accident and was
automatically reformatted by IT staff. This
company’s disaster recovery infrastructure
included an identical SAN storage unit
located off-site which employed site rep-
lication technology. The IT staff thought
that this event would be a minor business
disruption.
When the Threat is on the
Inside
A United States business merger suffered a disaster while the two company’s
IT departments were merging their data.
The first company’s virtual host server
held over 400 virtual machines across 20
storage LUNs. During the data merge,
someone with administrative access to
the virtual host server deleted the 400 virtual machines and their virtual disk files.
Evidence suggests the disaster was caused
by employee sabotage and the cause is still
under investigation by computer forensic
investigators.
The merging company quickly engaged
emergency data recovery services and pri-oritized core servers that provided essential services. In three days, those systems
were up and running. For the next two
weeks emergency recovery efforts continued on the rest of the storage system. This
required extensive recovery engineering
efforts to search the unallocated areas of
the storage LUN for potential virtual disk
files, identifiable only by their file system
attributes.
Through a combined effort of backup
restoration and original volume recovery, data was recovered. Most of the virtual disk files were complete, while other
virtual disks required the file contents to