Category Archives: DRP

WFH Workers: What’s your Power and Internet DR Plan?

When we all worked in corporate offices, our enterprises (if they were large enough) developed DR plans for power and Internet connectivity, enough that it took a significant event to take a workforce offline. Through economy of scale, workers were all covered by DR plans for power, Internet, and environmental controls, resulting in a decent level of resilience.

APC UPS for laptop and WiFi

But now that many of us are home, our DR is not what it should be. In our home offices, we lack multiple ISP and power feeds. In almost every respect, our critical infrastructure at home (power, Internet, environmentals) are N+0: if our power goes out, or our Internet goes out, or if our heat or A/C go out, there are probably no backup systems.

We are each responsible for our own DR in our home offices. If we want resilient power, we need a UPS and/or a generator. If we want resilient Internet, we have to have a fallback plan. If we want resilient heat or A/C, we must have another source of heating or cooling.

We live in the country, where there’s no landline, no cable, and no fiber. We’ve lived in WA for decades, and are accustomed to extended power outages due to severe weather events. For us, Internet outages are infrequent, but they happen, particularly at harvest time when farm equipment parks in front of our fixed wireless antenna, or when our local ISP is doing maintenance on network elements such as patching routers.

Fixed Wireless Internet Antenna

Here are the details of my primary and backup plans:

ResourcePrimaryBackup
InternetFixed Wireless WiFi (12Mbps/12Mbps)iPhone tethering (3Mbps/3Mbps)
PowerUtility PowerCyberpower UPS for Internet POE (5 hrs)
APC UPS for WiFi and Laptop Computer (6 hrs
Honda EU2200 Generator
Future: 10 circuit transfer switch + larger generator + EMP circuit protection

Our utility power does go out from time to time. Since living here in our present home since mid-2020, we have had one outage lasting ~30 hours, and three outages lasting 1-3 hours. In the more prolonged outage, we fired up our larger generator (not pictured) to run our freezers and charge the UPSs.

Our source of water is a well that is on our property. When the power goes out, we have no water. For this reason, we have purchased a transfer switch that can enable us to run our well using our larger generator (I doubt that the little Honda can power it).

Gas generator

All of these measures have given me confidence that we will have power and Internet for short-term outages. For longer-term outages, the larger generator and transfer switch will enable us to run water and remain in our home for as long as we have generator fuel.

Since I do not work in a corporate office in a commercial building, I must implement my own resilience strategy. No one else is going to do it for me.

Smartphone WiFi hotspot

From a cybersecurity perspective, I’m pretty confident in our setup, but I’m not going to go into details here. We have a commercial NG-Firewall protecting our entire network, advanced anti-malware, secure DNS from cleanbrowsing.org, and other safeguards.

Recovery Capacity Objective: a new metric for BCP / DRP

Business continuity and disaster recovery planning professionals rely on well-known metrics that are used to drive planning of emergency operations procedures and continuity of operations procedures. These metrics are:

  • Maximum Tolerable Downtime (MTD) – this is an arbitrary time value that represents the greatest period of time that an organization is able to tolerate the outage of a critical process or system without sustaining permanent damage to the organization’s ongoing viability. The units of measure are typically days, but can be smaller (hours, minutes) or larger (weeks, months).
  • Recovery Point Objective (RPO) – this is a time value that represents the maximum potential data loss in a disaster situation. For example, if an organization backs up data for a key business process once per day, the RPO would be 24 hours. This should not be confused with recovery time objective.
  • Recovery Time Objective (RTO) – this is a time value that represents the maximum period of time that a business process or system would be incapacitated in the event of a disaster.  This is largely independent of recovery point objective, which is dependent on facilities that replicate key business data to another location, preserving it in case the primary location suffers a disaster that damages business data.
  • Recovery Consistency Objective (RCO) – expressed as a percentage, this represents the maximum loss of data consistency during a disaster. In complex, distributed systems, it may not be possible to perfectly synchronize all business records. When a disaster occurs, often there is some inconsistency found on a recovery site where some data is “fresher” than other data. Different organizations and industries will have varying tolerances for data consistency in a disaster situation.

In my research on the topic of business continuity planning and disaster recovery planning, I have come across a standard metric that represents the capacity for a recovery system to process business transactions, as compared to the primary system. In professional dealings I have encountered this topic many times.

A new metric is proposed that is used to establish and communicate a recovery objective that represents the capacity of a recovery system:

  • Recovery Capacity Objective (RCapO) – expressed as a percentage, this represents the capacity of a recovery process or system as compared to the primary process or system.

Arguments for this metric:

  • Awareness. The question of recovery system capacity is not consistently addressed within an organization or to the users of a process or system.
  • Consistency. The adoption of a standard metric on recovery system capacity will facilitate adoption of the metric.
  • Planning. The users of a process or system can reasonably anticipate business conditions should a business process or system suffer a disaster that results in the implementation of emergency response procedures.

Cloud based solutions bring disaster recovery within reach of small business

Backup and Data Recovery (BDR) solutions traditionally have been high priced luxuries out of the reach of many small to medium business owners. Tape drives remain very expensive hardware components, and offsite storage services are simply too expensive for many companies to use. But now, cloud based solutions are poised to bring BDR solutions within reach of every business from the sole proprietorship to the multisite enterprise.

Let’s look at what a company needs for BDR. Data must be securely backed up, available in case of need, but safe from any disaster that might strike the company. When all of your data resides only on your fileserver, it is at risk from hardware failures, theft, human error, fire or other catastrophe. Many companies use tapes to back up their systems, but do not use a reliable way to move those tapes off site to a secure storage location. The same fire that cooks your server will melt the tapes in the file cabinet, and so will the summer sun beating down on the car’s boot.

Even the least expensive courier services can cost hundreds of dollars a month, and relying on tapes to store your data means needing redundant hardware to recover your data in an emergency. Tape based solutions are simply out of reach for most SMBs, who choose instead to accept the risk of loss because they don’t have a viable solution. Or rather, they didn’t until BDR met the cloud.

Cloud based BDR solutions use your company’s Internet circuit to make a secure connection to your service provider’s network, and performs data back ups continuously. Typically an agent is installed on each server and workstation you wish to backup, and examines data changes at the block level, replicating data either directly to the cloud service provider, or to a staging appliance in your datacenter that can further compress the data, and stage most recently changed data for rapid restores if necessary.

Rather than investing thousands or tens of thousands of dollars on hardware and software, cloud based BDR solutions typically operate on a monthly subscription basis, with graduated pricing based on total data stored. This means that SMBs can start using the services immediately, and keep their costs manageable. They can select a smaller total data level to start, and raise the level as their needs grow. Because costs are monthly and subscription based, the financial treatment of these costs is frequently very attractive as well, going to operations rather than assets.

Many of the cloud based providers of BDR services offer free trials, which enables the business owner or IT admin to take the service for a test ride, ensuring that they are comfortable with the requirements, performance, and availability of the service. Some services can offer individual users with backup capabilities for their workstations that go hand in hand with server based backups, while others pool team based storage to further enhance the services available.

With your data securely backed up to a cloud provider’s network, you can rest easy knowing that if disaster strikes, your data is not lost. It is safe and secure in the cloud ready for you to pull down at need.

This guest post was written by Casper Manes on behalf of IT Channel Insight, a site for MSPs and Channel partners where you can find other related articles to disaster recovery.

Why Disaster Recovery Requires a Plan

Why Disaster Recovery Requires a Plan

Guest post from Casper Manes on behalf of IT Channel Insight

Whether you are a commercial pilot, an astronaut, a submarine weapons officer, or a Cylon, you know the importance of having a plan. There are certain tasks that, no matter how repetitious they may seem, are so important to get right the first time, and every time, that they have been boiled down to a checklist which any reasonably skilled and trained individual can walk through, step by step, in order, to accomplish the task. They are designed to be easy to follow, to spell out exactly what needs to be done, and the order in which it must be done, to get things going, and to require a minimum of creative thinking. Tasks are performed by rote, and verified each step of the way. That’s the perfect way to approach disaster recovery, and in this article we’ll discuss why you need a disaster recovery plan that is a little more detailed than “don’t panic!”

What is a disaster?

Let’s consider what, in business terms, can constitute a disaster. Sure, things like hurricanes and blizzards come to mind, perhaps even fires in the datacenter, but a disaster is more than just a weather phenomenon or catastrophic loss; it’s anything that significantly disrupts the normal operations of your business. If we limit ourselves to an IT perspective, that can include prolonged Internet outages, a severe flu epidemic that takes out half the staff, a virus that shuts down key servers, or a SAN failure. It can also include HVAC failures, power outages, or hardware failures on critical, but not redundant, systems. Anything that causes a significant and protracted impact to normal operations may be enough to declare a disaster situation, and require that you implement your recovery plan.

Disaster declared, now what?

In the best case disaster, you have experienced a hardware failure that will eventually be corrected by the vendor. But while systems are down, your phone is ringing off the hook, you’re getting pinged on email and IM, and someone is probably sticking their head in your cube every 30 seconds asking if it’ fixed yet. In the worse type of disasters, you and your colleagues are probably more worried about your family and your own property more so than the company’s, and that’s assuming all your team even made it into the office. Hurricanes, blizzards, and other region impacting events can leave you with only a skeleton crew, and most of them are going to be worried about more than just how to get the website back online and email working. That’s why you want to work the plan.

By the numbers

Think back to how this article opened. When failure is not an option and there are countless distractions going on, you want people to have something to anchor themselves with, and to keep the need for creative thinking to a minimum. You also need to make sure that things are done in a certain order, and that nothing is missed, because most things have dependencies. A plan is the guide that your team will use to enable them to focus on specific and discrete tasks, without having to make it up as they go along. Make use of checklist; I mean actual paper documents on clipboards with check marks that each step is complete, so that;

a)     If something distracts you, it is easy to pick up where you left off without missing anything,

b)     You can hand off to someone else and they know exactly where to start

c)     Someone can audit that each step was done.

Paper checklists also have the distinct advantage of not relying on technology. I once saw an organization who kept all their DR procedures online; which looked great until they couldn’t get to them while the datacenter was down!

It’s a journey, not a destination

Disaster recovery planning is an ongoing process. Plans must be tested and revised as the company grows, new systems are brought into the environment, and old systems are deprecated. Real disasters don’t happen on schedule, so training must be thorough and testing must be performed to ensure that whoever is on the clock can handle the early steps of the process until more people can get online. Staffing changes will mean that this must happen frequently, and repeatedly. It’s just a part of the overall process, so accept it. And make sure that at least two people know how to perform any part of the disaster recovery plan since you have no way to know in advance whether everyone will be able to make it into the office when a disaster strikes. Redundancy of equipment is no more important that redundancy of skillsets, and a single point of failure could be the one guy who can’t get into the office because the roads are closed.

This article was written by Casper Manes on behalf of IT Channel Insight, a site for MSPs and Channel partners where you can find other related articles on how to setup a disaster recovery plan.

Ike: this is no time to think about disaster planning

Bookmark This (opens in new window)

Hurricane Ike

Hurricane Ike

Thousands of businesses in Texas from Freeport to Houston are wondering, “How are we going to survive Hurricane Ike and continue business operations afterwards?”

If this is the first time this has crossed your mind, there’s precious little you can do now but kiss your systems and hope that they are still running when you see them again.  The storm surge is supposed to exceed 20 feet, which will prove disastrous to many businesses.

But when you get back to the workplace and things are back to normal (which I hope is not too long), start thinking seriously about disaster recovery planning.  A DR project does not have to be expensive or take a lot of resources, and it’s not just for large businesses.  Organizations of every size need a DR plan: the plan may be large and complex in big organizations, but it will be small and manageable and not be expensive to develop.

Hurricane Ike's Path

Hurricane Ike's Path

Where do you begin?  At the beginning, of course, by identifying your most critical business processes, and all of the resources that those processes depend on.  Then you begin to figure out how you will continue those processes if one or more of those critical resources are not available.  The approach is systematic and simple, and repetitive: you go step by step through each process, identifying critical dependencies, figuring out how to mitigate those dependencies if they go “offline” at a critical time.

IT Disaster Recovery Planning for DummiesOrder yourself a great book that will get you started.  As one reviewer said, “It would be tempting to make all sorts of snide comments about a Dummies book that wants to take a serious look at disaster recovery of your IT area. But this is a Dummies title that you’ll actually go back to a number of times if you’re responsible for making sure your organization survives a disaster… IT Disaster Recovery Planning for Dummies by Peter Gregory. It’s actually the first book on the subject that I found interesting *and* readable to an average computer professional….” read the rest of this review here and here.

Don’t put this off – but strike while the iron is hot and get a copy of this now.  Don’t wait for the next hurricane to catch you off-guard.

I don’t want to see any business unprepared and fail as a result of a natural disaster.  If it were up to me, disaster preparedness would be required by law, but instead it’s a free choice for most business owners.  I just wish that more would choose the path of preparation and survival, but unfortunately many do not.  I wrote IT Disaster Recovery Planning For Dummies to help more people understand the importance of advance disaster recovery planning and how easy the planning process can be.

Press Release: Disaster Recovery Book Available in Electronic Edition

FOR IMMEDIATE RELEASE

CONTACT:  Rebecca Steele

rebecca.steele@ymail.com

Disaster Recovery Book Available in Electronic Edition

Book receiving critical acclaim from experts now available in Amazon Kindle edition

SEATTLE, Wash., September 5, 2008 – Technology author Peter H. Gregory’s 18th published book, IT DISASTER RECOVERY PLANNING FOR DUMMIES (John Wiley & Sons; $29.99; December, 2007), is receiving rave reviews from industry experts and professional reviewers.  The book is now available in electronic form on Amazon’s Kindle book reader.

According to Philip J. Rothstein, an industry expert on business continuity and disaster recovery planning and the owner of Rothstein & Associates, a disaster recovery planning consulting services firm, “Peter Gregory’s book helps to establish a realistic perspective for Disaster Recovery and provides a no-nonsense yet manageable foundation. He has identified many issues, techniques and tips which I found quite useful, despite my 25+ years involvement with business continuity and disaster recovery.”  Mr. Rothstein also wrote the Forward to the book.  According to Thomas Duff, 25-year IT professional and Amazon “Top 100” reviewer, “It would be tempting to make all sorts of snide comments about a Dummies book that wants to take a serious look at disaster recovery of your IT area.  But this is a Dummies title that you’ll actually go back to a number of times if you’re responsible for making sure your organization survives a disaster…  It’s actually the first book on the subject that I found interesting *and* readable to an average computer professional.”

IT Disaster Recovery Planning For Dummies is now available in Amazon’s Kindle electronic format.  This will enable owners of the popular Amazon Kindle to purchase the book at a reduced price and have an electronic edition of the book (and hundreds of others) readily available.  “Not only will this make the book more convenient for people to read, but having an electronic edition could be especially handy during a disaster or other emergency situation,” cites Peter H. Gregory, author of the book.  “When offices are shuttered or unreachable, this and other important books can be pre-loaded on Kindle readers and available as back-up references for emergency planners and responders,” he adds.

IT Disaster Recovery Planning For Dummies is available in paperback form at local book dealers, and also from online dealers such as Amazon, Barnes & Noble, and Borders.  It is available as an e-book directly from the publisher, John Wiley & Sons, as well as for Amazon Kindle.

ABOUT PETER H. GREGORY:

Peter Gregory, (Graham, WA) CISA, CISSP, is a security and risk manager for a financial services organization and the author of twenty books on security and technology. Peter Gregory is a career security professional with experience in the government, banking, nonprofit, e-commerce, and telecommunications industries.  He serves on two boards of advisors for information security certificate programs for the University of Washington, and on the board of directors for the Evergreen State chapter of InfraGard, a partnership between the U.S. Federal Bureau of Investigation and the private sector.

IT DISASTER RECOVERY PLANNING FOR DUMMIES

Published by John Wiley & Sons, Inc.

Publication date: December 26, 2007

$29.99; Paperback; 360 pages; ISBN: 978-0-470-03973-1

*  *  *

Does your organization need a disaster recovery plan?

Bookmark This (opens in new window)

DisasterMany businesses, particular those that have less than one thousand employees, think that disaster recovery planning is something that is too difficult or too expensive to undertake. Another response is that of the avoider: it won’t happen to me. These assumptions have been perpetuated to the detriment of many businesses that unnecessarily failed.

Disasters come in many forms. Most people think of massive earthquakes and hurricanes. However, there are hundreds of disasters that occur on a regular basis, but they’re too localized and small to make the news. And not all disasters are ‘acts of nature’: there are many man-caused disasters that occur on a regular basis that cripple businesses just like acts of nature do.

Disaster Recovery Planning need not be expensive, and most businesses can (and should!) get started right away with even a small amount of planning that could prove highly valuable, in case the unexpected occurs.

Get the book, build the plan!

Disaster recovery isn’t just for dummies

Bookmark This (opens in new window)

Disaster Recovery is not simply about Katrinas nor earthquakes nor 9/11 catastrophes. Sometimes, the focus on these monumental events could intimidate even the most committed IT manager from tackling Disaster Recovery Planning. Disaster Recovery is really about the ability to maintain business as usual – or as close to ‘as usual’ as is feasible and justifiable – whatever gets thrown at IT.

Read entire review

Find out more about the book, IT Disaster Recovery Planning for Dummies

The purpose for a criticality analysis

Bookmark This (opens in new window)

When the Maximum Tolerable Downtime (MTD), Recovery Point Objectives (RPO), and Recovery Time Objectives (RTO) targets have been established for each process, all of the processes can be compared to each other based upon these criteria. The point of the criticality analysis is to identify which processes in the organization are the most critical, based upon the objective measures that have been identified thus far in the Business Impact Assessment.

– From an upcoming book on data security

Protecting data while in DR mode

Bookmark This (opens in new window)

While in disaster recovery mode – that is, during a disaster when critical business applications are operating in alternate locations – all of these protections are also needed:

  • DR servers need to be backed up
  • Backup media needs to be protected, usually through off-site storage
  • Transmitted data must be protected
  • Critical data must be stored on resilient storage systems

In disaster mode, business information and processes are just as critical as they are in times of normal operations. Consequently, the systems and processes in use during a disaster must provide the same level of protection as the primary systems and processes.

from IT Disaster Recovery Planning for Dummies

On interim DR planning

Bookmark This (opens in new window)

Most organizations will immediately recognize the risks associated with the absence of a disaster recovery plan. Knowing that having a full DR plan in place and tested may be more than a year in the future, many organizations will have a strong desire to have something in place while waiting for the full DR plan to be completed.

Often the something that is needed is an interim DR plan. This is a plan that can be created quickly and with minimal effort. It will not, of course, be as comprehensive as a full DR plan. It is rather like tossing a tow rope in the back of a car, knowing that major engine work is needed.

Confidence in a DR plan

Submit: Add to your del.icio.us Digg This Slashdot GotNews StumbledUpon Reddit

Disaster recovery plans aren’t much good if they don’t work. And if they don’t work, then the time devoted to their development has been pretty much a waste of time.

Decision makers in businesses, especially the executives, like certainty. They want to have confidence that things will go as planned. And while no one plans a disaster, they want to know that the recovery effort after a disaster will work.

The survival of the business may depend on it.

You can take your DR plan to a fortune teller, but I wouldn’t put much stock in that. Why not just try it?

– from IT Disaster Recovery Planning for Dummies

Server consolidation and disaster recovery planning

Bookmark This (opens in new window)

Server consolidation has been the talk of IT departments for several years, and represents a still popular cost cutting move. The concept is simple: rather than dedicate applications to individual servers, which can result in underutilized servers, install multiple applications onto servers in order to more efficiently utilize server hardware, thereby reducing costs.

I’m all for saving money, electricity, natural resources, and so on, and consolidating servers is a smart move to undertake, as long as you abide by this principle:

Server consolidation is something to undertake during peacetime, not solely for recovery purposes.

Let me expand on this. Consider an environment that is made up of dozens of underutilized servers dedicated to applications. The DR planning team wants to consider a DR strategy that consolidates these applications onto fewer servers as a way of providing a lower-cost recovery capability.

Well, it might work, but I’d want to test it very thoroughly and carefully. Combining applications that are used to having servers all to themselves may lead to unexpected interactions that could be difficult to troubleshoot and untangle.

If you want to undertake server consolidation, do it first in your production environment, and then take that consolidated architecture and apply it to a DR architecture.

– from IT Disaster Recovery Planning for Dummies

Aligning DR planning to the org chart in large organizations

Perhaps different segments of a large organization may push forward on DR planning at different rates. One’s lack of progress should not impede another. Instead, you might think of this as a DR plan for each cog in the organizational wheel. If this is how things get done in your organization, then perhaps the DR plan gets built in pieces, asynchronously. Progress has many faces.

– from IT Disaster Recovery Planning for Dummies

DR team selection

You can’t hand pick your recovery team members. The disaster will select them for you. It is for this reason that recovery procedures must be specific enough so that anyone with the basic relevant skills can carry them out confidently and correctly.

– from IT Disaster Recovery Planning for Dummies

DRP: the job is not done until the paperwork is done

Paperwork

Bookmark This (opens in new window)

The job is not done until the paperwork is done.

Nowhere is this pithy saying more true than in disaster recovery planning. Why? Because the paperwork in DRP is about how to jump-start the business when “the big one” hits. Depending upon where your business is located, the “big one” may be an earthquake, tornado, hurricane, flood, or a swarm of locusts.

The paperwork in DRP is simply this: the procedures and other documents that business personnel must refer to in order to get things going again after a disaster. The DRP procedures are especially important because they might be read and followed by persons who are not the foremost experts with the systems that support critical business processes. Still, those people are expected to rebuild critical systems in a short period of time in order to support critical process that are probably going to be performed by people who likewise are not subject matter experts at the business process level.

And the business’s survival depends on the paperwork being right. There are no second chances.

You just love documentation, right? Thought so.

– from IT Disaster Recovery Planning for Dummies

Ninety percent of good disaster recovery planning is knowing what makes your environment run today.

– from IT Disaster Recovery Planning for Dummies

Bookmark This (opens in new window)

Building replacement workstations in a disaster

Submit: Add to your del.icio.us Digg This Slashdot GotNews StumbledUpon Reddit

The need to build workstations on unfamiliar hardware platforms requires some out-of-the-box thinking for those who are required to build replacement workstations in a disaster. Straightaway, I recommend that workstation images be very well documented, so that they can be built from the ground up on new hardware platforms.

– from IT Disaster Recovery Planning for Dummies

Gap in PC Procedure Causes Corporate Crisis

Bookmark This (opens in new window)

Some years back, a colleague in another organization came to me for help. In this international organization and U.S. public company, the finance department was unable to close its quarterly financial books in time to meet a S.E.C. filing deadline.

It had missed the deadline for several days, and the matter had reached the CEO and the boardroom as an uproar.

The cause: an overseas subsidiary was unable to close its books. The reason: one of the steps to the overseas subsidiary’s completing its month and quarter-end financials was a procedure wherein a financial report was downloaded to a PC’s spreadsheet program, where a spreadsheet macro would perform some calculations that would be used in the subsidiary’s financial results.

This time, there was a problem: the macro had become corrupted and would not run.

There were no backups. A contractor had created the macro and was nowhere to be found. No one in the finance department knew what the macro did or how it worked. It was an undocumented step in this critical business process; the original software was gone, and none of this was documented.

Be certain to avoid having this kind of a scenario occurring in your organization.

– from IT Disaster Recovery Planning for Dummies

Storing production data on end user workstations?

Submit: Add to your del.icio.us Digg This Slashdot GotNews StumbledUpon Reddit

As I encounter cases where an employee’s workstation is, in fact, on the critical path for a critical business process, the first question I usually ask is:

Why?

Warnings go off in my head when I hear about an employee’s workstation in any process’s critical path.

– from IT Disaster Recovery Planning for Dummies

Now let me tell you why I think it’s a bad idea to store production data on end user workstations:

  • Workstation hard drives are not protected from failure by any RAID or mirroring technology. When the hard drive fails, the data is gone. IT servers often have RAID or mirroring, which protects the integrity and availability of the data.
  • Most users don’t back up their workstation hard drives. When the data is gone, it’s gone. IT servers are usually backed up regularly.
  • Most workstations have little or no power protection (plug strips hardly count). When sags, spikes, or brownouts occur, the workstation will take the brunt of this, possibly resulting in a crash or hardware failure. Sure, it’s unlikely, but it DOES happen. IT servers are usually protected by UPS and, sometimes, generators.
  • Users often tinker with workstations, which sometimes results in a disabled state and/or a reboot. This happens a lot less in most IT servers.
  • User workstations, particularly if they are laptops, are stolen far more frequently than IT servers, especially when they are locked up in server rooms.