In: Computer Science
Explain how you will test your backup and recovery strategy.
Continuity of operation
Please rate the answer. Feel free to conatct me in case any more clarifictaions required.
Testing the Continuity of operation
Here we discuss how we can test the business continuity process of making the business getting back to fully functional after a disaster or outage.
01. Exercises
Answer =>
There are different exercises a company executes to validate the readiness of the backup and recovery strategy.
This mainly depends on the services they provide to their clients. The goal of this exercise is to validate the data integrity of the backup created.
01. Testing for Failover: In scenarios where backups of entire virtual machines are created, and supporting tools to recover the failed system and ensure that it boots up.
This is very important in the case of mission-critical systems/Defence/Finance..etc.
This task can also be performed manually.
02. Test the Functionality: Once the Backup is up and running, test the functionality is proper.
03. Test the Data: Verify the Data restored is same as the server which has a failure.
The best way to ensure the recovered data is proper is to use it with the application and verify the results.
04. Test for the recovery time. Test the system recovery time during the exercise, this is the important parameter for making a strategy live.
05. Identify the Gaps. Exercise should be able designed in such a way that it should figure out the gaps and manual process which can be automated in the future.
02. After-action reports
Answer =>
After-action report provides insights about the company's response to what happened during an above exercise or during an occurrence of a disaster.
It explains how each entity involved in the action plan executed the predefined jobs during the critical situation.
All these points need to be clear and precise. This is an iterative process which identifies the flaws in the existing strategy and rectifies it.
The Successful report should have below items.
01. Summary of the report.
02. Overview of the existing strategy.
03. Evaluate the Sucess/Failure of the strategy.
04. Suggestions for improvements.
03. Failover
Answer =>
Failover operations are critical disaster recovery plans that provide the restoration and thereby limit the damage in the disaster recovery process.
So Failover validates a system's ability to allocate extra resources and to move operations to a back-up server during the server failure.
Factors that need to be considered for a successful failover testing are as below:
1. Adequate planning should be done in terms of performance requirements.
> This can be Human resource.
> Server requirements.
> Necessary Approvals from company/Clients..etc.
2. Test plan generation as per the business requirements.
3. Execute the test plan Successfully.
4. Prepare a good report which evaluates the plan/Execution/Improvements required if any to meet the goal.
04. Alternate processing sites.
Answer =>
An Alternate processing site is the one to be used by a business when this business functionality cannot be carried out at the normal operating site once the disaster has occurred.
The testing can be conducted in such a way that the Failover will happen on a different site.
This can be:
1. Another geographic location(Different Regions/Different Datacenter), where the Datacenter or work area the recovery will happen.
2. Another geographic location(Different Regions/Different Datacenter), other than the main facility, that can be used to conduct business functions during an outage.
3. Another geographic location(Different Regions/Different Datacenter), other than the normal facility, used to process data and/or conduct critical business functions during an outage.
05. Alternate business process
Answer =>
Covers areas how businesses can keep working in case of longer recovery time or outage.
This could be a set of process which can be done manually so that the customers are not getting impacted.
Enabling services partially utilizing minimal resources to support the customers.
The input for this step will be the above-generated reports.
The above-generated reports will help to validate the currently executed strategy met the goal or not.
Discuss and finalize for a better plan, if the current one is place is not achieving the target.