Best Risk Management Practices: Business continuity, fault tolerance and disaster recovery

Risk management is an activity or rather process that business people cannot avoid at all costs. This is because each and every business has got risks associated with it so it is very important that one are conversant with some of the basic risk management practices for business continuity. Here are some important factors which one should learn in order to make some good business practices get in used for the betterment and the good performance of the organization;

Business continuity concepts

Business impact analysis: When determining whether one will proceed with one's business continuity process, it is very important for one to make an analysis of the impact the business has had not only to the employees but also to the environment in which it had been established. One should try and make some good analysis and see if the contribution of the business has been positive to the environments. If its contribution was greatly accepted by people, then one should not hesitate with one's continuity process.

Identification of critical systems and components: The critical and most important systems in a business must be identified so that the business continuity process does not cause some unwanted changes and effects on such components.

Business continuity planning and testing: This is an activity that entails making an analysis of all the activities of a business, its inter-dependencies and vulnerabilities so as to assist in setting up priorities for the business as well as strategic planning of its recovery.

Risk assessment: Risk assessment is also another important business continuity concept. In this case, one has to look for all the top class risk assessment personnel and actuaries so that they can calculate the risk involved during business continuity. With the help of such personnel, one can easily determine whether the continuity process will result into some loss or profit to one's business.

Continuity of operations: During business continuity, it is very important that the normally running of the business is not affected. This therefore means that all the business operations must run as usual without interruption.

Disaster recovery: Disaster recovery is also another important business continuity concept. This is a concept that is aimed at ensuring that all the information and data in a business that could have been lost is recovered so as to assist in future needs. Most of the businesses and organizations tend to rely heavily on some of this data and it is therefore very important that it is recovered since it can be used in the future.

Succession planning: Succession planning is basically the concept where one has to decide the people who will take over the various posts in one's business. In this case, one might want to have the best replacement for one's financial manager among many other personnel.

High availability: High availability is a concept that generally means that one's systems will always be available regardless of what happens. However, redundancy does not offer complete assurance that one's system will be always available. In the case of redundancy, one need to manually enable the other component when one fails but in high availability, the system is considered always available and hence no need for manual enabling.

High availability also means that there could be many different components working together. For instance in a network, one might have many firewalls, wide area networks, switches and routers all working together so that if one section fails, then they can easily pick up.

Redundancy: Redundancy is a concept that is mainly based on keeping things up and running in one's organization even with the absence of one important component. One idea behind redundancy is to keep things running and maintaining an uptime. With redundancy, one need to be sure that all one's network components and resources are working properly and that we are able to use all the resources available to us. This means that one's organization continues to function normally and as usual.

In this case therefore, one need to ensure that one has no hardware failure. In this case, one can have redundant servers or power supplies. With this, in case of a power outage, all one's systems will continue running efficiently because of there is another power supply available at one's disposal. With such redundancies, one is sure that if one component fails, there is another one that is there available and ready to take its place.

One should also make sure that there are no software failures. This can be made possible by the installation of software with the capability to inform one of a problem when it arises. In this case, one can have other software working on a different part of the network so that if one fails, then software on the network can pick up.

Finally, one need to make sure that there are no system failures. For instance, one may want one's network to perform optimally by having redundant switches, firewalls and routers. With such redundant systems available, we are sure that our systems will be running all the time.

Fault tolerance

Hardware: The concept of fault tolerance means that one will need to have redundant hardware components. One can decide to have multiple power supplies and multiple devices for one to use.

RAID: One can also choose to have multiple disks which are called the Redundant Array of Independent Disks. One can have the RAID in one server and it means that if one loses one disk, the system will still be up and running and no individual can detect such a problem.

Clustering: To ensure that resources on one's server are available and working properly, one might consider clustering servers together. That way, if one loses a motherboard or the system becomes unplugged or has a system fail; one can have these extra systems in one's cluster to keep everything up and running. Since the cluster machines can communicate to each other, they can detect when there is an outage and they can take up those resources and make sure that everyone is able to run the systems that they need to run.

Load balancing: Load balancing is also another important concept where one has all one's systems running all the time so as to balance the load and not overwork a specific system. If one loses one, then one can easily flip over to another without interruption. Without load balancing, one might need to have extra resources on the original machine so as to keep everything running.

Servers: For servers to be always functional and provide high availability, server clustering is very important. One can choose to have active/active server clustering where all one's end users are accessing different servers in the environment. The servers are always active and communicating with each other. One can also choose an active/passive server clustering where one server is active while the other is passive and waits for a failure so that it can pick up. Active/passive server clustering is easier to implement compared to active/ active server clustering.

Disaster recovery concepts

Backup plans/policies: There are some many strategies that can be employed when it comes to backup of data and information. One of the plans that one can use is through back up technologies. In this case, one can use traditional tapes and disks. This is the most economic and cheap way to carry out some backup although in some other cases, one might use disk backup or optical media backup. With the use of optical storage media, one can store very large amounts of data in a very small space.

One can also have database backup where there are two methods. First, there is the replication where one takes all the data and information on one machine and copy it to another machine. That way, there is always an exact replica of data on the machines so that if data on one machine is lost, then one can obtain the same data from another machine. The other backup method is online database backup.

Email database backups not only provide backup facilities but also provide regulatory requirements and compliance requirements to store such data over a long period of time. In this backup strategy, one is not only able to back up the whole email server but also mailboxes and individual messages. This means that if a user deleted a message from the mailbox, one can use recovery software to get to the database and restore the deleted message.

One can also use snapshots. This is a type of utility provided by windows where one are in a position to back up files that are open and being used by the windows operating system with the help of the Volume Shadow Copy Service. One can have an exact duplicate of the files even if they are being used at the specific time.

Another backup strategy used in home environments and large organizations is the image copy where one takes everything on the system and makes an exact duplicate of the same and then copy it to an image file.

Backup execution/frequency: Backup execution is normally done so as to give one the ability to restore a file even if one loses it. A backup copies all the data one has and creates a duplicate version of it.

One can decide to have a full backup execution. This is a kind of backup where one copies every piece of file on one's system. One can also decide to do an incremental backup where one decides to backup all files that have changed since one's last incremental backup. There is also another concept of the differential backup. This one is quite different from incremental in that is creates a backup of all files that have changes since one's last full backup. Each backup strategy has its pros and cons.

One can also approach backup execution in terms of the data retention period. First, one might choose to have a short-term backup. This means that one retain files that have recently changed for version control. Long-term backup is also another option where one keeps files and information for legal requirements or company policies.

Cold site: The cold site backup facility is viewed as a disaster recovery site that is an empty building. This means that there could be some cooling systems and other backup systems but no hardware in it. This therefore means that should an emergency occur, it is one's responsibility to bring along all the items required. This is a facility where there is no data stored so one must be conversant of ways in which one can bring data to the site. There are also no people in this disaster recovery facility and therefore one may choose to come with one's own team.

Hot site: A hot site is a recovery facility where one has a duplicate of everything, all the systems are up and running and one have a complete duplicate of a data centre. In this case, when one is buying hardware, one buy duplicates of hardware in the hot site. It's like buying everything in twos. Applications and software re also constantly updated hence creating a sort of automatic replication of every component. In the case of a hot site, it is very easy to make a switch of resources from one site to the other in a very short period of time.

Warm site: A warm site is quite different from a cold site in that it's a location where one have all one's equipment but all the hardware is stored in a separate room. This means that if there is a disaster, people will show up in the facility and start putting things in racks since it is a recovery site that gives enough room. In this case, the most important thing is that one need to have all the recovery data and software so that one can load it on the systems and begin the recovery process.

As a matter of fact, business continuity concepts, fault tolerance and data recovery concepts are very important especially to a person running a business and it is therefore important that we make an effort of learning about them. After reading them, one can make some really good strategies which would enable him to tackle every hurdle he faces and the business can be modified into some good shape.

