Understanding the Zone-Based Incident Recovery Model

The Zone-Based Incident Recovery Model (ZBIRM) is a recovery strategy used by Optiv Enterprise Incident Management (EIM) to conceptualize the fastest possible means of securely guiding your business through the incident response lifecycle, while maximizing the level of containment of your compromised environment. Typically, we recommend this recovery strategy to our clients who are experiencing major ransomware incidents or have detected an attack with a severe or unknown depth into the target environment.

 

The ZBIRM arranges your organization’s information systems into three categories aligned to the “confidentiality, integrity and availability" (CIA) triad framework:

 

Image
ZBIRM_img1.png

Figure 1: Optiv's Zone-Based Recovery Model

 

Red Zone (No CIA):

  • These systems are compromised, likely to be compromised or have a highly questionable state of trust.
  • Systems in the red zone are “contained” either at a network, endpoint or configuration level and are for the most part in a non-operational state.
  • These systems must be decommissioned.
  • The systems have the lowest CIA rating and cannot be trusted. This environment and its systems are contained, and they cannot be used.

 

Yellow Zone (Medium CIA):

  • These systems are restored into a semi-operational state with a level of risk accepted by the business.
  • Typically, these are mission-critical systems that must be restored quickly to sustain the business. A partial restoration of these systems allows the incident response team more time to continue forensic analysis, recovery and other efforts, but it may expose the same vulnerabilities or threats present in the red zone.
  • These systems must eventually be decommissioned, but they can be sustained in the short term with a level of business risk acceptance. The systems must be closely monitored for evidence of malicious activity.
  • These systems have a medium CIA rating and cannot be merged into the green zone (see below). They can be used short term to facilitate recovery efforts and keep the business operational.

 

Green Zone (High CIA):

  • These systems have been restored from a known good backup, rebuilt from scratch or vetted with extremely high confidence as “not compromised.”
  • The green zone will be considered your new production and operational environment and will be the environment your business will be hosted in moving forward.
  • These systems have the highest CIA rating. Business owners have high confidence in the integrity and the availability of the systems.

 

When Should I Use the Zone-Based Incident Recovery Model?

The ZBIRM is currently the accepted best practice for helping a business recover from a major cybersecurity incident. Typically, Optiv recommends this strategy to clients when:

 

  1. It is not possible to fully identify the scope or origin of a cyberattack.
  2. Evidence indicates that an organization is responding to a threat actor with multiple footholds into the network or the capability to regain access into the environment.
  3. The environment has been fully encrypted by ransomware or sustained a deep level of compromise.

 

 

Common Pitfalls to Avoid

If you’re at the point of considering the ZBIRM, it is important to proactively consider the following common pitfalls that organizations experience when responding to security incidents.

 

  1. Restoring too quickly before the incident has been fully scoped
    It is necessary to fully understand the scope of any incident you are responding to. Restoring systems before you are aware of a threat actor’s capabilities, or the extent of malware and persistence mechanisms they may have deployed in your environment, puts your organization at risk of being compromised again and further delaying incident recovery.
  2. Waiting to recover until after the investigation is complete
    While it is prudent to correctly follow the incident response lifecycle, you should begin parallel recovery efforts as soon as possible. Your incident response team should have a dedicated recovery team that is focused on building out a green zone, while the rest of the team isolates the compromised environment, preserves logs and event data and conducts the forensic investigation.
  3. Pushing recovery efforts out all at once without adequately testing restored systems or alerting help desk personnel of potential incoming issues
    Incident response teams are often stretched thin. Complicating recovery efforts by deploying untested systems and ineffectively communicating with help desk staff will create more work for the business and incident response. A guiding philosophy when managing timelines and expectations for incident recovery is “Slow is smooth, smooth is fast.”
  4. Lack of prioritization of system recovery in a criticality-based order
    Not every system is mission critical. We always ask our clients what their “crown jewels” are when assisting with recovery matters, because it is easy to lose focus when responding to incidents in large organizations with thousands of systems. Start by identifying the systems necessary to keep the business and IT infrastructure operational. Common examples of mission-critical systems include email servers, enterprise resource planning (ERP) and billing systems, manufacturing and operational technology (OT) systems, certain application servers etc. Every business has its own priorities, and team stakeholders must identify what is needed for them to restore operations. We encourage organizations to create and update business continuity and disaster recovery (BCDR) plans as part of their incident preparedness programs.
  5. Mismanaging people resources
    Although 24/7/365 threat monitoring and recovery efforts are necessary to eradicate a threat and restore operations as quickly as possible, overworking staff is not a viable means to achieve this goal. When feeling burned out, teams are less likely to perform efficient or quality work—both of which are key to a successful recovery. It is generally recommended to cap workdays at a maximum number of hours per day for all members of your incident response and remediation teams. If you do not have sufficient staff to maintain a 24/7/365 response, recovery and remediation effort, contractors and managed service providers (MSPs) can be used to fill in gaps. Alternatively, you can communicate and advocate for longer recovery timelines to stakeholders.
  6. Lack of clear incident command structure to guide workflows and segment incident response efforts
    Because of the time-sensitive nature of incident response, incident command by committee can be ineffective. Although committees are constructive in making complex, strategic business decisions, considerable time may be required to reach consensus and necessary approvals. Technical teams, however, often need direct, timely and authoritative guidance to respond to incidents quickly. An effective incident command structure typically consists of an incident commander, with a designated backup who is accountable to a committee of business leaders. This person is authorized to make fast decisions and coordinate efforts between the business and technical teams. With this structure, only two individuals are dedicated full time to managing and guiding technical and business discussions, projects and efforts, while the rest can be dedicated to specific projects and tasks.

 

 

Leveraging the Value of the Zone-Based Recovery Model

The ZBIRM is a proven incident remediation strategy that will continue to prove beneficial to organizations facing critical cybersecurity incidents. Alignment with the ZBIRM establishes clearly defined vocabulary between technical and leadership teams. This model also serves as a guide to avoid many of the pitfalls that organizations experience when attempting to restore operations post-incident. To learn more about Optiv’s approach to the ZBIRM, please reach out to us.

Justin Safa
Digital Forensics and Incident Response Consultant | Optiv
Justin Safa is a Digital Forensics and Incident Response Consultant at Optiv on our Enterprise Incident Management Team. Justin is a Subject Matter Expert in Microsoft 365, Cisco Security, Carbon Black, Sentinel One and a variety of other technologies. Justin has led many Incident Response Engagements involving Ransomware, Targeted Attacks, Zero-Days and other Sophisticated Threats. Justin has worked for a diverse range of Clients in various industries including various levels of government and the Fortune 500.

Optiv Security: Secure greatness.®

Optiv is the cyber advisory and solutions leader, delivering strategic and technical expertise to nearly 6,000 companies across every major industry. We partner with organizations to advise, deploy and operate complete cybersecurity programs from strategy and managed security services to risk, integration and technology solutions. With clients at the center of our unmatched ecosystem of people, products, partners and programs, we accelerate business progress like no other company can. At Optiv, we manage cyber risk so you can secure your full potential. For more information, visit www.optiv.com.