Managing Operational Resilience

Operational Resilience Image CSIACjournal_V3N2_WEB

Posted: February 9, 2016 | By: Julia H. Allen, Pamela Curtis, Nader Mehravari

A search at your favorite news aggregator for keywords such as “malware,” “computer virus,” or “data breach” will return results in the tens of thousands. For most organizations it’s not a question of if a cyber attack will occur, but when. And when an attack happens, the tempo of response must be fast, so an organization must already have practices in place covering how to respond. These practices should reflect a strategic approach that balances actions that protect assets such as customer data and intellectual property with actions that sustain services and operations.

A recommended approach to address both protection and sustainment is the application of resilience management practices. Operational resilience is the ability of an entity to prevent disruptions to its mission from occurring, continue to meet its mission if a disruption or incident does occur, and return to normalcy when the disruption is eliminated. The concept of operational resilience applies to entities such as organizations, systems, networks, supply chains, critical infrastructure, cyberspace, Armed Forces, and even nations.

Operational resilience management includes all the practices of planning, integrating, executing, and governing activities to ensure that an entity can

  • identify and mitigate operational risks that could lead to service disruptions before they occur
  • prepare for and respond to disruptive events (realized risks) in a manner that demonstrates command and control of incident response and service continuity
  • recover and restore mission-critical services and operations following an incident within acceptable time frames

Operational resilience management draws from several complex and evolving disciplines, including risk management, business continuity, disaster recovery, information security, incident and emergency management, information technology (IT), service delivery, workforce management, and supply-chain management, each with its own terminology, principles, and solutions. The practices described here reflect the convergence of these distinct, often siloed disciplines. As resilience management becomes an increasingly relevant and critical attribute of their missions, organizations should strive for a deeper coordination and integration of its constituent activities.

Our discussion of operational resilience management has four parts. First, we set the context by providing an answer to the question “Why is operational resilience management challenging?” A set of recommended practices for operational resilience management follows. We then briefly address how an organization can achieve effective results by following these practices. We conclude with a list of selected resources to help you learn more about operational resilience management. Also, we’ve added links to various sources to help amplify some points.

Every organization is different; judgment is required to implement these practices in a way that benefits your organization. In particular, be mindful of your mission, goals, existing processes, and culture. All practices have limitations. Some of these practices will be more relevant to your situation than others, and their applicability will depend on the context in which you apply them. To gain the most benefit, you need to evaluate each practice for its appropriateness and decide how to adapt it, striving for an implementation in which the practices meet your business objectives. Monitor your adoption and use of these practices, and adjust as appropriate.

Why is managing operational resilience challenging?

Over the past 10 years, organizations have invested a tremendous amount of resources in cybersecurity. Nevertheless, regardless of how much has been spent on protection, cyber attackers continue to penetrate systems. We have reached a point in the battle for information and cybersecurity where we should change the focus of security investment from a narrow focus on planning how to avoid cyber attacks to a more balanced focus on avoidance and planning how to recover from cyber attacks.

Operational resilience management has two sides—protect and sustain—and both are equally important. An organization must learn about the threat environment, maintain situational awareness of the context in which it operates, and create a risk-management plan that is as thorough and reliable as possible. But when an attack occurs, can the organization sustain its critical services and operations? Can it adequately recover its systems and get them back online as quickly as possible? Can it restore and recover service within a prescribed recovery time and according to its recovery-point objectives? An organization must ask, where can we not afford to have something bad happen, and where can we afford to have something bad happen and bounce back as quickly as we can? The need for organizations to achieve a balance between protect and sustain is why operational resilience management is so important.

Operational resilience management is challenging for several reasons:

1. Making a long-term commitment: Operational resilience is an emergent property. An emergent property is not something an organization can buy and put in place or assemble by buying its parts. For a property to emerge within an organization, the organization must execute a certain set of activities in a coordinated manner and do so with consistent discipline. Our own health makes a good analogy: we would all like to have good health, but we cannot buy it at any store. To become healthy, we must do certain good things, such as eat well, exercise, sleep enough, and get checkups. And we must do these things in a disciplined manner for a long time. Achieving operational resilience requires an organization to make a similar long-term commitment to perform certain activities with consistency. The activities involved in operational resilience management must become part of the organization’s daily habits across the enterprise.

2. Understanding the big picture: To be operationally resilient, organizations must address operational risk on many dimensions simultaneously, including people, technology, information, facilities, supply-chain, management, cyber, and physical dimensions. This requires careful planning, coordination, and training across many interdependent domains, as well as understanding how the organization’s capabilities along these dimensions contribute to mission success.

3. Overcoming organizational hurdles: An organization may encounter these barriers to operational resilience management:

  • the vague and abstract nature of operational risk management
  • compartmentalization of operational risk-management activities, such as segmenting responsibilities for information security and business continuity/disaster recovery
  • focusing on technology instead of on all the dimensions listed in Challenge 2
  • the proliferation of practices for operational resilience management
  • insufficient funding and staff
  • insufficient success stories and measurements
  • (over)reliance on people
  • regulatory climate
  • existing policies
  • the tendency to ignore current information to avoid a painful reality and the need to act
  • competitive pressures or short-term goals

Want to find out more about this topic?

Request a FREE Technical Inquiry!