Bridging Fault Tolerance and Game Theory for Assuring Cyberspace

Home / Articles / Journals / Spring 2016: Volume 4 Issue 1

Posted: March 8, 2016 | By: Dr. Kevin A. Kwiat, Charles A. Kamhoua

Two Air Force Office of Scientific Research (AFOSR)-funded in-house efforts have shaped the way that AFRL/RI has bridged fault tolerance and game theory: “Fault Tolerance for Fight-Through (FTFT)” and “STORM: Survivability Through Optimizing Resilient Mechanisms”. FTFT was the forerunner of STORM. This was also a logical ordering from a historical perspective because fault tolerance is an older discipline than game theory. Fault-tolerant computing formally originated when John von Neumann introduced the concept to electronic computers. Although the introduction of game theory is also credited to von Neumann, it was much earlier, in 1837, that Charles Baggage gave evidence of fault tolerance’s existence. In [1], he wrote that a complicated formula could be algebraically arranged in several ways such that if the same values are assigned to the variables and the results agree, then the accuracy of the computation is secure. Babbage, of course, was referring to the work of clerical staff – the “computers” of his time. Note that Babbage advocated the use of diversity to secure a computation [1]. As digital computers developed, diversity became a key consideration when seeking fault tolerance, and throughout the history of computers, fault tolerance was often coupled with diversity for added assurance to computing [2-3]. In FTFT we used fault tolerance and diversity to address the more contemporary concern of cyber defense.

Fault–tolerant computing shares conceptual similarities with cyber defense. For example, fault tolerance deals with the detection and treatment of failures whereas cyber defense deals with the detection and treatment of compromises – both of which can cause a computer to deviate from its specification. Traditionally, fault-tolerant computing dealt with deviations stemming from randomly occurring faults and not faults resulting from intelligent attack. Whereas faults caused by natural-occurring phenomena are tolerable using established, standard approaches, attacker induced faults require a more aggressive approach that also ushers-in cyber defense. New challenges arise in the area of transforming fault tolerance to attack tolerance. As information systems become ever more complex and the interdependency between these systems increases, it is beyond the abilities of most system developers to predict or anticipate every type of component failure and cyber-attack. Attempting to predict and protect against every conceivable failure and attack soon becomes exceedingly cumbersome and costly. Therefore, the more realistic goal became the design of a fight-through capability that can absorb the damage and then rebound so that it can be the basis for restoration of critical services. We sought adaptations of fault-tolerant computing concepts to address this need in cyber defense. An optimum decision has to be made early in the design phase and during mission execution to maximize fault-tolerance. To achieve that, we found an appropriate source in military strategist John Boyd who conceived and developed the Observe, Orient, Decide, and Act Loop (OODA Loop) [4]. He applied the OODA Loop to the combat operations process including the engagement of fighter aircraft in aerial combat. Figure 1 shows a basic OODA Loop.