Achieving resilience in distributed software systems via self-reconfiguration


Improvements in mobile networking combined with the ubiquitous availability and adoption of low-cost development boards have enabled the vision of mobile platforms of Cyber-Physical Systems (CPS), such as fractionated spacecraft and UAV swarms. Computation and communication resources, sensors, and actuators that are shared among different applications characterize these systems. The cyber-physical nature of these systems means that physical environments can affect both the resource availability and software applications that depend on resource availability. While many application development and management challenges associated with such systems have been described in existing literature, resilient operation and execution have received less attention. This paper describes our work on improving runtime support for resilience in mobile CPS, with a special focus on our runtime infrastructure that provides autonomous resilience via self-reconfiguration. We also describe the interplay between this runtime infrastructure and our design-time tools, as the later is used to statically determine the resilience properties of the former. Finally, we present a use case study to demonstrate and evaluate our design-time resilience analysis and runtime self-reconfiguration infrastructure.

Journal of Systems and Software