Explore chapters and articles related to this topic
Optimal Periodic Software Rejuvenation Policies in Discrete Time—Survey and Applications
Published in Mangey Ram, Modeling and Simulation Based Analysis in Reliability Engineering, 2018
Tadashi Dohi, Junjun Zheng, Hiroyuki Okamura
Present-day applications in computer systems impose stringent requirements in terms of software dependability, because system failure, caused by software failure in almost all cases, may lead to a huge economic loss or risk to human life. A guaranteed fulfillment of these requirements is very difficult, especially in applications with nontrivial complexity. In recent years, considerable attention has been paid to continuously running software systems whose performance characteristics are smoothly degrading in time. When a software application executes continuously for a long period of time, some of the faults cause software to age due to the error conditions that accrue with time and/or load. This phenomenon is called software aging and can be observed in many original software systems [1–6]. One common experience suggests that most software failures are transient in nature [7]. Since transient failures disappear if the operation is retried later in slightly different context, it is difficult to characterize their root origin. Therefore, the residual software faults are obvious in the operational phase. Grottke and Trivedi [8] classify several software bugs and point out that the resource exhaustion in computer systems causes the software aging. A complementary approach to handle transient software failures is called software rejuvenation [9] which can be regarded as a preventive and proactive solution that is particularly useful for counteracting the phenomenon of software aging. It involves stopping the running software occasionally, cleaning its internal state, and restarting it. Cleaning the internal state of software may involve garbage collection, flushing operating system kernel tables, reinitializing internal data structures, etc. An extreme, but well-known example of rejuvenation is a hardware reboot. In this way, software rejuvenation is becoming much popular as one of the light weighted software fault tolerant techniques.
Modelling and dynamic behaviour analysis of the software rejuvenation system with periodic impulse
Published in Mathematical and Computer Modelling of Dynamical Systems, 2021
Huixia Huo, Houbao Xu, Zhuoqian Chen
The failure of computers is mainly due to the failure of the software rather than the hardware [1]. Nowadays a software comes in multiple releases to keep it updated, relevant in the market, and patch/fix the vulnerabilities. Common experience suggests that the most software failures, called Mandelbugs, are transient in nature [2–4]. Since all ageing-related bugs are Mandelbugs, it must be tolerated in the operational phase of software system regardless of the software version [5,6]. Software ageing refers to the phenomenon that the failure rate and/or performance decline of a long-running software system is caused by the activation and spread of ageing-related bugs [7], such as Apache web server [1], telecommunication switching and billing system [8], cloud computing infrastructure [9] and Android operating systems [10]. The fault-error-failure chains shown in [11] suggested that the activation of ageing-related bugs will lead to the system shift from a correct state to a failure-prone one. After a long period of execution or a large number of accumulated errors, such bugs can lead to ageing-related failures that can result in economic loss or endanger human life [12].
Memory-loss resilient controller design for temporal logic constraints
Published in Cyber-Physical Systems, 2021
M. Abate, W. Stuckey, L. Lerner, E. Feron, S. Coogan
In instances where software ageing degrades system performance, regular software restarts can be employed to add system robustness [7–10]. This method, referred to in literature as software rejuvenation, increases system resiliency by regularly reinstalling mission objectives from a trusted mission planner [11]. For controlled dynamical systems, methods exist for enforcing set invariance [12] and tracking control objectives [13] in the presence of such memory losses. Methods in this paradigm do not exist, however, for enforcing more complex logical and temporal system objectives.