Performance assessment and reliability analysis of dependable and distributed computing systems based on BDD and recursive merge

作者:

Highlights:

摘要

System reliability evaluation, sensitivity analysis, failure frequency analysis, importance measures, and optimal design are important issues that have become research topics for distributed dependable computing. Finding all of the Minimal File Spanning Trees (MFST) and avoiding repeatedly computing the redundant MFSTs have been key techniques for evaluating the reliability of a distributed computing system (DCS) in previous works. However, identifying all of the disjointed MFSTs is difficult and time consuming for large-scale networks. Although existing algorithms have been demonstrated to work well on medium-scale networks, they have two inherent drawbacks. First, they do not support efficient manipulation of Boolean algebra. The sum-of-disjoint-products method used by these algorithms is inefficient when dealing with large Boolean functions. Second, the tree-based partitioning algorithm does not merge isomorphic sub-problems, and therefore, redundant computations cannot be avoided. In this paper, we propose a new efficient algorithm for the reliability evaluation of a DCS based on the recursive merge and the binary decision diagram (BDD). Using the BDD substitution method, we can easily apply our algorithm to a network with imperfect nodes. The experimental results show a significant improvement in the execution time compared to previous works.

论文关键词:Reliability,OBDD,Fault-coverage,System availability,Distributed system

论文评审过程:Available online 1 June 2010.

论文官网地址:https://doi.org/10.1016/j.amc.2010.05.075