An Efficient Algorithm-Based Fault Tolerance Design Using the Weighted Data-Check Relationship

IEEE Trans. on Computers, vol. 50, no. 4, pp. 371-383, Apr. 2001 (SCI)

Hee Yong Youn, Oh,C.G., Hyunseung Choo, Jin-Wook Chung, and Dongman Lee


VLSI-based processor arrays have been widely used for computation intensive applications such as matrix and graph algorithms. Algorithm-based fault tolerance designs employing various encoding/decoding schemes have been proposed for such systems to effectively tolerate operation time fault. In this paper, we propose an efficient algorithm-based fault tolerance design using the weighted data-check relationship, where the checks are obtained from the weighted data. The relationship is systematically defined as a new (n, k, Nw ) Hamming checksum code, where n is the size of the code word, k is the number of information elements in the code word, and Nw is the number of weights employed, respectively. The proposed design with various weights is evaluated in terms of time and hardware overhead as well as overflow probability and round-off error. Two different schemes employing the (n, k, 2) and (n, k, 3) Hamming checksum code are illustrated using important matrix computations. Comparison with other schemes reveals that the (n, k, 3) Hamming checksum scheme is very efficient, while the hardware overhead is small.



Algorithm-based fault tolerance, Hamming correcting code, matrix computations, overflow, round-off error, VLSI processor array


View Full Text