ComFIRM
A Communication Fault Injection Tool to Validate Linux Clusters

Fábio Olivé Leite
Conectiva S.A.

<olive@conectiva.com.br>

This talk will present some of the current Linux clustering technology, and specially how a communication fault injection tool called ComFIRM can be used to validate the communication protocols they use. Fault injection is an established technique among Fault Tolerance researchers, which allows one to do experimental validation of the implementation of fault tolerance mechanisms, guaranteeing that such implementations actually support the failure models they've been designed to support.

Communication fault injection in ComFIRM is done by creating a set of rules that will be evaluated at every packet reception or transmission. The rules are composed by a special bytecode that represents packet selection and manipulation primitives, and as such can be used to inject faults like delayed or dropped packets on any protocol implemented on Linux.

ComFIRM is a flexible and powerful communication fault injection tool, located inside the Linux kernel, created by Fábio Olivé Leite. It is available as a set of patches and some documentation at http://www.conectiva.com.br/~olive/ComFIRM. It is stable and usable, even though a few things still have to be enhanced or fixed.

Fábio Olivé Leite is a member of the Conectiva High Availability Development Team. He is currently finishing an MSc course on Fault Tolerance, has a BSc degree on Computer Science and also a Technician degree on Industrial Electronics. He has published a few works on Fault Tolerance and Distributed Computing, and enjoys working with reliable communication, clusters and other distributed cool stuff.


Last modified: April 4, 2001
[Go to: Single System Image Clustering] [Go to: Introductie High Availability] [Go
to: Index]