ALS 2000 Abstract
Sequence Analysis on a 216-Processor Beowulf Cluster
Katerina Michalickova,
Moyez Dharsee, and
Christopher W. V. Hogue, Samuel Lunenfeld Research Institute
Abstract
In this work we describe the implementation of a 216-
processor Beowulf cluster with switched gigabit Ethernet
networking. This design includes the use of a 8-CPU high
performance midrange computer with 8 gigabit ports as a
cluster head, a design that limits I/O contention. We have
been developing applications software for bioinformatics
research in protein folding, as well as the MoBiDiCK
system for managing cluster applications that is
extensible to general purpose distributed computing. In
addition to the cluster architecture, we present a new
cluster application for bioinformatics, a variant of the
BLAST family of sequence comparison programs.
MOBLAST performs the BLAST algorithm in an
exhaustive manner, avoiding its initial heuristic approach
to finding hits. This effectively slows BLAST down to
approach the speed of other comprehensive search
methods such as a Smith-Waterman alignment.
MOBLAST requires a sizeable cluster to run. We
describe the development of MOBLAST and its use in
making an exhaustive MxN database of alignments where
M is the set of protein sequences with known 3-D
structures, and N is the set of all protein sequences. This
MxN database of protein alignments will facilitate further
research in protein folding, the ultimate aim of our work
with Beowulf cluster technology. Furthermore, we
describe a general algorithm for partitioning MxN
problems and implement this in the MoBiDiCK computing
model.
|