A Text Retrieval Package for the Unix Operating System
Liam R. E. Quin
SoftQuad Inc.
(lee@sq.com)
Abstract
This paper describes lq-text, an inverted index text retrieval package
written by the author. Inverted index text retrieval provides a fast
and effective way of searching large amounts of text. This is
implemented by making an index to all of the natural-language words
that occur in the text. The actual text remains unaltered in place,
or, if desired, can be compressed or archived; the index allows rapid
searching even if the data files have been altogether removed.
The design and implementation of lq-text are discussed, and
performance measurements are given for comparison with other text
searching programs such as grep and agrep. The functionality provided
is compared briefly with other packages such as glimpse and zbrowser.
The lq-text package is available in source form, has been successfully
integrated into a number of other systems and products, and is in use
at over 100 sites.
Download the full text of this paper in
ASCII (54,410 bytes) and
POSTSCRIPT (264,871 bytes) form.
To Become a USENIX Member, please see our
Membership Information.