Next: Implementation Concerns
Up: Existing Results
Previous: Existing Theoretical Results
We describe nine cache replacement algorithms proposed in recent studies,
which attempt to minimize various cost metrics, such as
miss ratio, byte miss ratio, average latency, and total cost.
Below we give a brief description of all of them.
In describing the various algorithms,
it is convenient to view each request for a document
as being satisfied in the following way:
the algorithm brings the
newly requested document into the cache
and then evicts documents until
the capacity of the cache is no longer exceeded.
Algorithms are then distinguished by how they choose
which documents to evict.
This view allows for the possibility that the requested document
itself may be evicted upon its arrival into the cache, which means it
replaces no other document in the cache.
- Least-Recently-Used (LRU) evicts the document which was requested
the least recently.
- Least-Frequently-Used (LFU) evicts the document which
is accessed least frequently.
- Size [WASAF96] evicts the largest document.
- LRU-Threshold [ASAWF95] is the same as LRU, except documents larger than a
certain threshold size are never cached;
- Log(Size)+LRU [ASAWF95] evicts the document who has the largest log(size) and is the least recently used document among all documents with the same log(size).
- Hyper-G [WASAF96] is a refinement of LFU with last access
time and size considerations;
- Pitkow/Recker [WASAF96] removes the least-recently-used
document,
except if all documents are accessed today, in which case the largest one is
removed;
- Lowest-Latency-First [WA97] tries to minimize
average latency by removing the document with the lowest download latency
first;
- Hybrid, introduced in [WA97],
is aimed at reducing the total latency.
A function is computed for each document which is designed to
capture the utility of retaining a given document in the cache.
The document
with the smallest function value is then evicted. The function for a document p
located at server s depends on
the following parameters:
, the time to connect with server s,
the bandwidth
to server s,
the number of times p has been requested
since it was brought into the cache, and
, the size (in bytes)
of document p.
The function is defined as:
where
and
are constants.
Estimates for
and
are based on the the times to fetch documents from server s
in the recent past.
- Lowest Relative Value (LRV), introduced in [LRV97],
includes the cost and size of a document in the calculation of a value
that estimates the utility of keeping a document in the cache. The
algorithm evicts the document with the lowest value.
The calculation of the value is based on
extensive empirical analysis of trace data.
For a given i, let
denote the probability that a document is
requested i+1 times given that it is requested i times.
is estimated in an online manner by
taking the ratio
, where
is the total number
of documents seen so far which have been requested at least i times in the trace.
is the same as
except the value is
determined by restricting the count only to pages of size s.
Furthermore, let 1 - D(t) be the probability that a page is requested again
as a function of the time (in seconds) since its last request t; D(t) is
estimated as
Then for a particular document d of size s and cost c, if the last
request to d is the i'th request to it, and the last request was made
t seconds ago, d's value in LRV is calculated as:
Among all documents, LRV evict the one with the lowest value.
Thus, LRV takes into account locality, cost and size of a document.
Existing studies using actual Web proxy traces narrowed down the choice for proxy replacement algorithms
to LRU, SIZE, Hybrid and LRV.
Results in [WASAF96, ASAWF95] show that SIZE performs better than
LFU, LRU-threshold,
Log(size)+LRU, Hyper-G and Pitkow/Recker.
Results in [WASAF96] also show that SIZE outperforms LRU in most
situations. However, a different study [LRV97] shows that LRU
outperforms SIZE in terms of byte hit rate.
Comparing LFU and LRU, our experiments show that though LFU can outperform LRU
slightly when the cache size is very small, in most cases LFU performs
worse than LRU.
In terms of minimizing latency, [WA97] show that Hybrid performs
better than Lowest-Latency-First.
Finally, [LRV97] shows that LRV outperforms both LRU and SIZE
in terms of hit ratio and byte hit ratio.
One disadvantage of both Hybrid and LRV is their heavy parameterization, which
leaves one uncertain about their performance across access streams.
However, the studies offer no conclusion on which algorithm a proxy should
use. Essentially, the problem is finding an algorithm that can combine
the observed access pattern with the cost and size considerations.
Next: Implementation Concerns
Up: Existing Results
Previous: Existing Theoretical Results
Pei Cao
Thu Oct 23 18:04:42 CDT 1997