Since trends in sensor observation may change over time, a model
constructed using historical data may no longer reflect current
trends--the model parameters become stale and need to be updated to
regain energy-efficiency. PRESTO proxies periodically retrain the model
in order to refine its parameters. The retraining phase is similar to the
initial training--all data since the previous retraining phase is gathered
and the least squares method is used to recompute the model parameters
and
[1]. The key difference between the initial training
and the retraining lies in the data set used to compute model parameters.
For the initial training, an actual time series of sensor observations is used to compute model parameters. However, once the system is operational, sensors only report observations when they significantly deviate from the predicted values. Consequently, the proxy only has access to a small subset of the observations made at each sensor. Thus, the model must be retrained with incomplete information. The time series used during the retraining phase contains all values that were either pushed or pulled from a sensor; all missing values in the time series are substituted by the corresponding model predictions. Note that these prior predictions are readily available in the proxy cache; furthermore, they are guaranteed to be a good approximation of the actual observations (since these are precisely the values for which the sensor did not push the actual observations). This approximate time series is used to retrain the model and recompute the new parameters.
For the temperature monitoring application that we implemented,
the models are retrained at the end of each
day.3 The new parameters and
are then pushed to each sensor for future predictions. In
practice, the parameters need to be pushed only if they deviate from
the previously computed parameters by a non-trivial amount (i.e., only
if the model has actually changed).