In this section, we review prior work on distributed sensor data management and time-series prediction.
Sensor data management has received considerable attention in recent years. As we described in Section 1, approaches include in-network querying techniques such as Directed Diffusion [11] and Cougar [23], stream-based querying in TinyDB [13], acquisitional query processing in BBQ [3], and distributed indexing techniques such as DCS [20]. Our work differs from all these in that we intelligently split the complexity of data management between the sensor and proxy, thereby achieving longer lifetime together with low-latency query responses.
The problem of sensor data archival has also been considered in prior work. ELF [2] is a log-structured file system for local storage on flash memory that provides load leveling and Matchbox is a simple file system that is packaged with the TinyOS distribution [10]. Our prior work, TSAR [5] addressed the problem of constructing a two-tier hierarchical storage architecture. Any of these techniques can be employed as the archival framework for the techniques that we propose in this paper.
A key component of our work is the use of ARIMA prediction models. Most relevant to our work on prediction models are the approaches proposed in BBQ [3], in which multi-variate Gaussian models were used for addressing spatial correlations, and dynamic Kalman filters for addressing temporal correlations. Our work differs in that we propose model-driven push instead of pull, and we split modeling complexity between proxy and sensor tiers rather than using only the proxy tier. ARIMA models for time-series analysis has also been studied extensively in other contexts such as Internet workloads, for instance in [9].