A Difference World: High-performance, NVM-invariant, Softwareonly Intermittent Computation

Harrison Williams hrwill@vt.edu Saim Ahmad saim19@vt.edu

#### Matthew Hicks mdhicks2@vt.edu



### Mobile and IoT deployments are reaching massive scales

#### **Billions of IoT devices**

- Market estimate: over 25 billion devices by 2030
- Dominated by tiny, resource-limited sensor nodes

#### **Massive-scale applications**

- Industrial IoT
- Wearables
- Smart cities

Battery power is a **non-starter** at this scale





#### Batteryless systems enable new deployments System-level Benefits







### Intermittent software execution



## **SRAM-based checkpoints**

# Typical checkpointing depends on performant NVM

- Flash: high-power, endurance limited
- FRAM/MRAM/ReRAM: limited adoption/availability

# TotalRecall (ASPLOS `20): store checkpoints in SRAM

- Data retention well below MCU minimum
- Full retention for hours to days
- Verify integrity with <u>checksum</u>



"Checkpoint" is a checksum over all SRAM



## Many operations require rollback



Execution must roll back to beginning of atomic operation

Correctness, performance, programmability challenges



## Task-based models make rollback tractable



## Task-based models make rollback tractable



Question: how can we apply in-place SRAM checkpoints to taskbased intermittent systems?



# Camel: mixed-volatility SRAM worlds

| Volatile "Non-Volatile" (checksum- |
|------------------------------------|
|------------------------------------|

Store working data in volatile SRAM

Store known-good state in checksum-backed region of SRAM

Main design considerations: SRAM is scarce → minimize memory overhead Checksum is expensive → minimize writes to NV world



# Alternating world volatility

#### NVM-Based Task Model

task sense()

**COMPUTER SCIENCE** 



# Alternating world volatility



# Alternating world volatility



## Efficient state rollback after power failures

| Variable    | temp | x | У | result | Write-first Read-only                                                         |
|-------------|------|---|---|--------|-------------------------------------------------------------------------------|
| Initial     | 0    | 1 | 2 | 4      | void task_compute() {                                                         |
| Execution 1 | 3    | 1 | 2 | 7      | $\frac{ GV(tellip)  -  GV(x) + GV(y) }{ GV(result) } = GV(result) + GV(temp)$ |
| Execution 2 | 3    | 1 | 2 | 10     |                                                                               |
|             |      |   |   |        | Write-After-Read                                                              |
|             |      |   |   |        | (WAR)                                                                         |



## Efficient state rollback after power failures



Camel compiler identifies the minimum set of variables to roll back for correctness

**COMPUTER SCIENCE** 

## **Evaluation scenarios and benchmarks**

#### Two target platforms

- MSP430G2955 (Flash)
- MSP430FR6989 (FRAM)

#### Hardware and simulation

- Hardware: RF energy harvester
- Simulation: measure CPU cycles, deep program instrumentation

#### **Baselines + benchmarks**

- TotalRecall and prior task-based systems
- 8 benchmarks for correctness and performance



#### Efficient, correct SRAM-based intermittent execution



Camel eliminates the need for onchip voltage monitoring 3-5x performance improvement over TotalRecall

| Benchmark | TotalRecall | Camel        |
|-----------|-------------|--------------|
| Transmit  | Fails       | $\checkmark$ |
| Actuate   | Fails       | 1            |
| Sense     | Hangs       | 1            |

Camel correctly executes peripheral-centric software



#### Differential buffer design cuts software overhead



Camel's buffer design outperforms nextbest task-based system by 2x Differential buffer approach improves *all* intermittent systems

|             | AR   | BC  | CEM | CF  | RSA  | avg. |
|-------------|------|-----|-----|-----|------|------|
| DINO [25]   | 1136 | 717 | 259 | 324 | 1830 | 788  |
| Chain [5]   | 2008 | 717 | 231 | 452 | 315  | 744  |
| Alpaca [28] | 2008 | 717 | 225 | 452 | 315  | 743  |
| CAMEL       | 1999 | 709 | 114 | 385 | 254  | 692  |

**Commit count** 

COMPUTER SCIENCE

High-performance, NVM-invariant intermittent computation

Camel brings efficient, correct intermittent computation to the largest class of devices today

Camel's differential buffer design substantially improves task-based systems on *any* intermittent platform

See the paper for more: memory consumption, checkpoint cycle overhead, integrity check methods, etc. **Group**: forte-research.com **Me**: harriswms.github.io

