ROC approach
Collect data to see why services fail
Create benchmarks to measure ACME
- Use failure data as workload for benchmarks
- Benchmarks inspire researchers / humiliate companies to spur improvements in ACME
Apply Margin of Safety from Civil to Availability target: Need more 9s?
Create and Evaluate techniques to help ACME
- Identify best practices of Internet services
- Make human-machine interactions synergistic vs. antagonistic
- ROC focus on fast repair (they are facts of life) vs. FT focus longer time between failures (problems)