The Actor Model and the Queue or {“Batch} is the New {Black”}

James Whitehead II

Thursday, November 02, 2017 - 2:45 pm–3:30 pm

James Whitehead II, Chief Scientist, Formularity

Abstract:

This presentation will explain how two simple, decades-old computer paradigms can be combined and used to build the world’s largest and most resilient computing solutions. Real world examples will be shown.

In 1974, Carl Hewitt published his paper on the Actor model. In computing, an Actor is a computer program that uses information fed to it through messages to 1) create new Actors, 2) send messages to other Actors, and 3) make limited, often binary, decisions. Just as the binary on-off state of a single transistor can be built into the 2.6 billion(!) transistor Intel i7 microprocessor, Actors can be built into the most complex processing systems. If the Actor model sounds familiar, it’s because it is the basis for Microservices, one of the hottest new topics in cloud computing. Just another example that “…what has been will be again, what has been done will be done again; there is nothing new under the sun.”

The Actor Model is only half of the solution. The key to using Actors to build infinitely scalable real-world systems is how you connect them together. Typically, in Microservices, you send or “push” messages from one Microservice to another. When you reach the throughput of a Microservice instance, you clone a few more instances. When you reach the CPU or memory utilization limits of the virtual machine, you fire up more VMs. The key is that you “push” messages. This however, is the wrong approach. We all know what happens when you push something hard enough—it will fall over. Think of the classic scene from the “I Love Lucy” television program where Lucille Ball is wrapping chocolate candies on a conveyer belt. This graphically demonstrates that the “push” model is the wrong approach.

In Douglas Adam’s “The Hitchhiker’s Guide to the Galaxy”, the quote is “We'll be saying a big hello to all intelligent lifeforms everywhere and to everyone else out there, the secret is to bang the rocks together, guys.” To paraphrase Mr. Adams, the secret to scalable processing systems is really to “pull”, not “push” messages between Actors.

Rather than send messages directly between Actors, the messages are deposited into queues from which Actors can “pull” messages. As each Actor becomes available, it pulls the next message out of the queue and processes it. This has a number of advantages over “pushing” messages, such as increased Actor process stability, load balancing, predictive monitoring, and transparent redundancy.

Actors are computer programs and as such they aren’t lazy. An Actor will process messages as fast as its execution environment permits. If messages begin to back up in a queue, then you know, long before it becomes critical, that more Actor processes are required. As these new Actor processes become available, there is no need to add them to a load balancer. Each new Actor connects to the same queue and starts asynchronously removing and processing messages. Similarly, when queues become empty, redundant Actors can be terminated. Finally, by using network routing, it’s possible to route messages to redundant queues. If the primary queue fails, Actors can “failover” to a redundant queue and continue processing without message loss.

While the Actor model is 42 years old, the queue data structure was originally described by Alan Turing 70 years ago, in a paper published in 1947!

While these two “ancient” computing paradigms form the basis for modern, infinitely scaling systems, there are a number of details that must be dealt with, including how to handle work lost when Actors fail; how to maintain state or context; how to handle long-running processes; how to handle “split brain” network failures in light of redundant messages queues; synchronization of redundant message queues, etc. This presentation will discuss these issues as well. The goal of the presentation is to outline for software developers, the framework they can use to develop highly scalable, highly resilient processing systems.

James Whitehead II, Chief Scientist, Formularity

Brad Whitehead is Chief Scientist for Formularity, an electronic forms company dedicated to the secure collection and processing of personal information. Formerly, he was a Partner and Master Technology Architect with Accenture. Brad has architected and implemented several national-scale information processing systems based on the Actor model and queues. One such system processes billions of biometric transactions per day for the Republic of India, while another handles millions of biometric identification transactions each day while safeguarding the borders of the United States. He has served as a security advisor to several US Federal agencies, including the Department of Homeland Security, the Department of Defense, and the United States Postal Service. Brad holds a BS from Carnegie Mellon University and an MS from the University of Liverpool. He can be reached at brad.whitehead@formularity.com.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

@conference {207265,
author = {James Whitehead II},
title = {The Actor Model and the Queue or {{\textquotedblleft}Batch} is the New {Black{\textquotedblright}}},
year = {2017},
address = {San Francisco, CA},
publisher = {USENIX Association},
month = oct
}

The Actor Model and the Queue or “Batch is the New Black”

James Whitehead II, Chief Scientist, Formularity

Open Access Media

Presentation Video

Presentation Audio