Envoy event dispatcher
Andreas Hohmann March 22, 2024 #envoy #proxyLike most high-performance servers, Envoy performs IO asynchronously using event loops on a relatively small number of worker threads. How Does Envoy's implementation work? Where does this functionality live?
Libevent
Envoy's event dispatcher uses the libevent library that defines an abstraction layer on top of the platform-dependent asynchronous IO APIs such as epoll on Linux and kqueue on FreeBSD/Mac, similar to the libuv library used by node.js. Libevent also offers a buffer API and even an HTTP library on top of that, but Envoy uses only the low level event dispatching.
The main purpose of an event library is to provide an event loop and some mechanism that allows us to react to IO events. We also need timer events (for example, for timeouts) and a way to schedule general asynchronous tasks. Another important aspect is the ability to cancel these operations. This is not only needed for timeouts but also more generally for concurrent tasks that depend on each other. Let's see how libevent provides this functionality, starting with the event loop.
The state of a libevent dispatcher is a kept in an
event_base structure. Libevent is a typical C library
that hides its structures behind opaque pointers and exposes a set of functions
for creating, manipulating, and destroying these structures in an
object-oriented fashion. The functions start with the name of the structure
followed by _new
, _free
, or some custom function name.
To run a libevent event loop, we have to create an event_base with event_base_new or event_base_new_with_config, run it with event_base_loop, and eventually destroy it with event_base_free.
struct event_base* base = ;
// ... register initial events
;
;
The event loop can be stopped by calling event_base_loopexit or libevent-event_base_loopbreak, for example, from a task on the event loop or from another thread. event_base_loopexit stops the event loop after a timeout and completing queued events whereas libevent-event_base_loopbreak stops immediately.
An "event" in the libevent library is the combination of the specification of
the event we are interested in and the callback function that will be called by
libevent when the event occurs. An event specification could be "READ on file
descriptor 5" or "timer in 10 seconds". libevent callbacks are
specified, again in typical C fashion, using a function pointer and a void*
pointer to arbitrary data that is passed back to the callback function. An
event callback, for example, takes the file
descriptor, the events that occurred (as bits), and the void*
context object
that was given to libevent when registering the callback function.
typedef void ;
An event (that is, an event specification with callback) is represented as the event structure that is created with event_new and destroyed with event_free following libevent's object-oriented pattern. The event_new function handles the creation of all event types (IO, timers, tasks). libevent defines macros that look like event functions for different event types, but they all boil down to calls of the event API with different parameters.
struct event *;
event_new creates the event on the heap. Alternatively, one can initialize an existing event structure with event_assign. That's useful for wrapper libraries (like the one defined in Envoy) that embed the event in a wrapper structure whose memory is managed by the application. The event_del function deletes an event without freeing the memory.
Creating an event starts the event's lifecycle in the initialized state. It
does not schedule it yet. To this end, one has to add the event to the event
dispatcher (event_base
) by calling event_add which
moves the event to the pending state. Once the conditions of the event occur,
it becomes active. While an event is pending, we can remove it from the
dispatcher by deleting it with event_del and putting it
back by calling event_add again.
Looking at the event_new function, you may have missed a timeout duration. libevent does not consider the timeout part of the event specification and instead takes the timeout as a second argument to the event_add function.
int ;
As mentioned above, libevent uses this single event API for all the supported kinds of events. Besides the callback, we have three parameters at our disposal:
- file descriptor: socket file descriptor, also used for signal type
- events: bit set of EV_TIMEOUT, EV_READ, EV_WRITE, EV_SIGNAL, EV_PERSIST, and EV_ET
- timeout
The EV_ET flag stands for "edge triggered" and controls whether we want to react to state changes or values. EV_PERSIST keeps an event pending when it triggers (saving us a call to event_add).
The following table shows the meaning of the event parameters for the different
event types. The prefix is combined with the new
, add
, and pending
function
names, for example, evtimer_new to create a timer event.
event type | file descriptor | events | timeout | prefix |
---|---|---|---|---|
file (socket) | socket file descriptor | READ, WRITE, PERSIST, ET | timeout (NULL = forever) | event_ |
task | -1 | delay (NULL = now) | evtimer_ | |
signal | signal number | SIGNAL, PERSIST | timeout (NULL = forever ) | evsignal_ |
Here is a small C program reading from a UNIX socket and intercepting the SIGINT signal for a graceful shutdown. We use the cleanup attribute supported by gcc and clang to keep the error handling under control without RAII or a defer operator.
typedef struct Buffer Buffer;
void
void
void
void
void
int
The program opens an existing UNIX socket (that can be created with nc -l -U /tmp/foo
), creates the libevent dispatcher, and registers an event reading
from this socket with a 1 second timeout. It also registers a signal event for
SIGINT
(Ctrl-C on Linux) that exits the event loop with
event_base_loopbreak
. The two callbacks demonstrate how we can pass arbitrary
data such as the buffer or the event base to the callback functions.
Using libevent's buffer functionality, we could also write a TCP server or client with little effort. However, libevent does not offer asynchronous file IO. If we need file IO, we have to resort to multi-threading or jump to the newer asynchronous Linux APIs such as io_uring to keep the blocking functions out of the event loop.
Envoy event dispatching
Now that we have seen how libevent works (at least on the event level), we can turn to Envoy's event dispatcher implementation. The following diagram shows some of the key classes.
Envoy's Dispatcher interface encapsulates an event loop. Besides the lifecycle methods such as run and exit, this interface allows for registering callbacks for the various event types (files, timers, signals) and offers higher-level methods for managing server and client connections. There are two related interfaces: The Scheduler, which provides a single method to register a timer, and the CallbackScheduler, which allows for scheduling a callback for immediate (within the current event dispatch cycle) or asynchronous execution.
LibeventScheduler implements the two scheduler interfaces by wrapping a libevent event_base, and DispatcherImpl, the implementation of the Dispatcher interface, owns an instance of the LibeventScheduler.
Let's take a closer look at Envoy's use of the libevent API. The event_base_loop is called from the LibeventScheduler implementation.
void
We can find the creation of the file (socket) events in the assignEvents method of the FileEventImpl. The FileEventImpl contains the libevent event structure as a plain attribute (not a pointer) in its ImplBase base class and therefore uses event_assign rather than event_new. The callback passed to event_assign is the closure calling the mergeInjectedEventsAndRunCb method, and the callback object is the FileEventImpl itself.
void
Most of the code handles the conversion between Envoy's
FileReadyType and FileTriggerType and
libevent's event type constants. Note that Envoy always sets the EV_PERSIST
flag, that is, all file events stay in the event loop after becoming active and
have to be explicitly removed (using event_del) at some
point. This is done in the destructor of Envoy's event
base class ImplBase.
FileEventImpl's constructor first creates the event with assignEvents and then adds it to the event loop with event_add.
Envoy uses the libevent API in a similar fashion for timers and scheduled tasks. The main difference is that these events are not added automatically to the event loop. Instead, the Timer and SchedulableCallback interfaces offer methods to enable and disable the events explicitly. These methods call libevent's event_add and event_del functions, respectively, to add and remove the event from the event loop.
As we can see, Envoy's event management is a thin layer on top of libevent, shielding the rest of the application from the libevent API and taking care of the event lifecycles. The Envoy team considered moving to a newer event library during the early years, but for now libevent is working just fine, and any change to such a fundamental aspect of Envoy not worth the effort and risk.