Envoy factory registry
Andreas Hohmann March 01, 2024 #envoy #proxyEnvoy is an incredibly flexible proxy server. Almost every aspect of the processing chain from the listeners all the way to the management of the upstream connections is configurable. This is even more impressive given that the configuration can be changed on the fly (by reloading a file or via the xDS API) without restarting the server. Moreover, we are not restricted to Envoy's built-in features but can add our own components with their custom configuration.
How does Envoy pull this off given that C++ is not known as a particular "dynamic" language? This post tries to shed some light on Envoy's factory registries that form the foundation of this mechanism.
Let's start with the caller side. In various places, Envoy has to construct objects implementing some interface using a given configuration. Here is an example from the code handling the upstream connections.
1 Network::UpstreamTransportSocketFactoryPtr 2 3 4 5 6 7 8 9 10 11
The createTransportSocketFactory method creates
a socket factory based on the cluster configuration. The key is the
getAndCheckFactory method that takes the
transport_socket
protobuf configuration and returns a factory object of type
UpstreamTransportSocketConfigFactory.
TransportSocket is one of Envoy's fully dynamic
configuration objects using a
TransportSocket.typed_config, a protobuf
Any message. The getAndCheckFactory method calls
getFactoryByType with the typed config. If this does not
result in a factory, the method tries a lookup by name instead.
1
2 static Factory* 3 4 5 6 7 8 9
Now we have to understand the lookup of a factory by type (typed config) and by name. Let's start with the lookup by name.
1
2 static Factory* 3 4 5 6 7 8 9 10
getAndCheckFactoryByName
calls a static method of
a "factory registry" that is parameterized by the type of the factories we are
interested. There must be a static registry object per Factory
type that
contains all the factories that implement this type, and Envoy must somehow
register all the available factories in this registry. The
FactoryRegistry is indeed just a collection of static
methods, that is, a singleton similar to a Scala or Kotlin object
.
Static object in C++ are notorious for there unpredictable construction and destruction order known as the "static initialization order fiasco" or SIOF for short. That's why many C++ style guides disallow static objects altogether. How does Envoy get around these problems? A "registry" that keeps factory objects in some map is definitely a rich object with non-trivial constructor and destructor.
The first step is to apply "initialization on first use" and place the static variable in of a method instead of the top-level. The variable will get initialized when the method is called for the first time.
;
This technique is also known as Meyer's Singleton. While using static local objects solves the initialization problem, they may still cause trouble during destruction because of dependencies between these static objects. That's why the "initialization on first use" pattern recommends using pointers, allocating the objects on the heap, and never deallocating them. While this theoretically creates a memory leak, the objects live for the duration of the program, and the operating system will release the memory at the end of the process. The destructors will never be called, however, so that those objects must not have destructors doing anything meaningful besides freeing memory.
FactoryRegistry defines the static map of factories (per
factory type parameter Base
) in the factories
method. Note how the definition as a template gives us a new registry for a
Base
factory type by just calling one of the static methods.
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ;
Now that we know where the factories are kept and how the lookup by name is performed, let's figure out how the factories are registered. The registerFactory method stores the pointer to the given factory in the map under the given name.
1
2 3 4 5 6 7 8 9 10 11 12 13 14 ;
Note that this method is not threadsafe. We have to make sure that all factories are registered before they are used from other threads. Envoy provides a coupld of helper classes and macros to encourage this. The helper class RegisterFactory captures the registration of a single factory. The factory is constructed as a field (using the factory's default constructor) and registered in in the constructor.
1 2 3 4 5 6 7 8 9 ;
The factory class T
is implicitly assumed to derive from Base
, to be
default-constructible, and to have a name
method returning a string. With
C++20 we can require this concept explicitly:
1
2 concept Factory = std::default_initializable<F>
3 && 4 5 6 ;
7
8 requires 9 10 11 12 13
14 private:
15 F ;
16 };
Now we have a registration class, but we still need to instantiate this class
for a concrete factory type. To this end, Envoy uses the static local variable
trick once more. The REGISTER_FACTORY macro defines a
top-level forceRegister
function containing the static pointer to the
RegisterFactory
object:
1 2 3 4 5
Here is the macro call for the DecompressorFilterFactory as an example:
;
The registration functions are then called explicitly:
1 void 2 3 4 5 6 7
Note that the definition of the macro depends on the
ENVOY_STATIC_EXTENSION_REGISTRATION flag. If set,
the factory registration is a plain static object and the forceRegister
function is empty.
We followed the factory lookup by name all the way to the registration. This leaves the lookup by configuration (protobuf) type that we noticed at the very beginning in the getAndCheckFactory method
1
2 static Factory* 3 4 5 6 7 8 9
The getFactoryByType method delegates to the static FactoryRegistry method of the same name. This getFactoryByType follows the same patterns as the getFactory method:
1 static Base* 2 3 4 5 6 7
The only difference is the factoriesByType
call instead of the factories
call. The registration by type is not performed when a factory is registered.
Instead, the map from type name to factory is created lazily by the
buildFactoriesByType method and stored
in yet another static pointer variable in
factoriesByType. To be threadsafe, this
method has to be called once in the main thread after all factories have been
registered for a given factory type but before
getFactoryByType is called from another
thread.
The buildFactoriesByType implicitly
assumes that the Base
factory interface has a configTypes
method returning
the set of type strings under which to register the factory.
1 static std::unique_ptr<absl::flat_hash_map<std::string, Base*>> 2 3 4 5 6 7 8 9 10 11 12
As a C++20 concept, this would read:
1
2 concept FactoryBase = 3 4 ;
So, in the end both the name and the type names are defined by the factory
itself through the name
and configTypes
method. The name
method is called
on the concrete factory objects whereas the configTypes
method must exist in
the base factory interface. It is therefore a virtual method in the
UntypedFactory that all factories derive from. The default
implementation returns an empty type name set. The name
method is also a
virtual method in this class, but does not strictly have to be in the base
factory interface.
1 2 3 4 5 6 7 8 ;
This completes our little tour through the Envoy factory registration implementation.
Update 2024-03-22: After writing this post I stumbled upon Abseil's NoDestructor class that solves the destruction order problem of static objects by not running the destructor of the wrapped object. In contrast to the static pointer to a heap-allocated object that is never freed, NoDestructor lets us keep the object in static storage and save one pointer indirection:
1 2 3 4 5 6 7 ;
In case of Envoy's FactoryRegistry
, we could wrap the static factory hash map in
a NoDestructor:
1
2 3 4 5 6 7 8 9 10 ;
How does NoDestructor work? It's mainly a wrapper around placement new, constructing the wrapped object in a plain char array and never calling the destructor. Fortunately, template argument packs and perfect forwarding are tailor-made for such a wrapper.
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 ;
The actual NoDestructor template contains this implementation and adds the operators that make the NoDestructor wrapper look like a pointer.
That's the best solution for static objects in c++ that I'm aware of (besides not using static objects to begin with, see, for example LLVM's rule).