3

I'm trying to imagine how I could implement Cloud Run scale to zero feature, to educate myself. Let's say I'm running either containers with CRIU or KVM images, the scenario would be:

  • A client start a request (the protocol might be HTTP, TCP, UDP, ...)
  • The node receives the request
  • If a container is ready to serve, forward the connection as normal
  • If no container is available, first starts it, then forward the connection

I can imagine implementing this via a load balancer (eBPF? Custom app?), who would be in charge of terminating connections, anyhow I'm fuzzy on the details.

  • Wouldn't the connection possibly timeout while the container is starting? I can ameliorate this using CRIU for fast boots
  • Is there some projects already covering this?

1 Answer 1

4

I can imagine implementing this via a load balancer

Systemd can do that. so could xinetd, if you're really oldskool.

Wouldn't the connection possibly timeout while the container is starting?

That depends on the application. But typically, what happens is that the launching supervisor (e.g., systemd) accepts the TCP connection, starts the service (in which way ever) and passes in the connected socket as file descriptor (of course, that doesn't work with VMs).

If your application times out quickly if after connection establishment nothing is replied within a few milliseconds, yes, that would happen.

Is there some projects already covering this?

As said, systemd on any modern Linux server or desktop system does that already. Here's a relatively in-depth discussion of how that works. You're looking at what Poettering considers the "Type 2", so do read that article and the one linked from there; even if you don't end up using systemd but some other socket activation manager, that is good to have read to understand what you're up against.

With modern version of container engines (most importantly podman and related), you can start a systemd manager within in a container, so that you get a "managed" version of systemd for this particular purpose as well.

Note that starting containers (pods) is a pretty typical thing for systemd units to do; for example, podman even has functionality to generate systemd unit files for existing containers, so that you can directly do something like "OK, to provide this and that service, this container systemd unit needs to be started, and in its dependencies, it starts all other pods/containers it needs".

4
  • 1
    Yeah, this really does sound like what inetd (xinetd? Modern upstart!) did in the old days (insert happy memory sigh of a less complicated age...). Of course an app start up was fractions of a second; a container startup is slower, and a VM startup even slower! Commented Dec 3 at 1:25
  • 1
    @StephenHarris ah, starting a container can be pretty lightweight (in the end, it's a really just a few syscalls, unpacking a container image, and starting of whatever process the container defined as entry point); the bottleneck in container starts in cloud environments is often the artificially throttled IO speed at which the container is pulled and at which it can be unpacked to storage. Both can be worked around by avoiding cold container starts, in the end. As to VMs, I'm assuming Mascarpone is thinking of something like Firecracker VM or similar… Commented Dec 3 at 10:17
  • 1
    … micro-VM things that integrate into container lifetime managers. And then you're back down in the milliseconds on startup time (because you wouldn't cold boot these, but resume from a snapshot, probably) Commented Dec 3 at 10:20
  • @MarcusMüller thanks!!! I will look up firecracker code to educate me!!! Following your advice I managed to build an L4 prototype, but now I have more questions: unix.stackexchange.com/questions/801813/… Commented Dec 3 at 11:36

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.