Detour into detached net namespace, bridged networking with CNI and another daemon #28
richiejp
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This week I got much deeper into Buildkit's networking and a variety of other problems than I had planned. I thought I could quickly add rootless mode, zero dependency install and massively improve performance (previous post #27).
I did achieve two of those things and getting to a single binary installation is closer (#26). However one issue leads to another and I've had to dig deeper into how Buildkit runs containers than I wanted right now. This is not because I dislike Buildkit, I think it is fantastic, it's just that this whole thing distracting from the assistant plugin #22.
I spent a fair amount of time on things like a vertical tab character messing with terminal output. This is the ASCII character with the '\v' escape code which started showing up in terminal output when I enabled a TTYs for containers running in Buildkit. Nerdctl nor Docker never output it (8f675c7).
This is the kind of issue that some changes create in bulk. However it is relatively minor compared to the problems created with networking.
A somewhat pleasant surprise to me is that Buildkit specifically supports running processes in containers in a way more similar to the
docker run
CLI than theRUN
step in a Dockerfile. With the caveat that options relating to networking are largely missing.The features there are good for setting up the network to make outgoing connections, but for incoming things are more limited. This is expected as the build process usually involves downloading stuff, not serving resources. The exception is if testing is built into the build process, but it's probably two processes/threads in the same container connecting over the loopback device.
However Buildkit kind of supports it by accident and Rootlesskit, which is used to wrap Buildkit allowing it to run as a fake root user rather than the real one, explicitly supports port forwarding.
Buildkit has a number of networking options. One is to run in host network mode, meaning that the container process has the same network namespace as whatever Buildkit is running in. Which could be the root namespace where your real ethernet and WiFi devices are, or a virtual one.
If being run in Rootlesskit with host networking, then the container processes all run in the same network namespace as created by Rootlesskit. So I thought I would just run Buildkitd in Rootlesskit and use Rootlesskit's port forwarding. While Ayup only supports running one app at a time, we don't need to worry about multiple apps needing the same port.
Turns out though that running Buildkit inside a network namespace invites some issues. It couldn't connect to the Open Telemetry daemon running on localhost, so I was not getting traces from Buildkit and getting telemetry from Buildkit is a feature I really like.
No doubt there are other issues too which is why
--detach-netns
was added to Rootlesskit. This makes Rootlesskit create a new network namespace, but does not run Buildkitd inside of it. Instead it creates a file that Buildkitd can reference to get the namespace and temporarily enter the namespace when some network setup is needed.So I turned that on which fixed telemetry and broke container networking. As far as I could tell the container was now running in the real hosts network. Buildkit did not appear to put it inside the detached namespace. So all isolation was removed.
However Buildkitd does support using the detached namespace, just not for host networking, the simplest networking. It does however support it for "bridge" or "cni" networking. Both of which use CNI plugins (https://github.com/containernetworking/cni).
I'm guessing that "cni" networking allows you to submit some arbitrary CNI config, but I haven't touched it. Bridge networking uses the CNI bridge plugin to create a bridge device with individual virtual ethernet devices for each container attached to it. This means each container gets its own IP address which we can access while inside the Rootlesskit network namespace.
Enabling solves the issue with detach-netns, but now there is a new issue which is that it breaks Rootlesskit's port forwarding because each container is now isolated. There is no way in Buildkit (that I could see in the time I have) to forward ports from individual containers.
Perhaps this could be added upstream, but for now I decided to add another daemon which wraps buildkitd and runs inside of Rootlesskit. It can also attach itself to the network namespace and forward ports. It also performs some other fixups unrelated to networking.
Finally there is still one more problem which is finding which IP address a particular container has. Buildkit doesn't appear to care what IP a container has, but the CNI itself does publish this information to a file. So once I read that file I got port forwarding back.
There are still some other issues to iron out with delivering signals to Buidkit and the containers.
Beta Was this translation helpful? Give feedback.
All reactions