Rewrite the gRPC API to ditch google.protobuf.Any in favor of protobuf native oneof #2880
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi,
This pull request introduces a major change to the GoBGP API: transitioning from
google.protobuf.Any
to protobuf's nativeoneof
feature.Why
The BGP protocol uses a lot of list of various types, that reflects in the gRPC API. For example, path attributes can be any type of a specific subset. Currently, the API uses runtime dynamic types in the form of
Any
, allowing to hold any protobuf type. Which values are accepted is documented by a comment above the field, indicating which types are legal.There are, however, a few issues with this:
void*
in C, and one has to rely on the comment for thisIn a nutshell, the main motivation is type safety and discoverability.
In addition to those usability aspects, a performance aspect is present as well, since it requires dynamic typing and parsing of the actual type URL. The GC will also be happier, as it means less allocations and garbage (it was, in fact, looking at pprof that initially triggered this initiative).
There also is an improvement on the wire usage, as all variants of a oneof share the same memory once encoded. Since that is always a sub-message for gobgp, it's always a pointer, so all those oneofs only eat up a pointer on the wire. A net win.
How
The API structure did not change. All
Any
s were taken one by one, and a correspondingoneof
message was written.TunnelEncapTLV
'stlvs
has its oneof message nested since it's only used in this message)NLRI
,RouteDistinguisher
)ExtendedCommunity
that does even have its own.proto
file)Once those were identified, the various
Marshal
/Unmarshal
helpers were used to infer the correct set of accepted values in the oneof (many documentation comments were out of sync, especially for stuff used at several places).Due to include cycles, the big
attribute.proto
file had to be broken down in smaller files, likeextcom.proto
,common.proto
,nlri.proto
, etc.Result
In the end, the API is much more self-describing, avoids wrangling with Any, and (subjectively) better overall. A simple look at some of the main oneofs ought to be enough: NLRI, Attribute, ExtendedCommunity just to name those three.
For the performance aspect, I measured notable improvements on the gRPC API. The tests were performed using EVPN type-2 routes and a small Rust program for route injection.
Note: "unary backend" means calling the "AddPath" gRPC endpoint, while "stream backend" means using a single "AddPathStream" gRPC call and streaming all routes. I include both for completeness: most users will use the unary backend and won't see a meaningful improvement, while heavy users (like myself) will use the stream backend and will see meaningful improvements.
CPU time was measured by sampling
/proc/pid/stat
before and after applying the workload. It is included as an indicator of how much less work gobgp does, and that the increased performance does not come at a CPU cost. We do less work.The tests were performed on a Scaleway POP2-4C-16G instance running Ubuntu 24.04.
Downsides
A few downsides are present with this approach:
(it could be documented via comments that a variant is illegal in some situation, which is IMHO better, as to explicitly document exceptions)
IMHO this is a language / tooling issue, as a simple patch in protoc-gen-prost would allow to box enum variants, and reduce the enum size to a pointer.
However, I think the benefits far outweigh the drawbacks.
Other words
I don't expect this pull request to be merged straight away, or as-is. It definitely could, however. But I don't come to get this merged ASAP, and I'm open to discussion. This is a big changed, and we already discussed on slack about this. I hope this clear most of the doubts that could cloud this initiative.
On a side note, I did not fully nuke Anys from the proto. Deep down SRv6, there is a pair of Anys left, which I didn't touch, because I'm far from being knowledgeable enough to touch this (though I'll read up on it and end up doing it, it's a tiny leaf struct anyways).EDIT: now doneThank you for considering this :)