Skip to content

GEP 3779 - East/West Identity-Based Authorization #3822

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

LiorLieberman
Copy link
Member

@LiorLieberman LiorLieberman commented May 28, 2025

/kind gep

First iteration of provisional GEP - focused on "What?"/"Why?"/"Who?"

What this PR does / why we need it:
To support e/w authorization with Gateway For Mesh APIs

Which issue(s) this PR fixes:

Fixes #3779

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 28, 2025
@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/gep PRs related to Gateway Enhancement Proposal(GEP) labels May 28, 2025
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 28, 2025
Copy link
Contributor

@howardjohn howardjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is another draft gep for the same thing, is it feasible for you 2 to work together on a single proposal

@LiorLieberman
Copy link
Member Author

LiorLieberman commented Jun 2, 2025

there is another draft gep for the same thing, is it feasible for you 2 to work together on a single proposal

We are, on this PR. opened another one just technical reasons asked aryan to close the other PR

@LiorLieberman LiorLieberman changed the title add draft authz gep e/w GEP 3779 - Identity Based Authorization for east-west Jun 4, 2025
@LiorLieberman LiorLieberman changed the title GEP 3779 - Identity Based Authorization for east-west GEP 3779 - East/West Identity-Based Authorization Jun 4, 2025
@LiorLieberman LiorLieberman marked this pull request as ready for review June 4, 2025 20:25
@LiorLieberman LiorLieberman requested a review from howardjohn June 4, 2025 20:25
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 4, 2025
@k8s-ci-robot k8s-ci-robot requested review from kflynn and youngnick June 4, 2025 20:25
@LiorLieberman
Copy link
Member Author

/cc @robscott as well

@k8s-ci-robot k8s-ci-robot requested a review from robscott June 4, 2025 20:26
@k8s-ci-robot
Copy link
Contributor

@LiorLieberman: GitHub didn't allow me to request PR reviews from the following users: as, well.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @robscott as well

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the release-note-none Denotes a PR that doesn't merit a release note. label Jun 4, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. label Jun 4, 2025
Provide a method for configuring Gateway API Mesh implementations to enforce east-west identity-based Authorization controls. At the time of writing this we leave Authentication for specific implementation and outside of this proposal scope.


## Goals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is any attributes other than identity and port considered?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would mean entering the L7 space (which I think require more research about where we can standardize). So for the first iteration -- no.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest making this clear in the non-goals, and then we can consider this resolved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shane - it is in deffered goals. Think we should add it to Non Goals in addition?


(Using the [Gateway API Personas](../../concepts/roles-and-personas.md))

* A way for Ana the Application Developer to configure a Gateway API for Mesh implementation to enforce authorization policy that **allows** or **denies** identity or multiple identities to talk with some set of the workloads she controls.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

many users of k8s see k8s namespace as a security boundary . Is the namespace policy (e.g specifies which identities are allowed to talk to my namespace) a special case of the workload policy?


* A way for Ana the Application Developer to configure a Gateway API for Mesh implementation to enforce authorization policy that **allows** or **denies** identity or multiple identities to talk with some set of the workloads she controls.

* A way for Chihiro, the Cluster Admin, to configure a Gateway API for Mesh implementation to enforce non-overridable cluster-wide, authorization policies that **allows** or **denies** identity or multiple identities to talk with some set of the workloads in the cluster.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we're introducing hierarchical policies (cluster wide overridable/non-overridable, namespace, workload, port being the scopes i could make out), should we add a goal that Ana should be able to easily and determistically figure out the effective policy for their workload (port)?

feel free to resolve if this doesn't need to be called out or if it doesn't make sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very good call. I am not sure this should be a part of this specific proposal. IMO there should be a different effort in gateway to actually help discoverability of policies, and policy attachment.

Will leave this open in case others have other opinions. But its something the community had talked about many times in the past

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the fact that overridable/non-overridable policies gets this a little more complicated to understand effective policy so thats a good point. Will review some of the prior art here, but I dont think it should be part of the first iteration

Copy link

@aryan16 aryan16 Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LiorLieberman Hierarchical policies are fine, but I am confused about the behavior of non-overridable policies. Can you explain it more?

I was thinking making policies overridable (basically merge all if the policy somehow applies to the workloads).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to stretch goals, we can elaborate more on the community meeting and/or when the we progress with the iterations of this GEP


More on https://docs.cilium.io/en/stable/internals/security-identities/ & https://docs.cilium.io/en/stable/security/network/identity/


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably also worth reminding everyone that GAMMA has a some implicit authorization policy already - once one or more HTTPRoutes are added to a Service, then anything that does not match something in a HTTPRoute is expected to be denied.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LiorLieberman @aryan16 can one of you add a note about this in this GEP?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend not coupling routing and authorization. At least considering a sidecar architecture they are completely different since routing is client-side which cannot be used for security

Copy link
Contributor

@mikemorris mikemorris Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then anything that does not match something in a HTTPRoute is expected to be denied

This might be one of the things that needs clarification as per #3804 - the intent, as John said, is not to imply routing and authorization are linked. If an HTTPRoute is bound a to Service parentRef, Service-targeted traffic (like GET foo.local) not matching a route shouldn't be routed to the backend, but this is not intended to imply AuthZ controls that would prevent e.g. connections dialing the pod IP directly from succeeding.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am +1 to @howardjohn and @mikemorris comments.

@robscott @youngnick - any concerns if we leave this outside?

(See [status definitions](../overview.md#gep-states).)


## TLDR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that's not clear to me from this GEP is what types of traffic we are targeting.

Currently, GAMMA is specified to only be for HTTP traffic - that is, traffic that is routed using HTTP metadata, which requires the routing agent to have access to the unencrypted HTTP stream. So that rules out TCP, UDP, and TLS handling for GAMMA.

Is this proposal intended to cover other use cases than HTTP traffic? If so, that's a significant expansion of the current GAMMA spec, and we probably need to talk about it separately.

If not, we probably should say that explicitly, maybe via listing non-HTTP protocols in non-goals or something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will work within existing GAMMA scope, but should be sufficiently flexible to support more protocols if/when GAMMA allows for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this as a starting point. I hadn't realized the initial GAMMA scope was so narrow, so having an API that could grow beyond that limited scope as needed would be helpful here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Nick's suggestion that we add this as a non-goal for now, but I also agree with Rob's point that there's room for growth here. I suggest we add a non-goal as suggested, and then in "alternatives considered" we add a brief summary about the potential for growth here, but suggest that needs to be a follow-up GEP.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other protocols are already standard: https://gateway-api.sigs.k8s.io/geps/gep-1294/?h=xroute

Copy link
Contributor

@mikemorris mikemorris Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, GAMMA is specified to only be for HTTP traffic - that is, traffic that is routed using HTTP metadata, which requires the routing agent to have access to the unencrypted HTTP stream.

I don't think this is accurate?

Is this proposal intended to cover other use cases than HTTP traffic? If so, that's a significant expansion of the current GAMMA spec, and we probably need to talk about it separately.

We've largely focused on HTTP traffic for mesh, but gRPC, TLS, TCP and UDP are defined in GAMMA https://gateway-api.sigs.k8s.io/geps/gep-1294/?h=xroute#route-types (I think John was intending to link this too, just the wrong subheading).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather leave this as is, and make sure we definetly support HTTP traffic. Would prefer to avoid adding to non-goals and have a sufficient flexiblilty to support more protocols if we can. Later iteration of this GEP would obviously explore the implementation in more detail

Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @LiorLieberman and @aryan16! A couple comments, but otherwise LGTM.

(See [status definitions](../overview.md#gep-states).)


## TLDR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this as a starting point. I hadn't realized the initial GAMMA scope was so narrow, so having an API that could grow beyond that limited scope as needed would be helpful here.


More on https://docs.cilium.io/en/stable/internals/security-identities/ & https://docs.cilium.io/en/stable/security/network/identity/


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LiorLieberman @aryan16 can one of you add a note about this in this GEP?

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 16, 2025
Copy link
Member

@shaneutt shaneutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today is the time limit for merging. My review of the PR at this point is that it is not particularly contentious and just needs some comments resolved by making clearer distinctions about particulars in the GEP.

So I approve:

/approve

But hold until we resolve those last couple of comments, then we can merge:

/hold

(See [status definitions](../overview.md#gep-states).)


## TLDR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Nick's suggestion that we add this as a non-goal for now, but I also agree with Rob's point that there's room for growth here. I suggest we add a non-goal as suggested, and then in "alternatives considered" we add a brief summary about the potential for growth here, but suggest that needs to be a follow-up GEP.

Provide a method for configuring Gateway API Mesh implementations to enforce east-west identity-based Authorization controls. At the time of writing this we leave Authentication for specific implementation and outside of this proposal scope.


## Goals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest making this clear in the non-goals, and then we can consider this resolved.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: LiorLieberman, robscott, shaneutt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@shaneutt shaneutt self-assigned this Jun 17, 2025
@shaneutt shaneutt moved this to Review in Release v1.4.0 Jun 17, 2025
@shaneutt shaneutt added this to the v1.4.0 milestone Jun 17, 2025
@@ -0,0 +1,265 @@
# GEP-3779: Identity Based Authz for East-West Traffic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not sure I understand the relationship to the work going on in NetworkPolicy around this, which has a ~100% overlap but is not mentioned at all in this GEP?

Copy link
Contributor

@mikemorris mikemorris Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've had some preliminary joint discussions with NetPol WG stakeholders, but nothing identity-based is anywhere close to a proposal/KEP state over there AFAIK (AdminNetworkPolicy is still IP-focused).

The last place I remember leaving these discussions was that there was generally a preference for having these as separate layers with no "interleaving" (identity-based policy only applicable if IP-based policy has already allowed the connection) because most implementations - Cilium being the most notable exception - would only be able to or want to support one layer.

There was an open unanswered question on whether any sort of interop/signaling between layers to enable a belt-and-suspenders approach with restrictive IP-based policies would be feasible (or desirable) for use cases like deny-all at L3 but generating policies to allow IP connectivity if identity-based policy would allow the connection - something like mapping ServiceAccounts to pod IPs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs to block this GEP, but I think the scenario that comes to my mind if we do strict layering (i.e. NetPol first then identity) looks something like this: if a NetPol provider has an identity concept, I think this GEP means they must make 2 authz passes if the user wants to express something like: "only allow from this CIDR with identity X OR any other CIDR with identity Y". In fact, I don't know how the user effectively authors that policy in a strictly layered API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, +1. thanks Mike.

And I am working (slowly) on pushing more discussions (netpol wg included) for a more broad proposal in this space, no flavored by who/what is implementing it, to allow identity based authorization in Kubernetes.

If something like this ever lands, we'll see how we converge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/gep PRs related to Gateway Enhancement Proposal(GEP) release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
Status: Review
Development

Successfully merging this pull request may close these issues.

GEP: Support E/W authorization
10 participants