Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Enable webhook authorization options #3198

Merged
merged 6 commits into from
Dec 18, 2024
Merged

Conversation

xrstf
Copy link
Contributor

@xrstf xrstf commented Nov 13, 2024

Summary

This PR allows to configure an external webhook for authorization. Previously the relevant flags have been explicitly disabled in kcp, because kcp has its own authorizer. The webhook authorizer will sit next to the existing authorizers (allowPath, allowGroup) and so is an alternative to the built-in authorizers.

In the documentation, it looks like this:

2024-12-05_18-22-44

I chose to let the webhook authorizer run in parallel to the existing RBAC chain so that admins have full freedom to tweak the authorization. This comes with the warning that if your webhook is configured to deny everything, it will block kcp's startup (thankfully, kcp recovers quickly when the webhook stops denying everything and starts to either have no opinion or allow things explicitly).

Since the webhook authorizer is part of a union with the others, and more importantly comes after the allowGroups/Paths authorizers, webhooks cannot prevent healthchecks from functioning.


Originally I started by simply enabling the authz options in the built-in server options, i.e. remove the o.GenericControlPlane.Authorization = nil // we have our own line, to surface the existing webhook options from Kubernetes.

However I did not manage to integrate the webhook well into kcp. When the webhook authorizer is enabled, the code would then not run the RBAC PostStart hook in the server anymore, so RBAC policies like the cluster-admin ClusterRole were simply not being created anymore. I fear to break a lot of things if I hack around in the Kubernetes fork and so ultimately decided to not use the authz options in the built-in server options, but instead add them to the kcp authz options (where we also configure --always-allow-paths, for example).

This leads to a bit more code, because some steps like setting up the CLI flags have to be done manually, but allowed much greater control over how the webhook is loaded and was ultimately simpler to do. Now whatever kcp does with the webhook flags, will not affect Kubernetes' internal bootstrapping anymore.

I also renamed a few variables to make things a bit more clear.


In addition to the code change, I also revamped the authorizer documentation for kcp. The old one still mentioned non-existing authorizers (like the "top-level org authorizer") or behaviour that simply doesn't exist (like the workspace content authorizer alledgedly injecting data into the auth context for the authorizers downstream, which it doesn't do in reality).

Since the original ASCIIChart looked terrible due to my font settings, I was very glad to see that mkdocs-material comes with a Mermaid integration and so I replaced the oringinal chart with a much easier to maintain Mermaid diagram.

This PR also includes a dependency bump to include kcp-dev/kubernetes#151, which will take care of including the cluster name in the webhook's payload. In that dependency bump is also, slightly hidden, a cleanup for the bump-k8s script, which failed at least since the 1.31 bump since the legacy-cloud-providers module doesn't exist anymore.


Also I vote that we rename the "System CRD Authorizer" to something like "System Resource Authorizer", because the authorizer does not actually handle CustomResourceDefinitions at all. It handles APIBindings and APIExports. Calling it a "CRD Authorizer" is confusing to me.

Release Notes

Add support for external webhook authorization.

@kcp-ci-bot kcp-ci-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has signed the DCO. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 13, 2024
@xrstf xrstf marked this pull request as draft November 13, 2024 15:57
@kcp-ci-bot kcp-ci-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 13, 2024
@xrstf
Copy link
Contributor Author

xrstf commented Nov 14, 2024

/test all

@xrstf
Copy link
Contributor Author

xrstf commented Nov 29, 2024

/test all

@xrstf
Copy link
Contributor Author

xrstf commented Nov 29, 2024

/test all

@kcp-ci-bot kcp-ci-bot added dco-signoff: no Indicates the PR's author has not signed the DCO. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. dco-signoff: yes Indicates the PR's author has signed the DCO. labels Dec 5, 2024
@kcp-ci-bot kcp-ci-bot added dco-signoff: yes Indicates the PR's author has signed the DCO. and removed dco-signoff: no Indicates the PR's author has not signed the DCO. labels Dec 5, 2024
pkg/server/options/options.go Show resolved Hide resolved
@xrstf
Copy link
Contributor Author

xrstf commented Dec 5, 2024

/test all

@xrstf xrstf marked this pull request as ready for review December 6, 2024 10:47
@kcp-ci-bot kcp-ci-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 6, 2024
@xrstf xrstf requested a review from mjudeikis December 6, 2024 17:48
Copy link
Contributor

@mjudeikis mjudeikis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few nits.
I would really like to have some example of running webhook authz server as example. Or reference to one in upstream? By any chance you had one done while doing this PR in testing?

And maybe adding some e2e for this would great too. Myabe out of scope for this PR but great.

cmd/kcp/kcp.go Show resolved Hide resolved
pkg/server/options/authorization.go Show resolved Hide resolved
hack/bump-k8s.sh Show resolved Hide resolved
@xrstf
Copy link
Contributor Author

xrstf commented Dec 9, 2024

Or reference to one in upstream? By any chance you had one done while doing this PR in testing?

I used xrstf/httest to quickly spawn a local HTTPS server with CA/serving cert, which I can then use to inspect the webhook requests (it's a spleen for me, I like to see what's actually going over the wire).

So during development I did

$ httest --tls --response allow.json --trace
INFO[2024-12-09T10:38:54+01:00] Creating private key…                         file=.httest/ca.key
INFO[2024-12-09T10:38:54+01:00] Creating CA certificate…                      file=.httest/ca.crt
INFO[2024-12-09T10:38:54+01:00] Creating private key…                         file=.httest/serving.key
INFO[2024-12-09T10:38:54+01:00] Creating serving certificate…                 cn=localhost file=.httest/serving.crt
INFO[2024-12-09T10:38:54+01:00] Listening securely…                           address=localhost:8080 ca=.httest/ca.crt domains=[localhost 127.0.0.1]

Then I created my auth.kubeconfig using the CA key:

apiVersion: v1
kind: Config
clusters:
  - name: httest
    cluster:
      certificate-authority: /home/xrstf/gospace/src/github.com/kcp-dev/kcp/.httest/ca.crt
      server: https://localhost:8080/
current-context: webhook
contexts:
  - name: webhook
    context:
      cluster: httest

I had prepared a few standard responses, like the deny.json:

{
   "apiVersion": "authorization.k8s.io/v1beta1",
   "kind": "SubjectAccessReview",
   "status": {
     "allowed": false,
     "denied": true,
     "reason": "computer says no"
   }
}

I could then observe requests coming in to the webhook, like

INFO[2024-12-09T10:41:30+01:00] Request                                       method=POST path=/?timeout=30s protocol=HTTP/2.0 remote=127.0.0.1:46418 ua=Go-http-client/2.0
POST /?timeout=30s HTTP/2.0
Host: localhost:8080
Accept: application/json, */*
Accept-Encoding: gzip
Content-Length: 434
Content-Type: application/json
User-Agent: Go-http-client/2.0

{"kind":"SubjectAccessReview","apiVersion":"authorization.k8s.io/v1beta1","metadata":{"creationTimestamp":null},"spec":{"resourceAttributes":{"verb":"create","group":"tenancy.kcp.io","version":"v1alpha1","resource":"workspacetypes"},"user":"system:kcp:bootstrapper","group":["system:kcp:tenancy:workspace-bootstrapper","system:authenticated"],"extra":{"authorization.kubernetes.io/cluster-name":["root"]}},"status":{"allowed":false}}

@xrstf xrstf marked this pull request as draft December 9, 2024 14:01
@kcp-ci-bot kcp-ci-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 9, 2024
@xrstf
Copy link
Contributor Author

xrstf commented Dec 10, 2024

/test all

@xrstf
Copy link
Contributor Author

xrstf commented Dec 10, 2024

/retest

@xrstf xrstf marked this pull request as ready for review December 10, 2024 15:41
@kcp-ci-bot kcp-ci-bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Dec 10, 2024
On-behalf-of: @SAP christoph.mewes@sap.com
…e kcp options

On-behalf-of: @SAP christoph.mewes@sap.com
On-behalf-of: @SAP christoph.mewes@sap.com
On-behalf-of: @SAP christoph.mewes@sap.com
@kcp-ci-bot kcp-ci-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 13, 2024
On-behalf-of: @SAP christoph.mewes@sap.com
On-behalf-of: @SAP christoph.mewes@sap.com
@mjudeikis
Copy link
Contributor

/lgtm
/approve

We can iterate on this. We need to test this "in prod"

@kcp-ci-bot kcp-ci-bot added the lgtm Indicates that a PR is ready to be merged. label Dec 18, 2024
@kcp-ci-bot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 776ad297c7ee4faa29c0f8abdc617d5a3adf4351

@kcp-ci-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mjudeikis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kcp-ci-bot kcp-ci-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 18, 2024
@kcp-ci-bot kcp-ci-bot merged commit 1afb207 into kcp-dev:main Dec 18, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has signed the DCO. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants