Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make "no resources" error configurable for autoscaling platforms #1843

Open
aledegano opened this issue Apr 18, 2024 · 2 comments
Open

Make "no resources" error configurable for autoscaling platforms #1843

aledegano opened this issue Apr 18, 2024 · 2 comments

Comments

@aledegano
Copy link
Contributor

When I resume a session and there are not enough resources available I immediately get an error in the UI.

That makes sense in our current infra since if the resources aren't there they won't be (at least for a while), however on platforms where autoscaling is enabled (like the AWS PoC I'm carrying out), that's not necessarily true, as some resources might be coming up in a short while.

The error itself does not prevent the session to start in the background, but it might be misleading for a user.

Can we control this time interval before showing an error?

@rokroskar
Copy link
Member

Is there some indication from k8s that a resource is being provisioned to satisfy the (currently unschedulable) request?

@aledegano
Copy link
Contributor Author

Shortly after the pod is scheduled I see this event:

apiVersion: v1
count: 1
eventTime: null
firstTimestamp: "2024-04-19T08:05:16Z"
involvedObject:
  apiVersion: v1
  kind: Pod
  name: foo-40bar--an-2daws-2dproject-57f05e85-0
  namespace: renku
  resourceVersion: "21922912"
  uid: 467b94d8-e56a-432c-8ca1-108843aab5ec
kind: Event
lastTimestamp: "2024-04-19T08:05:16Z"
message: 'Pod should schedule on: nodeclaim/core-services-lrkls'
metadata:
  creationTimestamp: "2024-04-19T08:05:16Z"
  name: foo-40bar--an-2daws-2dproject-57f05e85-0.17c79fd3fcc0e889
  namespace: renku
  resourceVersion: "21922946"
  uid: 072773f3-e2b0-4f73-bd9f-4baca5511b66
reason: Nominated
reportingComponent: karpenter
reportingInstance: ""
source:
  component: karpenter
type: Normal

There are certainly more information from Karpenter, but that might be a bit too specific/platform-dependent...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants