Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: enhanced in-place update module to support vertical scaling #1353

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

LavenderQAQ
Copy link

@LavenderQAQ LavenderQAQ commented Aug 2, 2023

Ⅰ. Describe what this PR does

Enhanced the in-place update module to enable kruise's workload to modify resources without restarting the pod.

Ⅱ. Does this pull request fix one issue?

Fixes #1212

Ⅲ. Describe how to verify it

Need to kubernetes v1.27 version, and open feature gate InPlacePodVerticalScaling. Here is my kind configuration:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
featureGates:
  "InPlacePodVerticalScaling": true
nodes:
- role: control-plane
  image: kindest/node:v1.27.3@sha256:3966ac761ae0136263ffdb6cfd4db23ef8a83cba8a463690e98317add2c9ba72
- role: worker
  image: kindest/node:v1.27.3@sha256:3966ac761ae0136263ffdb6cfd4db23ef8a83cba8a463690e98317add2c9ba72
- role: worker
  image: kindest/node:v1.27.3@sha256:3966ac761ae0136263ffdb6cfd4db23ef8a83cba8a463690e98317add2c9ba72
- role: worker
  image: kindest/node:v1.27.3@sha256:3966ac761ae0136263ffdb6cfd4db23ef8a83cba8a463690e98317add2c9ba72

Build a cloneset (or other workload that includes in-place updates) and set updateStrategy to InPlaceIfPossible. Here's an example:

apiVersion: apps.kruise.io/v1alpha1
kind: CloneSet
metadata:
  name: nginx-cloneset
spec:
  updateStrategy:
    type: InPlaceIfPossible
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx-1
        image: nginx:1.21.1
        ports:
        - containerPort: 80
        resources:
          limits:
            cpu: "1"
            memory: "256Mi"
          requests:
            cpu: "0.5"
            memory: "128Mi"

Change the workload's resources, wait a while, and you'll find that the pod's resources will change, and the pod won't restart.

Ⅳ. Special notes for reviews

The control of Status for vertical scaling requires a higher version of the k8s api, which makes this pr need to wait for kruise version upgrade. It cannot be merged at this time because it is not yet complete.

@kruise-bot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign veophi for approval by writing /assign @veophi in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sonatype-lift
Copy link

sonatype-lift bot commented Aug 2, 2023

Sonatype Lift is retiring

Sonatype Lift will be retiring on Sep 12, 2023, with its analysis stopping on Aug 12, 2023. We understand that this news may come as a disappointment, and Sonatype is committed to helping you transition off it seamlessly. If you’d like to retain your data, please export your issues from the web console.
We are extremely grateful and thank you for your support over the years.

📖 Read about the impacts and timeline

@kruise-bot kruise-bot added the size/L size/L: 100-499 label Aug 2, 2023
@LavenderQAQ LavenderQAQ changed the title feat: enhanced in-place update module to support vertical scaling feat: enhanced inplace update module to support vertical scaling Aug 2, 2023
@LavenderQAQ LavenderQAQ changed the title feat: enhanced inplace update module to support vertical scaling feat: enhanced in-place update module to support vertical scaling Aug 2, 2023
@LavenderQAQ LavenderQAQ force-pushed the feat/inplace-workload-vertical-scaling branch 4 times, most recently from 08c47c7 to 04b5eb8 Compare August 7, 2023 03:04
@codecov-commenter
Copy link

codecov-commenter commented Aug 7, 2023

Codecov Report

Attention: Patch coverage is 73.04688% with 69 lines in your changes missing coverage. Please review.

Project coverage is 50.22%. Comparing base (0d0031a) to head (ee835ac).
Report is 122 commits behind head on master.

Files with missing lines Patch % Lines
pkg/util/inplaceupdate/inplace_update_defaults.go 74.16% 25 Missing and 6 partials ⚠️
pkg/util/inplaceupdate/inplace_update_vertical.go 79.46% 14 Missing and 9 partials ⚠️
pkg/controller/cloneset/core/cloneset_core.go 0.00% 9 Missing ⚠️
...kg/controller/cloneset/sync/cloneset_sync_utils.go 0.00% 3 Missing ⚠️
pkg/util/inplaceupdate/inplace_update.go 40.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1353      +/-   ##
==========================================
+ Coverage   47.91%   50.22%   +2.30%     
==========================================
  Files         162      193      +31     
  Lines       23491    24908    +1417     
==========================================
+ Hits        11256    12509    +1253     
- Misses      11014    11113      +99     
- Partials     1221     1286      +65     
Flag Coverage Δ
unittests 50.22% <73.04%> (+2.30%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@LavenderQAQ LavenderQAQ force-pushed the feat/inplace-workload-vertical-scaling branch from 0470ce2 to 4e13c0d Compare August 7, 2023 03:56
@kruise-bot kruise-bot added needs-rebase size/XL size/XL: 500-999 and removed size/L size/L: 100-499 labels Aug 8, 2023
@LavenderQAQ LavenderQAQ force-pushed the feat/inplace-workload-vertical-scaling branch from 35c4939 to 28406cb Compare August 8, 2023 11:46
@LavenderQAQ LavenderQAQ force-pushed the feat/inplace-workload-vertical-scaling branch 2 times, most recently from 48126ff to aca435f Compare August 8, 2023 12:04
@LavenderQAQ LavenderQAQ marked this pull request as ready for review August 8, 2023 13:16
@kruise-bot kruise-bot requested a review from veophi August 8, 2023 13:16
@LavenderQAQ
Copy link
Author

LavenderQAQ commented Aug 8, 2023

The complete logic needs to wait until the k8s api of kruise is upgraded.

@LavenderQAQ
Copy link
Author

/hold

@ABNER-1 ABNER-1 force-pushed the feat/inplace-workload-vertical-scaling branch 11 times, most recently from 3f4af6f to a12055b Compare September 14, 2024 09:30
@ABNER-1
Copy link
Member

ABNER-1 commented Sep 14, 2024

I encountered two errors while adding e2e test cases:

  1. When modifying the image and increasing resources simultaneously, I encountered a RunContainerError. I have already submitted an issue to the Kubernetes community at [InPlacePodVerticalScaling]Got RunContainerError when patch pod image and resources kubernetes/kubernetes#127356.

  2. When I scaled the CPU limit of resources from 100m to 200m, I observed that occasionally the status was first changed by kubelet from 100m InProcess to 200m, and then it was set back to 100m InProcess by kubelet, and after a while, it finally became 200m.

@ABNER-1 ABNER-1 force-pushed the feat/inplace-workload-vertical-scaling branch from a12055b to eea019c Compare September 14, 2024 09:55
Makefile Show resolved Hide resolved
apis/apps/pub/inplace_update.go Show resolved Hide resolved
@ABNER-1 ABNER-1 force-pushed the feat/inplace-workload-vertical-scaling branch 2 times, most recently from 6339a56 to d0db898 Compare November 7, 2024 04:04
pkg/util/inplaceupdate/inplace_update_vertical.go Outdated Show resolved Hide resolved
@@ -344,6 +401,11 @@ func defaultCalculateInPlaceUpdateSpec(oldRevision, newRevision *apps.Controller
}
updateSpec.MetaDataPatch = patchBytes
}
// Need to distinguish whether only resources have been updated
if len(updateSpec.ContainerResources) > 0 && len(updateSpec.ContainerImages) == 0 && !updateSpec.UpdateEnvFromMetadata {
updateSpec.VerticalUpdateOnly = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems that one can infer the VerticalUpdateOnly using ContainerResources, ContainerImages and UpdateEnvFromMetadata, so VerticalUpdateOnly is not necessary, consider change the field VerticalUpdateOnly to a function

pkg/util/inplaceupdate/inplace_update_defaults.go Outdated Show resolved Hide resolved
pkg/util/inplaceupdate/inplace_update_defaults.go Outdated Show resolved Hide resolved
pkg/util/inplaceupdate/inplace_update_vertical.go Outdated Show resolved Hide resolved
config/crd/bases/apps.kruise.io_clonesets.yaml Outdated Show resolved Hide resolved
pkg/controller/cloneset/core/cloneset_core.go Outdated Show resolved Hide resolved
// UpdateResource implements vertical updates by directly modifying the container's resources,
// conforming to the k8s community standard
func (v *VerticalUpdate) UpdateContainerResource(container *v1.Container, newResource *v1.ResourceRequirements) {
if container == nil || newResource == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can not return directly since it is possible to set or unset container resource for a pod of burstable qos class

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should recreate pod when try to change qos class.

// 7. empty exist exist => this container is not inplace-updated by kruise, ignore
func (v *VerticalUpdate) IsContainerUpdateCompleted(pod *v1.Pod, container *v1.Container, containerStatus *v1.ContainerStatus, lastContainerStatus appspub.InPlaceUpdateContainerStatus) bool {
// case 5-7
if lastContainerStatus.Resources.Limits == nil && lastContainerStatus.Resources.Requests == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we simply check pod.status.resize, if the field is not empty, the update is not completed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we only resize limits,filed .status.resize will not be changed by api-server and kubelet

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested for resize requests only, and sometimes this situation occurs: the internal mechanism of kubelet will first set .status.resize to None before proceeding with the work. In this case, if we use such a method to determine whether the resizing has been completed, it will cause problems.

image

if err != nil || len(oldTemp.Spec.Containers) <= idx {
return fmt.Errorf("invalid container index: %s", op.Path)
}
quantity, err := resource.ParseQuantity(op.Value.(string))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about the following patch

  { "op": "remove", "path": "/spec/containers/0/resources/limits/cpu" }

it is possible to unset the container resource for one of the container in the pod of burstable qos class

@ABNER-1 ABNER-1 force-pushed the feat/inplace-workload-vertical-scaling branch from d0db898 to fe87fa5 Compare November 19, 2024 12:38
LavenderQAQ and others added 5 commits November 20, 2024 10:08
Signed-off-by: LavenderQAQ <lavenderqaq.cs@gmail.com>
Signed-off-by: LavenderQAQ <lavenderqaq.cs@gmail.com>
…tations

Signed-off-by: LavenderQAQ <lavenderqaq.cs@gmail.com>
Signed-off-by: LavenderQAQ <lavenderqaq.cs@gmail.com>
Signed-off-by: Abner-1 <yuanyuxing.yyx@alibaba-inc.com>
@ABNER-1 ABNER-1 force-pushed the feat/inplace-workload-vertical-scaling branch from fe87fa5 to e5f2c6d Compare November 20, 2024 02:10
Signed-off-by: Abner-1 <yuanyuxing.yyx@alibaba-inc.com>
@ABNER-1 ABNER-1 force-pushed the feat/inplace-workload-vertical-scaling branch from e5f2c6d to ee835ac Compare November 20, 2024 03:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/XL size/XL: 500-999
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[feature request] In-place udpate support resources
5 participants