Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add enhanced livenessprobe controller #1535

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

BH4AWS
Copy link
Collaborator

@BH4AWS BH4AWS commented Mar 21, 2024

Ⅰ. Describe what this PR does

A controller converts the pod livenessprobe config to nodePodProbe config.
This controller is the part of the enhanced livenessProbe solutions.

Ⅱ. Does this pull request fix one issue?

Ⅲ. Describe how to verify it

when a pod with livenessProbe config is created, the native livenessProbe will be replaced to pod annotation and this controller converts to nodePodProbe config immediately.

Ⅳ. Special notes for reviews

@kruise-bot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign zmberg for approval by writing /assign @zmberg in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@BH4AWS BH4AWS force-pushed the add_enhanced_livenessprobe_controller_v1 branch from 05ebc64 to 11b35e2 Compare March 21, 2024 06:09
Copy link

codecov bot commented Mar 21, 2024

Codecov Report

Attention: Patch coverage is 48.93617% with 144 lines in your changes are missing coverage. Please review.

Project coverage is 47.93%. Comparing base (fd7e86e) to head (11b35e2).
Report is 9 commits behind head on master.

❗ Current head 11b35e2 differs from pull request most recent head 10b0f9f. Consider uploading reports for the commit 10b0f9f to get more accurate results

Files Patch % Lines
...venessprobemapnodeprobe/probemapnodeprobe_utils.go 63.90% 46 Missing and 15 partials ⚠️
...nessprobemapnodeprobe/livenessprobemapnodeprobe.go 37.83% 38 Missing and 8 partials ⚠️
...apnodeprobe/probemapnodeprobe_pod_event_handler.go 0.00% 37 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1535      +/-   ##
==========================================
+ Coverage   47.91%   47.93%   +0.01%     
==========================================
  Files         162      165       +3     
  Lines       23483    23763     +280     
==========================================
+ Hits        11252    11390     +138     
- Misses      11011    11131     +120     
- Partials     1220     1242      +22     
Flag Coverage Δ
unittests 47.93% <48.93%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@hantmac
Copy link
Member

hantmac commented Mar 21, 2024

@BH4AWS Would you like to explain what the enhanced livenessProbe do? In other words, what are the advantages of enhanced livenessProbe?

@furykerry furykerry added this to the 1.7 milestone Mar 22, 2024
@BH4AWS BH4AWS force-pushed the add_enhanced_livenessprobe_controller_v1 branch 8 times, most recently from a5e88a3 to 4f9bbfa Compare March 26, 2024 13:36
Signed-off-by: jicheng.sk <jicheng.sk@alibaba-inc.com>
@BH4AWS BH4AWS force-pushed the add_enhanced_livenessprobe_controller_v1 branch from 4f9bbfa to 10b0f9f Compare March 26, 2024 13:41
@BH4AWS
Copy link
Collaborator Author

BH4AWS commented Mar 27, 2024

@BH4AWS Would you like to explain what the enhanced livenessProbe do? In other words, what are the advantages of enhanced livenessProbe?

new link is : #1544

There is the standard livenessProbe feature in Kubernetes opensource community. For the applications configured with the liveness probe, the Kubernetes kubelet component will periodically check whether the liveness probe service is normal. If being negative, the kubelet component will directly trigger the restart of the service container.

However, this is a deadly operation or resilience protections for online applications, expecially for the incorrect probe configurations. For example, the configurations of liveness probes are often incorrect due to complicated configurations or contents. Once the probe takes effect, the full service container is triggered to restart, this condition casues service outage(or even triggering an avalanche of services during the restart).

Secondly, there is no standard concept in the design idea of community method, which completely relies on the machine node detection and machine node restart mechanism.

Furthermore, for a stability perspective, community-native solutions lack application-level global high availability and any resilience protection policies. Combined with the existing defects, it proposes to design and implement the enhanced livnessProbe application survival solution in the OpenKruise suite, and plans to open source release as the core capability in the future, so that the innovative technology can be used by more people.

details info can be found in our proposal~~~

@furykerry
Copy link
Member

@BH4AWS plz submit design proposal first for more feedback from the community

}
}()

err = r.syncPodContainersLivenessProbe(request.Namespace, request.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz double check whether pod is using enhanced liveness probe

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pkg/controller/livenessprobemapnodeprobe/probemapnodeprobe_pod_event_handler.go
the event handler checks that the pods has the annotation for 'usingEnhancedLivenessProbe' function.

}

func (r *ReconcileEnhancedLivenessProbe2NodeProbe) GetPodNodeName(pod *v1.Pod) string {
return pod.Spec.NodeName
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this func is too trivial , just use pod.spec.nodeName instead

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function has been removed, using pod.spec.nodeName.

podContainerProbe := podNewNppCloneSpec.PodProbes[index]
if podContainerProbe.Name == pod.Name && podContainerProbe.Namespace == pod.Namespace &&
podContainerProbe.UID == fmt.Sprintf("%v", pod.UID) {
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can mark the newPodProbeTmp is changed, so that no deep equal comparison is needed in L90

return err
}

podNewNppClone := nppClone.DeepCopy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the deep copy is not necessary since nppClone and podNewNppClone is not used here

isHit = true
// diff the current pod container probes vs the npp container probes
newPodContainerProbes := generatePodContainersProbe(pod, containersLivenessProbeConfig)
podContainerProbe.Probes = newPodContainerProbes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we break the loop if hit ?

podNewNppCloneSpec := podNewNppClone.Spec.DeepCopy()

isHit := false
for index := range podNewNppCloneSpec.PodProbes {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace this loop with a func called getOrCreatePodProbe(), so that the func can be greatly simplified

podNewNppCloneSpec.PodProbes = append( podNewNppCloneSpec.PodProbes, getOrCreatePodProbe(pod, containersLivenessProbeConfig))

return err
}

podNewNppClone := nppClone.DeepCopy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this clone is unnecessary since podNewNppClone is not used other than assign to podNewNppCloneSpec

@kruise-bot
Copy link

@BH4AWS: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

stale bot commented Jul 6, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@zmberg zmberg modified the milestones: 1.7, 1.8, 1.9 Jul 25, 2024
Copy link

stale bot commented Oct 28, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Oct 28, 2024
@furykerry furykerry removed the wontfix This will not be worked on label Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants