Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-v3.26] Auto pick #8913: updating the logic for stale endpoint management #9027

Conversation

coutinhop
Copy link
Contributor

Cherry pick of #8913 on release-v3.26.

#8913: updating the logic for stale endpoint management

Original PR Body below

Description

Background:
This issue surfaces when we run the K8s for hyperv containers test https://github.com/kubernetes/kubernetes/blob/6381e6504ac210297e382c22029791267d440d9e/test/e2e/windows/service.go#L51 to reproduce you will need to set up testing cluster using capz windows-testing/capz/readme.md at master · kubernetes-sigs/windows-testing (github.com) and run this specific test. Else you could also look at test-grid logs for details on test failure: sig-windows-experimental Test Grid (kubernetes.io)

In the test Calico is unable to recognize a container endpoint is in ready state as it doesn't attach an endpoint as the logic is based on HNSEndpoint sharedContainers which are an empty field for hyperv containers. https://github.com/microsoft/hcsshim/blob/8beabacfc2d21767a07c20f8dd5f9f3932dbf305/internal/hns/hnsendpoint.go#L146

Calico logic to define a stale container needs to be updated. I do so by using the HNS Endpoint state attribute.
PR for hcshim: microsoft/hcsshim#2177

The change has been backported to hcsshim 0.11 and 0.12 and we here ingest the latest tag to ingest dependent changes.

This change was tested by running kubernetes e2e tests locally for both hyperv containers and process isolated containers. All tests pass.

Related issues/PRs

Todos

  • Tests
  • Documentation
  • Release note

Release Note

Fix that Felix would not recognise HyperV containers as ready.

Reminder for the reviewer

Make sure that this PR has the correct labels and milestone set.

Every PR needs one docs-* label.

  • docs-pr-required: This change requires a change to the documentation that has not been completed yet.
  • docs-completed: This change has all necessary documentation completed.
  • docs-not-required: This change has no user-facing impact and requires no docs.

Every PR needs one release-note-* label.

  • release-note-required: This PR has user-facing changes. Most PRs should have this label.
  • release-note-not-required: This PR has no user-facing changes.

Other optional labels:

  • cherry-pick-candidate: This PR should be cherry-picked to an earlier release. For bug fixes only.
  • needs-operator-pr: This PR is related to install and requires a corresponding change to the operator.

ritikaguptams and others added 3 commits July 17, 2024 19:43
Signed-off-by: ritikaguptams <85255050+ritikaguptams@users.noreply.github.com>

Updating stale endpoint logic for readability

Signed-off-by: ritikaguptams <85255050+ritikaguptams@users.noreply.github.com>

Fixing typo

Signed-off-by: ritikaguptams <85255050+ritikaguptams@users.noreply.github.com>

Removing paranthesis to make lint check happy

Signed-off-by: ritikaguptams <85255050+ritikaguptams@users.noreply.github.com>

Adding comments about HNS dependency

Signed-off-by: ritikaguptams <85255050+ritikaguptams@users.noreply.github.com>
@coutinhop coutinhop requested a review from a team as a code owner July 18, 2024 02:46
@coutinhop coutinhop added release-note-required Change has user-facing impact (no matter how small) cherry-pick-candidate docs-not-required Docs not required for this change labels Jul 18, 2024
@marvin-tigera marvin-tigera added this to the Calico v3.26.5 milestone Jul 18, 2024
@coutinhop coutinhop self-assigned this Jul 18, 2024
fasaxc and others added 4 commits July 17, 2024 21:23
Use github.com/Microsoft/hcsshim v0.11.4 instead of the fork github.com/projectcalico/hcsshim v0.8.9-calico.

The fork was necessary at the time of v0.8.9 in order to access the endpoint's containers, but it is no longer needed since now that information is accessible via the SharedContainers field in the HNSEndpoint struct in the upstream hcsshim package.
Signed-off-by: scyda <scyda@outlook.com>
@coutinhop coutinhop merged commit 65f4054 into projectcalico:release-v3.26 Jul 18, 2024
2 checks passed
@coutinhop coutinhop deleted the auto-pick-of-#8913-upstream-release-v3.26 branch July 18, 2024 17:52
coutinhop added a commit to coutinhop/calico that referenced this pull request Jul 30, 2024
Revert windows endpoint mgr logic change from projectcalico#9027 which was breaking windows policies and add extra debug logging.
coutinhop added a commit to coutinhop/calico that referenced this pull request Jul 30, 2024
Revert windows endpoint mgr logic change from projectcalico#9027 which was breaking windows policies and add extra debug logging.
coutinhop added a commit that referenced this pull request Jul 31, 2024
Revert windows endpoint mgr logic change from #9027 which was breaking windows policies and add extra debug logging.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-not-required Docs not required for this change release-note-required Change has user-facing impact (no matter how small)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants