Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating the logic for stale endpoint management #8913

Merged
merged 5 commits into from
Jul 9, 2024

Conversation

ritikaguptams
Copy link
Contributor

@ritikaguptams ritikaguptams commented Jun 14, 2024

Description

Background:
This issue surfaces when we run the K8s for hyperv containers test https://github.com/kubernetes/kubernetes/blob/6381e6504ac210297e382c22029791267d440d9e/test/e2e/windows/service.go#L51 to reproduce you will need to set up testing cluster using capz windows-testing/capz/readme.md at master · kubernetes-sigs/windows-testing (github.com) and run this specific test. Else you could also look at test-grid logs for details on test failure: sig-windows-experimental Test Grid (kubernetes.io)

In the test Calico is unable to recognize a container endpoint is in ready state as it doesn't attach an endpoint as the logic is based on HNSEndpoint sharedContainers which are an empty field for hyperv containers. https://github.com/microsoft/hcsshim/blob/8beabacfc2d21767a07c20f8dd5f9f3932dbf305/internal/hns/hnsendpoint.go#L146

Calico logic to define a stale container needs to be updated. I do so by using the HNS Endpoint state attribute.
PR for hcshim: microsoft/hcsshim#2177

The change has been backported to hcsshim 0.11 and 0.12 and we here ingest the latest tag to ingest dependent changes.

This change was tested by running kubernetes e2e tests locally for both hyperv containers and process isolated containers. All tests pass.

Related issues/PRs

Todos

  • Tests
  • Documentation
  • Release note

Release Note

Fix that Felix would not recognise HyperV containers as ready.

Reminder for the reviewer

Make sure that this PR has the correct labels and milestone set.

Every PR needs one docs-* label.

  • docs-pr-required: This change requires a change to the documentation that has not been completed yet.
  • docs-completed: This change has all necessary documentation completed.
  • docs-not-required: This change has no user-facing impact and requires no docs.

Every PR needs one release-note-* label.

  • release-note-required: This PR has user-facing changes. Most PRs should have this label.
  • release-note-not-required: This PR has no user-facing changes.

Other optional labels:

  • cherry-pick-candidate: This PR should be cherry-picked to an earlier release. For bug fixes only.
  • needs-operator-pr: This PR is related to install and requires a corresponding change to the operator.

@marvin-tigera marvin-tigera added this to the Calico v3.29.0 milestone Jun 14, 2024
@marvin-tigera marvin-tigera added release-note-required Change has user-facing impact (no matter how small) docs-pr-required Change is not yet documented labels Jun 14, 2024
@CLAassistant
Copy link

CLAassistant commented Jun 14, 2024

CLA assistant check
All committers have signed the CLA.

@ritikaguptams ritikaguptams marked this pull request as ready for review June 18, 2024 05:25
@ritikaguptams ritikaguptams requested a review from a team as a code owner June 18, 2024 05:25
@fasaxc
Copy link
Member

fasaxc commented Jun 18, 2024

/sem-approve

@fasaxc
Copy link
Member

fasaxc commented Jun 18, 2024

Thanks for the contribution; CI failed with a code formatting check, please can you run make fix in the felix dir?

@fasaxc fasaxc added docs-not-required Docs not required for this change and removed docs-pr-required Change is not yet documented labels Jun 18, 2024
@coutinhop coutinhop self-assigned this Jun 18, 2024
@coutinhop
Copy link
Contributor

/sem-approve

Signed-off-by: ritikaguptams <85255050+ritikaguptams@users.noreply.github.com>

Updating stale endpoint logic for readability

Signed-off-by: ritikaguptams <85255050+ritikaguptams@users.noreply.github.com>

Fixing typo

Signed-off-by: ritikaguptams <85255050+ritikaguptams@users.noreply.github.com>

Removing paranthesis to make lint check happy

Signed-off-by: ritikaguptams <85255050+ritikaguptams@users.noreply.github.com>

Adding comments about HNS dependency

Signed-off-by: ritikaguptams <85255050+ritikaguptams@users.noreply.github.com>
@coutinhop
Copy link
Contributor

/sem-approve

@coutinhop
Copy link
Contributor

/sem-approve

@coutinhop
Copy link
Contributor

/sem-approve

Copy link
Contributor

@coutinhop coutinhop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @ritikaguptams!

@coutinhop
Copy link
Contributor

/sem-approve

coutinhop added a commit that referenced this pull request Jul 18, 2024
…elease-v3.28

[release-v3.28] Auto pick #8913: updating the logic for stale endpoint management
coutinhop added a commit that referenced this pull request Jul 18, 2024
…elease-v3.27

[release-v3.27] Auto pick #8913: updating the logic for stale endpoint management
coutinhop added a commit that referenced this pull request Jul 18, 2024
…elease-v3.26

[release-v3.26] Auto pick #8913: updating the logic for stale endpoint management
@coutinhop coutinhop added cherry-pick-completed PR has been cherry-picked and removed cherry-pick-candidate labels Jul 18, 2024
coutinhop added a commit to coutinhop/calico that referenced this pull request Jul 26, 2024
Revert hcsshim version bump from v0.11.6 back to v0.11.4,
as this was breaking Windows policies.

(original PR projectcalico#8913)
@coutinhop coutinhop mentioned this pull request Jul 26, 2024
3 tasks
coutinhop added a commit to coutinhop/calico that referenced this pull request Jul 26, 2024
Revert hcsshim version bump from v0.11.6 back to v0.11.4,
as this was breaking Windows policies.

(original PR projectcalico#8913)
coutinhop added a commit to coutinhop/calico that referenced this pull request Jul 29, 2024
Revert windows endpoint mgr logic change from projectcalico#8913 which was breaking windows policies and add extra debug logging.

CAPZ Win FV test fixes:
- fix .sh scripts permissions
- fix log file retrieval
- add .gitignore
- use -O for scp in generate_helpers.sh
- get KIND_VERSION from metadata.mk
- adjust timeouts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick-completed PR has been cherry-picked docs-not-required Docs not required for this change release-note-required Change has user-facing impact (no matter how small)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants