Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error from server (BadRequest): container "cilium-agent" in pod "cilium-hf7zn" is waiting to start: PodInitializing #33405

Closed
3 tasks done
chanyshev opened this issue Jun 26, 2024 · 6 comments
Labels
kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. need-more-info More information is required to further debug or fix the issue. needs/triage This issue requires triaging to establish severity and next steps.

Comments

@chanyshev
Copy link

chanyshev commented Jun 26, 2024

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

I'm getting an error in all clium agents

Cilium Version

1.15.5

Kernel Version

6.6.30-talos

Kubernetes Version

v1.30.1

Regression

No response

Sysdump

No response

Relevant log output

Error from server (BadRequest): container "cilium-agent" in pod "cilium-hf7zn" is waiting to start: PodInitializing

Anything else?

Talos: v1.7.2

I deployed cilium with the helm

helm upgrade --install \
    cilium \
    cilium/cilium \
    --namespace kube-system \
    --set ipam.mode=kubernetes \
    --set=kubeProxyReplacement=true \
    --set=securityContext.capabilities.ciliumAgent="{CHOWN,KILL,NET_ADMIN,NET_RAW,IPC_LOCK,SYS_ADMIN,SYS_RESOURCE,DAC_OVERRIDE,FOWNER,SETGID,SETUID}" \
    --set=securityContext.capabilities.cleanCiliumState="{NET_ADMIN,SYS_ADMIN,SYS_RESOURCE}" \
    --set=cgroup.autoMount.enabled=false \
    --set=cgroup.hostRoot=/sys/fs/cgroup \
    --set=k8sServiceHost=localhost \
    --set hubble.relay.enabled=true \
    --set hubble.ui.enabled=true \
    --set=k8sServicePort=7445

Cilium Users Document

  • Are you a user of Cilium? Please add yourself to the Users doc

Code of Conduct

  • I agree to follow this project's Code of Conduct
@chanyshev chanyshev added kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps. labels Jun 26, 2024
@youngnick
Copy link
Contributor

That error is just saying that something is stopping that Cilium agent from starting. Cilium requires one cilium-agent per node, along with a cilium-operator deployment. If those pods are not running, then Cilium is not working.

Without more information, we are unable to help. I recommend reading through the Troubleshooting page at https://docs.cilium.io/en/stable/operations/troubleshooting/, particularly the process of collecting a sysdump.

At the very least, we'd need a kubectl describe pod -n kube-system cilium-hf7zn, which will show the reason why the Pod is still Initialising in the Events section.

@youngnick youngnick added the need-more-info More information is required to further debug or fix the issue. label Jun 28, 2024
@chanyshev
Copy link
Author

I managed to deploy the application v1.14.0 Deployed, but v1.15.5 something's wrong.

@github-actions github-actions bot added info-completed The GH issue has received a reply from the author and removed need-more-info More information is required to further debug or fix the issue. labels Jun 28, 2024
@youngnick
Copy link
Contributor

Thanks for the extra feedback @chanyshev, but without further information, as I suggested above, it will be difficult for anyone to investigate.

@youngnick youngnick added need-more-info More information is required to further debug or fix the issue. and removed info-completed The GH issue has received a reply from the author labels Jul 1, 2024
@Ji993
Copy link

Ji993 commented Aug 5, 2024

I have the same issue on a different setup.
This installation method worked well in the past (I successfully ran Cilium on older versions of Kubernetes, like v1.23-1.27, in the past using the same setup), now the cilium-operator pods are running, but the cilium pods just hang.
Also @chanyshev workaround didn't work for me (on v1.14.0 the cilium pods still hang)

OS:
Rocky Linux 9.4

Kernel Version
5.14.0-427.13.1.el9_4.x86_64

Kubernetes Version
v1.30.3

Container Runtime
CRI-O v1.30.3

Cilium Version Tested
1.16.0, 1.15.7, 1.14.13, 1.14.0

Cilium install method
helm + ipam.mode=kubernetes, k8s.requireIPv4PodCIDR=true

@Ji993
Copy link

Ji993 commented Aug 6, 2024

I just solved the issue by using the solution from here -> #23838

The /opt/cni/bin directory needs to be owned by root and permissions set to 0755. Unbelievable, I wasted so much time reinstalling and testing different versions.

@youngnick
Copy link
Contributor

Thanks for that @Ji993. I'm going to close this issue for now, but @chanyshev should feel free to comment with more info and I'll reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. need-more-info More information is required to further debug or fix the issue. needs/triage This issue requires triaging to establish severity and next steps.
Projects
None yet
Development

No branches or pull requests

3 participants