Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes Events Input Segfault #9543

Open
Evesy opened this issue Oct 31, 2024 · 4 comments
Open

Kubernetes Events Input Segfault #9543

Evesy opened this issue Oct 31, 2024 · 4 comments

Comments

@Evesy
Copy link

Evesy commented Oct 31, 2024

Bug Report

Describe the bug
Running the kubernetes_events input eventually results in a seg fault

To Reproduce
Cannot yet reliably reproduce, but we see seg faults every few hours with the below config:

[SERVICE]
    Flush                      1
    Grace                      5
    Log_Level                  debug
    Daemon                     off

    HTTP_Server                On
    HTTP_Listen                0.0.0.0
    HTTP_Port                  2020

[FILTER]
    Name   record_modifier
    Alias  add_cloud_metadata
    Match  *
    Record cloud_project_id <redacted>

[FILTER]
    Name           nest
    Operation      nest
    Alias          nest.cloud_data
    Match          kube.*
    Wildcard       cloud_*
    Remove_Prefix  cloud_
    Nest_Under     cloud

[FILTER]
    Name           nest
    Operation      nest
    Alias          nest.meta_data
    Match          kube.*
    Wildcard       cloud
    Nest_Under     meta

[INPUT]
    name            kubernetes_events
    tag             k8s_events
    kube_url        http://app.kubernetes:80


[OUTPUT]
    Name               es
    Match              k8s_events
    Alias              es.k8s_events
    Retry_Limit        5

    Host               ${FLUENT_ELASTICSEARCH_HOST}
    Port               ${FLUENT_ELASTICSEARCH_PORT}
    Compress           gzip

    Logstash_Format    On
    Logstash_Prefix    fluent-kubernetes
    Write_Operation    create
    Buffer_Size        False
    Trace_Error        On
    Generate_ID        On
    Suppress_Type_Name On

Expected behavior
Fluent-bit should not crash

Output

[2024/10/30 09:54:15] [ info] [input:kubernetes_events:kubernetes_events.0] kubernetes stream closed by api server. Reconnect will happen on next interval.
[2024/10/30 09:54:15] [ info] [input:kubernetes_events:kubernetes_events.0] kubernetes stream disconnected, ret=1
[2024/10/30 09:54:15] [ info] [input:kubernetes_events:kubernetes_events.0] Requesting /api/v1/events?watch=1&resourceVersion=62152862
[2024/10/30 10:47:35] [ info] [input:kubernetes_events:kubernetes_events.0] kubernetes stream closed by api server. Reconnect will happen on next interval.
[2024/10/30 10:47:35] [ info] [input:kubernetes_events:kubernetes_events.0] kubernetes stream disconnected, ret=1
[2024/10/30 10:47:35] [ info] [input:kubernetes_events:kubernetes_events.0] Requesting /api/v1/events?watch=1&resourceVersion=62156442
[2024/10/30 11:31:05] [ info] [input:kubernetes_events:kubernetes_events.0] kubernetes stream closed by api server. Reconnect will happen on next interval.
[2024/10/30 11:31:05] [ info] [input:kubernetes_events:kubernetes_events.0] kubernetes stream disconnected, ret=1
[2024/10/30 11:31:05] [ info] [input:kubernetes_events:kubernetes_events.0] Requesting /api/v1/events?watch=1&resourceVersion=62158888
[2024/10/30 12:25:19] [ info] [input:kubernetes_events:kubernetes_events.0] kubernetes stream closed by api server. Reconnect will happen on next interval.
[2024/10/30 12:25:19] [ info] [input:kubernetes_events:kubernetes_events.0] kubernetes stream disconnected, ret=1
[2024/10/30 12:25:19] [ info] [input:kubernetes_events:kubernetes_events.0] Requesting /api/v1/events?watch=1&resourceVersion=62160843
[2024/10/30 13:00:47] [ info] [input:kubernetes_events:kubernetes_events.0] kubernetes stream closed by api server. Reconnect will happen on next interval.
[2024/10/30 13:00:47] [ info] [input:kubernetes_events:kubernetes_events.0] kubernetes stream disconnected, ret=1
[2024/10/30 13:00:47] [ info] [input:kubernetes_events:kubernetes_events.0] Requesting /api/v1/events?watch=1&resourceVersion=62163615
[2024/10/30 13:45:42] [ info] [input:kubernetes_events:kubernetes_events.0] kubernetes stream closed by api server. Reconnect will happen on next interval.
[2024/10/30 13:45:42] [ info] [input:kubernetes_events:kubernetes_events.0] kubernetes stream disconnected, ret=1
[2024/10/30 13:45:43] [ info] [input:kubernetes_events:kubernetes_events.0] Requesting /api/v1/events?watch=1&resourceVersion=62165276
[2024/10/30 14:00:00] [engine] caught signal (SIGSEGV)
#0  0x55f89bd8ee34      in  ???() at ???:0
#1  0x55f89c342326      in  ???() at ???:0
#2  0xffffffffffffffff  in  ???() at ???:0

Your Environment

  • Version used: 3.1.9
  • Configuration: See above
  • Environment name and version (e.g. Kubernetes? What version?): GKE v1.30.5-gke.1014001
  • Container Image: bitnami/fluent-bit:3.1.9
  • Operating System and version: Google COS VERSION=113 BUILD_ID=18244.151.27
  • Filters and plugins:

Additional context
We have other fluent bit instances using identical configuration, except other inputs instead of kubernetes_events and we are yet to see any seg faults on those

@patrick-stephens
Copy link
Contributor

I notice you're using a Bitnami image - does it happen with the actual image we produce here for OSS?

@Evesy
Copy link
Author

Evesy commented Oct 31, 2024

I will switch to fluent/fluent-bit:3.1.9 and see if it also happens in that image. Will close this off if I don't see any reoccurrence

@HaveFun83
Copy link

We saw the same error but we are also on bitnami images

@Evesy
Copy link
Author

Evesy commented Nov 4, 2024

I've been running fluent/fluent-bit:3.1.9 over the weekend and can see it's segfaulted ~5 times. Happy to try and grab more information, whatever would be useful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants