Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Falco 0.38.2 crashed with 139 #3404

Open
chenliu1993 opened this issue Nov 15, 2024 · 5 comments
Open

Falco 0.38.2 crashed with 139 #3404

chenliu1993 opened this issue Nov 15, 2024 · 5 comments
Labels

Comments

@chenliu1993
Copy link

chenliu1993 commented Nov 15, 2024

Describe the bug

we are running falco on rhel os both deployed on physical and cloud, while running falco will crashed due to 139. like once per week.

Last State:     Terminated
      Reason:       Error
      Exit Code:    139
      Started:      Fri, 15 Nov 2024 08:05:39 +0800
      Finished:     Fri, 15 Nov 2024 10:15:38 +0800

How to reproduce it

deployed through deamonset, kmod and modern_ebpf are used. but this is not reproducible on test env, only with nodes where high traffic and high number of contianers. before crash, cpu usage spiked and only on physical servers, evts buffer drop count suddenly increase

Expected behaviour

Screenshots
this crash should not happen

Environment

  • Falco version: 0.38.2
  • System info: 9.4
  • Cloud provider or hardware configuration:
  • OS: rhel
  • Kernel: Linux master-2 5.14.0-427.22.1.el9_4.x86_64 Mon Jun 10 09:23:36 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux
  • Installation method: kubernetes

Additional context

@chenliu1993
Copy link
Author

chenliu1993 commented Nov 15, 2024

{"evt.source":"syscall","evt.time":1731635439972082379,"falco.container_memory_used_mb":38.3,"falco.cpu_usage_perc":1.5,"falco.duration_sec":6298,"falco.evts_rate_sec":8361.1,"falco.host_boot_ts":1728034694000000000,"falco.host_cpu_usage_perc":12.6,"falco.host_memory_used_mb":10179.9,"falco.host_num_cpus":8,"falco.host_open_fds":9600,"falco.host_procs_running":2,"falco.kernel_release":"5.10.224-212.876.amzn2.x86_64","falco.memory_pss_mb":90.2,"falco.memory_rss_mb":120.8,"falco.memory_vsz_mb":1321.7,"falco.n_added_fds":6780850,"falco.n_added_threads":32959,"falco.n_cached_fd_lookups":36798081,"falco.n_cached_thread_lookups":54188390,"falco.n_containers":97,"falco.n_drops_full_threadtable":0,"falco.n_failed_fd_lookups":29433,"falco.n_failed_thread_lookups":180103,"falco.n_fds":211749,"falco.n_missing_container_images":0,"falco.n_noncached_fd_lookups":16660402,"falco.n_noncached_thread_lookups":6213273,"falco.n_removed_fds":6631880,"falco.n_removed_threads":31091,"falco.n_retrieve_evts_drops":12230011,"falco.n_retrieved_evts":6758363,"falco.n_store_evts_drops":0,"falco.n_stored_evts":6777209,"falco.n_threads":1878,"falco.num_evts":53295547,"falco.num_evts_prev":45770150,"falco.outputs_queue_num_drops":0,"falco.rules.Admin_user_activity":0,"falco.rules.Clear_Log_Activities":0,"falco.rules.Contact_K8S_API_Server_From_Container":0,"falco.rules.Create_Hardlink_Over_Sensitive_Files":0,"falco.rules.Create_Symlink_Over_Sensitive_Files":0,"falco.rules.Debugfs_Launched_in_Privileged_Container":0,"falco.rules.Detect_Directory_Change":0,"falco.rules.Detect_File_Permission_or_Ownership_Change":0,"falco.rules.Detect_New_File":0,"falco.rules.Detect_Write_Below_etc_hosts":0,"falco.rules.Detect_Write_To_proc_sys_fs_protected_symlinks":0,"falco.rules.Detect_release_agent_File_Container_Escapes":0,"falco.rules.Detect_su_or_sudo":0,"falco.rules.Directory_traversal_monitored_file_read":0,"falco.rules.Disallowed_SSH_Connection_Non_Standard_Port":0,"falco.rules.Drop_and_execute_new_binary_in_container":0,"falco.rules.Execution_from_dev_shm":0,"falco.rules.Fileless_execution_via_memfd_create":0,"falco.rules.Find_AWS_Credentials":0,"falco.rules.Inbound_SSH_Connection":0,"falco.rules.Kernel_Module_Modification":0,"falco.rules.Launch_Package_Management_Process_on_Host":0,"falco.rules.Linux_Kernel_Module_Injection_Detected":0,"falco.rules.Listen_on_New_Port":0,"falco.rules.Mount_Launched_in_Privileged_Container":0,"falco.rules.Netcat_Remote_Code_Execution_in_Container":0,"falco.rules.Node_Created_in_Filesystem":0,"falco.rules.Outbound_SSH_Connection":0,"falco.rules.PTRACE_anti_debug_attempt":0,"falco.rules.PTRACE_attached_to_process":0,"falco.rules.Packet_socket_created_in_container":0,"falco.rules.Read_sensitive_file_trusted_after_startup":0,"falco.rules.Read_sensitive_file_untrusted":0,"falco.rules.Redirect_STDOUT_STDIN_to_Network_Connection_in_Container":0,"falco.rules.Remove_Bulk_Data_from_Disk":0,"falco.rules.Run_shell_untrusted":0,"falco.rules.Search_Private_Keys_or_Passwords":0,"falco.rules.Sudo_Potential_bypass_of_Runas_user_restrictions_CVE_2019_14287":0,"falco.rules.System_user_interactive":0,"falco.rules.Terminal_shell_in_container":0,"falco.rules.Unexpected_file_access_readwrite_for_fluentd":0,"falco.rules.Unexpected_spawned_process_fluentd":0,"falco.rules.matches_total":0,"falco.sha256_config_file.falco":"7accf6fdd865ac25af1925a313d360b7f90690214f2a0193ff2d1a8058f698e4","falco.sha256_rules_file.falco_rules":"788c614cde7485976de0d71a2b739ca6212c4b1e50ac34e4b4ef723631da90e6","falco.sha256_rules_file.falco_rules_preload":"c5cc6494fec621de756ce99fc34ce969b7bb1cc2b53a5f8656003a7c18f110f7","falco.sha256_rules_file.falco_rules_volterra_10_exceptions":"ba70e9f0f27a32f8ddd74cc009c5826f73ec13745172a77833d444d68e70ca5a","falco.sha256_rules_file.falco_rules_volterra_20_security":"2229b9f25968ca44652f437dec00fcff71881aab9d3262db7b5665a2a1c9369e","falco.sha256_rules_file.falco_rules_volterra_30_apps":"52901031f61330430f7d01e4831905a45b5c62affd08ee030ce3f56b3b8d66e0","falco.sha256_rules_file.falco_rules_volterra_40_fim":"7230673a0c9122e2bc95534c592e05976d527ef4e85bc6f7c07bf7d3e358799e","falco.sha256_rules_file.falco_rules_volterra_50_cve":"e9df3c057434f86c0f721d5b492e30b0f2ae37dc0660286598ad64098a803730","falco.start_ts":1731629141065998658,"falco.version":"0.38.2","scap.engine_name":"modern_bpf","scap.evts_drop_rate_sec":0.0,"scap.evts_rate_sec":8366.1,"scap.n_drops":6,"scap.n_drops_buffer_clone_fork_enter":0,"scap.n_drops_buffer_clone_fork_exit":0,"scap.n_drops_buffer_close_exit":0,"scap.n_drops_buffer_connect_enter":0,"scap.n_drops_buffer_connect_exit":0,"scap.n_drops_buffer_dir_file_enter":0,"scap.n_drops_buffer_dir_file_exit":0,"scap.n_drops_buffer_execve_enter":0,"scap.n_drops_buffer_execve_exit":0,"scap.n_drops_buffer_open_enter":0,"scap.n_drops_buffer_open_exit":0,"scap.n_drops_buffer_other_interest_enter":0,"scap.n_drops_buffer_other_interest_exit":0,"scap.n_drops_buffer_proc_exit":0,"scap.n_drops_buffer_total":0,"scap.n_drops_perc":1.3280494671865537e-05,"scap.n_drops_prev":5,"scap.n_drops_scratch_map":6,"scap.n_evts":53326919,"scap.n_evts_prev":45797079},"priority":"Informational","rule":"Falco internal: metrics snapshot","source":"internal","time":"2024-11-15T01:50:39.972082379Z"}

@chenliu1993
Copy link
Author

there is also no cpu spike and then it crashed

@FedeDP
Copy link
Contributor

FedeDP commented Nov 21, 2024

Hi! Thanks for opening this issue!
We just released Falco 0.39.2; are you able to test it?
By the way, perhaps you have an events spike and the output queue grows too large; you can try playing around with outputs_queue.capacity option.

@chenliu1993
Copy link
Author

Thank you, I will try upgrade
one more question, may I know under what condition outputs_queue will be full?
from last log I can see "falco.outputs_queue_num_drops":0, this means no drop happens so buffer is not full yet

@FedeDP
Copy link
Contributor

FedeDP commented Nov 22, 2024

under what condition outputs_queue will be full?

By default, the output queue is unbounded, therefore it cannot be full until Falco gets killed by the OS.
This is because we don't want to lose any output event by default.

from last log I can see "falco.outputs_queue_num_drops":0, this means no drop happens so buffer is not full yet

That's true! But under events spikes it can happen that the outputs queue grows very large in just a few seconds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants