You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I have a Kubernetes cluster with 1 master and 9 workers, each node has 4 CPU cores and 4GB RAM. I am facing an issue that when the query frontend executes a query with a very large interval or with a large limit like 400000 logs, the loki-query-frontend pod consumes a huge amount of memory and crushes when the limit is reached. Are there any suggestions to overcome this issue? I think that is related to the behavior of the query frontend where it merges the responses returned from split queries in the memory (please correct me if I am wrong). I have read the following parameter and I think it might be useful. I am considering setting this parameter based on the available memory for the pod.
-frontend.max-query-bytes-read value
Max number of bytes a query can fetch. Enforced in log and metric queries only when TSDB is used. The default value of 0 disables this limit.
in the helm chart
limits_config:
max_query_bytes_read: xxxB
my current helm chart deployment
# Setup the DNS service that is used in the cluster in our case is coreDNSglobal:
dnsService: "kube-dns"dnsNamespace: "kube-system"# Distributed mode is preferred for more than 1TB logs per daydeploymentMode: Distributed# Begin Loki Configuration of the systemloki:
# 100MB of disk used by Loki Golang garbage collector to reduce total CPU utilization during garbage collector cyclesballast_bytes: 104857600# disable multi tenants because is not requiredauth_enabled: false# The schema config used to define how data is stored in Loki storageschemaConfig:
configs:
# From specifies the date from which the following schema configuration will apply# It tells Loki to use the TSDB indexing format for logs written from this date onward# This configuration also helps maintain backward compatibility with older storage formats
- from: "2024-10-25"# Specifies which indexing format/schema will be used (tsdb is the same format used in Prometheus)store: tsdb# Indicates the object storage backend where chunks of log data will be storedobject_store: s3# Selects the version of the schema being usedschema: v13index:
# Index files will be prefixed with loki_index_ when stored in the S3 storageprefix: loki_index_# Specifies the time period for creating new index tables and storing them in the S3 storage (every 24 hours)# Never change this value. Compactor will only work if period is 24hperiod: 24h# The following configuration will be used to send all TSDB index operations to the index gatewaystorage_config:
tsdb_shipper:
# The location that the ingesters will use to write the tsdb index files before flushing them to the storageactive_index_directory: /var/loki/tsdb-shipper-active# The location where the index files are cached in the index gateway. This very critical component in the cluster where it must be configured properly.cache_location: /var/loki/tsdb-shipper-cache # Both the previous parameters need to be configured with persistent storage# how long the index files will be cached in the index gateway cache_ttl: 48h0m0s# time interval to resync the cached index files. Sometimes the compactor merge the index files and chunks therefore we need to pull the index filesresync_interval: 10m# index gateways client configurationindex_gateway_client:
# the service name of the index gateway will be taken from the helm deploymentserver_address: '{{ include "loki.indexGatewayAddress" . }}'# http hedging is used to send multiple requests to the storage (read operations). If any of the sent requests succeeded the operation is considered successful and the other sent requests will be canceled. Hedging is used overcome the issue that some requests took long time to respond or dropped because of network related issues or high latency network routes. Therefore, sending more requests after specific duration will overcome this issuehedging:
# wait 300 ms if the first request did not succeed then start send subsequent requestsat: "300ms"# send only two more request when no response returned after 300msup_to: 2# set the maximum allowed requests to be sent to the storage. This parameter sets the total limits of hedging requests in the cluster to not overwhelm the storage.max_per_second: 15server:
http_listen_port: 3100grpc_listen_port: 9095# sets the maximum duration for reading the entire request from the client.http_server_read_timeout: 5m# defines the maximum duration for writing a response to the client. If the server takes too long to send the response back, the connection will be closed.http_server_write_timeout: 5m# enable limitless tcp connections from clientshttp_listen_conn_limit: 0# enable metric requestsregister_instrumentation: true#close idle connections after timeouts durationhttp_server_idle_timeout: 5m# cluster logs will be in logfmt formatlog_level: debuglog_format: "json"log_source_ips_enabled: true# The number of allowed messages size between cluster services and number of concurrent streams.grpc_server_max_recv_msg_size: 104857600# 100MBgrpc_server_max_send_msg_size: 104857600#100MBgrpc_server_max_concurrent_streams: 256# define cluster limitslimits_config:
# remove service name discoverydiscover_service_name: []# allow reject of outdated logsreject_old_samples: true# only reject logs that are older by 1 day of the last received log message timestamp per unique stream.# if the time of last log message received for specific stream is 10 PM then we will drop all the new received log messages with a timestamp before the 9PM# always set this parameter to half of max_chunk_age# reject_old_samples_max_age: 1hreject_old_samples_max_age: 1d# If the timestamp of the newest log entry for the query result is greater than 5 minutes then the result is cached.max_cache_freshness_per_query: 5m# each query will be split to small queries. For instance, query with 1 hour time interval will be split to 3 small queries of 20 minutessplit_queries_by_interval: 30m# number of the maximum scheduled queries in the query scheduler. This means the maximum length of scheduled queries in the query scheduler is 64 days (6h * 256)/24)tsdb_max_query_parallelism: 256# how long to wait until the query considered deadquery_timeout: 5m# enable volume (logs size in the storage) tracking volume_enabled: true# Maximum number of returned logs per query. This limit will be shown to the user when he set the limit in grafana to more than this valuemax_entries_limit_per_query: 300000# How long logs will stay in the storage before deleting them (180 days = 4320 hours)retention_period: 4320h0m0s# Limit queries interval to 180 days to prevent looking for logs before 180 daysmax_query_lookback: 4320h0m0s# if you want to set different retention per stream# retention_stream:# - selector: '{host="haproxy1"}'# priority: 1# period: 24h# the total number of concurrent allowed log streams from the user per all the ingester instances (0 means unlimited)max_global_streams_per_user: 0# the total number of concurrent allowed log streams from the user per single ingester instances (0 means unlimited)max_streams_per_user: 0# per second rate limit per unique streamper_stream_rate_limit: 16MB# the maximum accepted push message size with huge number of logs (burst mode from Promtail)per_stream_rate_limit_burst: 64MB# do not execute queries that return more than 300 matching log streams (client need to rewrite the query to reduce the number of the returned unique log streams)max_streams_matchers_per_query: 300# shard queries based on the expected bytes per shardtsdb_max_bytes_per_shard: 600MB# the maximum allowed burst log message size allowed (default 6)# useful if client is sending many many log messages at once in single requestingestion_burst_size_mb: 100# the maximum rate limit allowed to the ingesters per user (default 4)# if we are using promtail to pull data from apache Kafka then the ingesters will see only single user sending dataingestion_rate_mb: 100000ingestion_rate_strategy: local# Set the maximum size in kb for the single received log messagemax_line_size: 256KB# if log message received is more 265KB then they will be truncated instead of droppedmax_line_size_truncate: falsecommonConfig:
# No need for ingester replicationreplication_factor: 1ingester:
# enable snappy encoding for data in the chunks (preferred)chunk_encoding: snappy# How long chunks should be retained in-memory after they've been flushed.chunk_retain_period: 10s# How long chunks should sit in-memory with no updates (no log messages received for them) before being flushed if they don't hit the max block size. chunk_idle_period: 1h# the maximum age the chunk will be stored in the memory before flushedmax_chunk_age: 2h0m0swal:
# enable write ahead log (WAL)enabled: true# time interval of checkpoints of the WAL created.# Loki stores the logs in WAL directory as small files of 32KB. When checkpoint duration is reached these small files are merged into single file called checkpoint. Also, if there was older checkpoint file from the previous cycles, then it will be also merged. At the same time Loki will delete the WAL entries for chunks that are flushed in the S3 storage.checkpoint_duration: 5m0s# the location where write ahead log (WAL) will be written. This needs to be persistent across restartsdir: /var/loki/wal# when incident happen then the amount of memory the ingester will use to replay the WAL files. Preferred values is 75% of the available memoryflush_on_shutdown: true# when incident happen then the amount of memory the ingester will use to replay the WAL files. Preferred values is 75% of the available memoryreplay_memory_ceiling: 3GBquerier:
# number of concurrent quires each querier pod executes# this parameter should be set to the number of CPU cores reserved for the podmax_concurrent: 16compactor:
# Enable retention to compact both index and chunks.retention_enabled: true# Temporary location for compactor storage. This location used to store index and chunk files that will be compacted and the compacted files before storing them in the storageworking_directory: /tmp/loki/compactor# Target store for deletion requests (e.g., compacted indexes and chunks).delete_request_store: s3# Frequency of applying compaction and retention.compaction_interval: 10m# Delay before deleting marked-for-deletion chunks.retention_delete_delay: 1h# Number of workers to handle chunk deletions.retention_delete_worker_count: 32tracing:
enabled: truestorage:
type: s3bucketNames:
chunks: "loki-test"# The following will choose the bucket that the ruler will store its results on.# ruler: "loki-test"s3:
# s3 URL can be used to specify the endpoint, access key, secret key, and bucket nameendpoint: s3.net# AWS secret access keysecretAccessKey: ****# AWS access key IDaccessKeyId: ****# AWS signature version (e.g., v2 or v4)signatureVersion: v4# Forces the path style for S3 (true/false)s3ForcePathStyle: true# Allows insecure (HTTP) connections (true/false)insecure: true# HTTP configuration settingshttp_config:
timeout: 15sbackoff_config:
# Minimum backoff time when s3 get Objectmin_period: 100ms# Maximum backoff time when s3 get Objectmax_period: 3s# Maximum number of times to retry when s3 get Objectmax_retries: 5ingester:
replicas: 3maxUnavailable: 1zoneAwareReplication:
enabled: falsepersistence:
enabled: trueclaims:
- name: datasize: 25GmountPath: /var/loki/ storageClass: local-storage# extraArgs:# # print entire yaml file configuration to the stderr # - "-print-config-stderr"querier:
replicas: 3maxUnavailable: 2autoscaling:
enabled: trueminReplicas: 3maxReplicas: 6targetCPUUtilizationPercentage: 70targetMemoryUtilizationPercentage: 50# extraArgs:# - "-print-config-stderr"gateway:
enabled: truereplicas: 1basicAuth:
# -- Enables basic authentication for the gatewayenabled: true# -- The basic auth username for the gatewayusername: ahsifer# -- The basic auth password for the gatewaypassword: Master12345service:
type: NodePortnodePort: 31000port: 80queryFrontend:
replicas: 3maxUnavailable: 1frontend:
# log queries that took more than 10 seconds to executelog_queries_longer_than: 10squeryScheduler:
replicas: 2maxUnavailable: 1distributor:
replicas: 3maxUnavailable: 2compactor:
replicas: 1indexGateway:
replicas: 2maxUnavailable: 1persistence:
# -- Enable creating PVCs which is required when using boltdb-shipperenabled: true# -- Use emptyDir with ramdisk for storage. **Please note that all data in indexGateway will be lost on pod restart**inMemory: falsestorageClass: local-storagesize: 24Gtest:
enabled: falselokiCanary:
enabled: falseextraArgs:
- "-user=ahsifer"
- "-pass=Master12345"resultsCache:
# -- Specifies whether memcached based results-cache should be enabledenabled: true# -- Specify how long cached results should be stored in the results-cache before being expireddefaultValidity: 12h# -- Memcached operation timeouttimeout: 500ms# -- Total number of results-cache replicasreplicas: 2# -- Port of the results-cache serviceport: 11211# -- Amount of memory allocated to results-cache for object storage (in MB).allocatedMemory: 1024# -- Maximum item results-cache for memcached (in MB).maxItemMemory: 20# -- Maximum number of connections allowedconnectionLimit: 16384# -- Max memory to use for cache write backwritebackSizeLimit: 500MB# -- Max number of objects to use for cache write backwritebackBuffer: 500000# -- Number of parallel threads for cache write backwritebackParallelism: 8chunksCache:
# -- Specifies whether memcached based chunks-cache should be enabledenabled: true# -- Batchsize for sending and receiving chunks from chunks cachebatchSize: 8# -- Parallel threads for sending and receiving chunks from chunks cacheparallelism: 5# -- Memcached operation timeouttimeout: 2000ms# -- Specify how long cached chunks should be stored in the chunks-cache before being expireddefaultValidity: 0s# -- Total number of chunks-cache replicasreplicas: 2# -- Port of the chunks-cache serviceport: 11211# -- Amount of memory allocated to chunks-cache for object storage (in MB).allocatedMemory: 2048# -- Maximum item memory for chunks-cache (in MB).maxItemMemory: 20# -- Maximum number of connections allowedconnectionLimit: 16384# -- Max memory to use for cache write backwritebackSizeLimit: 500MB# -- Max number of objects to use for cache write backwritebackBuffer: 500000# -- Number of parallel threads for cache write backwritebackParallelism: 8# extraVolumes:# - name: my-emptydir# emptyDir: {}# extraVolumeMounts:# - name: my-emptydir# mountPath: /data# The following is defined to enable caching on the disk for memcached. Only used when eviction from the memory occur. This parameter will boost cluster read performance if used properly. Also I think we can use SSD for caching purposes using memcached# extraArgs: # memcached.ext_path: "/data/file:8G"# persistence:# # -- Enable creating PVCs for the results-cache# enabled: true# # -- Size of persistent disk, must be in G or Gi# storageClass: local-storage# storageSize: 10G# # -- Volume mount path# mountPath: /databloomCompactor:
replicas: 0bloomGateway:
replicas: 0ruler:
replicas: 0backend:
replicas: 0read:
replicas: 0write:
replicas: 0singleBinary:
replicas: 0
The text was updated successfully, but these errors were encountered:
I had the same issue. I started the loki yesterday in a Google instance with 4G of memory. Today, my pod was crushed because it ran out of memory. There are approximately 800,000 logs in my instance today.
Attempting to add the following configuration to limit_config:
max_query_bytes_read: 67108864 # 64MB
This is my loki-config.yaml without max_query_bytes_read
Hi, I have a Kubernetes cluster with 1 master and 9 workers, each node has 4 CPU cores and 4GB RAM. I am facing an issue that when the query frontend executes a query with a very large interval or with a large limit like 400000 logs, the loki-query-frontend pod consumes a huge amount of memory and crushes when the limit is reached. Are there any suggestions to overcome this issue? I think that is related to the behavior of the query frontend where it merges the responses returned from split queries in the memory (please correct me if I am wrong). I have read the following parameter and I think it might be useful. I am considering setting this parameter based on the available memory for the pod.
in the helm chart
my current helm chart deployment
The text was updated successfully, but these errors were encountered: