Guidance/ recommended configurations for running Astra at scale #795
-
Starting a discussion on running kaldb at scale. Can you share insight into how Slack runs this open source project internally (what configuration params do you keep as default and which do you override) How do you pick the number of each node type? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 5 replies
-
I'm still working to build out a complete recommendation list, but here's a snapshot of what we're currently using for some of the more critical configs: CPU range indicates the Kubernetes request/limit configs. Index
Recovery
Manager
Query
Cache
Preprocessor
General
|
Beta Was this translation helpful? Give feedback.
-
@bryanlb Another configuration question:
But it leads me to the question of the ideal configuration between the kafka topic retention and The way I see it, since the kafka topic is a WAL, the ideal configuration would be that any partition holds more messages than Is this how Slack configures the Kafka retention? AFAIK, kafka topics can have retention by bytes or time on the topic as a whole. Can you share the kafka WAL configuration and corresponding |
Beta Was this translation helpful? Give feedback.
I'm still working to build out a complete recommendation list, but here's a snapshot of what we're currently using for some of the more critical configs: CPU range indicates the Kubernetes request/limit configs.
Index
r5d.24xlarge
cpu: 2-5
memory: 32GB
jvm: 6GB
localdisk: 90Gi
maxBytesPerChunk: 15000000000 # 15GB
Scaled to 4MB/s per indexer - 40MB/s cluster would be 10 nodes
Recovery
r5d.24xlarge
cpu: 2-5
memory: 24GB
jvm: 20GB
localdisk: 100Gi
Auto scaled on CPU > 60%, min 2 nodes
Manager
m5.24xlarge
cpu: 0.5 - 2
memory: 12GB
jvm: 8GB
1 instance per cluster
Query
m5.24xlarge
cpu: 1-4
memory: 32GB
jvm: 28GB
requestTimeout: 60s
Scaled to 3-10 nodes, depending on query load
Cache
i3en.24xlarge