You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My Garnet server recently become unable to recover from last checkpoint. Notably, it used to recover the HybridLog Stats fine, and then initiate the AOF replay, but, despite no changes to the config, it did this on the last server reset:
2024-11-09 06:42:49 10::42::49 info: TsavoriteKV [main][0] ********* Primary Recovery Information ********
2024-11-09 06:42:49 10::42::49 info: StoreWrapper[0] Error during recovery of store; storeVersion = -1; objectStoreVersion = -1 Tsavorite.core.TsavoriteException: Unable to find valid HybridLog token at Tsavorite.core.TsavoriteKV`4.FindRecoveryInfo(Int64 requestedVersion, HybridLogCheckpointInfo& recoveredHlcInfo, IndexCheckpointInfo& recoveredICInfo) in /src/libs/storage/Tsavorite/cs/src/core/Index/Recovery/Recovery.cs:line 327 at Tsavorite.core.TsavoriteKV`4.Recover(Int32 numPagesToPreload, Boolean undoNextVersion, Int64 recoverTo) in /src/libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/Tsavorite.cs:line 349 at Garnet.server.StoreWrapper.RecoverCheckpoint(Boolean recoverMainStoreFromToken, Boolean recoverObjectStoreFromToken, Guid storeIndexToken, Guid storeHlogToken, Guid objectStoreIndexToken, Guid objectStoreHlogToken) in /src/libs/server/StoreWrapper.cs:line 245
2024-11-09 06:42:49 10::42::49 info: TsavoriteLog [aof][0] Unable to recover using any available commit
2024-11-09 06:42:49 10::42::49 info: StoreWrapper[0] Recovered AOF: begin address = 64, tail address = 64
2024-11-09 06:42:49 10::42::49 dbug: Session[0] [] [0139BFB8] Starting RespServerSession Id=0
2024-11-09 06:42:49 10::42::49 info: StoreWrapper[0] Begin AOF recovery
2024-11-09 06:42:49 10::42::49 info: StoreWrapper[0] Begin AOF replay
2024-11-09 06:42:49 10::42::49 info: StoreWrapper[0] Completed full AOF log replay of 0 record
It's worth pointing out that I'm using a configuration that involves a primary garnet server along with a secondary that functions as a low compute/memory backup who's purpose is only to write to aof in case of the primary being down. My guess is that maybe this strategy failed and somehow corrupted the checkpoint but I did test this in development and the primary was able to pickup the data that the secondary wrote while the primary was down so I hadn't expected this kind of failure, but given what I'm doing I'm not sure if this is a bug or a bad config based on a misunderstanding of how AOF works. Here are my config files:
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
My Garnet server recently become unable to recover from last checkpoint. Notably, it used to recover the HybridLog Stats fine, and then initiate the AOF replay, but, despite no changes to the config, it did this on the last server reset:
It's worth pointing out that I'm using a configuration that involves a primary garnet server along with a secondary that functions as a low compute/memory backup who's purpose is only to write to aof in case of the primary being down. My guess is that maybe this strategy failed and somehow corrupted the checkpoint but I did test this in development and the primary was able to pickup the data that the secondary wrote while the primary was down so I hadn't expected this kind of failure, but given what I'm doing I'm not sure if this is a bug or a bad config based on a misunderstanding of how AOF works. Here are my config files:
docker-compose.yml
garnet.conf
garnet-aof.conf
Beta Was this translation helpful? Give feedback.
All reactions