This repo contains code for SOSP'23 paper: FIFO queues are all you need for cache eviction
As a cache eviction algorithm, FIFO has a lot of attractive properties, such as simplicity, speed, scalability, and flash-friendliness. The most prominent criticism of FIFO is its low efficiency (high miss ratio).
In this work, we demonstrate a simple, scalable FIFO-based algorithm with three static queues (S3-FIFO). Evaluated on 6594 cache traces from 14 datasets, we show that S3-FIFO has lower miss ratios than state-of-the-art algorithms across traces. Moreover, S3-FIFO’s efficiency is robust — it has the lowest mean miss ratio on 10 of the 14 datasets and is among the top algorithms on the other datasets. The use of FIFO queues enables S3-FIFO to achieve good scalability with 6× higher throughput compared to optimized LRU at 16 threads.
Our insight is that most objects in Zipf workloads will only be accessed once in a short window, so it is critical to evict them early. And the key of S3-FIFO is a small FIFO queue that filters out most objects from entering the main cache. We show that filtering with a small static FIFO queue has a guaranteed eviction time and higher eviction precision compared to state-of-the-art adaptive algorithms.
The repo is a snapshot of libCacheSim, modified cachelib, and distComp.
You can compile libCacheSim, which will provide a cachesim
binary, then you can run simulations with
# compile libcachesim
pushd libCacheSim/scripts && bash install_dependency.sh && bash install_libcachesim.sh && popd;
Use cacheSim to run cache simulations
# ./cachesim DATAPATH TRACE_FORMAT EVICTION_ALGO CACHE_SIZE [OPTION...]
./cachesim DATA oracleGeneral fifo,arc,lecar,s3fifo 0 --ignore-obj-size 1
Detailed instructions can be found at libCacheSim.
# generate data
python3 libCacheSim/scripts/data_gen.py -m 1000000 -n 100000000 --alpha 1.0 --bin-output cachelib/mybench/zipf1.0_1_100.oracleGeneral.bin
git clone https://github.com/Thesys-lab/cachelib-sosp23
cd cachelib/mybench;
# turnoff turobo boose and change to performance mode, this is very important for getting consistent results
bash turboboost.sh disable;
# build cachelib
bash build.sh;
# usage: bash run.sh algo size
bash run.sh s3fifo 4000
This will generate binaries of different caches under _build/
. You can then run
./_build/algo trace size_mb hashpower n_thread
If you need to scale up the computation by using more nodes, you would need to use distComp. See here for more details.
Please see artifact evaluation for detailed instructions.
The traces we used can be downloaded here.
The binary traces are zstd compressed and have the following format:
struct {
uint32_t timestamp;
uint64_t obj_id;
uint32_t obj_size;
int64_t next_access_vtime; // -1 if no next access
}
The compressed traces can be used with libCacheSim without decompression. And libCacheSim provides a tracePrint tool to print the trace in human-readable format.
We greatly thank the following people and organizations that made this work possible.
We greatly appreciate the resources and support provided by Cloudlab and PDL for performing large-scale evaluations.
- Tencent Block (Download)
- Tencent Photo (Download)
- Wikimedia CDN
- Alibaba Block
- MSR
- FIU
- CloudPhysics
- Meta
This work was supported in part by Meta Fellowship, NSF grants CNS 1901410 and 1956271, and an AWS grant.
@inproceedings{yang2023-s3fifo,
title={FIFO queues are all you need for cache eviction},
author={Yang, Juncheng and Zhang, Yazhuo and Qiu, Ziyue and Yue, Yao and Rashmi, K.V.},
booktitle={Symposium on Operating Systems Principles (SOSP'23)},
year={2023}
}
Copyright 2023, Carnegie Mellon University
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.