-
Notifications
You must be signed in to change notification settings - Fork 154
GSoC Information
Twemcache inherits the multithreaded design and its locking mechanism of Memcached (1.4.4 to be exact). While having great performance and low CPU utilization in general, lock contention is often the bottleneck that prevents Twemcache from taking advantage of the many cores on modern CPUs. Re-designing locking in Twemcache can push single instance throughput to a whole new level.
Please refer to this investigation into Twemcache's scalability and lock contention. There are various locks used in Twemcache, but the most important ones are those used to protect the hashtable(s) and slabs.
As shown above, the current locking design clearly has room for improvement. We are re-examining how key lookup and memory allocation are implemented, and how locking is woven into these procedures. The goal is to come up with a new, preferrably simple design that realizes as much theoretical gain of a lock-less system as possible.
To get there, we may need to look into various possible designs, compare their pros and cons, build prototypes and run performance tests against them. It's more important to know how to evaluate a solution and iterate based on results, than coming up with the perfect solution on the first try.
Twemcache is key-value store that treats the values mostly as a data blob. Adding the most basic data structures such as lists will simplify how many users use Twemcache. On top of that, scripting support (such as Lua) will allow users to define, store and operate on arbitrary data structures, and do so without multiple round trips or atomicity concerns.
The Memcached protocol is known for its simplicity. For the most part, it is an in-memory key-value store that treats data as blobs. The only commands that try to understand what the data means are incr
and decr
; the only other commands that try to incrementally update a blob are append
and prepend
. Overall it doesn't do much with the data.
Data structure support has proven particularly valuable for some use cases, for example, building indices and/or structured records. It's a big step toward better usability in general. Redis, the most popular memcache(d) alternative, supports many variations of lists, sets and hashes. These data structures make Redis capable of doing more complicated operations on the server side, saving network traffic and improving latency by avoiding transfering data back and forth. Along the line of avoiding data transfer as much as possible, it also makes sense to allow users to customize their operations in ways unforeseeably by the server. For example, a user might want to append to a list and truncate it if the total length is greater than a threshold, and if there is a way to combine these two operations into one "command" without changing the server binary, the user can optimize their operation dynamically to improve efficiency and simplify application logic.
Twemcache inherits the Memcached protocol and the basic design. Overall we like the multithreaded setup, the way memory management works (with our additional features), and the very small codebase that is easy to understand and maintain. These are all part of the reasons why we didn't simply convert to Redis. On the other hand, there is high demand from some users for better data structure support and greater flexibility, and we want to enable that without compromising the strengths of Twemcache.
There are two goals we want to achieve. The basic goal is adding native implementation for the most common data structures- lists would be the most useful one, followed by sets, sorted sets and hashes. The second and more difficult goal is to add scripting support, such as Lua script, to register and execute user-defined logic inside the server, while maintaining the low latency and predictable performance of Twemcache.
We will need to address two challenges in this project: 1) how to decide what data structures and commands to support, what to push into scripting, and what to ignore; 2) how to provide data structure support and flexible command execution without sacrificing predictibility and performance.
For the first challenge we borrow the ASIC / FPGA / CPU(as a general purpose processor) analogy to compare various style of command & data structure support- native support is like ASIC, it's the fastest, but relatively inflexible and has long development cycles; scripting is like FPGA, there are primitives that allow the user to configure the "hardware", in this case the server, on-the-fly to perform specific tasks, but it will almost always be slower than a server-native implementation; naive key-value store is like CPU, user can write any logic against it, but it will probably be the slowest. A well designed system always does a good job choosing the right technology for tasks, based on their invoking frequency, complexity and implementation cost.
FOr the second challenge, we need to take a good look at the resource requirement w.r.t the nature of computation and request volume and how these operations interact with memory management. Testing will be essential in guaranteeing good performance. If backed by data, it might even be necessary to re-design the architecture to make sure the system doesn't run into major bottlenecks, which may be related to our other project- Cache Lock Revamp.