lru-cache
::: info Prerequisites
Before reading this article, you should first learn:
:::
The LRU algorithm is a cache eviction strategy. Its principle is not difficult, but writing a bug-free algorithm in an interview requires skill, involving multiple layers of abstraction and breakdown of data structures. This article will guide you to write elegant code.
The key data structure used in the LRU algorithm is the hash-linked list, LinkedHashMap. The Hands-on Guide to Implementing Hash Linked List in the data structure basics section explains the principles and code implementation of hash-linked lists. If you haven't read it, it's okay; this article will explain the core principles of hash-linked lists again to facilitate the implementation of the LRU algorithm.
Computer cache capacity is limited. If the cache is full, some content needs to be deleted to make space for new content. But the question is, what content should be deleted? We want to remove the cache that is not useful and keep the useful data in the cache for future use. So, what kind of data do we consider "useful"?
The LRU cache eviction algorithm is a common strategy. LRU stands for Least Recently Used, meaning that we consider recently used data to be "useful" and data that hasn't been used for a long time to be useless. When the memory is full, we delete the data that hasn't been used for the longest time.
For example, Android phones allow apps to run in the background. If I open "Settings," "Phone Manager," and "Calendar" one after another, the order in the background is like this:
But if I then access the "Settings" interface, "Settings" will be moved to the front, like this:
Assume my phone only allows three apps to run simultaneously, and it's already full. If I open a new app "Clock," I must close one app to free up space for "Clock." Which one should be closed?
According to the LRU strategy, the bottom "Phone Manager" is closed because it is the least recently used, and the new app is placed at the top:
Now you should understand the LRU (Least Recently Used) strategy. Of course, there are other cache eviction strategies, such as evicting based on access frequency (LFU strategy) rather than access order. Each has its application scenarios. This article explains the LRU algorithm strategy, and I will explain the LFU algorithm in LFU Algorithm Details.
1. LRU Algorithm Description
LeetCode Problem 146 "LRU Cache" requires you to design a data structure:
First, it should accept a capacity parameter as the maximum cache capacity, then implement two APIs: a put(key, val) method to store key-value pairs and a get(key) method to retrieve the val corresponding to key. If key does not exist, it returns -1.
Note that the get and put methods must have a time complexity of $O(1)$. Let's look at a specific example to see how the LRU algorithm works:
II. LRU Algorithm Design
Analyzing the above operations, to ensure that the time complexity of the put and get methods is O(1), we can summarize the necessary conditions for the cache data structure:
Clearly, the elements in the
cachemust have a sequence to distinguish between recently used and long-unused data. When the capacity is full, the least recently used element should be removed to make space.We need to quickly find whether a certain
keyexists in thecacheand obtain the correspondingval.Each time a
keyin thecacheis accessed, this element needs to be made the most recently used, which means thecachemust support quick insertion and deletion of elements at any position.
So, what data structure meets these conditions? A hash table allows fast lookup, but the data is unordered. A linked list is ordered and allows fast insertion and deletion, but lookup is slow. Therefore, combining them forms a new data structure: a hash-linked list, LinkedHashMap.
The core data structure of the LRU cache algorithm is this hash-linked list, a combination of a doubly linked list and a hash table. It looks like this:
Using this structure, let's analyze the three conditions one by one:
If we always add elements to the end of the list, then obviously, the closer to the end an element is, the more recently it's been used. The closer to the head an element is, the less recently it's been used.
For a specific
key, we can quickly locate the node in the linked list via the hash table and obtain the correspondingval.A linked list naturally supports quick insertion and deletion at any position by adjusting pointers. However, a traditional linked list cannot quickly access an element at a specific index. Here, with the help of a hash table, we can quickly map a
keyto any linked list node for insertion and deletion.
Readers might ask why a doubly linked list is needed instead of a singly linked list? Also, since the key is already stored in the hash table, why does the linked list need to store both key and val? Wouldn't storing just val suffice?
These questions can only be answered through implementation. The rationale behind this design will become clear once we personally implement the LRU algorithm, so let's start coding!
Three: Code Implementation
Many programming languages have built-in linked hashmaps or library functions with LRU-like functionality. But to help you understand the nuts and bolts of the algorithm, let's build an LRU cache from scratch first, then implement it again using Java's built-in LinkedHashMap.
First, let's define the node class for our doubly linked list. For simplicity, both key and val are int types:
Next, we'll use our Node type to build a doubly linked list with the essential APIs that our LRU algorithm needs:
If you're not comfortable with linked list operations, check out Hands-On Guide to Implementing a Doubly Linked List.
Now we can answer the earlier question: "Why do we need a doubly linked list?" It's because we need to perform deletions. Deleting a node requires not only a pointer to the node itself, but also access to its predecessor's pointer. Only a doubly linked list lets you find the predecessor directly, keeping the operation at O(1) time complexity.
::: important Important
Note that our doubly linked list API only inserts at the tail. This means data near the tail was recently used, while data near the head is the least recently used.
:::
With the doubly linked list in place, all we need to do is combine it with a hash map in our LRU algorithm. Let's start with the code skeleton:
Don't rush into implementing the get and put methods just yet. Since we're maintaining both a doubly linked list cache and a hash map map simultaneously, it's easy to miss an operation—for example, when deleting a key, you might remove the corresponding Node from cache but forget to also remove the key from map.
The best way to avoid this kind of bug is to provide an abstraction layer of APIs on top of these two data structures.
The idea is to keep the main get and put methods from directly manipulating the details of map and cache. Let's implement a few helper functions first:
This also answers the earlier question: "Why store both key and val in the linked list node instead of just val?" Look at the removeLeastRecently function—we need to get deletedKey from deletedNode.
Here's the thing: when the cache is full, we don't just delete the last Node—we also need to remove the corresponding key from map. And the only way to get that key is from the Node itself. If the Node only stored val, we'd have no way to know which key to delete from map, and that would cause bugs.
These helper methods are simple wrappers that let us avoid directly touching the cache linked list and map hash table. Now let's implement the get method for the LRU algorithm:
The put method is a bit more involved. Let's draw a diagram to clarify its logic:
With that clear picture, we can easily write the code for put:
At this point, you should have a solid grasp of both the theory and implementation of the LRU algorithm. Here's the complete implementation:
You can also use Java's built-in LinkedHashMap or the MyLinkedHashMap from Hands-On Guide to Implementing a Linked HashMap to implement LRU. The logic is exactly the same:
And there you have it—nothing mysterious about the LRU algorithm anymore. For more data structure design problems, check out Classic Data Structure Design Exercises.
Last updated