Classic DP: Edit Distance
Last updated
Last updated
Translator: Master-cai
Author: labuladong
Few days ago, I saw an interview paper of Tencent. In this paper, most of the algorithm problems are Dynamic programming. The last question is to write a function which calculates the shortest Editing distance. Today I wrote an article specifically to discuss this problem.
I personally like this problem because it looks very hard, yet the solution is surprisingly simple and beautiful. Though it's a rare algorithm, it's is very useful (yech, I recognize that many algorithm problems are not very useful). The problem is as follows:
Why did I say this problem is hard? Obviously, it's just hard, making people helpless and frightened.
And why did I say this problem is useful? Because a few days ago I used the algorithm in my daily life. I had an article in my Wechat Official Account and I wrote some words out of place by mistake. So I decided to modify this part to make the logic suitable. However, the Wechat Official Account article can only be modified 20 words at most, and it only supports addition, deletion and replacement (exactly same as the editing distance problem). So I used the algorithm to find the best way to solve the problem in just 16 steps.
Another advanced example is that the edit distance can be used to measure the similarity of two DNA sequences. The DNA sequence is a sequence consisting of A, G, C and T, which is similar to a string. The shorter the editing distance is, the more similar the two DNA are. Maybe the owners of these DNAs were ancient relatives.
Let's get to the point, I will explain you how to edit the distance in detail, and I hope you could obtain something fruitful.
The editing distance is a problem where we are given two strings s1
and s2
with only three operations, and we have to transform s1
into s2
in fewest steps. The first thing to be sure of is that the results of s1
to s2
and s2
to s1
are the same. So we will use s1
to s2
as an example.
Mentioned in the early paper "The longest common subsequence", I said that to solve the dynamic programming problem of two strings, we normally use two pointers i
, j
to point to the ends of the two strings, and then go forward step by step to reduce the size of the problem.
Assuming that the two strings are "rad" and "apple", in order to change s1
tos2
, the algorithm works like this:
Remember this gif in order to solve the editing distance problem. The key is how to make
the right operation which I will discuss later.
According to the above gif, we can figure out that there are not only three operations; in fact, there is a fourth operation which is skip. For example:
As the two characters are the same, obviously there should be no operation to minimize the distance. Just move i
, j
.
Another simple situation is when j
has finished s2
, if i
has not finished s1
, then you can only delete s1
to make them the same. For example:
Similarly, if i
finished s1
and j
has not finished s2
, you can only insert all the remaining characters of s2
into s1
. As you see, the two cases are the base case of the algorithm.
Let's look at how to change your ideas into code. Sit tight, it's time to go.
First we sort out our ideas:
The base case is when i
finished s1
or j
finished s2
, we can directly return the remaining length of the other string.
For each pair of characters, s1[i]
and s2[j]
, there are four operations:
With this framework, the problem has been solved. Maybe you will ask, how to chose the "three choices"? It's very simple: try it all, and chose the smallest one. we need some recursive skills here. Look at the code:
Let's explain this recursive code in detail. There is no need to explain the base case, so I'll mainly explain the recursive part.
It is said that recursive code is very interpretable. It does make sense. As long as you understand the definition of a function, you can clearly understand the logic of the algorithm. The function dp(i, j)
is defined like this:
Remember this definition, let's look at the code:
if s1[i] != s2[j]
, we should recurse the three operations which needs a bit of thing:
Now, you should fully understand this short and clever code. Another small problem is that this is a brute force solution. There are many overlapping subproblems, which should be optimized by dynamic programming techniques.
How can we see the overlapping subproblems at a glance? As mentioned in the previous article "Regular Expressions for Dynamic Programming", we need to abstract the recursive framework of the algorithm in this article:
For the subproblem dp(i-1, j-1)
, how can we get it from the original question dp(i, j)
? Once we found a repetitive path, it means that there is a huge number of repetitive paths, which is the overlapping subproblem. For example: dp(i, j)-> #1
and dp(i, j)->#2->#3
.
For the overlapping subproblems, we introduced in the previous article "Detailed Explanation of Dynamic Programming" in detailed. The optimization is nothing more than a memo or a DP table.
The memo is easy to append, just modified the original code slightly.
We mainly explain the DP table solution.
First, we declare the meaning of the dp array. The dp array is a two-dimensional array, which looks like this:
With the foundation of the previous recursive solution, it's easy to understand. dp [..][0]
and dp [0][..]
correspond to the base case. The meaning of dp [i][j]
is similar to the previous dp function:
The base case of the dp function is that i, j
is equal to -1. However, the array index is at least 0, so the dp array is offset by one position.
Since the dp array has the same meaning as the recursive dp function, you can directly apply the previous ideas to write code. The only difference is that the DP table is solved bottom-up, and the recursive solution is solved top-down:
Generally speaking, when dealing with the dynamic programming of two strings, we just follow the ideas of this article, making the DP table. Why? Because it's easy to find out the relationship of the state transitions, such as the DP table of the edit distance:
There is another detail: since every dp[i][j]
is only related to the three status, the space complexity can be reduced to O(min(M, N)) (M, N is the length of the two strings). It's not very difficult but the code is harder to read. You can try to optimize it by yourself.
You may also ask, As we only found the minimum editing distance, how can we know the every step? In the example of modifying the article you mentioned earlier, only a editing distance is definitely not enough. You must know how to modify it.
Actually, it's very simple, just slightly modify the code and add additional information to the dp array:
The val
attribute is the value of the previous dp array, and the choice
attribute represents the operation. When making the best choice, record the operation and then infer the specific operation from the result.
Our final result is dp [m] [n]
, where val
holds the minimum edit distance, and choice
holds the last operation, such as the insert operation, then you can move one space to the left:
Repeating this process, you can return to the starting point dp [0] [0]
step by step to form a path. Editing according to the operations on this path is the best solution.
The above is the entire content of the edit distance algorithm.