Find Subsequence With Binary Search
Translator: youyun
Author: labuladong
Binary search is not hard to understand. It is rather hard to apply. Sometimes, you can't even link a question with binary search. In another article Longest Increasing Subsequence, we could even apply binary search in a poker game.
Let's discuss another interesting question that we can use binary search: how to determine if a given string s
is subsequence of another string t
(assume s
is much shorter as compared to t
)? Look at the two examples below:
s = "abc", t = "ahbgdc", return true.
s = "axc", t = "ahbgdc", return false.
This is a straightforward question which looks simple. But can you relate this with binary search?
1. Problem Analysis
Here is an intuitive solution:
The idea is to use two pointers i, j
to point to s, t
respectively. While moving forward, try to match the characters:
Some people may claim this is the optimal solution, given the time complexity is O(N) while N is the length of t
.
In fact, this solution is good enough for this problem alone. However, there is a follow-up:
Given a list of string s1,s2,...
and a string t
, determine if each string s
is a subsequence of t
(assume each s
is much shorter as compared to t
).
We can still apply the same logic inside a for
loop. However, the time complexity for each s
is still O(N). If binary search is applied, the time complexity can be reduced to O(MlogN). Since N >> M
, the efficiency will be improved significantly.
2. Using Binary Search
To begin with binary search, we need to pre-process t
by storing the indices of each character in a dictionary index
.
Refer to the diagram below, since we've matched "ab", the next one to be matched should be "c":
If we apply the first solution, we need to traverse linearly using j
to find "c". With the information in index
, we can use binary search to find an index that is greater than j
in index["c"]
. In the diagram above, we need to find an index from [0, 2, 6]
that is greater than 4:
In this way, we can directly get the index of next "c". The problem becomes how to find the smallest index that is greater than 4? We can use binary search to find the left boundary.
3. More about Binary Search
In another article Detailed Binary Search, we discussed in details how to implement binary search in 3 different ways. When we use binary search to return the index of target val
to find the left boundary, there is a special property:
When val
does not exist, the index returned is the index of the smallest value which is greater than val
.
It means that when we try to find element 2 in array [0,1,3,4]
, the algorithm will return index 2, where element 3 is located. And element 3 is the smallest element that is greater than 2 in this array. Hence, we can use binary search to avoid linear traversal.
The binary search above is to find the left boundary. Its details can be found in Detailed Binary Search. Let's apply it.
4. Implementation
We take a single string s
as an example for the case of multiple strings. The part of pre-processing can be extracted out.
The gif below illustrates how the algorithm executes:
We can see that the efficiency can be significantly improved using binary search.
Last updated