# Find Subsequence With Binary Search

Binary search is not hard to understand. It is rather hard to apply. Sometimes, you can't even link a question with binary search. In another article Longest Increasing Subsequence, we could even apply binary search in a poker game.

Let's discuss another interesting question that we can use binary search: how to determine if a given string

`s`

is subsequence of another string `t`

(assume `s`

is much shorter as compared to `t`

)? Look at the two examples below:s = "abc", t = "ahbgdc", return true.s = "axc", t = "ahbgdc", return false.

This is a straightforward question which looks simple. But can you relate this with binary search?

Here is an intuitive solution:

bool isSubsequence(string s, string t) {

int i = 0, j = 0;

while (i < s.size() && j < t.size()) {

if (s[i] == t[j]) i++;

j++;

}

return i == s.size();

}

The idea is to use two pointers

`i, j`

to point to `s, t`

respectively. While moving forward, try to match the characters:gif

Some people may claim this is the optimal solution, given the time complexity is O(N) while N is the length of

`t`

.In fact, this solution is good enough for this problem alone.

**However, there is a follow-up**:Given a list of string

`s1,s2,...`

and a string `t`

, determine if each string `s`

is a subsequence of `t`

(assume each `s`

is much shorter as compared to `t`

).boolean[] isSubsequence(String[] sn, String t);

We can still apply the same logic inside a

`for`

loop. However, the time complexity for each `s`

is still O(N). If binary search is applied, the time complexity can be reduced to O(MlogN). Since `N >> M`

, the efficiency will be improved significantly.To begin with binary search, we need to pre-process

`t`

by storing the indices of each character in a dictionary `index`

.int m = s.length(), n = t.length();

ArrayList<Integer>[] index = new ArrayList[256];

// record down the indices of each character in t

for (int i = 0; i < n; i++) {

char c = t.charAt(i);

if (index[c] == null)

index[c] = new ArrayList<>();

index[c].add(i);

}

Refer to the diagram below, since we've matched "ab", the next one to be matched should be "c":

If we apply the first solution, we need to traverse linearly using

`j`

to find "c". With the information in `index`

, **we can use binary search to find an index that is greater than**`j`

**in****. In the diagram above, we need to find an index from**`index["c"]`

`[0, 2, 6]`

that is greater than 4:In this way, we can directly get the index of next "c". The problem becomes how to find the smallest index that is greater than 4? We can use binary search to find the left boundary.

In another article Detailed Binary Search, we discussed in details how to implement binary search in 3 different ways. When we use binary search to return the index of target

`val`

to find **the left boundary**, there is a special property:**When**

`val`

**does not exist, the index returned is the index of the smallest value which is greater than**

**.**

`val`

It means that when we try to find element 2 in array

`[0,1,3,4]`

, the algorithm will return index 2, where element 3 is located. And element 3 is the smallest element that is greater than 2 in this array. Hence, we can use binary search to avoid linear traversal.// binary search to find the left boundary

int left_bound(ArrayList<Integer> arr, int tar) {

int lo = 0, hi = arr.size();

while (lo < hi) {

int mid = lo + (hi - lo) / 2;

if (tar > arr.get(mid)) {

lo = mid + 1;

} else {

hi = mid;

}

}

return lo;

}

The binary search above is to find the left boundary. Its details can be found in Detailed Binary Search. Let's apply it.

We take a single string

`s`

as an example for the case of multiple strings. The part of pre-processing can be extracted out.boolean isSubsequence(String s, String t) {

int m = s.length(), n = t.length();

// pre-process t

ArrayList<Integer>[] index = new ArrayList[256];

for (int i = 0; i < n; i++) {

char c = t.charAt(i);

if (index[c] == null)

index[c] = new ArrayList<>();

index[c].add(i);

}

// the pointer in t

int j = 0;

// find s[i] using index

for (int i = 0; i < m; i++) {

char c = s.charAt(i);

// character c does not exist in t

if (index[c] == null) return false;

int pos = left_bound(index[c], j);

// c is not found in the binary search interval

if (pos == index[c].size()) return false;

// increment pointer j

j = index[c].get(pos) + 1;

}

return true;

}

The gif below illustrates how the algorithm executes:

We can see that the efficiency can be significantly improved using binary search.

Last modified 3yr ago