algo-en
  • Introduction
  • I. Dynamic Programming
    • Dynamic Programming in Details
    • Classic DP: Edit Distance
    • Classic DP: Super Egg
    • Classic DP: Super Egg(Advanced Solution)
    • Class DP: Longest Common Subsequence
    • Classis DP: Game Problems
    • Regular Expression
    • The Strategies of Subsequence Problem
    • Greedy: Interval Scheduling
    • 4 Keys Keyboard
    • What is DP Optimal Substructure
    • Longest Increasing Subsequence
    • KMP Algorithm In Detail
    • House Robber Problems
    • Stock Buy and Sell Problems
  • II. Data Structure
    • Binary Heap and Priority Queue
    • LRU Cache Strategy in Detail
    • Collections of Binary Search Operations
    • Special Data Structure: Monotonic Stack
    • Special Data Structure: Monotonic Stack
    • Design Twitter
    • Reverse Part of Linked List via Recursion
    • What's the Best Algo Book
    • Queue Implement Stack/Stack implement Queue
    • Framework for learning data structures and algorithms
  • III. Algorithmic thinking
    • My Way to Learn Algorithm
    • The Framework for Backtracking Algorithm
    • Binary Search in Detail
    • The Tech of Double Pointer
    • The Key Concept of TwoSum Problems
    • Divide Complicated Problem: Implement a Calculator
    • Prefix Sum Skill
    • FloodFill Algorithm in Detail
    • Interval Scheduling: Interval Merging
    • Interval Scheduling: Intersections of Intervals
    • String Multiplication
    • Pancake Soring Algorithm
    • Sliding Window Algorithm
    • Some Useful Bit Manipulations
    • Russian Doll Envelopes Problem
    • Recursion In Detail
    • Backtracking Solve Subset/Permutation/Combination
    • Several counter-intuitive Probability Problems
    • Shuffle Algorithm
  • IV. High Frequency Interview Problem
    • How to Implement LRU Cache
    • How to Find Prime Number Efficiently
    • How to Calculate Minimium Edit Distance
    • How to Solve Drop Water Problem
    • How to Remove Duplicate From Sorted Sequence
    • How to Find Longest Palindromic Substring
    • How to Reverse Linked List in K Group
    • How to Check the Validation of Parenthesis
    • How to Find Missing Element
    • How to Pick Elements From a Arbitrary Sequence
    • How to use Binary Search
    • How to Scheduling Seats
    • Union-Find Algorithm in Detail
    • Union-Find Application
    • Find Subsequence With Binary Search
    • Problems can be solved by one line
    • How to Find Duplicate and Missing Element
    • How to Check Palindromic LinkedList
  • V. Common Knowledge
    • Difference Between Process and Thread in Linux
    • You Must Know About Linux Shell
    • You Must Know About Cookie and Session
    • Cryptology Algorithm
    • Some Good Online Pratice Platforms
Powered by GitBook
On this page
  • 1. Problem Analysis
  • 2. Using Binary Search
  • 3. More about Binary Search
  • 4. Implementation

Was this helpful?

  1. IV. High Frequency Interview Problem

Find Subsequence With Binary Search

PreviousUnion-Find ApplicationNextProblems can be solved by one line

Last updated 5 years ago

Was this helpful?

Translator:

Author:

Binary search is not hard to understand. It is rather hard to apply. Sometimes, you can't even link a question with binary search. In another article Longest Increasing Subsequence, we could even apply binary search in a poker game.

Let's discuss another interesting question that we can use binary search: how to determine if a given string s is subsequence of another string t (assume s is much shorter as compared to t)? Look at the two examples below:

s = "abc", t = "ahbgdc", return true.

s = "axc", t = "ahbgdc", return false.

This is a straightforward question which looks simple. But can you relate this with binary search?

1. Problem Analysis

Here is an intuitive solution:

bool isSubsequence(string s, string t) {
    int i = 0, j = 0;
    while (i < s.size() && j < t.size()) {
        if (s[i] == t[j]) i++;
        j++;
    }
    return i == s.size();
}

The idea is to use two pointers i, j to point to s, t respectively. While moving forward, try to match the characters:

Some people may claim this is the optimal solution, given the time complexity is O(N) while N is the length of t.

In fact, this solution is good enough for this problem alone. However, there is a follow-up:

Given a list of string s1,s2,... and a string t, determine if each string s is a subsequence of t (assume each s is much shorter as compared to t).

boolean[] isSubsequence(String[] sn, String t);

We can still apply the same logic inside a for loop. However, the time complexity for each s is still O(N). If binary search is applied, the time complexity can be reduced to O(MlogN). Since N >> M, the efficiency will be improved significantly.

2. Using Binary Search

To begin with binary search, we need to pre-process t by storing the indices of each character in a dictionary index.

int m = s.length(), n = t.length();
ArrayList<Integer>[] index = new ArrayList[256];
// record down the indices of each character in t
for (int i = 0; i < n; i++) {
    char c = t.charAt(i);
    if (index[c] == null) 
        index[c] = new ArrayList<>();
    index[c].add(i);
}

Refer to the diagram below, since we've matched "ab", the next one to be matched should be "c":

If we apply the first solution, we need to traverse linearly using j to find "c". With the information in index, we can use binary search to find an index that is greater than j in index["c"]. In the diagram above, we need to find an index from [0, 2, 6] that is greater than 4:

In this way, we can directly get the index of next "c". The problem becomes how to find the smallest index that is greater than 4? We can use binary search to find the left boundary.

3. More about Binary Search

When val does not exist, the index returned is the index of the smallest value which is greater than val.

It means that when we try to find element 2 in array [0,1,3,4], the algorithm will return index 2, where element 3 is located. And element 3 is the smallest element that is greater than 2 in this array. Hence, we can use binary search to avoid linear traversal.

// binary search to find the left boundary
int left_bound(ArrayList<Integer> arr, int tar) {
    int lo = 0, hi = arr.size();
    while (lo < hi) {
        int mid = lo + (hi - lo) / 2;
        if (tar > arr.get(mid)) {
            lo = mid + 1;
        } else {
            hi = mid;
        } 
    }
    return lo;
}

4. Implementation

We take a single string s as an example for the case of multiple strings. The part of pre-processing can be extracted out.

boolean isSubsequence(String s, String t) {
    int m = s.length(), n = t.length();
    // pre-process t
    ArrayList<Integer>[] index = new ArrayList[256];
    for (int i = 0; i < n; i++) {
        char c = t.charAt(i);
        if (index[c] == null) 
            index[c] = new ArrayList<>();
        index[c].add(i);
    }

    // the pointer in t
    int j = 0;
    // find s[i] using index
    for (int i = 0; i < m; i++) {
        char c = s.charAt(i);
        // character c does not exist in t
        if (index[c] == null) return false;
        int pos = left_bound(index[c], j);
        // c is not found in the binary search interval
        if (pos == index[c].size()) return false;
        // increment pointer j
        j = index[c].get(pos) + 1;
    }
    return true;
}

The gif below illustrates how the algorithm executes:

We can see that the efficiency can be significantly improved using binary search.

In another article , we discussed in details how to implement binary search in 3 different ways. When we use binary search to return the index of target val to find the left boundary, there is a special property:

The binary search above is to find the left boundary. Its details can be found in . Let's apply it.

youyun
labuladong
Detailed Binary Search
Detailed Binary Search
gif