algo-en
  • Introduction
  • I. Dynamic Programming
    • Dynamic Programming in Details
    • Classic DP: Edit Distance
    • Classic DP: Super Egg
    • Classic DP: Super Egg(Advanced Solution)
    • Class DP: Longest Common Subsequence
    • Classis DP: Game Problems
    • Regular Expression
    • The Strategies of Subsequence Problem
    • Greedy: Interval Scheduling
    • 4 Keys Keyboard
    • What is DP Optimal Substructure
    • Longest Increasing Subsequence
    • KMP Algorithm In Detail
    • House Robber Problems
    • Stock Buy and Sell Problems
  • II. Data Structure
    • Binary Heap and Priority Queue
    • LRU Cache Strategy in Detail
    • Collections of Binary Search Operations
    • Special Data Structure: Monotonic Stack
    • Special Data Structure: Monotonic Stack
    • Design Twitter
    • Reverse Part of Linked List via Recursion
    • What's the Best Algo Book
    • Queue Implement Stack/Stack implement Queue
    • Framework for learning data structures and algorithms
  • III. Algorithmic thinking
    • My Way to Learn Algorithm
    • The Framework for Backtracking Algorithm
    • Binary Search in Detail
    • The Tech of Double Pointer
    • The Key Concept of TwoSum Problems
    • Divide Complicated Problem: Implement a Calculator
    • Prefix Sum Skill
    • FloodFill Algorithm in Detail
    • Interval Scheduling: Interval Merging
    • Interval Scheduling: Intersections of Intervals
    • String Multiplication
    • Pancake Soring Algorithm
    • Sliding Window Algorithm
    • Some Useful Bit Manipulations
    • Russian Doll Envelopes Problem
    • Recursion In Detail
    • Backtracking Solve Subset/Permutation/Combination
    • Several counter-intuitive Probability Problems
    • Shuffle Algorithm
  • IV. High Frequency Interview Problem
    • How to Implement LRU Cache
    • How to Find Prime Number Efficiently
    • How to Calculate Minimium Edit Distance
    • How to Solve Drop Water Problem
    • How to Remove Duplicate From Sorted Sequence
    • How to Find Longest Palindromic Substring
    • How to Reverse Linked List in K Group
    • How to Check the Validation of Parenthesis
    • How to Find Missing Element
    • How to Pick Elements From a Arbitrary Sequence
    • How to use Binary Search
    • How to Scheduling Seats
    • Union-Find Algorithm in Detail
    • Union-Find Application
    • Find Subsequence With Binary Search
    • Problems can be solved by one line
    • How to Find Duplicate and Missing Element
    • How to Check Palindromic LinkedList
  • V. Common Knowledge
    • Difference Between Process and Thread in Linux
    • You Must Know About Linux Shell
    • You Must Know About Cookie and Session
    • Cryptology Algorithm
    • Some Good Online Pratice Platforms
Powered by GitBook
On this page
  • Analysis
  • Summary

Was this helpful?

  1. IV. High Frequency Interview Problem

How to Find Duplicate and Missing Element

PreviousProblems can be solved by one lineNextHow to Check Palindromic LinkedList

Last updated 5 years ago

Was this helpful?

Translator:

Author:

Today we are going to talk about a simple but skillfull problem: find duplicate and missing element. It seems to be similar to the previous problem , but there are some difference between these two problems.

Here is the detailed description of this problem(LeetCode 645: Set Mismatch)

The set Soriginally contains numbers from 1 to n. But unfortunately, due to the data error, one of the numbers in the set got duplicate to another number in the set, which results in repetition of one number and loss of another number.

Given an array nums representing the data status of this set after the error. Your task is to firstly find the number occurs twice and then find the number that is missing. Return them in the form of an array.

Example 1:

Input: nums = [1,2,2,4]
Output: [2,3]

Actually, it's easy to solve this problem. Firstly, traverse over the whole nums array and use HashMap to store the number of times each element of the array. After this, we can consider every number from 1 to n, and check for its presence in map.

But here's a problem. This solution requires a HashMap that means the space complexity is O(n). We check the condition again. Consider the numbers from 1 to n, which happens to be one duplicate element and one missing element. There must be something strange about things going wrong.

We must traverse over the whole nums array of size n for each of the numbers from 1 to n. That means the time complexity is O(n). So we can think how to save the space used to reduce the space complexity to O(1).

Analysis

The characteristic of this problem is that each element has a certain correspondence with the array index.

Let's change the condition of the problem temporarily. Change the elements in nums array to [0..N-1]. Therefore, each element corresponds exactly to an array index, which is easy to understand.

We assume that there are no duplicate or missing elements in the array. Therefore, each element corresponds to a unique index value.

But the question now is one number is repeated that results which results in loss of another number. What would happen? This will result in two elements corresponding to the same index, and there will be an index with no elements to correspond.

If we can somehow find the duplicate corresponding index, which means we find the duplicate element. Then find the index that no element to correspond that also means we find the missing element.

So, how do you determine how many elements of an index correspond to without using extra space? Here is the subtlety of the question.

By turning the element corresponding to each index into a negative number, it indicates that this index has been mapped once.

If we find a duplicate element 4, the intuitive result is that the element corresponding to index 4is already negative.

For the missing element 3, the intuitive result is that the element corresponding to index 3is positive.

Therefore, we can code as follows:

vector<int> findErrorNums(vector<int>& nums) {
    int n = nums.size();
    int dup = -1;
    for (int i = 0; i < n; i++) {
        int index = abs(nums[i]);
        // nums[index] < 0  means find the duplicate element
        if (nums[index] < 0)
            dup = abs(nums[i]);
        else
            nums[index] *= -1;
    }

    int missing = -1;
    for (int i = 0; i < n; i++)
        // nums[i] > 0 means find the missing element
        if (nums[i] > 0)
            missing = i;

    return {dup, missing};
}

Now, the question is basically solved. But don't forget that we have just assumed that the elements in nums array is from 0 to N-1. Actually, it should be 1 to N. So we need to modify two places to get the right answer to the original question.

vector<int> findErrorNums(vector<int>& nums) {
    int n = nums.size();
    int dup = -1;
    for (int i = 0; i < n; i++) {
        // Now, elements  start at 1
        int index = abs(nums[i]) - 1;
        if (nums[index] < 0)
            dup = abs(nums[i]);
        else
            nums[index] *= -1;
    }

    int missing = -1;
    for (int i = 0; i < n; i++)
        if (nums[i] > 0)
            // Convert index to element
            missing = i + 1;

    return {dup, missing};
}

In fact, it makes sense for elements to start from 1, and it must start with a non-zero number. If the element starts from 0, the opposite number of 0 is still itself. So when the number 0 is repeated or missing, we can't deal with this situation. Our previous assumption was just to simplify the problem and make it easier to understand.

Summary

The key point is that elements and indexes appear in pairs for this kind of problems. Common methods include Sorting, XOR, and Map

The idea of Map is the above analysis. Mapping each index and element, and recording whether an element is mapped with a sign.

The Sorting method is also easy to understand. For this problem, we can assume that if all elements are sorted from smallest to largest. If we find that the corresponding elements of the index didn't match, so we find duplicate and missing elements.

We can stop by now.

XOR operation is also commonly used. The XOR operation (^) has a special property: the result of a number XOR itself is 0, and the result of a number with 0 is itself. For instance: a ^ a = 0, a ^ 0 = a. If we take XOR of the index and element at the same time, the paired index and element can be eliminated, and the remaining are duplicate or missing elements. You can look at the previous article which introduce this method.

bryceustc
labuladong
How to Find Missing Elements
Find Missing Elements