1722. Minimize Hamming Distance After Swap Operations


Problem Description

In this problem, we have two arrays named source and target, both of the same length n. Additionally, we are given an array allowedSwaps containing pairs of indices [ai, bi]. These pairs indicate that we can swap the elements at these particular indices in the source array as many times as we want.

The task is to find the minimum Hamming distance between the source and target arrays after performing any number of allowed swaps. The Hamming distance is defined as the count of positions where the elements of the two arrays differ.

Thus, the main objective is to minimize the differences between the two arrays using the swaps permitted by the allowedSwaps array.

Intuition

To minimize the Hamming distance, we want to arrange the elements in source such that as many of them as possible match with the corresponding elements in target.

The intuition behind the solution is to consider that each swap allows us to group certain elements together. If we think of these swaps as connections in a graph, we can use the Union-Find (Disjoint Set Union) algorithm to find the groups of indices that can be considered one interchangeable set. Within these sets, we can rearrange the elements freely.

Once we form the groups, we check for each element in the target array: if the corresponding group in source contains that element, we reduce the count for that element (as we can match it with the target). If it's not present, we increment the result, signifying a Hamming distance increase.

This approach efficiently calculates the minimum Hamming distance by determining and utilizing the flexibility provided by allowedSwaps.

Learn more about Depth-First Search and Union Find patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

Breadth first search can be used to find the shortest path between two nodes in a directed graph.

Solution Approach

The solution utilizes the Union-Find data structure, also known as the Disjoint Set Union (DSU) algorithm. Union-Find helps in managing the grouping and merging of index sets that can be interchanged freely due to the allowed swaps.

Here are the key steps and elements of the solution:

  1. Initialization: We start by initializing a parent array p, where each index is initially the parent of itself. This represents that initially, each index can only be swapped within its own singleton set.

  2. Union Operation: We iterate through each swap pair in allowedSwaps and perform a union operation. The union is done by finding the parent of both elements in the swap pair and making one the parent of the other. This effectively merges the sets, allowing all indices in the merged set to be interchangeable.

    The find function is used to recursively find the root parent of an index, implementing path compression by assigning the root parent directly to each index along the path. This process reduces the complexity of subsequent find operations.

  3. Grouping Elements of source: We use a dictionary mp that maps each set (identified by its parent index) to a Counter that tracks the frequency of each element in that set within the source array.

  4. Calculating the Hamming Distance: We iterate over each element in the target array and look for the set in source that corresponds to its index. If the element in target exists in this set (the set's Counter has a count greater than zero), we reduce the count for that element and do not increment the Hamming distance. If it does not exist, we increment the result as it represents a difference between source and target.

  5. Returning the Result: After checking all elements, the res variable holds the minimum Hamming distance, which we return as the final output.

The Union-Find algorithm is crucial here, as it allows us to efficiently determine which elements in source can be swapped to match elements in target. The use of the Counter within the groups formed by Union-Find enables us to track and match elements between source and target, minimizing the Hamming distance. The end result is the minimum number of positions where source and target arrays differ after performing the swaps.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Which of the following shows the order of node visit in a Breadth-first Search?

Example Walkthrough

Let's illustrate the solution approach with a given example:

  • source = [1, 2, 3, 4]
  • target = [2, 1, 4, 5]
  • allowedSwaps = [[0, 1], [2, 3]]

We want to find the minimum Hamming distance between source and target after performing any number of allowed swaps.

  1. Initialization: Create a parent array p of the same length as source. Each index starts with itself as its parent: p = [0, 1, 2, 3] - This implies each index is initially in its own set.

  2. Union Operation:

    • For the first pair in allowedSwaps ([0, 1]), we perform a union operation. We find the parents for 0 and 1, which are themselves. Then, we make one of them the parent of the other. Assume we make 1 the parent of 0: p = [1, 1, 2, 3].
    • For the second pair ([2, 3]), we similarly merge these two indices' sets. 2 becomes the parent of 3: p = [1, 1, 2, 2].
  3. Grouping Elements of source: We create a dictionary mp which will hold Counter objects representing the frequency of each element in source, grouped by their parent index after all unions: mp = {1: Counter({1: 1, 2: 1}), 2: Counter({3: 1, 4: 1})}

  4. Calculating the Hamming Distance:

    • We start with res = 0. For every element in target, we find its corresponding set in source using p. For example:
      • target[0] = 2 should look in the set at p[0] which is 1. Since 2 is present there, we match source[0] = 1 with target[0] = 2 without incrementing res.
      • target[1] = 1 looks in set p[1] = 1. The element 1 is present there, so no increment to res.
      • target[2] = 4 refers to the set at p[2] = 2. The element 4 is present, so no change to res.
      • target[3] = 5 goes to the set at p[3] = 2, but 5 is not in Counter({3: 1, 4: 1}). So, res increments by 1.
  5. Returning the Result: We've determined the Hamming distance by iterating through all the elements in target. As only one element, 5, couldn't be matched after performing allowed swaps, our final Hamming distance is res = 1.

After applying the Union-Find algorithm and efficiently tracking elements in each group, we conclude that the minimum number of differing positions between source and target is 1, which we return as the final result.

Solution Implementation

1from collections import defaultdict, Counter
2from typing import List
3
4class Solution:
5    def minimumHammingDistance(self, source: List[int], target: List[int], allowedSwaps: List[List[int]]) -> int:
6        # Get the length of source which is the same as the length of the target
7        n = len(source)
8      
9        # Initialize parent array for disjoint set (union-find)
10        parent = list(range(n))
11
12        # Function to find the representative (root) of a set for element x
13        def find(x):
14            if parent[x] != x:
15                parent[x] = find(parent[x])  # Path compression
16            return parent[x]
17
18        # Perform union operations for each allowed swap
19        for i, j in allowedSwaps:
20            # Union by setting the parent of the representative of i to the representative of j
21            parent[find(i)] = find(j)
22
23        # Map to store counts of elements grouped by their disjoint set representative
24        disjoint_set_counter = defaultdict(Counter)
25        # Populate the counts of elements for each representative
26        for i in range(n):
27            disjoint_set_counter[find(i)][source[i]] += 1
28
29        # Variable to store the result of minimum Hamming distance
30        result = 0
31        # Calculate the Hamming distance
32        for i in range(n):
33            # If the element in the target is in the disjoint set and count is greater than 0
34            if disjoint_set_counter[find(i)][target[i]] > 0:
35                # Decrement the count of the target element in the disjoint set
36                disjoint_set_counter[find(i)][target[i]] -= 1
37            else:
38                # If the element is not found in the set, increment result
39                result += 1
40
41        # Return the final result of minimum Hamming distance
42        return result
43
1class Solution {
2    // Parent array representing the root of each element's set.
3    private int[] parent;
4
5    // This function calculates the minimum Hamming distance between source and target arrays.
6    public int minimumHammingDistance(int[] source, int[] target, int[][] allowedSwaps) {
7        int n = source.length;
8        parent = new int[n];
9      
10        // Initialize parent array so that each element is its own root.
11        for (int i = 0; i < n; i++) {
12            parent[i] = i;
13        }
14      
15        // Apply union operation for allowed swaps.
16        for (int[] swap : allowedSwaps) {
17            int rootA = find(swap[0]);
18            int rootB = find(swap[1]);
19            parent[rootA] = rootB;
20        }
21      
22        // Create a mapping from root to a frequency map of the elements associated with that root.
23        Map<Integer, Map<Integer, Integer>> frequencyMap = new HashMap<>();
24        for (int i = 0; i < n; i++) {
25            int root = find(i);
26            Map<Integer, Integer> rootFrequencyMap = frequencyMap.computeIfAbsent(root, k -> new HashMap<>());
27            int elementCount = rootFrequencyMap.getOrDefault(source[i], 0);
28            rootFrequencyMap.put(source[i], elementCount + 1);
29        }
30      
31        // Calculate Hamming distance after applying allowed swaps.
32        int minDistance = 0;
33        for (int i = 0; i < n; i++) {
34            int root = find(i);
35            Map<Integer, Integer> rootFrequencyMap = frequencyMap.get(root);
36            if (rootFrequencyMap.getOrDefault(target[i], 0) > 0) {
37                // If the element in target exists in the frequency map, decrement its count.
38                rootFrequencyMap.put(target[i], rootFrequencyMap.get(target[i]) - 1);
39            } else {
40                // Otherwise, increment the Hamming distance.
41                minDistance++;
42            }
43        }
44      
45        return minDistance; // Return the minimum Hamming distance.
46    }
47
48    // The 'find' function implements union-find algorithm to find the root of an element.
49    private int find(int x) {
50        if (parent[x] != x) {
51            parent[x] = find(parent[x]); // Path compression.
52        }
53        return parent[x];
54    }
55}
56
1#include <vector>
2#include <unordered_map>
3
4class Solution {
5public:
6    std::vector<int> parent; // The 'p' vector is renamed to 'parent' for better readability.
7
8    // Helper function to find the root of an element using path compression.
9    int find(int x) {
10        if (parent[x] != x) {
11            parent[x] = find(parent[x]);
12        }
13        return parent[x];
14    }
15
16    // Main function to calculate the minimum Hamming distance after allowed swaps.
17    int minimumHammingDistance(std::vector<int>& source, std::vector<int>& target, std::vector<std::vector<int>>& allowedSwaps) {
18        int n = source.size();
19        parent.resize(n);
20        // Initialize each element's root to be the element itself.
21        for (int i = 0; i < n; ++i) {
22            parent[i] = i;
23        }
24        // Apply union operations for each allowed swap.
25        for (auto& e : allowedSwaps) {
26            parent[find(e[0])] = find(e[1]);
27        }
28      
29        // Use a map to count occurrences of elements within each connected component.
30        std::unordered_map<int, std::unordered_map<int, int>> countMap;
31        for (int i = 0; i < n; ++i) {
32            ++countMap[find(i)][source[i]];
33        }
34      
35        int result = 0; // This will hold the final result, the minimum Hamming distance.
36        // Compare the target array against the count map to calculate the Hamming distance.
37        for (int i = 0; i < n; ++i) {
38            if (countMap[find(i)][target[i]] > 0) {
39                --countMap[find(i)][target[i]];
40            } else {
41                ++result;
42            }
43        }
44        return result; // Return the minimum Hamming distance calculated.
45    }
46};
47
1// Import statements are not required in TypeScript for vector-like structures; 
2// instead, we use Array and TypeScript's included Map object.
3
4// Global variable 'parent' to track the root of each element.
5let parent: number[] = [];
6
7// Helper function to find the root of an element using path compression.
8const find = (x: number): number => {
9    if (parent[x] !== x) {
10        parent[x] = find(parent[x]);
11    }
12    return parent[x];
13};
14
15// Main function to calculate the minimum Hamming distance after allowed swaps.
16const minimumHammingDistance = (source: number[], target: number[], allowedSwaps: number[][]): number => {
17    const n = source.length;
18    parent = Array.from({ length: n }, (_, index) => index); // Initialize each element's root to be the element itself.
19
20    // Apply union operations for each allowed swap.
21    allowedSwaps.forEach(([a, b]) => {
22        parent[find(a)] = find(b);
23    });
24
25    // Use a map to count occurrences of elements within each connected component.
26    const countMap: Map<number, Map<number, number>> = new Map();
27    source.forEach((value, index) => {
28        const root = find(index);
29        if (!countMap.has(root)) {
30            countMap.set(root, new Map());
31        }
32        const componentCountMap = countMap.get(root);
33        if (componentCountMap) {
34            componentCountMap.set(value, (componentCountMap.get(value) || 0) + 1);
35        }
36    });
37
38    let result = 0; // This will hold the final result, the minimum Hamming distance.
39    // Compare the target array against the count map to calculate the Hamming distance.
40    target.forEach((value, index) => {
41        const root = find(index);
42        const componentCountMap = countMap.get(root);
43        if (componentCountMap) {
44            if (componentCountMap.get(value)) {
45                componentCountMap.set(value, componentCountMap.get(value)! - 1);
46                // If an item is present in both target and source within the same component, decrement the count.
47                if (componentCountMap.get(value) === 0) {
48                    componentCountMap.delete(value); // Remove the entry if count becomes zero to save space.
49                }
50            } else {
51                result++; // Increment the result if the item was not found.
52            }
53        }
54    });
55
56    return result; // Return the minimum Hamming distance calculated.
57};
58
Not Sure What to Study? Take the 2-min Quiz:

Depth first search can be used to find whether two components in a graph are connected.

Time and Space Complexity

The code snippet provided is a solution for computing the minimum Hamming Distance by allowing swaps between indices as dictated by the allowedSwaps list. Let's analyze the time and space complexity of the given code.

Time Complexity

The time complexity of the code can be broken down as follows:

  1. The find function essentially implements the path compression technique in union-find and it runs in amortized O(1) time.

  2. The loop over allowedSwaps to unite indices using the union-find data structure performs O(m) operations, where m is the length of allowedSwaps. Since the find function has an amortized complexity of O(1) for each operation, this step will have a time complexity of O(m).

  3. Building the mp dictionary, which maps the root parent from the union-find structure to a counter of elements, involves iterating over each element in source. For each element, we call find, which is O(1) amortized, so the loop results in a time complexity of O(n).

  4. The final for-loop iterates over target and checks if the corresponding element exists in the counter associated with its group's root parent. This is once again O(n) time because it involves n lookups and updates to the dictionary.

Combining these steps, the overall time complexity is O(m + 2n) which simplifies to O(m + n).

Space Complexity

The space complexity of the code considers the following factors:

  1. The disjoint set data structure (union-find) represented by the list p which uses O(n) space.

  2. The mp dictionary containing at most n keys (in the worst case, no elements are swapped) and their associated counters, which in total will not exceed n elements. Therefore, mp utilizes O(n) space.

  3. There is a negligible space used for variables like i, j, and res, which do not scale with the input.

Thus, the space complexity of the algorithm is also O(n).

In summary:

  • Time Complexity: O(m + n)
  • Space Complexity: O(n)

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

What are the most two important steps in writing a depth first search function? (Select 2)


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫