1055. Shortest Way to Form String


Problem Description

The problem involves finding how to form a given target string by concatenating subsequences of a source string. A subsequence can be obtained by deleting zero or more characters from a string without changing the order of the remaining characters. We need to determine the minimum number of these subsequences from source needed to create the target string. If it's not possible to form the target from subsequences of source, the function should return -1.

To visualize the problem, think about how you can create the word "cat" using letters from the word "concentrate". You can select c, omit o, pick a, skip n, c, e, n, and then pick t and skip r, a, t, e. That forms one subsequence "cat". Moreover, if you had a target like "catat", you'd need two subsequences from "concentrate" to form it—"cat" and "at".

Intuition

The intuition for the solution is built on the idea that we can iterate over the source and target strings simultaneously, matching characters from target and moving through source. We do this in a loop that continues until we've either created the target string or determined it's impossible.

For each iteration (which corresponds to constructing a single subsequence), we do the following:

  1. Start from the beginning of the source string and look for a match for the current character in target.
  2. Each time we find a match, we move to the next character in target but continue searching from the current position in source.
  3. If we reach the end of the source without finding the next character in target, we start over from the beginning of source and increment our subsequence count.
  4. If when restarting the source string we don't make any progress in target (meaning we didn't find even the next character in the target), we conclude that the target cannot be formed and return -1.

The concept is similar to using multiple copies of the source string side by side, and crossing out characters as we match them to target. Whenever we reach the end of a source string copy and still have characters left to match in target, we move on to the next copy of source, symbolizing this with an increment in our subsequence counter. This continues until we've matched the entire target string or have verified that it's impossible.

Learn more about Greedy and Two Pointers patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

How does merge sort divide the problem into subproblems?

Solution Approach

The solution uses a two-pointer technique to iterate through both the source and target strings. One pointer (i) traverses the source string, while the other pointer (j) iterates over the target string. Here's a step-by-step breakdown of the key components of the algorithm:

  1. Function f(i, j): This is a helper function that takes two indices, i for the source and j for the target. The purpose of f is to try to match as many characters of target starting from index j with the source starting from index i until we reach the end of source. The function runs a while loop until either i or j reaches the end of their respective strings. Inside this loop:

    • If the characters at source[i] and target[j] match, increment j to check for the next character in target.
    • Whether or not there is a match, always increment i because we can skip characters in source.
    • The function returns the updated index j after traversing through source.
  2. Main Algorithm: Once we have our helper function, the main algorithm proceeds as follows:

    • We initialize two variables, m and n as the lengths of source and target respectively, and ans and j to keep track of the number of subsequences needed and the current index in target.
    • We use a while loop that continues as long as j < n, meaning there are still characters in target that have not been matched.
    • Inside the while loop, we call our helper function f(0, j) which tries to match target starting from the current j index with source starting from 0. If the returned index k is the same as j, it means no further characters from target could be matched and we return -1 as it's impossible to form target.
    • If k is different from j, this means we've managed to match some part of target, and we update j to k and increment ans to signify the creation of another subsequence from source.
    • The process repeats until all characters of target are matched.
  3. Return Value: The loop ends with two possibilities; either we were able to form target successfully, hence we return the ans which is the count of subsequences needed, or we determined that target cannot be formed from source and returned -1.

Complexity Analysis

  • Time Complexity: O(m * n), where m is the length of source and n is the length of target. In the worst case, we iterate through the entire source for every character in target.
  • Space Complexity: O(1), we only use a fixed amount of extra space for the pointers and the ans variable regardless of the input size.

By thoroughly understanding the definition of a subsequence and carefully managing the iteration through both strings, this solution efficiently determines the minimum number of subsequences of source required to form target or establishes that it's not possible.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

What is the worst case running time for finding an element in a binary search tree(not necessarily balanced) of size n?

Example Walkthrough

Let's walk through a small example to illustrate the solution approach.

Example:

  • source: "abcab"
  • target: "abccba"

Walkthrough:

  1. Initialize the count of subsequences (ans) needed to 0 and the index j in the target to 0.

  2. Since j < n (where n is the length of target), start the iteration and call the helper function f with f(0, 0).

  3. Inside f(0, 0), iterate over source and target. For each character in source, check if it matches the current target[j].

    • For source[0] = 'a' and target[0] = 'a', there's a match, increment j to 1 (next character in target).
    • Continue to source[1] = 'b', which matches target[1] = 'b', increment j to 2.
    • Continue to source[2] = 'c', which matches target[2] = 'c', increment j to 3.
    • The character source[3] = 'a' does not match target[3] = 'c', so just increment i.
    • The character source[4] = 'b' does not match target[3] either, increment i again and now i reaches the end of source.
  4. The f function returns j which is now 3. Since j has increased from 0 to 3, one subsequence "abc" has been matched from source.

    • Increment ans to 1 and start matching the next subsequence with f(0, 3).
  5. In the second call to f(0, 3), we iterate from the start of source again:

    • Skip source[0] = 'a', since it doesn't match target[3] = 'c'.
    • Skip source[1] = 'b', since it doesn't match target[3] either.
    • source[2] matches target[3], so increment j to 4.
    • source[3] matches target[4], increment j to 5.
    • No more characters in source match target[5] = 'a', but once we reach the end of source, f returns j which is now 5.
  6. Increment ans to 2 and start matching the last character with f(0, 5).

  7. In the third call to f(0, 5), the first character source[0] = 'a' matches the last character target[5] = 'a'. The j is incremented to 6, which is the length of target, so the entire target string has been matched.

  8. Increment ans to 3 which is our final answer.

Return:

The function would return 3 as it takes three subsequences of source to form the target string "abccba".

This example collapses the entire iteration into a concise explanation, demonstrating how the algorithm works in practice and matches subsequences in the source to form the target string.

Solution Implementation

1class Solution:
2    def shortestWay(self, source: str, target: str) -> int:
3        # Helper function to find the first unmatched character in 'target'
4        # starting from 'target_index' by iterating through 'source'.
5        def find_unmatched_index(source_index, target_index):
6            # Iterate over both 'source' and 'target' strings.
7            while source_index < len_source and target_index < len_target:
8                # If the current characters match, move to the next character in 'target'.
9                if source[source_index] == target[target_index]:
10                    target_index += 1
11                # Move to the next character in 'source'.
12                source_index += 1
13            # Return the index in 'target' where the characters stop matching.
14            return target_index
15
16        # Initialize the length variables of 'source' and 'target'.
17        len_source, len_target = len(source), len(target)
18      
19        # Initialize 'subsequences_count' to 0 to count the subsequences of 'source' needed.
20        subsequences_count = 0
21      
22        # Initialize 'target_index' to keep track of progress in the 'target' string.
23        target_index = 0
24      
25        # Main loop to iterate until the entire 'target' string is checked.
26        while target_index < len_target:
27            # Find the index of the first unmatched character after 'target_index'.
28            unmatched_index = find_unmatched_index(0, target_index)
29          
30            # Check if 'target_index' did not move forward; if so, 'target' cannot be constructed.
31            if unmatched_index == target_index:
32                return -1
33          
34            # Update 'target_index' to the index of the first unmatched character.
35            target_index = unmatched_index
36          
37            # Increment the count of subsequences used.
38            subsequences_count += 1
39      
40        # Return the total number of subsequences from 'source' needed to form 'target'.
41        return subsequences_count
42
1class Solution {
2    // Method to find the minimum number of subsequences of 'source' which concatenate to form 'target'
3    public int shortestWay(String source, String target) {
4        // 'sourceLength' is the length of 'source', 'targetLength' is the length of 'target'
5        int sourceLength = source.length(), targetLength = target.length();
6        // 'numSubsequences' will track the number of subsequences used
7        int numSubsequences = 0;
8        // 'targetIndex' is used to iterate through the characters of 'target'
9        int targetIndex = 0;
10
11        // Continue until the whole 'target' string is covered
12        while (targetIndex < targetLength) {
13            // 'sourceIndex' is used to iterate through characters of 'source'
14            int sourceIndex = 0;
15            // 'subsequenceFound' flags if a matching character was found in the current subsequence iteration
16            boolean subsequenceFound = false;
17
18            // Loop both 'source' and 'target' strings to find subsequence matches
19            while (sourceIndex < sourceLength && targetIndex < targetLength) {
20                // If the characters match, move to the next character in 'target'
21                if (source.charAt(sourceIndex) == target.charAt(targetIndex)) {
22                    subsequenceFound = true; // A match in the subsequence was found
23                    targetIndex++;
24                }
25                // Always move to the next character in 'source'
26                sourceIndex++;
27            }
28
29            // If no matching subsequence has been found, it's not possible to form 'target'
30            if (!subsequenceFound) {
31                return -1;
32            }
33            // A subsequence that contributes to 'target' was used, so increment the count
34            numSubsequences++;
35        }
36
37        // Return the minimum number of subsequences needed to form 'target'
38        return numSubsequences;
39    }
40}
41
1class Solution {
2public:
3    // Function to find the minimum number of subsequences of 'source' required to form 'target'.
4    int shortestWay(string source, string target) {
5        int sourceLength = source.size(), targetLength = target.size(); // Source and target lengths
6        int subsequencesCount = 0; // Initialize the count of subsequences needed
7        int targetIndex = 0; // Pointer for traversing the target string
8      
9        // Loop until the entire target string is covered
10        while (targetIndex < targetLength) {
11            int sourceIndex = 0; // Reset source pointer for each subsequence iteration
12            bool subsequenceFound = false; // Flag to check if at least one matching character is found in this iteration
13          
14            // Traverse both source and target to find the subsequence
15            while (sourceIndex < sourceLength && targetIndex < targetLength) {
16                // If the characters match, move pointer in target string to find the next character
17                if (source[sourceIndex] == target[targetIndex]) {
18                    subsequenceFound = true;
19                    ++targetIndex;
20                }
21                ++sourceIndex; // Always move to the next character in the source string
22            }
23          
24            // If no matching character was found, it's impossible to form target from source
25            if (!subsequenceFound) {
26                return -1;
27            }
28          
29            ++subsequencesCount; // A new subsequence is found for this iteration
30        }
31      
32        // Return the total count of subsequences required
33        return subsequencesCount;
34    }
35};
36
1// Function to find the minimum number of subsequences of 'source' required to form 'target'.
2function shortestWay(source: string, target: string): number {
3    let sourceLength: number = source.length; // Source length
4    let targetLength: number = target.length; // Target length
5    let subsequencesCount: number = 0; // Initialize the count of subsequences needed
6    let targetIndex: number = 0; // Pointer for traversing the target string
7
8    // Loop until the entire target string is covered
9    while (targetIndex < targetLength) {
10        let sourceIndex: number = 0; // Reset source pointer for each subsequence iteration
11        let subsequenceFound: boolean = false; // Flag to check if at least one matching character is found in this iteration
12
13        // Traverse both source and target to find the subsequence
14        while (sourceIndex < sourceLength && targetIndex < targetLength) {
15            // If the characters match, move pointer in the target string to find the next character
16            if (source.charAt(sourceIndex) === target.charAt(targetIndex)) {
17                subsequenceFound = true;
18                targetIndex++; // Move to the next character in target
19            }
20            sourceIndex++; // Always move to the next character in the source string
21        }
22
23        // If no matching character was found, it's impossible to form target from source
24        if (!subsequenceFound) {
25            return -1;
26        }
27
28        subsequencesCount++; // A new subsequence is found for this iteration
29    }
30
31    // Return the total count of subsequences required
32    return subsequencesCount;
33}
34
Not Sure What to Study? Take the 2-min Quiz:

A person thinks of a number between 1 and 1000. You may ask any number questions to them, provided that the question can be answered with either "yes" or "no".

What is the minimum number of questions you needed to ask so that you are guaranteed to know the number that the person is thinking?

Time and Space Complexity

Time Complexity

The primary function of the algorithm, shortestWay, iterates over the target string while repeatedly scanning the source string to find subsequences that match the target. The function f(i, j) is called for each subsequence found and runs in a while loop that continues until either the end of the source or target string is reached. The worst-case scenario occurs when every character in the source has to be visited for every character in the target.

Given:

  • m is the length of source
  • n is the length of target

The worst-case time complexity can be roughly bounded by O(n * m) since, in the worst case, the substring search could traverse the entire source string for each character in the target string.

Space Complexity

The space complexity of the algorithm is O(1) as it only uses a fixed number of integer variables m, n, ans, j, and k, and does not allocate any additional space proportional to the input size.

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Which data structure is used to implement priority queue?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫