2052. Minimum Cost to Separate Sentence Into Rows


Problem Description

In this problem, you are given a sentence which is a string of words separated by spaces. Additionally, you are given an integer k that represents the maximum number of characters allowed in each row. The task is to break the sentence into rows where each row has at most k characters.

The constraints you need to consider are:

  • A word cannot be split across two rows.
  • Each word must be used exactly once and in the same order as it appears in the given sentence.
  • Rows should not begin or end with spaces.
  • Between two words in a row, there must be exactly one space.
  • The 'cost' of a row is calculated as (k - n)^2, where n is the length of the row. The total cost is the sum of the individual costs for all rows, except the last one, which isn't included in the total cost calculation.

Your goal is to find the arrangement of words into rows that results in the minimum possible total cost.

Intuition

The intuition behind solving this problem is to use dynamic programming to explore all possible ways to break the sentence into rows. This process can be visualized as a decision tree where each node represents a state in which certain words have been grouped into a row, and each branch is a decision to start a new row with the next word. The solution involves recursively computing the cost of each decision and then combining those costs to find the minimum cost arrangement.

Dynamic programming is appropriate here because:

  1. The problem has an optimal substructure - the solution to the problem can be composed of the solutions to its subproblems. For example, the minimum cost for the entire sentence depends on the minimum cost for the remainder of the sentence after forming each possible row.
  2. It has overlapping subproblems - as we compute costs for different partitions of the sentence, we will encounter the same remaining sentence multiple times.

The recursive function dfs starts at the beginning of the sentence and makes decisions to split the sentence at various points such that the length of the row is at most k. It keeps track of the minimum cost found so far for a given partition. Through memoization (using the @cache decorator), we remember the results of subproblems, which saves us from recalculating the same state multiple times, thus reducing the time complexity of the algorithm.

The solution approach is as follows:

  • First, convert the sentence into a list of word lengths (t) and use accumulate to create an array s that holds the accumulated sum of word lengths at each index, which helps in quickly checking the length of a range of words.
  • The dfs function then recursively explores different points where the sentence can be split, calculating the cost of creating a new row.
  • The base case of the recursive function is when there are enough characters left to fit the rest of the sentence into one row.
  • If the base case condition is not met, the function iteratively tries fitting more words into the current row and calls itself to compute the minimum cost for the rest of the sentence from that point onwards.
  • The aim is to find the minimum cost across all possible splits.
  • The result of the DFS will be the minimum total cost for separating the sentence into rows, excluding the cost of the last row (as per the problem statement).

The provided solution uses the depth-first search (DFS) algorithm and is implemented efficiently thanks to Python's memoization capabilities.

Learn more about Dynamic Programming patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece

What is the worst case running time for finding an element in a binary tree (not necessarily binary search tree) of size n?

Solution Approach

The provided Python solution employs a top-down dynamic programming approach, also known as memoization, to solve the problem. Let's go through the different components of the solution:

Data Structures:

  • An array t is used to store the length of each word from the sentence.
  • An array s is created using the accumulate function from the itertools module, which represents the accumulated sum of lengths of words up to a particular index. This helps in efficiently computing the length of any sequence of words in constant time.

Dynamic Programming Function dfs (Depth First Search):

  • The dfs function is the heart of the solution. It takes an index i as its argument, which indicates the starting point for the current row within the list of words.
  • It returns the minimum cost of fitting the remaining words into rows, starting from index i (inclusive).
  • The function is decorated with @cache, which automatically remembers the result of each call with particular arguments (i in this case) so that it doesn't need to be computed again.

Base Case:

  • The base case for the recursive dfs function is when all remaining words from index i can fit into one row, in which case the cost is 0 because the last row's cost is not included in the total.

Recursive Case:

  • For other cases, we initialize a variable, ans to infinity (inf), representing the minimum cost found so far.
  • We iterate through the words starting from index i+1 with a variable j. In this loop, we do the following:
    • Compute the current length t of the sequence of words from i to j-1 (inclusive) as s[j] - s[i] + j - i - 1. The + j - i - 1 accounts for the spaces between the words.
    • If this length t does not exceed k, we compute the cost of this row as (k - t) ** 2 and add it to the result of the recursive call to dfs(j) (which represents the minimum cost for the rest of the sentence starting from j).
    • Update ans with the minimum of its current value and the cost calculated above.
    • Increment j to include the next word in the current row or to start a new row from the next word.

Once the loop finishes, ans will hold the minimum cost found for breaking the sentence into rows from index i.

Triggering the Dynamic Programming:

  • We call dfs(0) to start the process from the beginning of the sentence. The return value is the minimum total cost for breaking the sentence into rows as per the problem's constraint.

In summary, the algorithm cleverly combines memoization and the intuitive approach of trying every possible break point to ensure all scenarios are covered, yet avoids redundant computation by caching the costs of subproblems already solved.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Which of the following shows the order of node visit in a Breadth-first Search?

Example Walkthrough

Let's illustrate the solution approach using a small example. Consider the sentence "LeetCode is great" and let's say the maximum number of characters allowed per row, k, is 10.

First, we prepare our data structures:

  • The word lengths are t = [8, 2, 5].
  • The accumulated sum array is s = [8, 11, 17] using accumulate.

Now, we invoke dfs(0) to begin our dynamic programming search:

  • At i = 0, the sentence's first word "LeetCode" is 8 characters, which fits within the 10 character limit.
  • We can't add the next word "is" to this row because it would exceed our limit (8 (LeetCode) + 1 (space) + 2 (is) = 11 > k).
  • Therefore, we split after "LeetCode", yielding a cost of (10 - 8)^2 = 4 for this row, and recursively invoke dfs(1) for the remaining sentence "is great".

Now, dfs(1) takes over:

  • Starting at i = 1, we have a remaining length of 7 characters for "is great", which fits into a row, so we consider the next word.
  • Since "great" is the last word, we terminate here. The cost for this penultimate row is (10 - 7)^2 = 9.
  • However, since "great" forms the last row, we do not include this cost in our total, and the function returns 0 for the cost at this index.

For our initial call dfs(0), the minimum cost would be equal to the cost of the first row plus the cost returned from the call to dfs(1): 4 + 0 = 4.

Therefore, our rows are "LeetCode" and "is great" with a minimum total cost of 4, with the split chosen such that the cost for the penultimate row is minimal, and the algorithm efficiently finds this by memoization to avoid repetitive calculations.

Solution Implementation

1from functools import lru_cache
2from itertools import accumulate
3from math import inf
4
5class Solution:
6    def minimumCost(self, sentence: str, k: int) -> int:
7        # Define a recursive function to check minimum cost with memoization (caching)
8        @lru_cache(maxsize=None)  # This decorator enables caching of results
9        def min_cost_dfs(start_index):
10            # Base Case: If the entire remaining string fits in one line, the cost is 0
11            if total_lengths[-1] - total_lengths[start_index] + num_words - start_index - 1 <= k:
12                return 0
13          
14            # Initialize the minimum cost to infinity
15            minimum_cost = inf
16          
17            # Start with the next word
18            next_word_index = start_index + 1
19          
20            # Try placing each next word if it fits in the current line and calculate its cost
21            while next_word_index < num_words and \
22                  (current_line_length := total_lengths[next_word_index] - total_lengths[start_index] + next_word_index - start_index - 1) <= k:
23                cost = (k - current_line_length) ** 2 + min_cost_dfs(next_word_index)
24                minimum_cost = min(minimum_cost, cost)  # Select the minimum cost
25                next_word_index += 1
26              
27            return minimum_cost
28
29        # Split the sentence into words and calculate their lengths
30        words = sentence.split()
31        word_lengths = [len(word) for word in words]
32        num_words = len(words)
33      
34        # Create a prefix sum array of word lengths to calculate line lengths efficiently
35        total_lengths = list(accumulate(word_lengths, initial=0))
36      
37        # Start the recursive calculation from the first word
38        return min_cost_dfs(0)
39
40# Example usage:
41# sol = Solution()
42# print(sol.minimumCost("The quick brown fox jumps over the lazy dog", 10))
43
1import java.util.Arrays;
2
3public class Solution {
4    private static final int INFINITY = Integer.MAX_VALUE;
5    private int[] memoization;
6    private int[] prefixSumLengths;
7    private int totalWords;
8
9    // Calculates the minimum cost of formatting a sentence into lines with width <= k
10    public int minimumCost(String sentence, int k) {
11        // Split the sentence into words
12        String[] words = sentence.split(" ");
13        totalWords = words.length;
14        prefixSumLengths = new int[totalWords + 1];
15
16        // Compute the prefix sum of word lengths to enable O(1) length calculation later
17        for (int i = 0; i < totalWords; ++i) {
18            prefixSumLengths[i + 1] = prefixSumLengths[i] + words[i].length();
19        }
20
21        // Initialize memoization array with INFINITY to indicate uncomputed subproblems
22        memoization = new int[totalWords];
23        Arrays.fill(memoization, INFINITY);
24
25        // Start the Depth-First Search from the first word
26        return depthFirstSearch(0, k);
27    }
28
29    // Recursive function with memoization to minimize cost
30    private int depthFirstSearch(int index, int k) {
31        // If the cost for the current index is already computed, return it
32        if (memoization[index] != INFINITY) {
33            return memoization[index];
34        }
35
36        // If all remaining words fit within a single line, there is no cost
37        if (prefixSumLengths[totalWords] - prefixSumLengths[index] + totalWords - index - 1 <= k) {
38            memoization[index] = 0;
39            return 0;
40        }
41
42        int minCost = INFINITY;
43        // Try placing each following word on the same line and calculate the cost
44        for (int j = index + 1; j < totalWords; ++j) {
45            int totalLengthWithSpaces = prefixSumLengths[j] - prefixSumLengths[index] + j - index - 1;
46            // If the total length fits within the given width `k`
47            if (totalLengthWithSpaces <= k) {
48                // Calculate and update the minimum cost
49                minCost = Math.min(
50                    minCost,
51                    (k - totalLengthWithSpaces) * (k - totalLengthWithSpaces) + depthFirstSearch(j, k)
52                );
53            }
54        }
55
56        // Memoize the result for the current index
57        memoization[index] = minCost;
58        return minCost;
59    }
60}
61
1#include <vector>
2#include <string>
3#include <sstream>
4#include <climits>
5
6class Solution {
7public:
8    int n; // The number of words in the sentence
9
10    // This function calculates the minimum cost to split the sentence into lines with a width limit of k
11    int minimumCost(std::string sentence, int k) {
12        std::istringstream iss(sentence);
13        std::vector<std::string> words;
14        std::string word;
15
16        // Split the sentence into words
17        while (iss >> word) {
18            words.push_back(word);
19        }
20      
21        n = words.size();
22
23        // Prefix sum array to hold the length of the words so far
24        std::vector<int> prefixSum(n + 1, 0);
25        for (int i = 0; i < n; ++i) {
26            prefixSum[i + 1] = prefixSum[i] + words[i].size();
27        }
28      
29        // Memoization array to keep track of computed costs
30        std::vector<int> memo(n, INT_MAX);
31
32        // Start the DFS from the first word
33        return dfs(0, k, prefixSum, memo);
34    }
35
36    // Helper DFS function to compute the minimum cost
37    int dfs(int index, int k, std::vector<int>& prefixSum, std::vector<int>& memo) {
38        // If the cost for the current index is already computed, return it from the memo array
39        if (memo[index] != INT_MAX) return memo[index];
40
41        // Base case: if the remaining words can fit into the last line, no extra cost is incurred
42        if (prefixSum[n] - prefixSum[index] + n - index - 1 <= k) {
43            memo[index] = 0;
44            return 0;
45        }
46
47        int ans = INT_MAX;
48
49        // Try splitting after every possible word and calculate the minimum cost
50        for (int j = index + 1; j < n; ++j) {
51            int lineWidth = prefixSum[j] - prefixSum[index] + j - index - 1;
52            if (lineWidth <= k) {
53                int cost = (k - lineWidth) * (k - lineWidth); // The cost of the current line
54                // Recursively calculate the cost for the remaining words and update the answer
55                ans = std::min(ans, cost + dfs(j, k, prefixSum, memo));
56            }
57        }
58      
59        // Save the computed minimum cost in the memo array
60        memo[index] = ans;
61      
62        return ans;
63    }
64};
65
1// Importing required modules for readline and streams
2import * as readline from 'readline';
3import { Readable } from 'stream';
4
5// Declare necessary variables
6let n: number; // The number of words in the sentence
7
8// Function to calculate the minimum cost to split the sentence into lines with a width limit of k
9function minimumCost(sentence: string, k: number): number {
10    // Creating a readable stream from a string
11    const stream = Readable.from(sentence);
12    const rl = readline.createInterface({ input: stream });
13
14    // Array to hold words of the sentence
15    const words: string[] = [];
16
17    // Promise to handle the asynchronous nature of readline
18    const wordsPromise = new Promise<void>((resolve) => {
19        // Read the sentence and split into words
20        rl.on('line', (line) => {
21            words.push(...line.split(' '));
22        }).on('close', () => {
23            resolve();
24        });
25    });
26
27    return wordsPromise.then(() => {
28        n = words.length;
29
30        // Prefix sum array to hold the length of words so far
31        const prefixSum: number[] = Array(n + 1).fill(0);
32        for (let i = 0; i < n; ++i) {
33            prefixSum[i + 1] = prefixSum[i] + words[i].length;
34        }
35
36        // Memoization array to keep track of computed costs
37        const memo: number[] = Array(n).fill(Number.MAX_SAFE_INTEGER);
38
39        // Start the DFS from the first word
40        return dfs(0, k, prefixSum, memo);
41    });
42}
43
44// Helper DFS function to compute the minimum cost
45function dfs(index: number, k: number, prefixSum: number[], memo: number[]): number {
46    // Check memo array for already computed cost at the current index
47    if (memo[index] !== Number.MAX_SAFE_INTEGER) return memo[index];
48
49    // If remaining words fit into the last line, no extra cost is incurred
50    if (prefixSum[n] - prefixSum[index] + n - index - 1 <= k) {
51        memo[index] = 0;
52        return 0;
53    }
54
55    let ans = Number.MAX_SAFE_INTEGER;
56
57    // Compute the cost for each split point
58    for (let j = index + 1; j < n; ++j) {
59        const lineWidth = prefixSum[j] - prefixSum[index] + j - index - 1;
60        if (lineWidth <= k) {
61            // The cost for the current line based on remaining space
62            const cost = (k - lineWidth) ** 2;
63            // Recurse for the remaining sentence and update the minimum answer
64            ans = Math.min(ans, cost + dfs(j, k, prefixSum, memo));
65        }
66    }
67
68    // Cache computed cost in memo array
69    memo[index] = ans;
70
71    return ans;
72}
73
Not Sure What to Study? Take the 2-min Quiz

How many times is a tree node visited in a depth first search?

Time and Space Complexity

Time Complexity: O(n * k)

The solution uses dynamic programming with memoization (indicated by the use of the @cache decorator). The function dfs(i) is recursively called while iterating through each word from the index i to the end of the sentence. In the worst case, for every word, it checks all other words up to the maximum allowed length k. Therefore, the time complexity is O(n * k) where n is the number of words.

Space Complexity: O(n)

The space complexity comes from two main contributions: the memoization of the dfs function and the creation of the prefix sum array s. Since the dfs function will compute a result for each starting index i, the cache will at most store n results. Additionally, the prefix sum array s is of size n+1. Thus, the overall space complexity is O(n) due to these factors.

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Which of the following shows the order of node visit in a Breadth-first Search?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.

←
↑TA đŸ‘šâ€đŸ«