-
Notifications
You must be signed in to change notification settings - Fork 0
Example: Text Justification
Roberto Fronteddu edited this page May 15, 2025
·
22 revisions
Given a sequence of words, and a limit on the number of characters that can be put in one line (line width). Put line breaks in the given sequence such that the lines are printed neatly.
Example: Tushar Roy likes to code
- Tushar - 6
- Roy - 3
- likes - 5
- to - 2
- code - 4
- First we compute the matrix of cost of spaces where
- the cost is
$(w-len(x[i..j]))^2$ , we use the square because it works better empirically - x[i..j] is the sentence from word i to word j, note that space between words are not counted
- 0..0: 10 - 6 = 4x4
- 0..1: 10 - 10 = 0
- 0..2: inf because word doesn't fit
- the cost is
0 | 1 | 2 | 3 | 4 | |
---|---|---|---|---|---|
0 | 16 | 0 | max | max | max |
1 | 49 | 1 | max | max | |
2 | 25 | 4 | max | ||
3 | 64 | 9 | |||
4 | 36 |
We now compute 2 arrays as long as the number of sentences
-
r: for the best cost from i to end
-
s: for the split [i, j) if needed
-
With i,j = 4 we have that the cost is 36 and that there is no split (so we use 5 to indicate that since we have 4 sentences)
0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|
36 | ||||
5 |
- With i = 3 j = 4
0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|
9 | 36 | |||
3 | 5 | 5 |
- With i = 2 j = 4
- 2..4 doesn't fit, we need a split, note that once we consider the first part, we can use a previously computed result from the table to get the best second we found
- first we consider split = 4 -> 2..3 (4) 4..4 (36) => 40
- split = 3 => 2..2 (25) 3..4 (9) => 34 FOUND
- 2..4 doesn't fit, we need a split, note that once we consider the first part, we can use a previously computed result from the table to get the best second we found
0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|
34 | 9 | 36 | ||
3 | 5 | 5 |
- With i = 1 j = 4
- 1..4 doesn't fit
- j = 4 ? 1..3 (max) doesn't fit
- j = 3 ? 1..2 (1) 3..4 (9) => 10 FOUND
- j = 2 ? 1..1 (49) 2..4 (34)
- 1..4 doesn't fit
0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|
10 | 34 | 9 | 36 | |
3 | 3 | 5 | 5 |
- With i = 0 j = 4
- 0..4 doesn't fit
- j=4? 0..3 (max)
- j=3? 0..2 (max)
- j=2? 0..1 (0) + 2..4 (34)
- j=1? 0..0 (16) + 1..4 (10) = 10
- 0..4 doesn't fit
0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|
26 | 10 | 34 | 9 | 36 |
1 | 3 | 3 | 5 | 5 |
We interpret the table as following:
- 26 is the smallest cost we can get from 0..4
- to get 26 we do the following split
- [0..1), [1..3), [3..5)
justify(x, w)
n = x.len
let cost[0..n-1, 0..n-1] be a table
let r[0..n-1] and s[0..n-1] be an array
// compute penalties
for i = 0 to n-1
for j = i to n-1
l = len(x[i..j]) + (j-i) // concatenation of words from i to j with 1 space in-between
cost[i,j] = w >= l ? (w-l) ** 2 : max
for i = n-1 to 0:
r[i] = MAX
for j = i + 1 to n:
if cost[i,j-1] == MAX:
continue
c = cost[i][j-1] + (j < n) ? r[j] : 0
if c < r[i]:
r[i] = c
s[i] = j
print("Minimum cost:", r[0])
i = 0
while i < n:
print("[", i, "," s[i], ")")
i = s[i]