Skip to content

Example: Text Justification

Roberto Fronteddu edited this page May 15, 2025 · 22 revisions

Given a sequence of words, and a limit on the number of characters that can be put in one line (line width). Put line breaks in the given sequence such that the lines are printed neatly.

Example: Tushar Roy likes to code

  1. Tushar - 6
  2. Roy - 3
  3. likes - 5
  4. to - 2
  5. code - 4
  • First we compute the matrix of cost of spaces where
    • the cost is $(w-len(x[i..j]))^2$, we use the square because it works better empirically
    • x[i..j] is the sentence from word i to word j, note that space between words are not counted
      • 0..0: 10 - 6 = 4x4
      • 0..1: 10 - 10 = 0
      • 0..2: inf because word doesn't fit
0 1 2 3 4
0 16 0 max max max
1 49 1 max max
2 25 4 max
3 64 9
4 36

We now compute 2 arrays as long as the number of sentences

  • r: for the best cost from i to end

  • s: for the split [i, j) if needed

  • With i,j = 4 we have that the cost is 36 and that there is no split (so we use 5 to indicate that since we have 4 sentences)

0 1 2 3 4
36
5
  • With i = 3 j = 4
0 1 2 3 4
9 36
3 5 5
  • With i = 2 j = 4
    • 2..4 doesn't fit, we need a split, note that once we consider the first part, we can use a previously computed result from the table to get the best second we found
      • first we consider split = 4 -> 2..3 (4) 4..4 (36) => 40
      • split = 3 => 2..2 (25) 3..4 (9) => 34 FOUND
0 1 2 3 4
34 9 36
3 5 5
  • With i = 1 j = 4
    • 1..4 doesn't fit
      • j = 4 ? 1..3 (max) doesn't fit
      • j = 3 ? 1..2 (1) 3..4 (9) => 10 FOUND
      • j = 2 ? 1..1 (49) 2..4 (34)
0 1 2 3 4
10 34 9 36
3 3 5 5
  • With i = 0 j = 4
    • 0..4 doesn't fit
      • j=4? 0..3 (max)
      • j=3? 0..2 (max)
      • j=2? 0..1 (0) + 2..4 (34)
      • j=1? 0..0 (16) + 1..4 (10) = 10
0 1 2 3 4
26 10 34 9 36
1 3 3 5 5

We interpret the table as following:

  • 26 is the smallest cost we can get from 0..4
  • to get 26 we do the following split
    • [0..1), [1..3), [3..5)
justify(x, w)
    n = x.len
    let cost[0..n-1, 0..n-1] be a table
    let r[0..n-1] and s[0..n-1] be an array
    
    // compute penalties
    for i = 0 to n-1
        for j = i to n-1
            l = len(x[i..j]) + (j-i) // concatenation of words from i to j with 1 space in-between
            cost[i,j] = w >= l ? (w-l) ** 2 : max
    
    for i = n-1 to 0:
        r[i] = MAX
        for j = i + 1 to n:
            if cost[i,j-1] == MAX:
                continue
            c = cost[i][j-1] + (j < n) ? r[j] : 0
            if c < r[i]:
                r[i] = c
                s[i] = j


    print("Minimum cost:", r[0])
    i = 0
    while i < n:
        print("[", i, "," s[i], ")")
        i = s[i]   
    
Clone this wiki locally