Dynamic Programming - Basics

Written by Sandro Feuz. Translated by Joël Mathys.

Mathematics is often referred to as the queen of the sciences. Similarly, from my point of view, dynamic programming is the princess of algorithms. While most of the other algorithms presented here only solve one specific problem (Dijkstra’s Algorithm finds (only) the shortest paths, Kruskal finds (only) Minimal Spanning Trees), Dynamic Programming (DP) is much more powerful. DP is not simply a single algorithm, but rather describes a whole class of algorithms. A way of tackling and solving algorithmic problems.

A motivating example

If I had to summarize DP in one sentence, it would be

Don’t waste time calculating things twice.

Let’s take a look at an example.

0,1,1,2,3,5,8,0,1,1,2,3,5,8, \dots

This is the beginning of the famous series of Fibonacci and the ii-th Fibonacci number fif_i (for i=0,1,i=0,1, \dots) can be formally expressed as

f0=0f1=1fi=fi2+fi1\begin{align*} f_0 &= 0\\ f_1 &= 1\\ f_i &= f_{i-2} + f_{i-1} \end{align*}

Let’s implement this in Python. We can handle the two base cases (i=0i=0 and i=1i=1) in a single if condition and the general can directly be copied from the definition.

def fib(i):
  if i < 2:
    return i
  else:
    return fib(i-2) + fib(i-1)

When we try to calculate the Fibonacci numbers f5,f10,f50f_{5}, f_{10}, f_{50} and f100f_{100}, we quickly notice that the program is way too slow and does not finish for f100f_{100} even after hours. What’s happening? We are violating the principle of dynamic programming and waste time on recalculating the same thing over and over.

Our function fibfib works recursively and when we call fib(100)fib (100), fib(99)fib (99) and fib(98)fib (98) are called. The fib(99)fib (99) call then calls fib(98)fib (98) and fib(97)fib (97) again. And so on. This is summarized in the following diagram.

fib(100)+
        |
        +---fib(99)+
        |          |
        |          +----fib(98)+
        |          |           |
        |          |           +...
        |          |
        |          +----fib(97)+
        |                      |
        |                      +...
        +---fib(98)+
                   |
                   +...

In particular, it can be seen that the fib(98)fib (98) call happens twice, and of course it will also return the same number twice (the 98th Fibonacci number). Now let’s try to make sure that we don’t throw away all the calculations that are done on the first call of fib(98)fib (98). It would be much more clever to save the return value and then simply return it in the second fib(98)fib (98) call without additional calculations. Let’s assume that the three “\dots” calls in the chart all take the same amount of time (which is reasonably true). By saving the fib(98)fib (98) return value we save one of these three calls. Thus we reduce the runtime of our program by 33%, and this, only by saving a single number! While the program would have taken 10 years to calculate the 100th Fibonacci number before, we can now do it in less than 7 Yay!!

Complexity analysis

We have seen that the 100th Fibonacci number already will not terminate with our program, so it seems likely that the runtime is exponential. That’s also not too difficult to see: We notice that in the code we only return the numbers 00 and 11, and in the recursive calls we only sum up. This means that at the end the runtime of the program there will be at least the number of “ones” many numbers are added together. This is exactly the value of the nn-th Fibonacci number itself. The nn-th Fibonacci number is approximately 1.6n1.6^n, so exponential in nn.

Fibonacci with dynamic programming

We can generalize the idea above. Every time we calculate a Fibonacci value, we save it. Before we calculate a new value, we check if we don’t already know it. The program becomes slightly more complicated.

stored = {}
def fib(i):
  if i < 2:
    return i
  if i not in stored:
    stored[i] = fib(i-2) + fib(i-1)
  return stored[i]

When we run this new program, we see that the 100th Fibonacci number is calculated in milliseconds instead of 7 years: for every ii from 11 to nn, only constant work is done. Wow, long live Princess DP!

Memoization vs Bottom Up

We have seen the most important idea of DP: Don’t calculate the same thing several times; just store it instead. Often there are two “orders” in which the values can be stored. Either from “bottom up” or from “top” (memoization (note, no “r”).

In the DP Fibonacci code of the previous section we have calculated the values from the top. We take the existing (slow) recursive formulation of the problem and store the values just before returning them. Therefore the above code is a memoization solution for Fibonacci.

The corresponding code for the bottom-up solution, which calculates the values “from below”, considers which values we already know from the beginning. These are the anchor values of the recursion, i. e. f0=0f_0=0 and f1=1f_1=1. Then we think about: if we now know the first two values, can we calculate the third one directly? Of course, because f2=f0+f1=0+1=1f_2 = f_0+f_1 = 0 + 1 = 1. Now we have the first three values, can we calculate the fourth? Yeah, and so on.

In code, that could look like this:

stored = {0: 0, 1: 1}
for i in xrange(2, n+1):
  stored[i] =  stored[i-2] + stored[i-1]

We completely abandon the recursive formulation completely and fill the help table directly from below.

Most of the time it doesn’t matter if we write a memoization or a bottom-up solution, both are considered to be dynamic programming programs and they typically both have the same complexity (as in our example).

The Four Steps of Dynamic Programming

Let’s leave the Fibonacci example. Cormen et al [1] summarized dynmaic programming somewhat more formally as follows

  1. Characterize - characterize the structure of an optimal solution.
  2. Define - Recursively define the value of an optimal solution.
  3. Compute - Compute the value of an optimal solution in a bottom-up fashion
  4. Construct - Construct an optimal solution from computed information.

and we want to try to go through them step by step using a new example.

Maximum Triangular Sum Path

The task we’re going to solve is probably older than you! It was presented at the International Olympiad in Informatics 1994 [2]:

A number pyramid of the height nn is given. The task is to find a maximum path from the tip to the base, where in each step you move down one step and can either move left or right. The value of a path is the sum of its numbers. Example:

    7
   3 8
  8 1 0
 2 7 4 5
4 5 2 6 5

The correct output for the example is 30, realized by path 7,3,8,7,5.

0) Greedy is Wrong

Before we start on a dynamic programming solution, we should make sure that no simpler solution strategy works. A first candidate is often “Greedy”, which means taking the best local number. In the above example, we would return the path 7,8,1,7,5, which has only the value 28. So worse than the best solution and therefore wrong.

1) Characterize the structure of an optimal solution

Let’s think about what characterizes an optimal solution. In every step we go one row lower and one to the left or right. Now let’s consider the final step in the optimal solution.

    .
   . .
  . . .
 2 7 . .
. 5 . . .

The field with the 5 can only be reached by passing either 2 or 7 in the previous step. This applies to every path up to the 5, i. e. especially to an optimal path up to the 5. We have seen in the example above that the optimal path to 5 comes from 7. Then what do we know about the partial path from the tip of the triangle to the 7? Among all paths from start to 7, this path must be one with maximum sum. Why? There is a “Cut & Paste” argument: If this were not the case, there would be a path from the top to the 7 with a higher sum. But then we could find with this path a better path to the 5 altogether, by extending this better path which goes to 7. However, this is a contradiction because we have based our assumptions on an optimal solution.

In other words, an optimal solution up to 5 must always consist of an optimal solution to one of its two possible predecessor fields. We will use this insight in the next step.

2) Recursively define the value of an optimal solution

Let’s try to formulate the problem recursively. Let opt(i,j)opt (i, j) be the maximum sum of a path from the tip of the triangle to the field (i,j)(i, j), this is the jj-th field in the ii-th row. For example, the 88 in the third row is the (3,1)(3,1) field. The solution (the value of the best path from the top to any field in the bottom line) would then be the maximum of the opt fields in the last line:

maxj=1,,nopt(n,j).max_{j=1,\dots,n} opt(n, j).

How can opt(i,j)opt (i, j) be calculated? So we’ll figure out how to get to the $ (i, j)$ field? Either from the field to the right of $ (i - 1, j)$ or to the left of $ (i - 1, j - 1)$ in the line above. Since we know from 1) that an optimal path must consist of an optimal partial path to one of its two predecessor fields, we can now specify opt(i,j)opt (i, j) directly: it is the better of the two optimal paths to the predecessors. So:

where T[i][j]T[i][j] indicates the number in the pyramid of the field $ (i, j)$. For the start field $ (1,1)$ itself there is only one path (which is of course the optimal one up to there):

opt(1,1)=T[1][1].opt(1, 1) = T[1][1].

This recursive formulation can be translated almost directly into code. Here is an example implementation in C++ (we change the numbering of the fields so that it now starts at 0 instead of 1).

std::vector<std::vector<int>> T = {
{     7       },
{   3,  8     },
{  8, 1,  0   },
{ 2, 7, 4, 5  },
{4, 5, 2, 6, 5}};

// Return the value of an optimal path until the field in
// line i and column j.
int MaximumTriangularSumPathRecursive(int i, int j) {
  // The top of the pyramid.
  if (i == 0 && j == 0) {
    return T[i][j];
  }
  int left = 0;
  int right = 0;

  // All fields on the very left have no left predecessor.
  if (j > 0) {
    left  = MaximumTriangularSumPathRecursive(i - 1, j - 1);
  }
  // All fields on the very right have no right predecessor..
  if (j < i) {
    right = MaximumTriangularSumPathRecursive(i - 1, j);
  }
  return std::max(left, right) + T[i][j];
}

Task for the reader: What is the runtime of this recursive solution? Does the program still find a solution in the foreseeable future for a triangle with 100 lines?

3) Compute the value of an optimal solution in a bottom-up fashion

Spoiler Alert: The program will not be able to finish in a reasonable time, because the runtime is O(2n)O (2^n) (exponential) and therefore for n=100n=100 the number of operations is much too high (a further task for the reader: How to calculate the runtime of O(2n)O (2^n))?

What now? Well, the solution is of course Dynamic Programming. This time we choose a Bottom Up solution directly, but I would like to point out that memoization would also work.

For the calculation we use an array opt[n][n]opt[n][n] and calculate the solution step by step. This means that we start at the top of the pyramid (where there is only one path), and work our way down row by row. This means that we have already processed all rows

0,,i10, \dots, i-1

when editing the ith line, and can therefore use the recursive description of the solution from 2).

std::vector<std::vector<int>> T = {
{     7       },
{   3,  8     },
{  8, 1,  0   },
{ 2, 7, 4, 5  },
{4, 5, 2, 6, 5}};

// return value of an optimal path from the tip to the base
int MaximumTriangularSumPathBottomUp() {
  const int n = T.size();
  vector<vector<int>> opt(n, std::vector<int>(n));
  opt[0][0] = T[0][0];
  for (int i=1; i < n; i++) {
    for (int j=0; j<=i; j++) {
      if (j == 0) {
        // left border.
        opt[i][j] = T[i][j] + opt[i-1][j];
      } else if (j == i) {
        // right border.
        opt[i][j] = T[i][j] + opt[i-1][j-1];
      } else {
        opt[i][j] = T[i][j] + max(opt[i-1][j], opt[i-1][j-1]);
      }
    }
  }
  // the function max_element in <algorithm> returns an iterator to the
  // biggest element and the star dereferences
  // the iterator (takes the value of the iterator).
  return *std::max_element(opt[n-1].begin(), opt[n-1].end());
}

How fast is this solution? We note that we only have a constant amount of work for each field of the pyramid. Since there are n22\frac{n^2}{2} fields, the runtime is O(n2)O (n^2) (i. e. linear in the input size, since we also have O(n2)O (n^2) many numbers in the input for a pyramid of the height nn).

Construct an optimal solution from computed information

We now have an efficient solution to find the value of the optimal path and depending on the task, we are done. But this may not be enough, and we want to be able to calculate the value of an optimal path as well as the path itself. To do this, we first find the end of the optimal path (as derived from above, it is the maximum of the fields of the last line). Then we look backwards to see from which field the optimal value was reached, and walk backwards from the bottom to the top of the pyramid. The solution should become clearer in the following code:

// T = as above...
// return a optimal path for T as a vector of length n, where
// path[i] is the used field of the i-th row.
std::vector<int> MaximumTriangularSumPathBottomUpWithPath() {

  // as above...

  // the path consists of the fields with coordinates
  //   (0, path[0]), (1, path[1]), ..., (n-1, path[n-1])
  std::vector<int> path(n);

  // find the field in the last row in which the optimal path ends.
  // As above std::max_element returns an iterator to this field,
  // and std::distance returns the distance between the first element
  // and this iterator (which is the column index of the field beginnig with zero),
  path[n-1] = std::distance(opt[n-1].begin(), std::max_element(opt[n-1].begin(), opt[n-1].end()));

  for (int i=n-2;i>=0;--i) {
    if (path[i+1] == 0) {
      // left most field only has one possible predecessor.
      path[i] = 0;
    } else if (path[i+1] == i+1) {
      // right most field only has one possible predecessor.
      path[i] = i;
    } else {
      if (opt[i+1][path[i+1]-1] > opt[i][path[i+1]]) {
        // value comes from the right predecessor
        path[i] = path[i+1] - 1;
      } else {
        // value comes from the left predecessor.
        path[i] = path[i+1];
      }
    }
  }
  return path;
}

Further material

Dynamic programming is a very powerful but also a difficult topic, and I personally believe that the easiest way to master it is to solve as many problems as possible yourself (or at least try to solve them for at least a few hours before looking at the solution). That’s why you should follow the article here with examples. For example, a list of sample tasks can be found here or here again with theory and solutions, or you can ask your favorite SOI organizer (or me [3]: -)) for a list of their most interesting DP tasks.

[1]in this book here (german) resp. here (english)
[2]Task 1 of day 1here. A version of the task can also be found on Project Euler
[3]sandro.feuz@gmail.com