GameDev.netAlgorithmic Forays Part 7

Algorithmic Forays Part 7
In this column, we will discuss memoization – a technique to speed up recursive computations. ## RecursionIn mathematics and computer science, recursion is a way of specifying something (usually a mathematical object or part of a computer program) by reference to itself. More precisely (and to dispel the appearance of circularity in the definition), "complicated" instances are defined in terms of "simpler" instances, and the "simplest" instances are given explicitly. One interesting "application" of recursion is the definition of the set of natural numbers. We can define a natural number recursively as follows: - 0 is natural
- If
*n*is natural, then*n + 1*is also natural
So, 0 is natural by definition. 1 is natural, because 0 is natural and 1 = 0 + 1. 2 is natural because 2 = 1 + 1 and 1 is natural, and so on. ## Fibonacci numbersA canonical example of recursion is the computation of Fibonacci numbers: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946 ... This sequence can be defined recursively as follows (F(n) is the n - if n = 0, F(n)= 0
- if n = 1, F(n) = 1
- otherwise, F(n) = F(n - 1) + F(n - 2)
You can easily follow this definition and generate some Fibonacci numbers for yourself - you will get precisely the sequence given above. How would we generate this sequence in code? It's quite simple - recursive definitions are easily reflected in recursive function calls. Here is the C++ code that returns any Fibonacci number (constrained by execution-time and limitations of the C++ long fib_rec( long index ) { if (index < 2) return index; else return fib_rec(index - 1) + fib_rec(index - 2); } Note how gracefully the mathematical definition is translated to the code of For example, finding the 40th Fibonacci number (which is, by the way, 102,334,155) using this routine takes about 4 seconds on my machine. The 42nd number (267,914,296) takes 11 seconds to compute, and the time grows very quickly (the 45th, which is 1,134,903,170, takes 47 seconds, etc.) One reason for this is the cost of function calls (of which there are many in the recursive solution). When a function is called, there is always a certain amount of overhead. For small functions, this overhead can be comparable to the time required to execute the function itself. This results in a performance hit. However, this is not the main reason for recursion being slow for the computation of Fibonacci numbers. The principal cause, in this case, is the vast amount of repetition involved. To demonstrate this, let's try to trace a sample execution of the recursive computation fib_rec(5) | |---fib_rec(4) | | | |---fib_rec(3) | | | | | |---fib_rec(2) | | | | | | | |---fib_rec(1) | | | |---fib_rec(0) | | | | | |---fib_rec(1) | | | |---fib_rec(2) | | | |---fib_rec(1) | |---fib_rec(0) | |---fib_rec(3) | | fib_rec(2) | | | |---fib_rec(1) | |---fib_rec(0) | |---fib_rec(1) When Now, do you notice anything funny about this call tree? It shouldn't be hard to spot the scandalous number of times the same calls are made. For instance, the call ## Memoized FibonacciMemoization literally means "putting into memory". An alternative name for it is In programming, Memoization plays a role similar to a hardware cache. It is a technique used to speed up computer programs by saving intermediary answers for later use, rather than recomputing them. If you look at the call tree for So what should we do in order to memoize the Fibonacci computation? First, we should set up some data structure to serve as a cache of computations. Then, when being asked to compute a Fibonacci number we should first consult the cache. If the result is in cache, it can be returned without any further computations. If it isn't in the cache - it means we haven't computed it yet, so we compute it and add it to the cache. Let's see how this is translated to code: long fib_memoized_aux(vector<long>& cache, long index) { if (cache[index] >= 0) return cache[index]; cache[index] = fib_memoized_aux(cache, index - 1) + fib_memoized_aux(cache, index - 2); return cache[index]; } long fib_memoized(long index) { vector<long> cache(index + 1, -1); cache[0] = 0; cache[1] = 1; return fib_memoized_aux(cache, index); } Here, What about performance, then? While up to about 30, There's another major speed-improvement here, which may not be immediately obvious. Imagine that during the runtime of our program, we need to calculate Fibonacci numbers not just once, but many times. While using the plain method we'd go through the computations again and again, using memoization we can just reuse the cache from call to call. Chances are that most of the computations will be answered in constant time - because the result will already be in the cache! The assiduous reader may implement a simple class for Fibonacci number calculation. This class will have the cache as a member, which will be initialized only once. The Fibonacci calculation method will use this cache and update it at times (when yet un-calculated results are requested). ## Alternative Fibonacci implementationsPersonally, I don't feel good about the Fibonacci calculation example. Though it's educationally sound, I find it somewhat contrived. This is because there are fast implementations for Fibonacci calculations that don't require memoization. For example, it's very easy to come up with a simple iterative solution. Since a Fibonacci number is simply a sum of the previous two, we can use a loop and keep track of just two numbers to generate any Fibonacci: long fib_iter(long index) { if (index < 2) return index; long cur = 1; long one_back = 0; for (long i = 2; i <= index; ++i) { long temp = cur; cur = cur + one_back; one_back = temp; } return cur; } This will calculate Fibonacci numbers as fast as the memoized implementation - in linear time. An even faster solution can utilize long fib_binet(long index) { double sqrt_5 = 2.2360679774997896964091736687313; return (long) floor((pow(1 + sqrt_5, index) - pow(1 - sqrt_5, index)) / (pow(2, index) * sqrt_5)); } Don't just sit there and gasp in horror :-) Calculation of Fibonacci numbers is a fascinating topic and you can learn alot by browsing the web a little - start by Googling for "Fibonacci numbers". I must note, just to be fair, that these fast non-recursive implementations lack the caching-between-calls property of memoization. That is, if we use memoization to save results between function calls (discussed in the last paragraph of the previous section); we can get most results at a cost of a trivial array look up - faster than the non-recursive implementations. But to be even more fair, huge Fibonacci numbers are rarely needed in practice, and even when they are, the iterative or the formula implementations can provide us with as big numbers as we'll even need in negligible time. So let's examine another problem, where there is no simple alternative to the recursive solution. ## Counting changeAs we saw, it isn't hard to come up with a simple iterative Fibonacci algorithm (the same goes for the factorial function, another common example of recursion in programming books and tutorials). In contrast, consider the following problem: How many different ways can we make change of $1.00, given half-dollars, quarters ($0.25), dimes ($0.10), nickels ($0.05), and cents ($0.01)? More generally, can we design an algorithm to compute the number of ways to change any given amount of money? While at first sight an innocuous problem that might interest supermarket cashiers, this is a close relative of an important algorithm - Let's start with an example, to make sure that the problem is understood. In how many ways can we make change from 10 cents? One is ten cents. Two is one nickel and five cents. Three is two nickels. Four is a dime. So, there are four ways. In fact, this problem has a simple recursive solution. Suppose we think of the types of coins available as arranged in some order. Then the following relation holds: The number of ways to change amount - the number of ways to change amount
*a*using all but the first kind of coin, plus - The number of ways to change amount
*a - d*using all*n*kinds of coins, where*d*is the denomination of the first coin.
Rationale: note that given a task to make change, we can divide it to two complementary groups: ways that don't use the first kind of coin, and ways that do. The total number of ways is the sum of those two groups - that is, the number of ways to make change without using the first kind of coin, plus the number of ways to make change assuming that we do use the first kind of coin. But if we've used the first kind, what remains is the amount minus the denomination of the first kind. Thus, we've found a way to solve the problem by reducing it to two smaller problems (in the first the amount of coins is smaller, and in the second the sum is smaller). This is just what recursion is about - reducing problems to simpler problems. What we're lacking is an explicit solution for the "simplest" problems: - If the amount
*a*is 0, there's only one way to make change (no coins) - If the amount
*a*is negative, there is no way to make change - If
*n*is 0, there's only one way to make change (we don't have any choice...)
To convince you that the reduction and boundary cases work, lets look again at the 10 cent problem (note that we're not interested in 25 and 50 cent coins in this case): So, to change 10 using the coins 10, 5 and 1 (ordered thus) we sum (1) the number of ways to change 10 using all but the first kind of coin, that is using 5 and 1 only, and (2) the number of ways to change amount 10 - 10 = 0 using all kinds of coins. (2) Is simple - there's one way to change amount 0 (by the first boundary case). (1) can be further reduced to (3) the number of ways to change amount 10 using 1 cent only, plus (4) the number of ways to change 10 - 5 = 5 using all kinds of coins. (3) Is only one way, ten 1 cent coins, (4) can be reduced, etc. (we get two ways from (4) - one five cent coin, and five one cent coins). When the algorithm finishes we'll end up with 4 ways. To take care of the coins ordering, we'll define a helper function: long first_denomination(long n_kinds_of_coins) { switch (n_kinds_of_coins) { case 5: return 50; case 4: return 25; case 3: return 10; case 2: return 5; case 1: return 1; default: assert(0); } } Given how many coins we can use, this function returns the denomination of the first coin. It sets up the following ordering of the coins - 50, then 25, then 10, then 5, then 1. Now we're ready for to implement the change counting procedure itself. As a true recursive algorithm, it translates into code very naturally: long count_change_aux(long amount, long n_kinds_of_coins) { if (amount == 0) return 1; else if (amount < 0 || n_kinds_of_coins == 0) return 0; else { return count_change_aux(amount, n_kinds_of_coins - 1) + count_change_aux(amount - first_denomination(n_kinds_of_coins), n_kinds_of_coins); } } long count_change(long amount) { return count_change_aux(amount, 5); }
To get some intuition of the problem, suppose that we run Contrary to the Fibonacci problem, here we don't have any simple way to formulate a swift iterative algorithm that will complete the same task (if you find one, let me know!). But we'd still like to compute change for large sums in a reasonable time. The solution is memoization. ## Memoized change countingWe will proceed in a manner similar to the memoization of The only slightly problematic point is that we can't just use a simple array for the cache as we did in long count_change_memoized_aux(map<pair<long, long>, long>& cache, long amount, long n_kinds_of_coins) { pair<long, long> entry = make_pair(amount, n_kinds_of_coins); if (cache.find(entry) != cache.end()) return cache[entry]; if (amount == 0) cache[entry] = 1; else if (amount < 0 || n_kinds_of_coins == 0) cache[entry] = 0; else { cache[entry] = count_change_memoized_aux(cache, amount, n_kinds_of_coins - 1) + count_change_memoized_aux(cache, amount - first_denomination(n_kinds_of_coins), n_kinds_of_coins); } return cache[entry]; } long count_change_memoized(long amount) { map<pair<long, long>, long> cache; return count_change_memoized_aux(cache, amount, 5); } Note that Here I use a And indeed, benchmarking shows considerable speed improvement. Change from 2000 cents now takes only 0.02 seconds to compute (vs. 2.5 minutes for the non-memoized version). Change from 5000 takes 0.06 seconds (I gave up waiting for the non-memoized version to finish this calculation). The runtime increase for the memoized version increases linearly with the size of the problem - as expected. ## Wrapping upIn this article you've been introduced to You probably noticed the similarity between memoizing the Fibonacci implementation and memoizing the change counting algorithm. Indeed, memoization is quite simple to apply automatically, if the programming language allows it. For example, in Lisp where functions are data just like any other data, memoizing an arbitrary function is trivial. In Perl it is a little bit trickier, but there exists an excellent module called Memoize that will automatically memoize any function for you. As far as I know there is no simple way to achieve this in C++.
© 1999-2011 Gamedev.net. All rights reserved. |