Written by Bibin Muttappillil and Timon Gehr.
At SOI we sometimes ask you to justify why your algorithm solves a given problem. This usually involves four points:
- a description of the idea on how to solve the problem
- an explanation why this idea solves the problem correctly for all inputs
- an analysis on how fast the proposed algorithm is and how much memory it needs
- a rough sketch of a pseudo-code with the important pieces
where the most essential part is the second point.
Like writing algorithms, proving their correctness is not an easy task. There is no step-by-step guide on how to reason correctly as it requires a lot of creativity and thinking. Despite this we try to collect some tips and techniques on how to write theoretical solutions.
We will start with some guidelines on how to write down your solutions.
Then we introduce you to some techniques on how to show that a program solves something correctly (second point).
And we will look on how to analyze runtime and memory (third point).
Note: These techniques are not only useful for theoretical tasks, but for actual programming. Knowing how to show that something is correct helps debugging a faulty program as you can observe where your explanation and therefore your code breaks. It also helps to save time as you can see faster which algorithms are wrong and don’t waste time implementing a wrong solution.
In mathematics there are so-called ‘proofs’, which are formal descriptions of an explanation of why something is correct.
We at SOI don’t expect you to write such things, but we expect you to give a good explanation containing all important ideas and steps involved in showing the correctness of your algorithm.
It doesn’t need to be formally correct, but we need to see that you thought about all possible cases where your idea could go wrong. In the following we will refer to those explanations as ‘reasonings’.
One essential point for a clean reasoning is structure. With structure it is easier for you to build logical arguments and to see why they are correct or if something is missing (also applies to us as correctors). For that you can define lemmas, constructs and use facts.
The latter is just knowledge you can assume are known at SOI (e.g. something written in the wiki, a well-known mathematical property, …). Be sure that this fact is true and try to include a reference (e.g. wiki, lecture of ours, name of the property, …) if you know it.
Sometimes you may want to use special property which is useful for you reasoning, for example looking only at a continuous subsequence which is strictly increasing (e.g. looking at
[2,5,7] and
[1,2,9] in the array
A=[9,2,5,7,6,5,1,2,9,8]). Then instead of writing the properties over and over again you can define a
construct and then reason what properties it has and what consequences follow from this construct. In the example before you could define a ‘rise’ (the name doesn’t matter) as two indices
a and
b such that for every
i, where
a≤i<b, it holds that
A[i]<A[i+1]. With this definition you don’t have two write ‘continuous subsequence which is strictly increasing’ every time, it is clear what you mean and you can build on top of this definition, for example you can define the length of a rise as
len(rise):=b−a+1 (where
:= is the symbol for
defined as).
The most important thing for the structure are probably lemmas. Usually one property is not enough to reason that something is correct, but you need a few key observations. You can view this small observations and properties as checkpoints in your whole reasoning. We call these checkpoints lemmas. An example of a lemma would be that two rises (from before) don’t overlap. If you have several lemmas it is good to name or number them, so we could name the property from before as ‘Lemma of overlap’ or just ‘Lemma 1’. Don’t forget that you also need to reason why a lemma you used is correct. With lemmas it is easier to see what you already showed, what you can use in your reasoning and what we can give you points for.
We will use these structure elements in the following sections so that you can see how they could be applied.
Here are some rules of thumb if you are not sure if your reasoning is correct:
- Try to justify everything or say it’s well known (only if it is actually known or really obvious).
- Try to think of counterexamples and see if your reasoning holds against them (meaning your have an argument against it). Counterexample are the easiest and most powerful tool against wrong reasoning (and also against wrong algorithms).
Before we begin we need to agree on some notation. Let’s say that A and B are statements. A statement can be any claim or sentence which is either true or false. For example A=“There are 56 cards in a deck of Tichu” is a true statement, B=“42 is the largest natural number” would be a wrong statement and F(n)=“The n-th Fibonacci number is 1” is true for n=1 and n=2 but false for the other numbers. Every statement can also be negated, meaning we can also say the opposite (e.g. notA=“There are not 56 cards in a deck of Tichu”). As you saw we write the negation with the ‘not’ word. The key property of the negation is that a true statements becomes false and vice versa:
A |
notA |
false |
true |
true |
false |
We also need the implication, denoted by →. A statement can imply another statement, A→B. This means that if A is true, than B must also be true. For example A=“x is divisible by 4” and B=“x is even”. If A is true B must be true, but if A is false then B can be true or false. So the implication only goes one way. Note that an implication A→B is again a statement and therefore it can be again true or false. For example A=“V is the father of L” and B=“L is the son of V” then it is not necessarily true that A→B holds (L could be the daughter of V). This table summarizes when A→B is true:
A |
B |
A→B |
false |
false |
true |
false |
true |
true |
true |
false |
false |
true |
true |
true |
Now to the interesting part. An easy yet powerful tool is called
proof by contradiction. This is the technique were you assume the opposite of the thing you want to reason about and show that the opposite leads to a contradiction and therefore must be impossible. With this you can conclude that the original statement is true. In other words if we want to say that a statement
A is correct we try to assume
notA, then conclude stuff with this assumptions (e.g.
notA→B0→B1,
notA→C0, …) and find two statements that contradict each other (e.g.
B1 is the same as
notC0 but
C0 and
notC0 can’t both be true).
A small example to see why it might be easier to argue with the negated statements:
Proof why there are infinite natural numbers:
Let’s try to use
proof by contradiction. We assume the opposite that there are finite many natural numbers. Now we can say that there exists a largest natural number (for example by testing every number), let’s call it
M. We also know that
M+1 is also a natural number and that
M+1 is larger (
M+1>M). This contradicts with the definition of
M being the largest, so our assumption of the finiteness of natural numbers must be false. (Note: You don’t need to be so formal as to specify which statement is
A,
B or
C0, these are just here as helping tools).
A similar technique to contradiction are
indirect proofs. They work in a similar manner, but you work with a beginning statement. So instead of reasoning why
B is true you want to show that
B is true if
A is true (
A→B). The rest is very similar except our end goal. We assume
notB, show some in between steps (
notB→C0→C1→…) and instead of leading to a contradiction we want to show
notA. In summary we say that
A→B is correct by showing
notB→notA. The intuition is that if
A→B is true it means that
B has to be true if
A is true. So in case
B is false this means that
A has to be false as well, otherwise this leads to a contradiction. And the last statement can be read as
notB→notA.
A |
B |
A→B |
notB→notA |
false |
false |
true |
true |
false |
true |
true |
true |
true |
false |
false |
false |
true |
true |
true |
true |
A small example, proving that if
n2 is even then
n is also even:
Instead of showing
n2 is even→n is even we want to show
n is not even→n2 is not even.
Well
n is not even implies that
n is odd meaning we can write
n=2⋅k+1. Then it holds that
n2=4⋅k2+4⋅k+1=2⋅(2⋅k2+2⋅k)+1=2⋅C+1 (where
C=2⋅k2+2⋅k). This again shows that
n2 is odd and therefore not even.
With this we showed that
n is not even→n2 is not even and therefore also
n2 is even→n is even.
Two of the most common outputs of our problems are either ‘yes/no’ or the size of an optimal solution.
In the ‘yes/no’ category you usually need to show that something is either possible or impossible for a given input. Be careful in your reasoning as you actually need to reason two things here:
- You need to argue why for every possible case you print ‘yes’ and
- why for every impossible case you print ‘no.’
And for both there are several ways to tackle the reasoning.
Either you try to reason it directly (
possible input→output ’yes’) or try it
indirectly and show that ‘no’ would be a wrong answer (
output ’no’→impossible input) or use
contradictions (
output ’no’ and possible input→contradiction). They all reason about the same thing, but the start and ending point are different, giving you several options to choose from. Depending on the situation one might be easier than the others.
The ‘no’ case is analogous, either show directly (
impossible input→output: ’no’), indirectly (
output ’yes’→possible input) or contradiction
(output ’yes’ and impossible input→contradiction).
The other category is finding an optimal solution or at least the size of the optimal solutions. Let’s say the problem asks us to find a solution with an ‘x’ as big as possible. Here again we need to reason about two things:
- Why is your output is correct (so why there exists a solution with the size of your output) and
- why there is no better solution.
The first part is usually easier, you need to show why your algorithm could output a correct construction that fulfills the requirements of the desired output. The second part requires more thinking as you need to show that there can’t exist a better solution (better meaning with a larger ‘x’). Here we mostly use the contradiction method: assume there exists a solution that is greater than the solution found by your algorithm then try to show that this leads directly to a contradiction or show that your algorithm would have found this solution.
The details for both kind of problems are dependent on the specific problem and usually involve other proof techniques for a full reasoning, but it at least shows what you need to do and where to start.
Often, we want to show that a statement is true for all elements of a certain set, for example the set of natural numbers N={0,1,2,3,…}, or the set of possible inputs to your program. Sometimes, it is possible to argue that the statement is true for all elements by picking an arbitrary unknown representative of the set and arguing for that specific representative.
For example, for an arbitrary natural number n, we know that n+1 is larger than n. Therefore, we know that for all natural numbers n, n+1 is larger than n.
Unfortunately, for some statements, this kind of argument is not strong enough. The reason is that in order to show something holds true for the natural number n, we may already want to use that the same thing is true for natural numbers that are smaller than n.
For example, the sum 0+1+2+⋯+n is equal to 2n(n+1). We can easily check that this is true for n=0, in which case both terms are 0. Assume that for some fixed n>0, we already know that 0+1+⋯+(n−1) is 2(n−1)n. We can then conclude that 0+1+⋯+n=(0+1+⋯+(n−1))+n=2(n−1)n+n=2n(n+1). We have now shown that equality holds for n=0, and for all n>0, if the equality holds for n−1, then it also holds for n. This means, if we want to show that the equality holds for some specific natural number k, we can start at n=0, and use our argument k times until we reach n=k. Therefore, the equality is true for all natural numbers.
More generally, imagine we want to show that the statement P(n) is true for all natural numbers n. It then sufficies to show that P(0) is true and that for all natural numbers n>0, we have P(n−1)→P(n). This observation is known as the principle of mathematical induction. In our first example, the statement P(n) was 0+1+2+⋯+n=2n(n−1).
(Note that a well-known alternative argument is to note that 2⋅(0+1+2+⋯+n)= (0+1+2+⋯+n)+(n+(n−1)+⋯+0)=(0+n)+(1+(n−1))+⋯+(n+0)= n+n+⋯+n=(n+1)⋅n, but what we discuss above is actually a bit simpler.)
Note that the principle of mathematical induction is equivalent to the observation that any non-empty set of natural numbers has a smallest element: If we want to show that the statement P(n) is true for all natural numbers n, we can argue by contradiction. Assume to the contrary that there is some natural number n which is a counterexample, i.e., P(n) is false. In particular, consider the smallest n for which P(n) is false. (This exists because the set of counterexamples has a smallest element.) We then show that P(0) is true and we conclude that n must be larger than 0. Then, we show P(n−1)→P(n) or, equivalently notP(n)→notP(n−1). From notP(n) we can then derive notP(n−1). This is however a contradiction, because n was the smallest counterexample, but now we found one that is even smaller. Therefore our assumption had to be wrong and P(n) is true for all natural numbers n.
A well-known variant is the principle of complete induction. It follows from the principle of mathematical induction using the statement P(n) that says that some statement Q(n) is true for all natural numbers m that are strictly smaller than n.
To show that Q(n) is true for all natural numbers n using the principle of complete induction, we pick an arbitrary natural number n, assume that Q(m) is true for all m<n and then show that Q(n) is true.
The principle of complete induction is particularly helpful for arguing about the correctness of recursive functions. Consider the following algorithm for sorting a list of numbers:
def mergeSort(list):
if len(list) <= 1:
return list
else:
(firstHalf, secondHalf) = split(list)
return merge(mergeSort(firstHalf), mergeSort(secondHalf))
Here, the function split splits a list of n numbers into two halves of length ⌊n/2⌋ and ⌈n/2⌉ (i.e., n/2, once rounded down and once rounded up to the next integer), and the function merge takes two sorted lists and merges them into a single sorted list containing the same numbers. We can use complete induction to show that the function mergeSort correctly sorts the list of numbers.
In particular, we will show that the algorithm sorts lists of length n for an arbitrary n. By the principle of complete induction, we can assume that the algorithm correctly sorts lists that contain fewer than n elements.
If n is 0 or 1, the list is already sorted and therefore the algorithm returns it unchanged. Otherwise, the list is split into two halves that both contain strictly fewer than n elements. Therefore, the algorithm sorts each half correctly. Finally, the function merge combines the two sorted halves into a single sorted list.
See the Greedy script to see how to reason about greedy algorithms.