Introduction to the theory of algorithms
This article is very important because it explains terms and matters which occur in all of the algorithms throughout the courses. I'm telling you now that I'm going to try to keep things as simple as possible and demonstrate them on examples, so don't expect a bunch of equations and numbers. Some of you may be disappointed by this decision, but I'm sure that the majority of you will be pleased However, this doesn't mean we won't talk about scientific matters and terms, you'll encounter them here as well. We're going to explain what an algorithm is and focus on asymptotic time complexity and stability. I believe that anything and everything can be explained using simple words and examples. I'll continue following this pattern of simplicity with the other algorithms. Now that that's all been established, let's go over how computers solve their tasks and introduce you all to algorithms as well as what we solve and compute within them.
Algorithm
The first thing we'll have to do is agree on what an algorithm is. Simply put, an algorithm is a recipe used to solve a problem. From a human's point of view, an example of an algorithm could be how to get up in the morning. Although it might sound simple, it can get very complicated, very quickly. Computers are machines and machines don't think. Therefore, we have to describe all of the algorithm's steps precisely. With this, we get to the first property of an algorithm - it must be elementary (consist of the final number of simple and comprehensible steps, commands). "Get out of bed" certainly isn't an algorithm. "Open your eyes, take the blanket off, sit up, put your legs on the ground and stand up" - this one sounds concrete enough to be a true algorithm. However, we're going to keep things relevant to IT, so we'll solve tasks like sorting items by their size or searching elements by their values. Sorting and searching are two basic tasks which computers do all the time, so they have to be thought out and optimized very well. Otherwise, they'd take up too much time that could be used for other processes. As for other algorithm examples, think of solving a quadratic equation or solving a Sudoku.
What if we wanted the computer to swap 2 numbers (meaning to swap the values
of 2 variables, e.g. of variables a
and b
)? As you may
already know, the following code wouldn't work:
a = b; b = a;
The value of the variable b
is assigned to the variable
a
. Then, the value of the variable a
is assigned to
the variable b
. However, b
already had a value stored
within it, so both of them end up with a value of b
. Therefore,
we'll have to introduce an auxiliary variable, i.e. c
:
c = a; a = b; b = c;
Now, the values of variables a
and b
have been
swapped. We now have our first algorithm, an algorithm for swapping 2 numbers.
Notice how we can't just say "swap the numbers". We have to describe what the
program should do using a set of commands perfectly.
Another important property of an algorithm is that it should be deterministic. At any given moment, we should be able to tell whether the algorithm has already ended or how it's going to continue. An algorithm should also end up with the correct result and cover all possible inputs. In our simple example, it should be able to swap the values regardless if the values are a=10; b=20; or a=999; b=102. It's also important that an algorithm is general as well as final. It can't afford to get stuck at any point, so it always has to end with the finite number of steps.
Algorithm's time complexity and the asymptotic notation
Time complexity is, as the name suggests, related to the duration for which an algorithm (the program) runs. The duration depends on the input size (sorting 10 numbers will surely be faster than sorting a million numbers). The problem is, no one can say: "This algorithm can do it in 5 seconds" because the duration depends on the speed of the computer the algorithm is running on. It also depends on things like the used programming language. We can't even count the computer processor's instructions since it would vary depending on their architecture. Therefore, instead of time, we counting the commands the computer performs. By command, we mean an elementary operation like assigning a value or comparing 2 values (not to fill an array, that'd be a series of assignments). Simple arithmetic operations would be commands as well. We count them according to the input size.
Consider an algorithm which finds the minimum (meaning the smallest element)
in an array. I'm sure you know how we're going to do it: We'll assume that the
first element is the minimum and then go through all of the array elements, from
beginning to the end. If we find anything smaller, we'll assume it is the new
minimum and continue in the same way. Once we get to the end, we'll have the
array's overall smallest element. We have to go through the entire array. If it
has n
elements, we have to compare the stored minimum with the
current element n
times, increase the loop's control variable
n
times and assign the new minimum as needed. Let's say we have to
do this 3n
times in an average case (30 operations for an array of
10 elements). The time complexity would then be O(3n). However, O(3n) is the
same as just O(n). Why is that? Simply because we're interested in the
quality of an algorithm and constants are not significant in
the calculation. The definition of the asymptotic time complexity is very
similar the definition of the mathematic limit. Let's demonstrate how to compare
2 algorithms on an example.
Consider an algorithm for sorting an array of numbers of a time complexity of
O(n2). It'll be something like a selection sort, which will simply
put, find the minimum in the time of O(n) and put this minimum at the beginning
of the array. Then, it'll find the minimum of the remaining elements and put it
behind the last minimum. This will be done n
times, so the
resulting time complexity is O(n * n) = O(n2).
Then, we'll go consider the second, really dumb algorithm, which will try all the element combinations out until the array seems sorted. We'd end up with O(n!) permutations, so we'll omit all additional operations since the complexity is already high enough
Last of all, we'll talk about QuickSort, which I'm not going to explain to keep things simple. Let's just say that it can sort elements with a time complexity of O(n log n).
To be a bit more illustrative, we'll also consider an algorithm of the time complexity of O(n). With this one, even the sorting problem can't be performed that easily.
For this example, let's measure the elapsed time of each algorithm sorting arrays of various sizes on the same computer. We'll say that 1 operation takes 1ms:
Elements | O(n) | O(n log n) | O(n2) | O(n!) |
---|---|---|---|---|
10 | 0,01 second | 0,03 second | 0,1 seconds | 1 hour |
15 | 0,015 second | 0,06 second | 0,2 seconds | 41 years |
100 | 0,1 second | 0,7 second | 10 second | incalculable |
1000 | 1 second | 10 seconds | half an hour | incalculable |
Let's say that the computer has its processor's frequency at 3GHz. So why don't we just use a faster one? The bad algorithms would go faster as well, wouldn't they? No, they wouldn't. Here is where we can actually see the algorithms' quality. Although the algorithm of O(n) complexity would be able to handle 10 times more elements if we used a computer that is 10 times faster, we can't say this about the other ones. It isn't directly proportional, the speed of the computer processor is constant, and if a function is ascending very quickly, constants are completely negligible. Let's look at how the times would change if we speed up the computer 10x, 100x, and even 1000x:
Let's say we'll sort 15 elements (I'll take 0,15 second according to the table above). Let's look at how much more we'd be able to sort with the other algorithms and how the speed-up will help us at all:
CPU frequency | O(n) | O(n log n) | O(n2) | O(n!) |
---|---|---|---|---|
3 GHz (no speed-up) | 15 elements | 6 elements | 4 elements | 4 elements |
30 GHz | 150 elements | 30 elements | 12 elements | 5 elements |
300 GHz | 1500 elements | 200 elements | 40 elements | 6 elements |
3 000 GHz | 15000 elements | 1400 elements | 120 elements | 8 elements |
Values are rounded.
As you can see, even if we speed the computer up 1000 times to 3 000 GHz, we'd only be able to sort 4 more elements with the O(n!) complexity. The algorithm's quality is bad and increasing the machine's speed isn't a solution at all. We'd have to look for another algorithm with a better time complexity.
Now, what have we learned? The time complexity splits algorithms into certain qualitative categories. It doesn't matter at all if we speed up any algorithm a million times since it wouldn't ever leave its category and there would always be a number of elements from which an algorithm from another category would be faster. In other words, an algorithm that sorts numbers in O(n log n) executed on an old 486 computer would always be faster than a sorting algorithm with a time complexity of O(n2) executed on a modern 100-core server. Speeding the computer up might hide the algorithm's handicap for a while but it'll always bite back with large inputs and the constant will become completely negligible with large numbers. Some of you may have noticed that I've just said the limit's definition
The uppercase O symbol is called the "Omicron notation". It means that the time complexity of a given algorithm is asymptotically less or equal to the expression in the parentheses. It indicates which "time complexity category" the algorithm will never surpass. You may also encounter Omega (greater or equal)as well as the lowercase equivalents of omega and omicron which indicate strong inequality. Finally, uppercase Phi indicates asymptotic equality.
Other algorithm properties
Stability
We say that an algorithm is stable if it maintains the original order of the elements which are equal according to the comparison criterion. If we're just sorting an array of numbers, stability is meaningless for us. However, in the moment we start sorting objects or any another container structures, this could be a great benefit. Let's say that we have employees sorted by their names in an array and we want to sort them according to their ages. If we use an unstable sorting algorithm, 2 employees could be of the same age (Adam and Zack) and could end up switching their original order: Zack, Adam. A stable algorithm will maintain the original order in case the elements are equal, i.e. Adam, Zack. Employees of the same age will be ordered alphabetically as well.
In place
We say that we're sorting "in place" if we don't need any other memory aside from the input array. If we do, there's a risk of running out of memory in some cases. With this type of memory handicap, we could get a better time complexity or other benefits.
That would be all for the introduction. You can now browse through our algorithm collection without a problem
For the tables, I used Unicorn College materials from professor Ondřej Kučera.