Computational Complexity: Decomposition into prime factors explained step by step.
The analysis of the computational complexity of an algorithm is the study of the amount of time or space that it will need for its execution. We do the study of time and space differently. In this article, we will focus on temporal computational complexity. Always talking about classical computing, the study of complexity will radically change when you can work with quantum computers.
This discipline is critical and quite tricky to understand. On many occasions, it is confused with temporal or statistical studies of the execution time of an algorithm. These studies are also necessary, but they give rise to local optimizations (1).
What is sought in computational complexity is the study of the algorithm under conditions of data sets of infinite length. It is possible to establish similarities by studying the limits of a function when the function is approaching infinity.
The “magic” of the study of computational complexity is that it can indicate aspects such as that an algorithm will not improve its performance substantially even if we increase the processor’s power. On the contrary, that algorithm will continue to behave excellent performance even if we execute it in a low-performance computer.
The study of computational complexity is based on the algorithm’s internal analysis, emphasizing aspects such as loops, their nesting, etc.
The study of computational complexity uses the following notation:
There are more performance measures, but I have indicated the basic ones.
In the graphical representation of the previous functions, we can see how the temporal cost of the algorithm increases a lot when the data size increases in the case of O(n^2) and O(n^n). We observe that it remains linear in the case of O (n) and that the increase in size does not have much impact if the cost is O(log n) or O(sqrt(n)). Finally, if the cost is O (1), it is constant, which indicates that increasing the size of the data does not affect the response time.
Making another mathematical analogy, the slopes of these functions indicate growth rates in time cost when the size of the data increases.
In an ideal world, all of our algorithms would have O (1), O (log n), or O(complexity sqrt(n)). Unfortunately, the very nature of the problems means that on many occasions, this is not possible.
(1) Starting the explanation of algorithmic complexity by making specific execution time measurements of an algorithm is a grave mistake because it breaks the essence of the discipline itself. These measurements depend, for example, on the power of the processor used. As we have said previously, the computational complexity study results are independent of the computer’s power to execute the algorithm.
Decomposition into prime factors:
To see how the study of computational complexity works, we will apply it to the algorithm of decomposing a number into prime factors (2). We will make three attempts.
Basic definitions first:
- A prime is a number that can only be divided by unity and by itself.
- We can rewrite every non-prime number as a product of prime numbers. These prime numbers that allow its rewriting are its factors.
90 = 3^2* 2 * 5
Since multiplication is commutative, we can always rewrite this decomposition, starting with the smallest number.
90 = 2 * 3^2* 5
It comes from the traditional way we are taught in school when children calculate the prime factors.
If we convert this algorithm into code directly, we are left with something like this:
This algorithm has cost O (n), where n is the number that we want to decompose. Although we make n small within the loop, if n tends to infinity, we must consider that we make infinite turns of the loop. The key is how many turns we make of the loop.
Let’s make a little change. First, let’s check if the number is even, that is, if it is divisible by 2, and we calculate how many times it is divisible by 2.
Then we make the previous loop, but we start at three and increase “i” by two to test only the odd numbers. It is clear that the only prime even number is two, and we have already checked it in the first loop.
To study complexity, we are going to look at each loop separately. We can see that the cost is O(log(2,n)) in the first loop. It can be better verified if we decompose an even number that only has a factor of 2.
32 = 2^5 , log(2,32) = 5 => 5 loops.
So even if we think of a huge even number but whose only factor is 2, the cost will still be O(log(2,n)). Remembering the graph of functions from the previous section, we see that we are doing much better than in the first attempt.
Now let’s look at the second loop. This time we go through half the numbers since we only go through the odd ones. But thinking of a number close to infinity, half infinity is still infinity, so as happened in the first attempt, the cost is still O(n).
The final cost would be O(log(2,n)) + O(n). When we think of infinity, we know that O(n) grows much faster than O(log(2,n)). In this way, we can simplify the sum by leaving O(n)
The first loop still has cost O(log(2,n)).
The second loop has changed its condition to this While (i <=n) Why?
Let’s look at this statement. Given any non-prime number, there must be at least one factor less than or equal to its square root. To prove it, let’s think otherwise. Suppose it doesn’t exist.
If it does not exist we have the number decomposed into prime factors a and b => n = a * b, where a and b> sqrt(n). Then in the simplest case, at least a and b =sqrt(n) + 1 =>
n = a * b = sqrt(n) + 1 * sqrt(n) + 1 => (sqrt(n) + 1)^2= n + 1 + 2 sqrt(n) != n
Then the opposite has to happen.
So even if it is an infinitely large number now, the cost of the second loop is O(sqrt(n))
The final cost would be O(log(2,n)) + O(sqrt(n)), which again, thinking in terms of limits and infinity leaves us in O(sqrt(n)). But O(sqrt(n)) is much better than O(n).
In any case, we must consider the first O loop (log(2,n)) as an excellent local optimization.
As an engineer who develops real projects, I think that those local optimizations are significant. It is also clear that the algorithm’s complexity study is the fundamental factor in determining if the algorithm will behave correctly.
With some small local optimizations, it would look like this:
(2) I have done the prime factorization example because it is the first example in some engineering studies. I can’t entirely agree at all with this decision. I prefer to start with studies on the complexity of search types in vectors of integers and continue with the complexity in sorting algorithms.
We have seen that the complexity of our algorithm is O(sqrt(n)). If we consult many bibliographies, we will see that the decomposition into prime factors is a typical example of the NP problem. Therefore, a polynomial cost solution is unknown, which contradicts our O(sqrt(n)) result.
It is because we have ignored the cost of the division or modulo operations performed within the loop. The divisions will end up executed in binary, and the binary divisions have a cost of O(2^b), where b is the number of bits needed to represent the number. Then, if n is 2^b, we have in the previous loop O(sqrt(n)) = O(sqrt(2^b)), and the total cost is the cost of the previous loop times the cost of a binary division.
O(2^b)* O(sqrt(2^b))= O(2^b)*O(2^b^(1/2))= O(2^b) * O(2^(b/2)) = O(2^(3b/2))
So the cost is exponential. In this case, we must go down to the binary level to find what is known as cryptographic complexity. In most algorithms, it is not necessary to go down to the binary level.