Algorithms: Part 1 - Runtime Analysis and Insertion Sort
Runtime Analysis
In this article, we will go into detail on analysing the runtime of an algorithm.
We will use insertion sort as the example.
Insertion Sort
#include <stdbool.h>
#include <stdio.h>
int insertion_sort(int *arr, int size) {
if (arr == NULL)
return false;
if (size < 2)
return false;
for (int i = 1; i < size; i++) {
int curr_val = arr[i];
int prev_idx = i - 1;
while (prev_idx >= 0 && arr[prev_idx] > curr_val) {
arr[prev_idx + 1] = arr[prev_idx];
prev_idx--;
}
arr[prev_idx + 1] = curr_val;
}
return true;
}
Using insertion sort above, I’ll break down each line in terms of time cost. I’ll skip over the base case checks since they are not as important but we can note that those checks have c = constant
time term.
for (int i = 1; i < size; i++)
- This is a loop over n
times, the length of the array. The operations of assigning i, checking i is less then size and incrementing i are all constant time operations, so we can focus on the dominant term for time, which is n
.
The assigninments in the loop are constant time, repeated
n
times within the loop.Each constant time assignment can be expressed as:
c1, c2
int curr_val = arr[i];
int prev_idx = i - 1;
- The inner loop iterates in reverse towards 0 from index and shifting items right, it is repeated in worst case scenario
n-1
times, essentially for each entry. This number of comparisons and subsequent loops in the worst case scenario forms an arithmetic series.
while (prev_idx >= 0 && arr[prev_idx] > curr_val) {
arr[prev_idx + 1] = arr[prev_idx];
prev_idx--;
}
- The loop can be more formally represented as:
$ \sum_{j=1}^{n-1} j = \frac{(n-1) \cdot n}{2} $
For
j (prev_idx)
ton-1
, compare each pair for sorting. The formula is a summation of an arithmetic series, because we are essentially comparing each pair in the loop.The final assignment of swapping positions is constant time and repeated in the outer loop
n
times.
arr[prev_idx + 1] = curr_val;
- outer loop:
n
- assignments in inner loop:
c1*n
andc2*n
- inner loop (arithmetic series):
(n-1 * n / 2)
- inner loop assignments and decrement:
c3*n-1
andc4*n-1
- final assignemnt:
c5*n
$ T = n + c1*n + c2*n + (n-1 * n/ 2) + c3*n-1 + c4*n-1 + c5*n $
- The constant time operations we can assume to have a direct CPU operation or in our case of some kind of virtualized machine, the operation is relatively cheap. We can ignore all constant time operations.
$ T = n + n + n + (n-1 * n / 2) + n-1 + n-1 + n $
- We can also ignore all the like terms
n
andn-1
since the inner loop arithmetic summation already takes into account the outer loopn
$ T = n(n-1)/2 $
- When we simplify the formula:
$ T = n^2 - n / 2 $
When looking at run times of algorithms we look at the dominant term and usually only the worst case.
We can see the dominant term is n^2
, which tells us the worst-case runtime of insertion sort is:
$ \theta({n^2}) $
We can identify that this runtime is quadratic.
Arithmetic Series
As a quick aside, an arithmetic series is a sequence of numbers that are sorted and have a constant difference between the terms.
For example:
$ [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] $
- This is a sorted sequence and has a constant difference in terms of 1.
The summation formula works because of of the arithmetic sequence properties.
If we calculate the sum naively:
$ 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 = 55 $
For an array of a
, we can identify that, for each ai + an-1
pair, the result will be constant.
ai + an-1 = c
1 + 10 = 11
2 + 9 = 11
3 + 8 = 11
4 + 7 = 11
5 + 6 = 11
We can see that each pair in ai + an-1
has the same constant results.
We can also calculate the number of pairs as n/2
. Therefore there is a formula that
will find the sum without calculating each pair.
$ S = n/2 * (ai + an) $
We are using the arithmetic series to demonstrate in the inner loop in insertion sort since we are comparing pairs in an arithmetic series.
This formula be applied in a situation where, we have a list of sorted numbers with a constant factor between the terms, this would result in a constant time runtime instead of the classic linear but this depends on the array holding the characteristics of an arithmetic series.
Run times
We looked at the run time of insertion sort as
$ \theta(n^2) $
There are other runtimes that we will come across in subsequent blog posts.
- A constant time algorithm e.g. arithmetic series summation if the properties of the series holds
$ \theta(1) $
- A base 2 logarithmic runtime that grows according to the size of an array
$ \theta(log n) $
- A linear runtime that grows according to the size of an array
$ \theta(n) $
- A runtime that grows n time log n of the size of an array
$ \theta(nlogn) $
- An exponential runtime
$ \theta(2^n) $
- Factorial
$ \theta(!n) $