Algorithms: Part 2 - Runtime Analysis and Insertion Sort

Nov 2, 2024ยท
Christopher Coverdale
Christopher Coverdale
ยท 5 min read

Runtime Analysis

In this article, we will go into detail on analysing the runtime of an algorithm.

We will use insertion sort as the example.

Insertion Sort

#include <stdbool.h>
#include <stdio.h>

int insertion_sort(int *arr, int size) {
    if (arr == NULL)
        return false;

    if (size < 2)
        return false;

    for (int i = 1; i < size; i++) {
        int curr_val = arr[i];
        int prev_idx = i - 1;

        while (prev_idx >= 0 && arr[prev_idx] > curr_val) {
            arr[prev_idx + 1] = arr[prev_idx];
            prev_idx--;
        }

        arr[prev_idx + 1] = curr_val;
    }

    return true;
}

Using insertion sort above, I’ll break down each line in terms of time cost.

I’ll skip over the base case checks since they are not as important but we can note that those checks have a contant time c.

  • Starting with the main loop:
for (int i = 1; i < size; i++)

Loop over n times, the length of the array. The operations of assigning the variable i, checking i is less then size and incrementing i are all constant time operations, so we can focus on the dominant term, which is n.

  • The assigninments in the loop are constant time, repeated n times within the loop.

  • Each constant time assignment can be expressed as: c1, c2

int curr_val = arr[i];
int prev_idx = i - 1;
  • The inner loop iterates in reverse towards 0 from index and shifting items right, it is repeated in worst case scenario n-1 times, essentially for each entry.

  • The number of comparisons and subsequent loops in the worst case scenario forms an arithmetic series.

while (prev_idx >= 0 && arr[prev_idx] > curr_val) {
    arr[prev_idx + 1] = arr[prev_idx];
    prev_idx--;
}
  • The loop can be more formally represented as:

$ \sum_{j=1}^{n-1} j = \frac{(n-1) \cdot n}{2} $

  • For j (prev_idx) to n-1, compare each pair for sorting. The formula is a summation of an arithmetic series, because we are essentially comparing each pair in the loop.

  • The final assignment of swapping positions is constant time and repeated in the outer loop n times.

arr[prev_idx + 1] = curr_val;

In review of the algorithm in terms of time cost:

  • outer loop: n
  • assignments in inner loop: c1*n and c2*n
  • inner loop (expressed as an arithmetic series): (n-1 * n / 2)
  • inner loop assignments and decrement: c3*n-1 and c4*n-1
  • final assignemnt: c5*n

$ T = n + c1*n + c2*n + (n-1 * n/ 2) + c3*n-1 + c4*n-1 + c5*n $

  • The constant time operations we can assume to have a direct CPU operation or in our case of some kind of virtualized machine, the operation is relatively cheap. We can ignore all constant time operations.

$ T = n + n + n + (n-1 * n / 2) + n-1 + n-1 + n $

  • We can also ignore all the like terms n and n-1 since the inner loop arithmetic summation already takes into account the outer loop n

$ T = n(n-1)/2 $

  • When we simplify the formula:

$ T = n^2 - n / 2 $

When looking at run times of algorithms we look at the dominant term and usually only the worst case.

We can see the dominant term is n^2, which tells us the worst-case runtime of insertion sort is:

$ \theta({n^2}) $

We can identify that this runtime is quadratic.

Arithmetic Series

As a quick aside, an arithmetic series is a sequence of numbers that are sorted and have a constant difference between the terms.

For example:

$ [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] $

  • This is a sorted sequence and has a constant difference in terms of 1.

The summation formula works because of of the arithmetic sequence properties.

If we calculate the sum naively:

$ 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 = 55 $

For an array of a, we can identify that, for each a_i + a_n-1 pair, the result will be constant.

a_i + a_n-1 = c

1 + 10 = 11
2 + 9  = 11
3 + 8  = 11
4 + 7  = 11
5 + 6  = 11

We can see that each pair in a_i + a_n-1 has the same constant results.

We can also calculate the number of pairs as n/2. Therefore there is a formula that will find the sum without calculating each pair.

$ S = n/2 * (ai + an) $

We are using the arithmetic series to demonstrate in the inner loop in insertion sort since we are comparing pairs in an arithmetic series.

This formula can be applied in a situation where, we have a list of sorted numbers with a constant factor between the terms, this would result in a constant time runtime instead of the classic linear but this depends on the array holding the characteristics of an arithmetic series.

Run times

We looked at the run time of insertion sort as

$ \theta(n^2) $

There are other runtimes that we will come across in subsequent blog posts.

  • A constant time algorithm e.g. arithmetic series summation if the properties of the series holds

$ \theta(1) $

  • A base 2 logarithmic runtime that grows according to the size of an array

$ \theta(log n) $

  • A linear runtime that grows according to the size of an array

$ \theta(n) $

  • A runtime that grows n times log n of the size of an array

$ \theta(nlogn) $

  • An exponential runtime

$ \theta(2^n) $

  • Factorial

$ \theta(!n) $

Did you find this page helpful? Consider sharing it ๐Ÿ™Œ