Algorithms: Part 3 - Divide and Conquer and Merge Sort
Divide and Conquer
In this article, we will go into detail on the Divide and Conquer algorithmic strategy. We will use the Merge Sort Algorithm and Maximum Subarray to explore the idea.
Divide and Conquer algorithms are usually recursive and will focus on splitting the problem into smaller sub problems. Once the sub problems are as small as they can be, they’ll solve these subproblems, use the results to solve the bigger problems.
The Divide and Conquer strategy can be split into the following 3 stages:
Divide
- A strategy to divide the problem into subproblems, usually a recursive function.
Conquer
- The logic required for solving each subproblem. In the case of Merge Sort, after splitting the subproblem in two sides, each side is sorted.
Combine
- This stage combines the solved subproblems to solve the parent problem. In a recursive problem, this is using the results of the child subproblems to solve the problem at the current level of recursion.
Merge Sort
void merge_sort(int *arr, size_t start, size_t end) {
// Merge sort base case bottoms out at 1 element.
if (end - start <= 1) {
return;
}
size_t mid = start + (end - start) / 2;
merge_sort(arr, start, mid);
merge_sort(arr, mid, end);
merge(arr, start, mid, end);
}
The merge_sort
function is a recursive function that will break the problem
down into further subproblems until the level of recursion reaches an array
of only 1 element.
Merge Sort, takes an array, splits it in the middle and calls itself using each split array.
Once the recursion bottoms out, the array is “Conquered” using the merge sorting function.
So far the step of Divide
has been covered - splitting the probelm into subproblems.
void merge(int *arr, size_t start, size_t mid, size_t end) {
size_t left_len = mid - start;
size_t right_len = end - mid;
int left[left_len], right[right_len];
for (int i = 0; i < left_len; i++) {
left[i] = arr[start + i];
}
for (int j = 0; j < right_len; j++) {
right[j] = arr[mid + j];
}
int i = 0;
int j = 0;
int k = start;
while (i < left_len && j < right_len) {
if (left[i] <= right[j]) {
arr[k] = left[i];
i++;
} else {
arr[k] = right[j];
j++;
}
k++;
}
while (i < left_len) {
arr[k] = left[i];
i++;
k++;
}
while (j < right_len) {
arr[k] = right[j];
j++;
k++;
}
}
The merge
function is the Conquer
step. This is the implementation that sorts
the given array.
It starts by creating x2 auxiliary arrays, left and right, which represents two sides of the array. Both sides of the input array are copied into the auxiliary arrays.
A while loop iterates over both arrays, it compares both auxiliary arrays and copies the smaller value back into the input array.
Whichever array has left over items, the rest is copied back into the input array since it indicates the rest of the items are all larger than the other auxiliary array.
An analogy used to represent this strategy is that the two auxiliary arrays represent two hands or two piles of playing cards. The sorter would flip each card over from each pile, compare the two cards, and put the smaller card on a new pile. This would continue until one side is exhausted. Afer, the rest of the pile would be placed on the new pile. This way, the new deck of cards is sorted between the two piles.
After the merge
function is called, the subproblem is solved and the recursive function
returns up the stack. This means the result of the subproblem are used as input
into the larger problem one level up in the recursion call. The results of the recursive
calls are once again used in merge
, this represents the Combine
step.
This process repeats until merge_sort
is back at the root level, using two sorted
sides of an array, to perform the final sort.
Two Way Merge
A two way merge is an algorithm that merges two arrays into one array. This is also known as a “Two-Pointer Technique”.
This strategy is used in the merge function in merge sort. It uses three pointers, two pointers for each array that are going to be merged and one pointer to track the that array that is receiving the merge.
The merge function can be expressed through a loop invariant.
Initialization
When the loop invariant is initialized. The receiving array of the merge, starting from A[0:i], i being the pointer of the array, will have a sorted order. The pointers for the auxiliary arrays (j and k) will not be exhausted and will be pointing to the beginning of each array respectively.
Maintenance
Before an iteration and after an iteration. A[0:i] in the receiving array will be sorted from lowest to highest. After inserting a value from the auxiliary arrays to the merge array, each pointers in the auxiliary will be at different positions based on the comparison of the value at pointers j and k. Either j and k will be incremented to the next position based on whichever value was lower at the index j and k.
Termination
Termination will occur once i
reaches the end of the merging array and both j and k will be at the end of auxiliary arrays. The merge array will be in sorted order.
Recurrence Equation
The Recurrence Equation is used to demonstrate the run time of a Recursive Divide and Conquer Algorithm.
$ T(n) = a \cdot T\left(\frac{n}{b}\right) + D(n) + C(n) $
$ T(n) $: Is the runtime of the algorithm
$ a $: The number of subproblems divided at each level, e.g. in Merge Sort it’s 2 since the array is divided in half at each level
$ T\left(\frac{n}{b}\right) $: The time complexity of recursively sorting a subarray of size $n/b$
$ b $: How much smaller the subproblem becomes, e.g. in Merge Sort it’s 2 since the subproblems are half of the parent problem size. In Binary Search it’s 1, since only one side of the split is used
$ D(n) $: The time it takes divide the problem into subproblems
$ C(n) $: The time it takes combine the results of the subproblems
The Recurrence Equation can also be represented by dropping $D(n)$ and representing the non-recursive function as $f(n)$.
$ T(n) = a \cdot T\left(\frac{n}{b}\right) + f(n) $
We can formally represent the Recurrence Equation of Merge Sort as below:
$ T(n) = \begin{cases} \Theta(1) & \text{if } n = 1, \\ 2T\left(\frac{n}{2}\right) + \Theta(n) & \text{if } n > 1. \end{cases} $
If the size of the array is 1, we don’t need to recursively solve the problem. This is the base case and we simply just return. This is constant time.
$a = 2$ because we are dividing the problem into two subproblems each time.
$b = 2$ because when we divide the subproblems, we are dividing the problem by half.
The Merge Function to sort the problem is $\Theta(n)$.
Master Method
The Master Method gives us a method to determine the run time of the Recurrence Equation.
We need to explain variables that will be used in the master method.
$ f(n) = \text{the non-recurisve "conquer" function} $
$ p = log_b a $
- p is the exponent that describes the growth rate of the recursive work
$ \epsilon = p - k(\text{ the exponent of } f(n)) $
- $\epsilon$ shows the growth rate difference of $f(n)$ to $n^p$
The Master Method essentially compares the growth of $f(n)$ to $n^p$ to determine the Time Complexity using $\epsilon$.
There are 3 cases:
- $\epsilon > 0$ and $f(n) < p$
Time Complexity is $\Theta(n^p)$
This means that non-recurisve function (conquer) grows slower than the recursive function.
This means the time complexity is exactly $\Theta(n^p)$, representing the recursive function in the Recurrence Equation
- $\epsilon == 0$
Time Complexity is $\Theta(n^p log n)$
This means the both the recursive and the non-recursive function have equivalent contributions to the Time Complexity
The $log n$ factor comes from the number of levels of recursion
- $\epsilon > 0$ and $f(n) > p$
Time Complexity is $\Theta(f(n))$
This means the non-recursive function will dominate the Time Complexity, therefore Time Complexity is $f(n)$, the non-recursive function dominates.
Tree Recursion Method
The Tree Recursion Method is a useful method for deducing a runtime.
The method requires drawing the recursion tree.
It can be summarized in 3 parts.
- Write down the Recurrence Equation
$ T(n) = aT(n/b) + f(n) $
- e.g Merge Sort
$ T(n) = 2T(n/2) + \Theta(n) $
- Visualize the tree
Start at the root, which will perform $f(n)$ work
The number of children at the next level will be split according to $b$. For example in Merge Sort, $b=2$, therefore there will be 2 children at each level
With Merge Sort, the work per level can be represented as ${n/2, n/4, n/8...}$
This split will bottom out when the problem size reaches $1$
Therefore, the number of levels in the tree should be $log_b n$ (we divide $n$ by $b$ until it reaches 1), which can be represented as $n/2^i$, $i$ being the exponent of the final level
- Calculate the work per level
Work per level = (num of subproblems) * (work per subproblem)
Using Merge Sort as an example:
- Level 0:
n = 1 * n
- Level 1:
n = 2 * (n/2)
- Level 2:
n = 4 * (n/4)
We can see that the number of subproblems increases by squared amount and the work per subproblem is divided in half.
- If we substitute $n = 1$, we can see we get a constant result (work per level):
- $1 = 2 * (1/2)$
- $1 = 4 * (1/4)$
- $1 = 8 * (1/8)$
- Calculate the sum of all work across all the levels.
Total work = (work per level) * (number of levels)
- $(n + n + n) * (log n)$
- $3n * log n$
- $(n * log n)$
For Merge Sort:
Work per level is = n
Number of levels = log2 n
$n log n = n * log2 n$
Therefore the run time is $\Theta(n log n)$
We can verify the runtime with the Master Method.
Steps to use the Master Method
Calculate p
Calculate $epsilon$
If $f(n)$ is a polynomial then:
- $epsilon = p - k$
- Subtract the higher term e.g. if k is higher then do $epsilon = k - p$
If $f(n)$ is a logarithm then $epsilon$ will always be 0
if $f(n)$ has mixed terms, e.g. $f(n^2 log n)$ then use the dominant term, $n^2$ as $k$
- $epsilon = p - k$
- Check the cases according to:
Case 1: $epsilon > 0$ and $f(n) < p$
- Runtime is: $Theta(n^p)$
Case 2: $epsilon == 0$
- Runtime is: $Theta(n^p log n)$
Case 3: $epsilon > 0$ and $f(n) > p$
- Runtime is: $Theta(f(n))$
Maximum Subarray
The Maximum Subarray problem is a very common Divide and Conquer problem that seeks to find the largest subarray within an array.
It has applications in:
Finance:
- Profit maximization when buying and selling stocks
- Analyzing trends in financial data
Signal Processing:
- Identifying regions with the strongest signal in a noisy stream.
We can use the Tree Recursion Method and the Master Method to deduce the Runtime.
#include "../../../common/include/max.h"
#include "../../../common/include/utils.h"
#include <limits.h>
#include <stdio.h>
int find_crossmax(int *arr, int start, int mid, int end) {
int MAX_L = INT_MIN;
int MAX_R = INT_MIN;
int sum = 0;
for (int i = mid - 1; i >= start; i--) {
sum += arr[i];
if (sum > MAX_L) {
MAX_L = sum;
}
}
sum = 0;
for (int i = mid; i < end; i++) {
sum += arr[i];
if (sum > MAX_R) {
MAX_R = sum;
}
}
return MAX_L + MAX_R;
}
int max_subarray(int *arr, size_t start, size_t end) {
// Handle the case of an empty arr.
if (arr == NULL || end == 0) {
return 0;
}
// Divide:
if (end - start <= 1) {
return arr[start];
}
// Conquer:
size_t mid = start + (end - start) / 2;
const int left = max_subarray(arr, start, mid);
const int right = max_subarray(arr, mid, end);
// Combine:
const int cross_max = find_crossmax(arr, start, mid, end);
int *max_val = (int *)max3(&left, &right, &cross_max, INT_TYPE);
return *max_val;
}
find_crossmax is the non-recursive function that will find the largest subarray value starting from the midpoint, this will cover the cross over point between the left and right halves of the array.
Calculating the largest subarray from the cross over point will find the largest subarray value because this function will calculate on each subproblem and return its value.
At the combine stage, the recursive calls return for the left subproblem, the right subproblem and the cross max is calculated at the current level. The combine stage will return the largest value (the largest subarray seen so far).
Lets use the Recursion Tree Method to calculate:
- Write down the Recurrence Equation
$ T(n) = aT(n/b) + f(n) $
We can observe that the find_crossmax function has a linear Runtime since, the function will always iterate from mid to start and mid to end. It always iterates over the whole array.
The max_subarray function divides the array into subproblems $left$ and $right$. It divides the problem into 2 sub branches, dividing the problem in half each time.
We can subsitute these values into the Recurrence Equation.
$ T(n) = 2T(n/2) + \Theta(n) $
- Visualize the Tree
Starting at the root, there will $f(n)$ work, which is $\Theta(n)$
The next level will split into 2 subproblems, dividing the work in half
$2 * \Theta(n/2)$
We can continue to represent the next level
$4 * \Theta(n/4)$
This will eventually bottom out when the subproblem can no longer be divided, this can be represented as:
$ 2^i * \Theta(n/2^i) = \Theta(n) $
- $i$ represents the level of the tree where the recursion will bottom out. We can see that 2^i cancels out to leave just $\Theta(n)$.
- Calculate the work per level
We can see at each level there is a constant work of $\Theta(n)$
There are $logb n$ levels of work
If we add all the terms:
There are $n$ number of times the work performed until $n^i$
$(n + n + n...n^i)$
- Which is the same way as saying n times by log n times
$n * logn$
- Therefore the theres $\Theta(n log n)$ work which is the Runtime.
We can validate this using the Master Method
- Calculate p
$ p = logb a $
$ p = 1 = log2 2 $
- Calculate $\epsilon$
- $k$ is the exponent of $f(n)$, which would be 1 in this case
$ \epsilon = p - k $
$ \epsilon = 0 = 1 - 1 $
- Check the cases:
Case 1:
$\epsilon > 0$ and $f(n) < p$: False
Case 2:
$\epsilon == 0$: True
Therefore the runtime is $\Theta(n log n)$
Intuition
We can summarize our understanding of the typical runtimes for Divide and Conquer/Recursive Algorithms.
n log n
We have the non-recursive function as $\Theta(n)$
The subproblem size decreases geometrically as the array is divided into smaller problems.
At each level, there is constant size work at each level of the subproblems - $\Theta(n)$
Since every level contributes a constant amount of work - $\Theta(n)$, we need to sum the work repeated at every level
Therefore the run time is:
$\Theta(n log n)$
n^2
In the case where the non-recursive function grows at a faster rate than $\Theta(n)$
The subproblem size decreases geometrically as the array is divided into smaller problems.
At each level, the amount of work at each level ($\Theta(n^2)$) of the subproblems decreases quadratically with the size of the subproblem.
The work performed at each level becomes drastically smaller.
The dominant contribution is only at the root level, since the subproblems drastically reduce contributions, their work is insignifcant compared to the root level.
Therefore we ignore the depth of the tree ($logb n$) and use the dominant term $\Theta(n^2)$