3) Sorting Algorithms Lesson

Merge Sort

12 min to complete · By Jon Fincher

The first three sorting algorithms tackled the problem of sorting a dataset as a whole entity -- you scan the entire dataset, comparing and moving values as necessary. For a large dataset, this can be memory intensive as well as time consuming. It was for larger datasets that Merge Sort was invented.

Merge Sort Algorithm

The Merge Sort takes a divide and conquer approach to sorting a dataset. Merge Sort works by dividing the dataset in half, then sorting each half using the Merge Sort algorithm. When each half of the dataset is too small to split any further (i.e. when each half contains only one value), the two halves are merged back together in sorted order. This merging continues until you have a completely sorted dataset.

Merge Sort Animation

Here's what these actions look like with a small dataset:

Merge Sort animation

Credit: Swfung8, CC BY-SA 3.0, via Wikimedia Commons

Merge Sort Big O Complexity

The divide and conquer approach to sorting means that Merge Sort has a Big O rating of $`O(n log\ n)`$ -- dividing the data in sort() provides the $`log\ n`$ portion, while the single loop merge() provides the initial $`n`$. However, there is the problem of making copies of the sub-arrays in merge() which means Merge Sort has a memory requirement of $`O(n)`$, doubling the memory needed for any dataset.

Merge Sort Examples

It's time to tackle how the Merge Sort accomplishes these tasks one at a time -- first, how the dataset is split, then how it is merged back together. Note that it is assumed you are storing integer data in an array called nums.

Merge Sort Example in Java

The first step is splitting the dataset, which requires you to find the middle. If you call the first item in the dataset the left item, and the last item in the dataset the right item, then the middle can be defined as (left + right) / 2. Now you can refer to the first half as nums[left] through nums[middle], and the second half as nums[middle+1] through nums[right].

What if the nums array only has one item in it? In that case, both left and right will be the same, and there is nothing to split.

And then what do you do with those two sub-arrays? Simple -- you call sort() again on each half. Here's what that looks like in code:

void sort(int nums[], int left, int right) {
  // Is there anything to sort here?
  if (left < right) {
    // Find the middle point
    int middle = (left + right) / 2;

    // Sort (recursive) first and second halves
    sort(nums, left, middle);
    sort(nums, middle + 1, right);

    // Merge the sorted halves
    merge(nums, left, middle, right);
  }
}

The if statement checks to see if the sub-array has more than one item in it by ensuring left is less than right. If so, you find the middle, and use it to define two new sub-arrays, each of which you sort(). Once they are sorted, you can merge them together. Your initial call is sort(nums, 0, nums.length-1) to indicate you are sorting the whole array.

So how does the final merge happen? It's easier to look at the code:

void merge(int nums[], int left, int middle, int right) {
  // Find sizes of two subarrays to be merged
  int n1 = middle - left + 1;
  int n2 = right - middle;

  // Create temp arrays
  int leftHalf[] = new int[n1];
  int rightHalf[] = new int[n2];

  // Copy data to temp arrays
  for (int i = 0; i < n1; ++i)
    leftHalf[i] = nums[left + i];
  for (int j = 0; j < n2; ++j)
    rightHalf[j] = nums[middle + 1 + j];

  /* Merge the temp arrays */

  // Initial indexes of first and second subarrays
  int i = 0, j = 0;

  // Initial index of merged subarry array
  int k = left;
  while (i < n1 && j < n2)
    if (leftHalf[i] <= rightHalf[j])
      nums[k++] = leftHalf[i++];
    else
      nums[k++] = rightHalf[j++];


  /* Copy remaining elements of leftHalf[] if any */
  while (i < n1)
    nums[k++] = leftHalf[i++];

  /* Copy remaining elements of rightHalf[] if any */
  while (j < n2)
    nums[k++] = rightHalf[j++];

}

First, you create two copies of the sub-arrays. Then you define three separate indexes to manage the merge, one for each of the sub-array copies (i and j), and one for the nums array (k). Then you can start the merge

Each time the while loop iterates, the lowest item from either the leftHald[] or rightHalf[] arrays is picked and copied to the nums array. This continues until one array has been exhausted of all data. When that occurs, you copy everything that remains from the other array back to nums.

As previously mentioned, the divide and conquer approach to sorting means that Merge Sort has a Big O rating of $`O(n log\ n)`$ -- dividing the data in sort() provides the $`log\ n`$ portion, while the single loop merge() provides the initial $`n`$. However, there is the problem of making copies of the sub-arrays in merge() which means Merge Sort has a memory requirement of $`O(n)`$, doubling the memory needed for any dataset. You can still do better.

Merge Sort Example in Python

The example below defines a merge_sort() funtion in Python. The function takes a list nums as a parameter and sorts the nums list using the merge sort. The merge sort recursively divides the array into halves until it reaches subarrays of size one or zero, and then merges them in a sorted manner. The merge() function is used to combine two sorted halves into a single sorted list.

def merge_sort(nums):
  """
  This function performs the merge sort algorithm on a list.
  :param nums: A list of comparable elements
  :return: The sorted list in ascending order
  """
  # If the array has one or zero elements, it is already sorted
  if len(nums) <= 1:
    return nums

  # Split the array into two halves
  mid = len(nums) / 2
  left_half = nums[:mid]
  right_half = nums[mid:]

  # Recursively sort both halves
  left_half = merge_sort(left_half)
  right_half = merge_sort(right_half)

  # Merge the sorted halves
  return merge(left_half, right_half)


def merge(left, right):
  """
  This function merges two sorted lists into a single sorted list.
  :param left: The left half of the original list
  :param right: The right half of the original list
  :return: The merged and sorted list
  """
  result = []
  i = j = 0

  # Compare elements from both halves and merge them in sorted order
  while i < len(left) and j < len(right):
    if left[i] < right[j]:
      result.append(left[i])
      i += 1
    else:
      result.append(right[j])
      j += 1

    # Append the remaining elements from both halves (if any)
    result.extend(left[i:])
    result.extend(right[j:])

    return result

Summary: Merge Sort

The Merge Sort, $`O(n log\ n)`$, uses recursion to divide the dataset into half, sort each half, then merge them back into a single dataset. The Merge Sort is efficient and well suited to sorting large datasets, though it duplicates the dataset being sorted, which is something you can improve upon.