3) Sorting Algorithms Lesson

Quick Sort

13 min to complete · By Jon Fincher

Note: This article is scheduled to be updated shortly with Python code samples. For now, this article focuses on the quick sort in Java.

Another divide and conquer algorithm is Quick Sort. Like Merge Sort, it divides the dataset into pieces and sorts each piece using Quick Sort. From there, however, it quickly changes direction.

Rather than divide the dataset into equal parts, Quick Sort selects a semi-random pivot value from the dataset. It then partitions the dataset into two ranges -- the first range consists of data that is less than the pivot, and the second range contains data greater than the pivot. Once the ranges are defined, they are each sorted using the same process.

Quick Sort Animation

Here's what these actions look like with a small dataset:

Quick Sort animation

Credit: Swfung8, CC BY-SA 3.0, via Wikimedia Commons

Quick Sort Big O Complexity

Like the Merge Sort, the Big O analysis of the Quick Sort is $`O(log\ n)`$. Unlike Merge Sort, it doesn't require copying the entire dataset to sort -- a single swap variable is all that is needed. Because of this, datasets can be sorted in place using Quick Sort.

Quick Sort Java Example

Again, code is king. First, you will explore how a dataset is partitioned. As before, it is assumed that you are sorting an integer array called nums:

int partition(int nums[], int low, int high) {
  // Pick a semi-random pivot value
  int pivot = nums[high];

  // Where should values less than the pivot go?
  int i = (low - 1); // index of smaller element

  // Check every value in the range except the pivot
  for (int j = low; j < high; j++) {
    // If current element is smaller than or equal to pivot
    if (nums[j] <= pivot) {
      // Move the index up
      i++;

      // Move the smaller number down with a swap
      int temp = nums[i];
      nums[i] = nums[j];
      nums[j] = temp;
    }
  }

  // You've scanned everything -- move the pivot into place
  int temp = nums[i + 1];
  nums[i + 1] = nums[high];
  nums[high] = temp;

  // Return the index of the pivot value
  return i + 1;
}

In most implementations of Quick Sort, the pivot value is picked to be the last item in the range to be sorted -- in this case, that's nums[high]. The variable i tracks a location in the array and is placed initially below the low end of the range -- it will be moved as the partitioning commences.

As the for loop iterates, you check each value to see if it's less than, or equal to, the pivot value. If so, you need to move it to the beginning of the range, but where exactly? This is where the i value comes in. It is always either less than or equal to j, so the value at j can be swapped with it. You have already checked that value, so moving it later in the range is fine.

Once the loop is done, you move the pivot into place -- now everything to the left of the pivot is lower than it, and everything to the right is greater. You return the index of the pivot so it can be used for a later partition.

Again, here's what this looks like for a random dataset. The last item in each range is selected as the pivot value (indicated by the red column), then the items in each range are sorted against this pivot value, and finally, the new sub-ranges are sorted using the same method:

Quick Sort animation

Credit: Swfung8, CC BY-SA 3.0, via Wikimedia Commons

How is this all used? Here's how everything starts:

void sort(int nums[], int low, int high) {
  if (low < high) {
    // pi is partitioning index, nums[pi] is now at right place
    int pi = partition(nums, low, high);

    // Recursively sort elements before and after the partition
    sort(nums, low, pi - 1);
    sort(nums, pi + 1, high);
  }
}

You call sort() with the array to sort the indexes of the first and last items in the array. As long as there is more than one item to be sorted, the array is partitioned. The partition index is stored in pi (not to be confused with Math.PI), and each sub-array is sorted by calling the sort() method recursively.

Why do you use pi - 1 and pi + 1 when making the recursive calls? Remember that the pivot value is the last item in the range -- if your first recursive call were sort(nums, low, pi), then the same pivot value would always be chosen. Since every item in that range is lower than the pivot value, the first range would never get smaller. You would be stuck in an infinite recursive loop, which would crash with an out-of-memory error.

As previously mentioned, like the Merge Sort, the Big O analysis of the Quick Sort is $`O(log\ n)`$. Unlike Merge Sort, it doesn't require copying the entire dataset to sort -- a single swap variable is all that is needed. Because of this, datasets can be sorted in place using Quick Sort.

Alternate Partitioning Schemes

There is a problem with the Quick Sort if the data is already sorted or substantially sorted. By choosing the last value in a sorted range as the pivot, you ensure every item is less than it, resulting in one subrange with all the data and one with no data in it. This means the performance can degrade to $`O(n^2)`$ when working on sorted datasets.

To work around the degradation problem, you can adopt a different partitioning scheme. This algorithm selects the middle value as the pivot and doesn't swap anything unless it's necessary:

int partition2(int nums[], int low, int high) {
  // Pick the middle value as the pivot
  int pivot = nums[(high+low)//2];

  // Where should values less than the pivot go?
  int i = (low - 1);

  // Where should values greater than the pivot go?
  int j = (high + 1);

  // Keep doing this
  while (true){
    // Move i up until you exceed the pivot
    while (nums[++i] < pivot);

    // Move j down until you're lower than the pivot
    while (nums[--j] > pivot);

    // Have you passed each other? j is where the pivot is
    if (i >= j) return j;

    // Swap the items at i and j
    int temp = nums[i];
    nums[i] = nums[j];
    nums[j] = temp;
  }
}

However, to use this, you have to modify your sort() method as well:

void sort2(int nums[], int low, int high) {
  if (low < high) {
    // pi is partitioning index, nums[pi] is now at right place
    int pi = partition2(nums, low, high);

    // Recursively sort elements before and after the partition
    sort2(nums, low, pi);       // Note the pivot is included
    sort2(nums, pi + 1, high);
  }
}

Because you are selecting the middle value as the pivot, you need to include the pivot value so it gets sorted properly.

This partitioning scheme avoids degradation because of the choice of the pivot. Data in the range isn't swapped unless it is on the wrong side of the pivot value. This is not the case in a sorted dataset, so the outer while (true) loop will only iterate once, and each range will be equally sized.

Summary: Quick Sort

Quick Sort, $`O(log\ n)`$, uses recursion to partition the dataset into two ranges, using a pivot value as a separator, then sorting each range separately using Quick Sort as well. The size of the ranges is not fixed as in Merge Sort, and the availability of alternate partitioning algorithms makes Quick Sort both efficient and adaptable.