simple_complexity/part4.txt at master · loqi/simple_complexity

43 lines (32 loc) · 6.84 KB
<a title="Simple Complexity" href="http://stickybits.azurewebsites.net/?p=41">jump to beginning of series</a><a title="Simple Complexity 3" href="http://stickybits.azurewebsites.net/?p=401"> or the previous page</a>
Let's build on what we know. Consider this pseudocode:
<pre>1. loop 10 times
2.      loop n times
3.           sleep 1 millisecond
4. loop n times
5.      loop 100 times
6.           sleep 5 milliseconds</pre>
Lines 1 through 3 run in linear time [math]O(1 * n * 1) = O(n)[/math] and lines 4 through 6 are also linear time [math]O(n * 1 * 1) = O(n)[/math]. So what's this [math]O(n) + O(n)[/math] sum? Our code is structured serially to do all the 1 ms sleeps first and then all the 5 ms sleeps, but we could have interspersed them in other ways to the same effect. Here's the whole thing as a one-liner:
<pre>1. sleep (510*n) milliseconds</pre>
Does this run an O(n) function 510 times, or an O(510) function n times? Well, doing a little thing many times is exactly the same as doing a big thing once. That's why we signify constant time as O(1) and never O(510). It turns out that [math]O(n) + O(n) = O(n)[/math] because doing two little things is the same as doing one bigger thing, even if you do those things n number of times.
Your big O expression broadly classifies your code as sometimes needing to execute a constant-time operation as many as some number of times. But what's this <em>operation</em> thing anyway? Big O doesn't care. It's just some constant-time quantity of computational work that needs to be performed the number of times specified by your big O expression, and as we saw earlier, constant-time operations can be tiny or huge.
This brings us to another key point. Your algorithm doesn't need to do one discrete step [math]n^2[/math] number of times in order to be classified as [math]O(n^2)[/math] time complexity. Let's look at a simple selection sort algorithm in pseudocode.
<pre>function selectionSort(sortArray)
  <strong>unsortedUpTo</strong> = <strong>sortArray.length</strong> - 1
  <strong>loop</strong> (<strong>sortArray.length</strong> - 1) times
      biggestAt = 0
      <strong>for</strong> compareAt in 1..<strong>unsortedUpTo</strong>
          if sortArray[biggestAt] &lt;= sortArray[compareAt]
              biggestAt = compareAt
      swap sortArray[unsortedUpTo], sortArray[biggestAt]
      <strong>decrement unsortedUpTo</strong>
  return sortArray</pre>
The bold text shows we have a loop nested inside a loop, both of which iterate for some flavor of sortArray.length number of times.
The outer loop iterates once per array element. Each time the inner loop runs, it stops one element short of the previous time. This means we're doing approximately [math]n^2/2[/math] number of compare operations: precisely the (n-1)'th triangular number of compares. We're also doing (n-1) number of swaps. Selection sort is classified as a polynomial-time [math]O(n^2)[/math] algorithm. Why not classify this particular selection sort as [math]O(n^2/2 + n/2 - 1)[/math] to be precise?
This is traditionally explained by noting that only the degree of a polynomial makes a significant contribution to the resolution of the function as n approaches infinity. That's a reasonably correct explanation, but in my opinion there's a much more intuitively satisfying way of understanding this. Since big O doesn't care how big one operations is, performing slightly more than [math]n^2/2[/math] operations is equivalent to performing exactly [math]n^2[/math] number of slightly-bigger-than-half-size operations. Little operations and fragments of operations are still operations, so in fact you are doing <em>something</em> exactly [math]n^2[/math] number of times.
We could re-write our selection sort to remember both the greatest and least values and grow a sorted region at each end, and it's still [math]O(n^2)[/math] time complexity because it's doing about a quarter of a compare and an n'th of a swap, about [math]n^2[/math] number of times. Okay, it's actually a quarter of two compares and two-n'ths of a swap, but you take my meaning. Running this algorithm against ten times more data will cause it to do about a hundred times more half-compares, and a hundred times more tenth-of-a-swap operations. Such optimizations may cut the actual running time, which matters a lot, but they won't promote the same algorithm to a flatter big O. No amount of optimization will change the time complexity. Only if we rewrite to an altogether different algorithm can we achieve a flatter complexity profile.
Big O is a lossy notation. It doesn't attempt to account for everything your code does. The sequence in which we execute the operations is irrelevant to the time complexity. We could walk an array once, doing something with each element and then walk the array a second time performing a second operation; or we could walk it just once, with both operations together. Either way, we can simplify all that detail and say we're doing one linear-time [math]O(n)[/math] operation on that array because we're doing <em>something</em> as many as n times, whether we make the choice to do both halves of that something separately or together, and whether that something includes the overhead of a second O(n) loop does not affect our big O expression. [math]O(n) + O(n) = O(n)[/math]
The size of constant-time operations is irrelevant to your time complexity analysis. You're quantifying the number of constant-time operations you have to do with respect to your data as it scales bigger or smaller. Remember [math]O(1) + O(1) = O(1)[/math] so all your constant-time once-per-n operations coalesce onto a single once-per-n operation. Do you have an n-bound loop inside another n-bound loop or do you have a 100-bound loop inside an n-bound loop? That's the difference between [math]O(n^2)[/math] and [math]O(n)[/math] complexities. As n gets big, runtime approaches a proportion of the square of n, or approaches a proportion of just n, respectively. The time complexity reflects this difference, but instead of being measured by a clock or a profiler, it quantifies constant-time operations of unspecified size.
In the selection sort example, we're doing [math]n^2[/math] number of constant-time operations to get the sorting job done. As n grows, the number of steps rises in a steepening pattern, while the amount of work per step stays the same. We could look at this from the inside-out perspective and say we're doing a constant number of polynomial-time steps. This would be just as correct, since multiplication is commutative. As you gain insights into complexity, you'll be able to manipulate these concepts in surprising ways. For now, it's fairly easy to conceptualize a nested loop as a linear-time operation (inner loop) executed a linear number of times (outer loop) combining to run in quadratic time.
Now, let's <a title="Simple Complexity 5" href="http://stickybits.azurewebsites.net/?p=4631">analyze our analysis.</a>
Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

part4.txt

Latest commit

History

part4.txt

File metadata and controls