Worst:
Average: same.
Principle:
- merging two ordered lists of size
ncan be done inO(n)timeO(n)memory. - start with lists of size 1 (always ordered), of items side by side, then 2, then 4, etc.
log(n)times.
The input comes into an array:
input: 4 2 1 3
We first divide it in 1 sized chunks:
4 | 2 | 1 | 3
The first step is to merge:
-
two sorted lists: one containing only
2and the other only44 | 2 | 1 3 ^ ^This gives:
2 | 4 | 1 3 ^ ^ -
two sorted lists: one containing only
1and the other only32 4 | 1 | 3 ^ ^Since they are already sorted, this gives:
2 4 | 1 | 3 ^ ^
Keep in mind that an array with a single element is always sorted.
Next we split things into 2 sized chunks:
2 4 | 1 3
And we merge two sorted lists 2 4 and 1 3 with
1 2 3 4
If there were still more elements, we would take sizes 4, 8 and so on.
In another section we will go into detail on how to do the merge of two sorted arrays efficiently.
The merge step takes two ordered arrays of same size and merges them.
It is the key for the efficiency of this algorithm.
Let's merge the two following ordered arrays: 2 4 and 1 3, both of size n = 2.
They are both on a single input array side by side:
input: 2 4 | 1 3
The | is just to ease the visualization: it does not exist on the array.
First make a copy of the input (O(n) memory):
input: 2 4 | 1 3
output: 2 4 1 3
Now point to their elements with indexes k, i and j as:
input: 2 4 | 1 3
^ ^
i j
output: 2 4 1 3
^
k
Compare values of
Put the smallest one into
input: 2 4 | 1 3
^ ^
i j
output: 1 4 1 3
^
k
So the smallest was 1 (the other was 2).
Move
input: 2 4 | 1 3
^ ^
i j
output: 1 4 1 3
^
k
This is the end of the step.
In total we did:
- one comparison:
$A[i]$ and$A[j]$ - one value copy:
$A[i]$ or$A[j]$ to$A[k]$ - two increments:
$i$ and$j$
Repeat step:
input: 2 4 | 1 3
^ ^
i j
output: 1 2 1 3
^
k
Repeat step:
input: 2 4 | 1 3
^ ^
i j
output: 1 2 3 3
^
k
Oops,
Last rule: if one of them falls out, it equals infinity.
We can now just copy the remaining elements of the other side into the output
with an efficient memcpy (in this case a single element 4):
input: 2 4 | 1 3
^ ^
i j
output: 1 2 3 4
^
k
Opps,
As we can see, the output contains the elements of both inputs and is ordered.
We took each step exactly
-
$k$ increases once per step - the step ends when
$k$ falls out of the output.
Therefore each merge step of two arrays of size
Let
We have:
We can then verify that the solution has the form:
where:
Let
We need to work with powers of 2, so let
For a simple analysis, we can just fill all values between
We are doing:
-
$N/2$ merges with$n = 1$ -
$N/4$ merges with$n = 2$ -
$N/8$ merges with$n = 4$ - ...
-
$N/2^{m+1}$ merges with$n = 2^m$
Since each merge of size
-
$N/2$ times$Cn$ operations -
$N/4$ times $2Cn$ operations -
$N/8$ times $4Cn$ operations - ...
-
$N/2^{m+1}$ times$n = 2^m$
which makes a total of:
-
$C*N/2$ operations -
$C*N/2$ operations - ...
-
$C*N/2$ operations
m times.
But since
- one comparison
- one value copy
- two increments
The memory worst case was all on the last merge, which required merging