Merge Sort
Merge sort is an algorithm based on the principle of divide and conquer (in Latin divide et impera). Meaning that if we're unable to solve a problem as a whole we should split it into multiple smaller and simpler pieces and apply the same procedure on those pieces. In other words, we split them again into even smaller pieces (recursion is a common solution) until we get to a point where we're able to solve the problem without any issues. A common goal in sorting is to get an array with a size of 1 which would be considered automatically sorted.
Merge sort is all about very quickly and linearly merging two pre-sorted arrays into one so that the resulting array is sorted too. We'll start out with one unsorted array. However, we're able to split it into two halves and do the same to the resulting halves and so on. In the deepest level of recursion, we'll be at a point where we'll have arrays with only one item. One item arrays can technically be seen as sorted - more so, sorted trivially. Then, we merge those one item arrays into two-item arrays, then, into four item arrays and so on. Once we get two big arrays, we'd merge them as well to get a sorted version of our original array. Since we always merge after splitting into halves, we'll en up doing so "log n" times (where the base of the logarithm is 2 because we split it into 2 halves). The merging itself is done in O(n) units in time, so the resulting time complexity of the algorithm will be O(n log n). Let's take a look at how the arrays are merged first.
Merging
We have 2 arrays, which are already sorted:
2 | 3 | 5 | |
1 | 4 | 7 | 8 |
Notice that the arrays do not have to have the same size. If we split an array with seven values in Merge sort, we'd get something like these two as the result. The problem with Merge sort is that we need an auxiliary memory to work with it, which would be empty for now.
2 | 3 | 5 | |
1 | 4 | 7 | 8 |
Auxiliary memory:
Now, let's move on to merging. In each array, we'll need 2 indexes from the beginning at the first index (shown in bold below).
2 | 3 | 5 | |
1 | 4 | 7 | 8 |
Auxiliary memory:
Compare items at index positions and place the lower item into the auxiliary array at a position which is the sum of these indexes. Both indexes are currently at position 0 (the first position), so the position of the auxiliary array will be 0 + 0 = 0. We'll compare and find out that 1 < 2, so we place 1 in the 0th position in the auxiliary array and increment the index by 1.
2 | 3 | 5 | |
1 | 4 | 7 | 8 |
Auxiliary memory:
1 |
2 < 4, so we place two at 0 + 1 = 1 and move the index.
2 | 3 | 5 | |
1 | 4 | 7 | 8 |
Auxiliary memory:
1 | 2 |
3 < 4, so we place 3 at 1 + 1 = 2 and move the index.
2 | 3 | 5 | |
1 | 4 | 7 | 8 |
Auxiliary memory:
1 | 2 | 3 |
So on and so forth...
2 | 3 | 5 | |
1 | 4 | 7 | 8 |
Auxiliary memory:
1 | 2 | 3 | 4 |
2 | 3 | 5 | |
1 | 4 | 7 | 8 |
Auxiliary memory:
1 | 2 | 3 | 4 | 5 |
Now we're at a point where we got at least one of the indexes out of the array, so let's check whether the second index is still in the array and merge the rest of the items.
2 | 3 | 5 | |
1 | 4 | 7 | 8 |
Auxiliary memory:
1 | 2 | 3 | 4 | 5 | 7 | 8 |
That's it, the resulting array is now sorted! In practice, we wouldn't get two pre-sorted arrays, instead, we'd get 1 shuffled array. Therefore, we'd have to get to the one item arrays using recursion and then begin to merge them.
Process
The process is shown in the following diagram:
Possible improvements
The pitfall here is the need of the auxiliary memory. If we look at the diagram above, we can see everything that the function must hold in the memory. However, we could sort it another way - take the original array as one item arrays and merge these into two-item arrays, then, four item arrays, and so on. We'd still need an auxiliary memory, but with a bit of work, we'd only need one which would be of the same size as the original array. There, we'd save the merged result and then merge it into the original array and vice-versa. This way, we'd pour data alternately between those arrays.
Algorithm properties
Time complexity | O(n log n) |
---|---|
Stability | Yes |
Speed | Fast |
Note: the time complexity is based on average cases and the speed is like this when compared to all of the other sorting algorithms.
Source code
// merge two sorted arrays into one public static void merge(int[] list, int[] left, int[] right) { int i = 0; int j = 0; // until we get out of one array while ((i < left.length) && (j < right.length)) { // place the smaller item from both arrays and move the index if (left[i] < right[j]) { list[i + j] = left[i]; i++; } else { list[i + j] = right[j]; j++; } } // merge the rest from the non-empty array if (i < left.length) { while (i < left.length) { list[i + j] = left[i]; i++; } } else { while (j < right.length) { list[i + j] = right[j]; j++; } } } // sorting public static void merge_sort(int[] list) { if (list.length <= 1) // recursion condition return ; int center = list.length / 2; // the center of the array int[] left = new int[center]; // create and fill the left array for (int i = 0; i < center; i++) left[i] = list[i]; int[] right = new int[list.length - center]; // create and fill the right array for (int i = center; i < list.length; i++) // create and fill the right array right[i - center] = list[i]; merge_sort(left); // recursion call to both new arrays merge_sort(right); merge(list, left, right); // merge the arrays }
// merge two sorted arrays into one public static void merge(int[] list, int[] left, int[] right) { int i = 0; int j = 0; // until we get out of one array while ((i < left.Length) && (j < right.Length)) { // place the smaller item from both arrays and move the index if (left[i] < right[j]) { list[i + j] = left[i]; i++; } else { list[i + j] = right[j]; j++; } } // merge the rest from the non-empty array if (i < left.Length) { while (i < left.Length) { list[i + j] = left[i]; i++; } } else { while (j < right.Length) { list[i + j] = right[j]; j++; } } } // sorting public static void merge_sort(int[] list) { if (list.Length <= 1) // recursion condition return ; int center = list.Length / 2; // the center of the array int[] left = new int[center]; // create and fill the left array for (int i = 0; i < center; i++) left[i] = list[i]; int[] right = new int[list.Length - center]; // create and fill the right array for (int i = center; i < list.Length; i++) // create and fill the right array right[i - center] = list[i]; merge_sort(left); // recursion call to both new arrays merge_sort(right); merge(list, left, right); // merge the arrays }
// merge two sorted arrays into one procedure merge(var list, left, right: array of integer); var i, j: integer; begin i:=0; j:=0; // until we get out of one array while ((i < length(left)) and (j < length(right))) do begin // place the smaller item from both arrays and move the index if (left[i] < right[j]) then begin list[i + j]:=left[i]; inc(i); end else begin list[i + j]:=right[j]; inc(j); end; end; // merge the rest from the non-empty array if (i < length(left)) then begin while (i < length(left)) do begin list[i + j]:=left[i]; inc(i); end; end else begin while (j < length(right)) do begin list[i + j]:=right[j]; inc(j); end; end; end; // sorting procedure merge_sort(var list: array of integer); var center, i: integer; left, right: array of integer; begin if (length(list) <= 1) then exit; // recursion condition center:=length(list) div 2; // the center of the array setlength(left, center); // create and fill the left array for i:=0 to (center - 1) do left[i]:=list[i]; setlength(right, length(list) - center); // create and fill the right array for i:=center to (length(list) - 1) do right[i - center]:=list[i]; merge_sort(left); // recursion call to both new arrays merge_sort(right); merge(list, left, right); // merge the arrays end;
# merge two sorted arrays into one def merge(list, left, right) i = 0 j = 0 # until we get out of one array while ((i < left.length) && (j < right.length)) # place the smaller item from both arrays and move the index if (left[i] < right[j]) list[i + j] = left[i] i += 1 else list[i + j] = right[j] j += 1 end end # merge the rest from the non-empty array if (i < left.length) while (i < left.length) list[i + j] = left[i] i += 1 end else while (j < right.length) list[i + j] = right[j] j += 1 end end end # sorting def merge_sort(list) return if (list.length <= 1) # recursion condition center = list.length / 2 # the center of the array left = Array.new(center) # create and fill the left array center.times { |i| left[i] = list[i] } right = Array.new(list.length - center) # create and fill the right array center.upto(list.length - 1) { |i| right[i - center] = list[i] } merge_sort(left) # recursion call to both new arrays merge_sort(right) merge(list, left, right) # merge the arrays end