Merge sort failing to work in N logN

Question

I have written a merge sort algorithm, but it fails to sort in N log N. The code for sorting an array list is:

void merge(int start, int mid, int end) {
    int i,j,q;

    for (i=start; i<=end; i++) 
        l[i]=list[i]; 

    i=start; j=mid+1; q=start; 

    while (i<=mid and j<=end) { 
        if (l[i]<l[j]) 
            list[q++]=l[i++];
        else
            list[q++]=l[j++];
    }

    while (i<=mid) 
            list[q++]=l[i++];
}


void mergesort(int start, int end) {
    int mid;
    if (start<end) {
        mid=(start+end)/2;
        mergesort(start, mid);   
        mergesort(mid+1, end);  
        merge(start, mid, end); 
    }
}

However, if I for example sort 7800 numbers, the runtime is approximately 1.243 ms. The same sample is sorted by std::sort in 0.668 ms, and I know that sort has a complexity of N logN. What is wrong with my code? I cannot seem to find the waste of time.

Measurement of time:

#include <time.h>

clock_t start = clock();

//SORTING ALGORITHM HERE//

clock_t stop = clock();
double elapsed =(stop - start) * 1000.0 / CLOCKS_PER_SEC;

Show source
| sorting   | c++   | algorithm   | big-o   2017-01-05 14:01 2 Answers

Answers to Merge sort failing to work in N logN ( 2 )

  1. 2017-01-05 14:01

    Assuming your implementation is correct, two O(N logN) won't necessarily run at the same amount of time. Asymptoptic complexity is a measure of how much the resources required to run a program grow as the input becomes very large. Just to give you an example, the following loops are both O(1), since each of them always run a constant number of steps:

    for (i = 0; i < 10; i++) {
        printf("%d\n", i);
    }
    
    for (i = 0; i < 1000000000; i++) {
        printf("%d\n", i);
    }
    

    But there's no question that the second will take much longer to run. As a matter of fact, the runtime gap between these two loops will be significantly larger than the gap you are observing for your sort algorithm versus std::sort. That's because asymptoptic analysis neglects constants.

    Moreover, asymptoptic complexity is usually for the average or worst case scenario. The same algorithm can run in more or less time for inputs of equal size depending on the data.

    Not to mention that std::sort is very likely not a single sorting algorithm. It probably uses different strategies depending on the size of the array. In fact, somple implementations of std::sort use a mixture of algorithms.

    The proper way to analyze a program's complexity is by reading the code. For a numerical approach, the closest you can do is to run your program, without comparing it to other programs, for several inputs of different sizes. Plot a graph an observe the curve.

  2. 2017-01-05 17:01

    In the case of Visual Studio, std::sort() is a mix of quick sort, heap sort (only to prevent worst case O(n^2) time complexity), and insertion sort, while std::stable_sort(), is a mix of merge sort and insertion sort. Both are reasonably fast, but it's possible to write faster code. The example code in the question is copying data before every merge, which consumes time. This can be avoided by doing a one time allocation of a working buffer, and switching merge direction based on level of recursion, using a pair of mutually recursive functions (shown below), or a boolean parameter to control merge direction (not used in example below).

    Example C++ code for top down merge sort that is reasonably optimized (bottom up merge sort would be slightly faster, as it skips the recursion used to generate indices, using iteration instead).

    // prototypes
    void TopDownSplitMergeAtoA(int a[], int b[], size_t ll, size_t ee);
    void TopDownSplitMergeAtoB(int a[], int b[], size_t ll, size_t ee);
    void TopDownMerge(int a[], int b[], size_t ll, size_t rr, size_t ee);
    
    void MergeSort(int a[], size_t n)       // entry function
    {
        if(n < 2)                           // if size < 2 return
            return;
        int *b = new int[n];
        TopDownSplitMergeAtoA(a, b, 0, n);
        delete[] b;
    }
    
    void TopDownSplitMergeAtoA(int a[], int b[], size_t ll, size_t ee)
    {
        if((ee - ll) == 1)                  // if size == 1 return
            return;
        size_t rr = (ll + ee)>>1;           // midpoint, start of right half
        TopDownSplitMergeAtoB(a, b, ll, rr);
        TopDownSplitMergeAtoB(a, b, rr, ee);
        TopDownMerge(b, a, ll, rr, ee);     // merge b to a
    }
    
    void TopDownSplitMergeAtoB(int a[], int b[], size_t ll, size_t ee)
    {
        if((ee - ll) == 1){                 // if size == 1 copy a to b
            b[ll] = a[ll];
            return;
        }
        size_t rr = (ll + ee)>>1;           // midpoint, start of right half
        TopDownSplitMergeAtoA(a, b, ll, rr);
        TopDownSplitMergeAtoA(a, b, rr, ee);
        TopDownMerge(a, b, ll, rr, ee);     // merge a to b
    }
    
    void TopDownMerge(int a[], int b[], size_t ll, size_t rr, size_t ee)
    {
        size_t o = ll;                      // b[]       index
        size_t l = ll;                      // a[] left  index
        size_t r = rr;                      // a[] right index
        while(1){                           // merge data
            if(a[l] <= a[r]){               // if a[l] <= a[r]
                b[o++] = a[l++];            //   copy a[l]
                if(l < rr)                  //   if not end of left run
                    continue;               //     continue (back to while)
                while(r < ee)               //   else copy rest of right run
                    b[o++] = a[r++];
                break;                      //     and return
            } else {                        // else a[l] > a[r]
                b[o++] = a[r++];            //   copy a[r]
                if(r < ee)                  //   if not end of right run
                    continue;               //     continue (back to while)
                while(l < rr)               //   else copy rest of left run
                    b[o++] = a[l++];
                break;                      //     and return
            }
        }
    }
    

Leave a reply to - Merge sort failing to work in N logN

◀ Go back