Probably you know sorting at good level, wrote quicksort and other fast algorithms thousand times. The best sorting algorithms which as main operation have comparing one value to another has O(nlogn) complexity. With the assumption of comparing values they couldn’t be faster. But are there any sorting algorithms, that sorts and are faster than comparision algorithms?
The answer is: yes.
In this article I will introduce counting sort which does not do any comparing, just counting indexes and adding them together. It has O(n) complexity, so it’s much faster than comparision algorithms.
Let’s do simple test and see implementation of both algorithms in c++.
#include// printf, puts #include // srand, rand #include // time, clock, time_t #define MAX 10000000 // array size #define RANGE 1000 // range void generate(int*, int); void write(int*, int); void quick_sort(int *, int, int); void counting_sort(int *, int, int); int main(int argc, char** argv) { int* T = new int[MAX]; time_t start, end; long dif; srand(time(NULL)); puts("quicksort, O(nlog(n))"); generate(T, MAX); //write(T,MAX); start = clock(); quick_sort(T, 0, MAX - 1); end = clock(); dif = end - start; //write(T,MAX); printf("%ld\n", dif / 1000); puts("counting sort, O(n)"); generate(T, MAX); //write(T,MAX); start = clock(); counting_sort(T, MAX, RANGE); end = clock(); dif = end - start; //write(T,MAX); printf("%ld\n", dif / 1000); delete[] T; return 0; } // helpers void generate(int* a, int max) { for (int i = 0; i < max; ++i) a[i] = rand() % RANGE; } void write(int* a, int max) { for (int i = 0; i < max; ++i) printf(" %d", a[i]); printf("\n"); } // sorting functions void quick_sort(int *a, int s, int k) { int left = s, right = k, med = a[(s + k) / 2], tmp; do { while (a[left] < med) ++left; while (a[right] > med) --right; if (left <= right) { tmp = a[left]; a[left] = a[right]; a[right] = tmp; ++left; --right; } } while (left < right); if (left < k) quick_sort(a, left, k); if (s < right) quick_sort(a, s, right); } // assumptions: // - values of A are greater than 0 // - max value of A is known void counting_sort(int *A, int size, int range) { // tmp array int *B = new int[size]; // array for counting indexes int *C = new int[range + 1]; for (int i = 0; i <= range; ++i) C[i] = 0; // count appearances of values for (int i = 0; i < size; ++i) ++C[A[i]]; // count the places where values // should be after sorting - // just add previous indexes counts for (int i = 1; i <= range; ++i) C[i] += C[i - 1]; // copy values from A to the right // place in B, and decrement C[A[i]] by one - // because you already place value on this index for (int i = size - 1; i >= 0; --i) { B[C[A[i]] - 1] = A[i]; --C[A[i]]; } // copying back to A for (int i = 0; i < size; ++i) A[i] = B[i]; delete[] B; delete[] C; }
prints:
quicksort, O(nlog(n)) 2780 counting sort, O(n) 430
As you see, taking under consideration array size of 10’000’000 elements counting sort is 6-7 times faster than quicksort. Using counting sort when assumptions given above in code are met (if not – you can do some mapping, for example: if you have values below zero – just add to them some constant, and remove it in result array!) may improve speed of your applications.
Another examples of O(n) sorting algorithms are bucket sort and radix sort, which are also very helpful. I’ll describe them in future in another articles.
See also other linear complexity sorting algorithms:
- C++ radix sort implementation,
- C++ bucket sort implementation – list sorting.