A sorting algorithm is an algorithm that puts elements of a list into an order. We’ve already discussed common sorting algorithms in a previous post – Common Sorting Algorithms
However, in this post we’ll explore three sorting algorithms namely — Counting Sort, Radix Sort, and Bucket Sort, that have linear time complexity under certain conditions. These sorting algorithms are particularly useful when the range of input elements is relatively small compared to the number of elements being sorted.
Traditional sorting algorithms like Quick Sort, Merge Sort, and Heap Sort have average time complexities of O(n logn), making them efficient for general-purpose sorting. However, when the input elements have a limited range, sorting can be done more efficiently in linear time.
Counting Sort, Radix Sort, and Bucket Sort are linear-time sorting algorithms that exploit specific properties of the input data to achieve efficient sorting. They are particularly effective when the range of input elements is known and relatively small compared to the number of elements.
Counting Sort
Counting Sort is a non-comparison-based sorting algorithm that works efficiently when the range of input elements is small. It counts the occurrences of each distinct element and then uses this information to place each element in its correct sorted position. Also, it’s a stable sorting algorithm.
Pseudocode
// A is the input array, k is the range of elements in array A
countingSort (A, k):
n = length(A)
count = array of size k+1 initialized with zeros
output = array of size n
// Count the occurrences of each element
for i = 1 to n:
count[A[i]] += 1
// Calculate cumulative counts
// count[i] contains number of elements
// less than or equal to i
for i = 1 to k:
count[i] += count[i - 1]
// Place each element in its correct sorted position
for i = n downto 1:
output[count[A[i]]] = A[i]
count[A[i]] -= 1
return output
Code Implementation
//
// main.cpp
// Linear Time Sorting (Counting Sort)
//
// Created by Himanshu on 05/03/24.
//
#include <iostream>
#include <vector>
using namespace std;
void printList (vector<int> arr) {
int n = (int) arr.size();
for (int i=0; i<n; i++) {
cout<<arr[i]<<" ";
}
}
vector<int> countingSort (vector<int> arr, int k) {
int n = (int) arr.size();
vector<int> count(k + 1, 0);
vector<int> output(n);
// Count the occurrences of each element
for (int i = 0; i < n; i++) {
count[arr[i]]++;
}
// Calculate cumulative counts
// count[i] contains number of elements
// less than or equal to i
for (int i = 1; i <= k; i++) {
count[i] += count[i - 1];
}
// Place each element in its correct sorted position
for (int i = n - 1; i >= 0; i--) {
output[count[arr[i]] - 1] = arr[i];
count[arr[i]]--;
}
return output;
}
int main () {
vector<int> arr = {4, 2, 2, 8, 3, 3, 1};
int k = 8;
vector<int> sortedList = countingSort(arr, k);
cout<<"Counting Sort:"<<endl;
printList(sortedList);
return 0;
}
Output
Counting Sort:
1 2 2 3 3 4 8
Time complexity: O(n + k), where n is the number of elements in the input array and k is the range of input elements.
Auxiliary Space: O(n + k), for the count array and to store the input elements.
Radix Sort
Radix Sort is a non-comparison-based sorting algorithm that sorts elements by their individual digits or bytes. It sorts the elements from the least significant digit (or byte) to the most significant digit (or byte) using a stable counting sort or bucket sort as a subroutine.
A stable sorting algorithm maintains the relative order of records with equal keys. This means that when a collection is sorted with a stable sorting algorithm, items with the same sort keys preserve their order.
Pseudocode
radixSort (A, d):
for i = 1 to d:
use a stable sort to sort array A on digit i
Code Implementation
//
// main.cpp
// Linear Time Sorting (Radix Sort)
//
// Created by Himanshu on 05/03/24.
//
#include <iostream>
#include <vector>
#include <cmath>
using namespace std;
void printList (vector<int> arr) {
int n = (int) arr.size();
for (int i=0; i<n; i++) {
cout<<arr[i]<<" ";
}
cout<<endl;
}
// Radix Sort implementation
// Assuming a stable 'Counting Sort' as a subroutine
// Here, d is the maximum number of digits in any element
vector<int> radixSort (vector<int> arr, int d) {
int n = (int) arr.size();
// Perform counting sort for each digit from least significant to most significant
for (int i = 1; i <= d; i++) {
// Counting sort on digit i
vector<int> count(10, 0);
vector<int> output(n);
// Count the occurrences of each digit
for (int j = 0; j < n; j++) {
// calculating ith digit of the element arr[j] from the last
int digit = (arr[j] / int(pow(10, i - 1))) % 10;
count[digit]++;
}
// Calculate cumulative counts
// count[j] contains number of elements
// with ith digit (from the last) less than or equal to j
for (int j = 1; j < 10; j++) {
count[j] += count[j - 1];
}
// Place each element in its correct sorted position
for (int j = n - 1; j >= 0; j--) {
int digit = (arr[j] / int(pow(10, i - 1))) % 10;
output[count[digit] - 1] = arr[j];
count[digit]--;
}
// Update the original array
arr = output;
printf("Array after %d iteration: \n", i);
printList(arr);
}
return arr;
}
int main () {
vector<int> arr = {329, 457, 657, 839, 436, 720, 355};
int d = 3; // Maximum number of digits in any element
vector<int> sortedList = radixSort(arr, d);
cout<<endl<<"Radix Sort:"<<endl;
printList(sortedList);
return 0;
}
Output
Array after 1 iteration:
720 355 436 457 657 329 839
Array after 2 iteration:
720 329 436 839 355 457 657
Array after 3 iteration:
329 355 436 457 657 720 839
Radix Sort:
329 355 436 457 657 720 839
Notice how after ith iteration, array elements are sorted by ith digit from the last.
Time complexity: O(d * (n + k)), where d is the maximum number of digits or bytes in any element, n is the number of elements, and k is the base of the number system being used.
Auxiliary Space: O(n + k), for the count array and to store the sorted output array.
Bucket Sort
Bucket Sort is a distribution-based sorting algorithm that divides the input into a finite number of buckets, each containing a subset of the input elements. It then sorts each bucket individually, either recursively or using another sorting algorithm, and concatenates the sorted buckets to obtain the final sorted array.
Pseudocode
bucketSort (A):
n = length(A)
B = array of empty lists
for i = 1 to n:
insert A[i] into B[floor(n * A[i])]
for i = 0 to n-1:
sort list B[i] using any sorting algorithm
concatenate the lists B[0], B[1], ..., B[n-1] together in order
Code Implementation
//
// main.cpp
// Linear Time Sorting (Bucket Sort)
//
// Created by Himanshu on 05/03/24.
//
#include <iostream>
#include <vector>
using namespace std;
void printList (vector<float> arr) {
int n = (int) arr.size();
for (int i=0; i<n; i++) {
cout<<arr[i]<<" ";
}
}
void insertionSort (vector<float>& arr) {
int n = (int) arr.size();
for (int i = 1; i < n; i++) {
float key = arr[i];
int j = i - 1;
while (j >= 0 && arr[j] > key) {
arr[j + 1] = arr[j];
j = j - 1;
}
arr[j + 1] = key;
}
}
vector<float> bucketSort (vector<float> arr) {
int n = (int) arr.size();
vector<vector<float>> B(n);
// Distribute elements into buckets
for (int i = 0; i < n; i++) {
int bucket_index = n * arr[i];
B[bucket_index].push_back(arr[i]);
}
// Sort each bucket
for (int i = 0; i < n; i++) {
insertionSort(B[i]);
}
// Concatenate the sorted buckets
vector<float> sorted;
for (int i = 0; i < n; i++) {
for (float num : B[i]) {
sorted.push_back(num);
}
}
return sorted;
}
int main () {
vector<float> arr = {0.78, 0.17, 0.39, 0.26, 0.72, 0.94, 0.21, 0.12, 0.23, 0.68};
vector<float> sortedList = bucketSort(arr);
cout<<"Bucket Sort:"<<endl;
printList(sortedList);
return 0;
}
Output
Bucket Sort:
0.12 0.17 0.21 0.23 0.26 0.39 0.68 0.72 0.78 0.94
Time complexity: O(n + k), where n is the number of elements in the input array and k is the number of buckets. However, in the worst case, when all the elements are in the same bucket, the time complexity of bucket sort is O(n2), as all the elements need to be compared to each other.
Auxiliary Space: O(n + k), for the buckets.
Time complexity analysis
Here, worst case time complexity could be reduced to O(n logn), if we use an O(n logn) sorting algorithm, such as Merge Sort or Heap Sort, to sort each bucket.
Bucket Sort divides the input array into a number of buckets based on the range of input values and distributes elements into these buckets.
If the elements are uniformly distributed across the buckets, each bucket will contain approximately n/k elements, where n is the total number of elements in the input array, and k is the number of buckets.
- Sorting each bucket individually using an efficient algorithm typically takes:
O(n/k * log(n/k)) time. - Considering there are k buckets, the total time spent on sorting all the buckets is approximately:
k * (n/k * log(n/k)) , which simplifies to O(n * log(n/k)) . - If the number of buckets ( k ) is chosen to be proportional to n (i.e., k ≈ n ), the time complexity becomes O(n * log(1)) = O(n) . Here, you could avoid
log(1)
, since time complexity of sorting a single element is O(1).
Therefore, in the average case scenario with uniform distribution of elements among the buckets, the average time complexity of Bucket Sort approaches O(n).