
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Faster Processing of Sorted Arrays in C++
In C++, it is faster to process a sorted array than an unsorted array due to some reasons related to algorithm efficiency and data access patterns. In this article, we see why this is the case and how sorting can improve performance of array processing. First of all, let see an example of processing time of a sorted and unsorted array.
Processing Time: Sorted vs Unsorted Array
In the example code below, we have used <ctime> library of C++ STL to measure the time taken to process an unsorted array and a sorted array. The code counts how many elements in the array are less than half of the size of the array.
#include <iostream> #include <algorithm> #include <ctime> #include <iomanip> using namespace std; const int N = 100001; int main() { int arr[N]; // Assign random values to array for (int i=0; i<N; i++) arr[i] = rand()%N; // Run a 'for' loop for unsorted array int count = 0; // Start the clock to measure time double start = clock(); for (int i=0; i<N; i++) if (arr[i] < N/2) count++; // End the clock double end = clock(); // Force fixed-point format with 6 decimal places cout << fixed << setprecision(6); // Print the time taken for unsorted array cout << "Time for unsorted array : " << ((end - start)/CLOCKS_PER_SEC) << endl; // Now the array is sorted sort(arr, arr+N); // Run a 'for' loop for sorted array count = 0; start = clock(); for (int i=0; i<N; i++) if (arr[i] < N/2) count++; end = clock(); cout << "Time for sorted array : " << ((end - start)/CLOCKS_PER_SEC) << endl; return 0; }
The output of the above code will be:
Time for unsorted array : 0.000280 Time for sorted array : 0.000058
Note: The exact value of output may vary based on the compiler used and the random values generated. But this program clearly shows that the time taken to process a sorted array is much less than that of an unsorted array.
Why Sorted Array is Faster?
The sorted array takes less time to process because of two reasons: branch prediction and cache locality.
1. Branch Prediction
Modern CPUs use a technique called branch prediction to guess which way a branch (like an if statement) will go before it is executed. When the array is sorted, the CPU can predict that the next element will likely be greater than or less than a certain value, leading to fewer mis-predictions and faster execution.
In this case, if condition checks that arr[i] < 2000, but if you observe in case of sorted array, after passing the number 2000 the condition is always false, and before that it is always true, compiler optimizes the code here and skips the if condition which is referred as branch prediction.
// Sorted array arr[] = {0,1,2,3,4,5,6, .... , 1999,2000,2001, ... , 4000} {T,T,T,T,T,T,T, .... , T, F, F, ... , F } // Unsorted array arr[] = {4000, 1, 2000, 3, 4, 5, 6, .... , 1999, 1876, 500, 1000} {T,F,T,T,T,T,T, .... , T, F, F, ... , F ,T } T = if condition true F = if condition false
In the unsorted array, the condition can be true or false at any point. This leads to more branch mis-predictions, which can slow down the execution.
2. Cache Locality
In a sorted array, the elements might be stored in a contiguous memory locations. This will improve cache locality, meaning that when the CPU accesses one element, the next element will be stored in a memory location that is close to it. It reduces the time taken to access the next element, and eventually leads to a faster processing time.
In an unsorted array, the elements can be scattered across different memory location. This lead to more cache misses and slower access times. Because the CPU has to fetch data from different locations in memory, which takes more time.
// Sorted array arr[] = {0,1,2,3,4,5,6, .... , 1999,2000,2001, ... , 4000} // Memory locations 0x0001, 0x0002, 0x0003, .... , 0x07D0, 0x07D1, 0x07D2, ... , 0x0FA0 // Unsorted array arr[] = {4000, 1, 2000, 3, 4, 5, 6, .... , 1999, 1876, 500, 1000} // Memory locations 0x0FA0, 0x0002, 0x07D1, 0x0003, 0x0004, ... , 0x07D0, 0x07D3, 0x07D4, 0x07D5