查看更多>>摘要:Sorting is becoming increasingly important in modern computing, ranging from small-scale Internet of Things (IoT) devices to supercomputers. To improve sorting performance, various algorithms, including Intro sort, Merge sort, Heap sort, and Insertion sort, are adopted in different systems. However, the performance of sorting algorithms depends on various factors, and our analysis shows that the optimal algorithm varies, with no single algorithm consistently outperforming the others. In this paper, we first analyze data internal factors (data size, distribution, data type) and external factors (threads, different hardware) that impact sorting algorithm performance. We utilize widely adopted sorting algorithms such as STL sort and Merge sort, as well as state-of-the-art sorting algorithms like Ips4o sort and Aips2o sort. In addition to sequential sorting algorithms, we implement Parallel Intro sort and utilize the parallel versions of state-of-the-art sorting algorithms with varying number of threads. From the analysis, we present an adaptive sorting algorithm selection model for heterogeneous workloads and systems, called AS2 (Adaptive Sorting Algorithm Selection). Its goal is to determine the optimal algorithm from the existing sorting algorithms in heterogeneous workloads and systems. AS2 uses various ML models to build performance models for each sorting algorithm using data internal and external factors from various datasets. Then, AS2 chooses the optimal sorting algorithm based on the performance prediction using the model. We evaluate AS2 using a representative dataset that includes various data internal and external factors. The results show that AS2 can accurately predict the performance of various sorting algorithms, with min and max r-squared values of 0.83 and 0.99, respectively. In addition, AS2 successfully selects the optimal algorithm in our evaluation scenario up to 99.68% accuracy by choosing the algorithm with the shortest predicted sorting time, improving performance by up to 1.83× compared to the state-of-the-art algorithm. We also evaluate the performance of AS2 using the real-world dataset and the results show that AS2 selects the optimal algorithm with 87.50% accuracy.