80/20 rule; Cloud computing; Cluster computing; Google cluster; Load balancing; Rescheduling; Cloud-computing; Google+; Jobs scheduling; Load-Balancing; Task constraints; Task executions; Task priorities; Software; Computer Networks and Communications
Abstract :
[en] Cloud architecture and its operations interest both general consumers and researchers. Google, as a technology giant, offers cloud services globally. This paper analyzes the Google cluster usage trace, focusing on three key aspects: task execution times, rescheduling frequency, and the relationship between task priority and rescheduling. Firstly, we examine how memory and processor performance impact task execution times across different machines. Next, we investigate how the number of task constraints influences rescheduling frequency and overall environmental efficiency. Furthermore, we analyze how task priority affects rescheduling and explore its correlation with task constraints. The results reveal that doubling the memory size can accelerate tasks by a factor of nine and that 90% of rescheduling is associated with tasks having less than seven constraints. We aim to enhance data center performance by identifying bottlenecks in the Google Cluster Dataset and providing recommendations for all cloud service providers. Our key findings indicate that memory plays a more significant role than the processor, and tasks with higher constraints have a less pronounced impact on rescheduling than anticipated.
Precision for document type :
Review article
Disciplines :
Computer science
Author, co-author :
Shahmirzadi, Danyal; Graduate School of Engineering Science and Technology, National Yunlin University of Science and Technology, Douliou, Taiwan
KHALEDIAN, Navid ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CritiX
Rahmani, Amir Masoud; Future Technology Research Center, National Yunlin University of Science and Technology, Douliou, Taiwan
External co-authors :
yes
Language :
English
Title :
Analyzing the impact of various parameters on job scheduling in the Google cluster dataset
N. Khaledian K. Khamforoosh S. Azizi V. Maihami IKH-EFT: an improved method of workflow scheduling using the krill herd algorithm in the fog-cloud environment Sustain. Comput.: Inform. Syst. 2023 37 100834
A. Rosà L.Y. Chen R. Birke W. Binder Demystifying casualties of evictions in big data priority scheduling ACM SIGMETRICS Perform. Eval. Rev. 2015 42 4 12 21 10.1145/2788402.2788406
Chen, X., Lu, C. D., Pattabiraman, K.: Failure analysis of jobs in compute clouds: a Google cluster case study. In 2014 IEEE 25th International Symposium on Software Reliability Engineering (pp. 167–177). IEEE. (2014)
Rzadca, K., Findeisen, P., Swiderski, J., Zych, P., Broniek, P., Kusmierek, J., Wilkes, J.: Autopilot: workload autoscaling at Google. In proceedings of the fifteenth european conference on computer systems (pp. 1–16), (2020)
R. Anil G. Capan I. Drost-Fromm T. Dunning E. Friedman T. Grant Ö. Yılmazel Apache mahout: machine learning on distributed dataflow systems J. Mach. Learn. Res. 2020 21 127 1 6
G.E. Gévay J. Soto V. Markl Handling iterations in distributed dataflow systems ACM Comput. Surv. (CSUR) 2021 54 9 1 38 10.1145/3477602
Tirmazi, M., Barker, A., Deng, N., Haque, M.E., Qin, Z.G., Hand, S., Wilkes, J.: Borg: the next generation. In proceedings of the fifteenth european conference on computer systems (pp. 1–14), (2020)
D. Fernández-Cerero Á.J. Varela-Vaca A. Fernández-Montes M.T. Gómez-López J.A. Alvárez-Bermejo Measuring data-centre workflows complexity through process mining: the Google cluster case J. Supercomput. 2020 76 2449 2478 10.1007/s11227-019-02996-2
Gog, I., Schwarzkopf, M., Gleave, A., Watson, R.N., Hand, S.: Firmament: Fast, centralized cluster scheduling at scale. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) (pp. 99–115), (2016)
Fernández Cerero, D., Fernández Montes González, A., Jakóbik, A., Kolodziej, J.: Stackelberg game-based models in energy-aware cloud scheduling. In ECMS 2018: 32nd European Conference on Modelling and Simulation (2018). European Council for Modelling and Simulation, (2018)
N. Khaledian K. Khamforoosh R. Akraminejad L. Abualigah D. Javaheri An energy-efficient and deadline-aware workflow scheduling algorithm in the fog and cloud environment Computing 2024 106 1 109 137 10.1007/s00607-023-01215-4
D. Fernández-Cerero A. Jakóbik D. Grzonka J. Kołodziej A. Fernández-Montes Security supportive energy-aware scheduling and energy policies for cloud environments J. Parallel Distrib. Comput. 2018 119 191 202 10.1016/j.jpdc.2018.04.015
H.H. Maala S.A. Yousif Cluster trace analysis for performance enhancement in cloud computing environments J. Theor. Appl. Inf. Technol. 2019 97 7 2019
R. Koch, "The 20/80Principle: the secret of achieving more with less.," Doubleday, (1999)
Van Loo, T., Jindal, A., Benedict, S., Chadha, M., Gerndt, M.: Scalable infrastructure for workload characterization of cluster traces. (2022), arXiv preprint
I.H. Adil A. Wahid E.H. Mantell Split sample skewness Commun. Stat. Theory Methods 2021 50 22 5171 5188 4319645 10.1080/03610926.2020.1804588
D. Olabisi S.K. Abubakar A.T. Abdullahi demystifying dew computing: concept, architecture and research opportunities Int. J. Comput. Trends Technol. 2022 70 39 43 10.14445/22312803/IJCTT-V70I5P105
Reiss, C., Tumanov, A., Ganger, G.R., Katz, R.H., Kozuch, M.A.: Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In Proceedings of the third ACM symposium on cloud computing (pp. 1–13), (2012)
A. Umer A.N. Mian O. Rana Predicting machine behavior from Google cluster workload traces Concurr. Comput.: Pract. Exp. 2023 35 5 e7559 10.1002/cpe.7559
Jassas, M. S., Mahmoud, Q. H.: Failure characterization and prediction of scheduling jobs in Google cluster traces. In 2019 IEEE 10th GCC Conference & Exhibition (GCC) (pp. 1–7). IEEE. (2019)
Wang, H., Jiang, C., Xie, B.: Missing data analysis and prediction: a Google cluster case study. (2022)
Ngang'a, D.N., Cheruiyot, W.K., Njagi, D. A Machine Learning Framework for Predicting Failures in Cloud Data Centers-A Case of Google Cluster-Azure Clouds and Alibaba Clouds. Available at SSRN 4404569
Soualhia, M., Khomh, F., Tahar, S.: Predicting scheduling failures in the cloud: A case study with Google clusters and Hadoop on Amazon EMR. In 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems (pp. 58–65). IEEE. (2015)
S. Chen C. Yang W. Huang W. Liang N. Ke A. Souri K.C. Li Fairness constraint efficiency optimization for multiresource allocation in a cluster system serving internet of things Int. J. Commun. Syst. 2023 36 3 e5395 10.1002/dac.5395
Wilkes, J.: More Google cluster data. Google research blog, Nov, (2011)
S. Gupta A.D. Dileep Long range dependence in cloud servers: a statistical analysis based on Google workload trace Computing 2020 102 4 1031 1049 4082197 10.1007/s00607-019-00779-4
N.V. Subramanian V.S. Sriram Load-aware VM migration using hypergraph based CDB-LSTM Intell. Autom. Soft Comput. 2023 35 3 3279 3294 10.32604/iasc.2023.023700
B. Berisha E. Mëziu I. Shabani Big data analytics in Cloud computing: an overview J. Cloud Comput. 2022 11 1 24 10.1186/s13677-022-00301-w
J.W. Osborne A. Overbay The power of outliers (and why researchers should always check for them) Pract. Assess. Res. Eval. 2019 9 1 6
Seo, S.: A review and comparison of methods for detecting outliers in univariate data sets (Doctoral dissertation, University of Pittsburgh), (2006)
G. Brys M. Hubert A. Struyf A robust measure of skewness J. Comput. Graph. Stat. 2004 13 4 996 1017 2109062 10.1198/106186004X12632
A. Tawhid T. Teotia H. Elmiligi Machine Learning for Optimizing Healthcare Resources Machine Learning, Big Data, and IoT for Medical Informatics 2021 Cambridge Academic Press 215 239 10.1016/B978-0-12-821777-1.00020-3