A Spark-based Configuration Optimization Technology
Spark is becoming more and more important in power applications where massive data should be rapidly processed,but its configuration parameter space is large and the relationship between parameters is complex.It is extremely difficult to manually adjust parameters based on experience to obtain the best performance.Therefore,this paper proposes a configuration optimization method based on Spark.The configuration parameters that have an active impact on Spark performance are select-ed,and the dataset is generated through MCMC sampling and generative adversarial network(GAN);The performance model is constructed through hierarchical modeling.The optimal configuration of the application is efficiently searched in the parame-ter space by the particle swarm optimization(PSO)algorithm.The experimental results show that the performance of Spark is improved by an average of 25%compared with empirical tuning by the method based on experience.