Morris Factor Screening Method Based on Robust Statistics
The Morris method is usually applied to identify non-influential inputs for a computationally costly mathematical model for a model with a large number of inputs.It can be used to simplify a model by identifying the factors with a low influence which can be fixed,as a first step.It is widely used in factor screening studies where the inputs of the model are large and the computer model is computational because of its highly efficient and model-free properties.It is also known as the Elementary Effect(EE)method which measures the impor-tance of the input by estimating the two statistics mean and standard deviation(or variance)of the distribution of the EE.However,in industrial production,the presence of outliers due to measurement errors,instrument failures or noise makes the problem of data contamination unavoidable.When there are outliers in the output,the mean and standard deviation of the EE will deviate from the true central tendency and variation.Then,the con-ventional Morris method cannot accurately identify the effect of the input on the output.Thus,in this paper,we propose to use robust estimators to replace the mean and standard deviation estimating the location and scale parameters of the EE.By this means the improved Morris method can accurately estimate the linear and nonlinear effects of the input,when faced with data contamination.The proposed Morris method is also widely used regard-less of whether the output is a normal distribution or not.The Morris method is model-free which can identify the linear effect and nonlinear effect of input under com-plex situations.In this paper,we adopt median and Hodges-Lehmann(HL)to estimate factors'linear effects,and employ Median Absolute Deviation(MAD)and Shamos to estimate factors'nonlinear effects.The statistical properties of robust estimators(median,HL,MAD,and Shamos)are investigated.The breakdown points are used to measure the robustness of robust estimators.Median and MAD have a higher breakdown point than HL and Shamos.While the relative efficiency of HL and Shamos are higher.In the paper,the procedure of estima-ting factors'effects is introduced.The proposed method does not increase the computational cost compared to the traditional method.Then,two test functions(Ishigami function,and a 20-input Sobol'G function)and a real case(HyMOD model)are introduced to verify the robustness and convergence of the proposed Morris method.To verify the validity of the proposed method in the presence of outliers in the output,a certain percentage of the initial outputs are randomly replaced with outliers by employing a contamination function.For the Ishigami function,we investigate the impact of the number of outliers and the size of the outliers in the output on the experimental results separately.The maximum percentage of outliers in output is 0.3 which is larger than the breakdown point of robust estimator HL and Shamos.For the Sobol'G function,the convergence speed of the proposed method and that of the traditional Morris method are compared.Several interesting conclusions are drawn.First,when there are no outliers in the output or the percentage of outliers is not greater than the breakdown point of the robust estimator,the proposed method can effectively perform factor screening.Second,the number of outliers in the output has an effect on the result,e.g.when the percentage of outliers in the output is larger than the breakdown point of the robust estimator,the proposed method will fail,but the size of the outliers in the output will have no effect on the result.Third,when there are no outliers in the output,by adopting the proposed method accurate results can be obtained at a smaller cost;the proposed method is more efficient than the traditional Morris method.Moreover,when facing the problem of outliers in the output,to obtain accurate screening results,the number of repetitive tests needs to be increased appropriately.In summary,the improved method has wide applicability and high robustness which can perform factor screening regardless of the presence of outliers in model response,and possesses higher efficiency than the traditional Morris method.