基于稳健统计量的Morris因子筛选方法

Morris Factor Screening Method Based on Robust Statistics

谢恩 ¹马义中 ¹刘丽君 ¹张凤霞¹

扫码查看

作者信息

1. 南京理工大学经济管理学院,江苏南京 210094
折叠

摘要

Morris方法因其高效且与模型无关的特性,被广泛应用于输入因子数目较大或计算模型运行成本较高的因子筛选研究中.然而,传统的Morris方法基于数据无异常值的假设,难以应对数据中存在异常值情形的因子筛选问题.因此,本文采用稳健统计量度量因子效应来改进传统的Morris方法,有效的解决了模型响应中存在异常值时的因子筛选问题.本文首先引入稳健统计量并简单分析其性质;然后,采用稳健的位置和尺度统计量替代均值和标准差改进传统的Morris方法,使得因子筛选结果不受异常值影响;最后,引入两个常用的测试函数和一个实际案例,验证所提方法的有效性和适用性.试验结果表明,本文所提方法具有广泛的适用性,不管模型响应中是否存在异常值都能有效进行因子筛选,且比传统的Morris方法具有更高的效率.

Abstract

The Morris method is usually applied to identify non-influential inputs for a computationally costly mathematical model for a model with a large number of inputs.It can be used to simplify a model by identifying the factors with a low influence which can be fixed,as a first step.It is widely used in factor screening studies where the inputs of the model are large and the computer model is computational because of its highly efficient and model-free properties.It is also known as the Elementary Effect(EE)method which measures the impor-tance of the input by estimating the two statistics mean and standard deviation(or variance)of the distribution of the EE.However,in industrial production,the presence of outliers due to measurement errors,instrument failures or noise makes the problem of data contamination unavoidable.When there are outliers in the output,the mean and standard deviation of the EE will deviate from the true central tendency and variation.Then,the con-ventional Morris method cannot accurately identify the effect of the input on the output.Thus,in this paper,we propose to use robust estimators to replace the mean and standard deviation estimating the location and scale parameters of the EE.By this means the improved Morris method can accurately estimate the linear and nonlinear effects of the input,when faced with data contamination.The proposed Morris method is also widely used regard-less of whether the output is a normal distribution or not.The Morris method is model-free which can identify the linear effect and nonlinear effect of input under com-plex situations.In this paper,we adopt median and Hodges-Lehmann(HL)to estimate factors'linear effects,and employ Median Absolute Deviation(MAD)and Shamos to estimate factors'nonlinear effects.The statistical properties of robust estimators(median,HL,MAD,and Shamos)are investigated.The breakdown points are used to measure the robustness of robust estimators.Median and MAD have a higher breakdown point than HL and Shamos.While the relative efficiency of HL and Shamos are higher.In the paper,the procedure of estima-ting factors'effects is introduced.The proposed method does not increase the computational cost compared to the traditional method.Then,two test functions(Ishigami function,and a 20-input Sobol'G function)and a real case(HyMOD model)are introduced to verify the robustness and convergence of the proposed Morris method.To verify the validity of the proposed method in the presence of outliers in the output,a certain percentage of the initial outputs are randomly replaced with outliers by employing a contamination function.For the Ishigami function,we investigate the impact of the number of outliers and the size of the outliers in the output on the experimental results separately.The maximum percentage of outliers in output is 0.3 which is larger than the breakdown point of robust estimator HL and Shamos.For the Sobol'G function,the convergence speed of the proposed method and that of the traditional Morris method are compared.Several interesting conclusions are drawn.First,when there are no outliers in the output or the percentage of outliers is not greater than the breakdown point of the robust estimator,the proposed method can effectively perform factor screening.Second,the number of outliers in the output has an effect on the result,e.g.when the percentage of outliers in the output is larger than the breakdown point of the robust estimator,the proposed method will fail,but the size of the outliers in the output will have no effect on the result.Third,when there are no outliers in the output,by adopting the proposed method accurate results can be obtained at a smaller cost;the proposed method is more efficient than the traditional Morris method.Moreover,when facing the problem of outliers in the output,to obtain accurate screening results,the number of repetitive tests needs to be increased appropriately.In summary,the improved method has wide applicability and high robustness which can perform factor screening regardless of the presence of outliers in model response,and possesses higher efficiency than the traditional Morris method.

关键词

Morris方法/基本效应/因子筛选/稳健统计量/异常值

Key words

Morris method/elementary effects/factor screening/robust estimator/outlier

引用本文复制引用

基金项目

国家自然科学基金资助项目(71931006)

江苏省卓越博士后计划项目(2022ZB259)

江苏省卓越博士后计划项目(2022ZB260)

出版年

2024

运筹与管理

中国运筹学会

运筹与管理

CSTPCDCHSSCD北大核心

影响因子：0.688

ISSN：1007-3221

参考文献量2

段落导航