异方差大数据下联合均值与方差模型的α-最优子抽样
α-Optimal Subsampling for Joint Mean and Variance Models Under Heteroscedasticity Big Data
熊正榆 1吴刘仓 1杨兰军1
作者信息
- 1. 昆明理工大学理学院,昆明 650500;昆明理工大学应用统计学研究中心,昆明 650500
- 折叠
摘要
随着信息技术的发展,经济、金融、工业等领域产生了异常庞大的数据,这些数据往往具有异方差特性,传统统计模型和统计方法难以解决该类大数据的建模问题.子抽样是处理大数据的重要方法.文章针对联合均值与方差模型,在异方差大数据环境下研究了子抽样问题.文章主要贡献如下:对具有异方差特性的大数据建立联合均值与方差模型,在一定条件下,基于A-最优准则和L-最优准则讨论了子样本参数估计的一致性和渐近正态性;首次提出了异方差大数据下联合均值与方差模型的α-最优子抽样算法.数值模拟和实证分析的结果表明,该抽样算法能提高估计的精确性,减少计算成本.
Abstract
With the development of information technology,an unusually large amount of data is generated in economy,finance,industry and other fields,and these data have the characteristics of heteroscedasticity.The traditional statistical models and statistical methods can not solve the heteroscedasticity modeling problem in big data.Subsampling is an important method to deal with big data.In this paper,we study the subsampling for the joint mean and variance models in the heteroskedas-tic big data environment.The main contributions of this paper are as follows:The joint mean and variance models are developed for heteroscedasticity big data,and the consistency and asymptotic normality of the subsample estimator are proven based on the A-optimality criterion and the L-optimality criterion under certain conditions;An α-optimal subsampling algorithm of the joint mean and variance models for het-eroscedasticity big data is proposed.The results of numerical simulations and a real example show that the sampling algorithm improves estimation accuracy and reduces computational costs.
关键词
异方差大数据/联合均值与方差模型/α-最优子抽样Key words
Heteroscedasticity big data/joint mean and variance models/α-optimal subsampling引用本文复制引用
基金项目
国家自然科学基金(12261051)
云南省基础研究专项重点项目(202401AS070061)
昆明理工大学哲学社会科学科研创新团队(CXTD2023005)
出版年
2024