二项分布双边收尾概率和假设检验统计量的几处修正

Corrections to the two-sided probability and hypothesis test statistics on binomial distributions

王建康¹

扫码查看

作者信息

1. 中国农业科学院作物科学研究所/作物基因资源与育种国家重点实验室,北京 100081
折叠

摘要

二项分布是广泛存在的一种离散型概率分布.服从二项分布B(n,p)的一个随机变量等于n个相互独立且服从贝努利分布B(1,p)的随机变量之和,二项分布包含参数p的估计与检验等同于贝努利分布参数p的估计与检验.本文修正常见教科书中有关二项分布双边收尾概率计算和假设检验统计量构建中存在的3处问题.(1)对二项分布B(n,p)的取值概率pk(k=0,1,…,n)从小到大排序,排序后的概率用p(k)表示,观测值k的双边收尾精确概率等于kΣi=0p(i);(2)二项分布B(n,p)参数p与给定值p0的差异显著性检验统计量被修正为u=(p)-p0/√(p)(q)/n,该统计量在大样本条件下近似服从正态分布N(p-p0,1);(3)二项分布B(n1,p1)和B(n2,p2)的参数p1和p2差异显著性检验统计量被修正为u=(p)1-(p)2/√(p)1(q1)/n1+(p)2(q)2/n2,该统计量在大样本条件下近似服从正态分布N(p1-p2,1).修正后的双边收尾概率是精确值,不会出现概率大于1的问题.修正后的2个检验统计量无论原假设是否成立,其大样本近似正态分布的方差均为1,有利于准确研究备择假设条件下检验统计量的功效.此外,文中还介绍了小样本条件下二项分布参数的精确检验,对比分析了准确检验与近似检验的异同;讨论了修正统计量的理论基础,给出了小概率和大样本的判定标准,列出了贝努利分布参数检验与正态分布均值检验的异同.期望读者能够从中了解到假设检验与统计推断作为统计学核心研究内容的全貌.

Abstract

Binomial distributions widely exist in nature and human society,which is classified as discrete by probability theory.In theoretical studies in mathematical statistics,a random variable of binomial distribution B(n,p)is equivalent to the sum of a number of n independent and identical variables of Bernoulli distribution B(1,p).Estimation and testing on parameter p of bino-mial distribution B(n,p)is therefore equivalent to those of Bernoulli distribution B(1,p).Three corrections were made in this arti-cle,relevant to the calculation of two-tailed probability,and the construction of hypothesis test statistics.(1)Assume pk(k=0,1,…,n)is the probability list of binomial distribution B(n,p),and the probability by ascending order is given by p(k).The two-tailed exact probability is equal to kΣi=0p(i),given the value of the observed k.(2)When testing the difference between parameter p of B(n,p)against a given value p0,the test statistic was corrected by u=(p)-p0/√(p)(p)/n,which asymptotically approaches to normal distribution N(p-p0,1)under the condition of large samples.(3)When testing the difference between two parameters of binomial distributions B(n1,p1)and B(n2,p2),the test statistic was corrected by u(p)1-(p)2/√(p)1(q1)/n1+(p)2(q)2/n2,which asymptotically ap-proaches to normal distribution N(p1-p2,1)under the condition of large samples.By the correction,the two-tailed probability has the exact value,and avoids the embarrassing situation of a probability exceeding one.Under either the null or alternative hypothesis conditions,the asymptotical normal distributions always have the variance at one,and therefore are more suitable to study the statistical power in testing the alternative hypothesis.Exact test on binomial distributions under the condition of small samples was also introduced,together with the comparison between exact and approximate tests.Probability theory underlying the corrections was provided.Comparison was made between the tests on parameter of Bernoulli distribution and mean of normal distribution.The general rule in determining the small probability and large sample was present as well.By doing so,the author wishes to provide the readers with a perspective picture on hypothesis testing and statistical inference,consisting of the core content of modern statistics.

关键词

二项分布/正态分布/假设检验/检验统计量/修正/检验功效

Key words

binomial distribution/normal distribution/hypothesis test/testing statistic/correction/testing power

引用本文复制引用

基金项目

国家自然科学基金(31861143003)

中国农科院创新工程项目()

出版年

2024

作物学报

中国作物学会　中国农业科学院作物科学研究所

作物学报

CSTPCDCSCD北大核心

影响因子：1.803

ISSN：0496-3490

参考文献量15

段落导航