High-dimensional linear discriminant analysis using nonparametric methods

扫码查看

原文链接

NSTL
Elsevier

外文摘要：The classification of high-dimensional data is a very important problem that has been studied for a long time. Many studies have proposed linear classifiers based on Fisher's linear discriminant rule (LDA) which consists of estimating the unknown covariance matrix and the mean vector of each group. In particular, if the data dimension p is larger than the number of observations n (p > n), the sample covariance matrix cannot be a good estimator of the covariance matrix due to the well-known rank deficiency. To solve this problem, many studies proposed methods by modifying the LDA classifier through diagonalization or regularization of covariance matrix. In this paper, we categorize existing methods into three cases and discuss the shortcomings of each method. To compensate for these shortcomings, our baseline idea is that we consider estimation of the high dimensional mean vector and covariance matrix altogether while existing methods focus on shrinkage estimator of either mean vector or covariance matrix. We provide theoretical result that the proposed method is successful in both sparse and dense situations of the mean vector structures. In contrast, some existing methods work well only under specific situations. We also present numerical studies that our methods outperform existing methods through various simulation studies and real data examples such as electroencephalogy (EEG), gene expression microarray, and Spectro datasets. (C) 2021 Elsevier Inc. All rights reserved.

外文关键词：

Empirical BayesKiefer-Wolfowitz estimatorLinear classification ruleNonparametric maximum likelihood estimatorSingular value decompositionEMPIRICAL BAYES ESTIMATIONMAXIMUM-LIKELIHOOD ESTIMATORCLASSIFICATIONPRECISIONMATRIX

作者：

Park, Hoyoung、Baek, Seungchul、Park, Junyong

展开 >

作者单位：

Seoul Natl Univ

Univ Maryland Baltimore Cty

出版年：

2022

DOI：

10.1016/j.jmva.2021.104836

Journal of Multivariate Analysis

SCI

ISSN：0047-259X

年,卷(期)：2022.188

被引量1
参考文献量21