Data synthesis method preserving correlation of features

扫码查看

原文链接

NSTL
Elsevier

外文摘要：Abundant data are essential for improving the performance of machine learning algorithms. Thus, if only limited data are available, data synthesis can be used to enlarge datasets. Data synthesis methods based on the covariance matrix are useful because of their fast data synthesis capabilities. However, artifi-cial datasets generated via classical techniques show statistical discrepancies when compared to original datasets. To address this problem, we developed a new data synthesis method that preserves the corre-lation (between features) observed in the original dataset. This preservation was realized by considering not only the correlation but also the random noises used in data synthesis process. This method was applied to various biosignals (i.e., electrocortiography, electromyogram, and electrocardiogram), wherein data points are insufficient. Several classifiers (i.e., convolutional neural network, support vector machine, and k-nearest neighbor) were used to verify that the classification accuracy can be improved by the pro-posed data synthesis method. (c) 2021 Elsevier Ltd. All rights reserved.

外文关键词：

Data synthesisCorrelationArtificial datasetRandom noiseFAULT-DIAGNOSISGENERATIONFRAMEWORKNETWORKS

作者：

Yang, Wonseok、Nam, Woochul

展开 >

作者单位：

Chung Ang Univ

出版年：

2022

DOI：

10.1016/j.patcog.2021.108241

Pattern Recognition

EISCI

ISSN：0031-3203

年,卷(期)：2022.122

被引量3
参考文献量43