首页|基于机器学习的小样本瞬变源早期分类算法

基于机器学习的小样本瞬变源早期分类算法

扫码查看
提出一种基于机器学习的小样本瞬变源早期分类算法TXW,可以在训练样本较少的情况下实时、准确地证认出瞬变源的类别。TXW算法是一种改进的小样本度量学习方法,融合了可提取测光数据特征的时间卷积网络和计算类别概率分数的极限梯度提升树,并结合了新的加权算法,可以解决信号源过早消失所导致的噪声被误判为特征的问题。实验结果表明,与其他5种算法相比,TXW算法针对小样本数据集分类的准确度平均提升了4。33百分点,早期结果的精确率-召回率曲线精度平均值提升了0。25,接受者操作特性曲线下方面积的宏平均值提升了0。08。同时,TXW算法在精度、鲁棒性、抗噪性上都有较优表现,可实现小样本瞬变源的早期分类,在证认引力波电磁对应体等稀有瞬变源事件中具有应用价值。
An Early Classification Algorithm for Small Sample Transient Source Based on Machine Learning
Objective Transient sources play a crucial role in studying the origins of the universe and physical phenomena in extreme environments.One of the primary objectives of the SVOM mission is to detect target of opportunity(ToO)events,including electromagnetic counterparts of gravitational waves and other types of transients.Given their Rapid decay,millions of transient events are detected by sensors every night.Hence,a Rapid and accurate classification algorithm is essential for confirming their nature early on.Early classification not only aids in subsequent observational follow-ups but also in studying the physical properties and progenitor systems of transients.Currently,early photometric data of transients often consist of incomplete light curves,which poses a challenge for traditional classification algorithms that typically require complete data sets.Existing early classification algorithms rely heavily on large data sets,which may overlook transients with low occurrence rates or those undetected by current methods.Therefore,developing early classification algorithms tailored for small sample transients is necessary to improve detection efficiency.Methods We propose an early classification algorithm for small sample transient sources based on machine learning:temporal convolutional network(TCN)and eXtreme gradient boosting(XGBoost)combined with a weight module(TXW)algorithm.The algorithm utilizes a small sample metric learning method.Firstly,input data is converted into feature vectors,after which similarity scores for all classes are calculated by the classifier.The transient object is classified as the class with the highest score.The TCN module in the TXW algorithm extracts features from the photometric data of transients,while the XGBoost module calculates probability scores for each candidate class of transient objects.We propose a novel weighting algorithm in the weight module to reduce the noise in time-series photometric data from transient sources.This addresses issues where signal sources disappear prematurely and noise is mistaken for features.Experimental data consists of four types of open-source multi-band transient simulation data provided by the photometric LSST astronomical time-series classification challenge(PLAsTiCC):tidal disruption events(TDE),kilonovae(KN),type Ⅰa supernovae(SNIa),and Type Ⅰ super-luminous supernovae(SLSN-I).We use simulated photometric transient data from the g,r,and i bands in the PLAsTiCC dataset,as these bands align with ground-based telescope observation bands used in the SVOM mission.After preprocessing steps such as time correction,de-reddening,light curve fitting,and data augmentation,a suitable dataset is established for the models.We evaluate the performance of the TXW algorithm by comparing it with other classifiers—LSTM,transformer,Rapid,and TXW without the weight module—using the same testing set.Results and Discussions We compare the real-time classification accuracy results of different algorithms.As shown in Table 1,the TXW classification accuracy is 21.98 percent point higher than that of LSTM,18.23 percent point higher than that of Transformer,4.33 percent point higher than that of Rapid,and 0.81 percent point higher than that of the TXW algorithm without the weight module.These results demonstrate that the TXW algorithm offers high accuracy and strong noise resistance capabilities.We consider the results at 2 d post-trigger as the early epoch transient classification results,and those at 24 d post-trigger as the late epoch results.This paper uses confusion matrices,precision-recall(PR)curves,and receiver operating characteristic(ROC)curves as performance indicators for the algorithms.Figure 5 displays the confusion matrix,showing that the TXW results at 2 d and 24 d post-trigger are superior to those of Rapid.Additionally,the accuracy of the TXW algorithm at 2 d post-trigger exceeds 0.5.precision-recall curves and average precision(AP)values are presented in Fig.6.The average AP of the TXW algorithm is 0.25 higher than that of Rapid at 2 d post-trigger,with TDE higher by 0.03,KN by 0.1,SNIa by 0.21,and SLSN-I by 0.16 compared to Rapid.At 24 d post-trigger,the average AP of the TXW algorithm is 0.17 higher than Rapid,with TDE higher by 0.02,KN by 0.03,SNIa by 0.09,and SLSN-I by 0.13 compared to Rapid.ROC curves and area under the curve(AUC)values are shown in Fig.7.At 2 d post-trigger,the micro-average and macro-average AUC of the TXW algorithm are higher by 0.1 and 0.08 respectively,with TDE higher by 0.02,KN by0.09,SNIaby0.19,and SLSN-I by 0.09 compared to Rapid.At 24 d post-trigger,the micro-average is 0.04,the macro-average is 0.05,TDE is 0.04,KN is 0.02,SNIa is 0.1,and SLSN-I is 0.05 higher than Rapid.Figure 8 shows the AUC over time for the TXW and Rapid algorithms.Over time,both algorithms show improvement.However,after t>40,the AUC of the Rapid algorithm decreases due to noise influence,whereas the TXW algorithm mitigates noise effects.The maximum AUC of the Rapid algorithm is greater than 0.8,while that of the TXW algorithm exceeds 0.9.Overall,the TXW algorithm consistently outperforms the Rapid algorithm in both early and late epoch results,which showcases higher accuracy and better noise resistance,particularly beneficial for early classification of small sample transients.Conclusions We propose an early classification algorithm,TXW,for small sample transients.In the design of the TXW algorithm,the TCN has stronger feature extraction abilities compared to the GRU.The TXW algorithm not only possesses the advantages of the XGBoost algorithm,including high accuracy and strong robustness but also addresses the shortcomings of RF and XGBoost,which ignore correlations between attributes in datasets due to the TCN module.Additionally,the residual block in the algorithm resolves the issue of CNN overfitting.Due to the short time scale of the transients,we propose a new weighting formula to address the issue where noise from prematurely disappearing signal sources is misclassified as features.We compare the classification results of TXW with LSTM,transformer,Rapid,and TXW without the weight module.We also analyze the results using performance indicators such as accuracy,confusion matrix,PR curve,AP value,ROC curve,and AUC value.The results show that the TXW algorithm has high accuracy,strong robustness,and great anti-noise ability.The comprehensive performance of the TXW algorithm is better than that of the Rapid algorithm.The TXW algorithm contributes significantly to research on small sample transients.

transient sourceclassificationphotometrysmall samplemachine learning

李梦慈、刘承志、吴潮、康喆、邓诗宇、李振伟

展开 >

中国科学院国家天文台长春人造卫星观测站,吉林长春 130117

中国科学院大学,北京 100049

中国科学院紫金山天文台空间目标与碎片观测重点实验室,江苏南京 210008

中国科学院国家天文台,北京 100101

展开 >

瞬变源 分类 测光 小样本 机器学习

2024

光学学报
中国光学学会 中国科学院上海光学精密机械研究所

光学学报

CSTPCD北大核心
影响因子:1.931
ISSN:0253-2239
年,卷(期):2024.44(24)