研究莱曼极限系统(Lyman limit systems,LLS)对于了解宇宙的大尺度结构、星系演化以及星系团内部气体分布具有重要意义。然而,由于LLS吸收特征的独特性,目前的研究主要采用传统方法,对柱密度在1019cm-2≤N(HI)<1020。3cm-2的小样本集上进行认证。本文利用深度学习技术,在暗能量光谱仪(The Dark Energy Spectroscopic Instrument,DESI)模拟光谱上,通过优化卷积神经网络(convolutional neural network,CNN)模型,提高了对LLS(1018。5 cm-2≤N(HI)≤1020。0 cm-2)的识别精度(达到95%)。随后,验证了该模型的完备度和纯度,并估计了LLS的柱密度和红移。结果显示:在S/N>6的情况下,当10190cm-2>N(NHI)>1018。5 cm-2时,CNN模型的完备度超过0。5,而纯度也超过0。2;当 1020。0cm-2>N(HI)>1019。0cm-2时,完备度超过0。9,而纯度超过0。7;当 1020。0 cm-2>N(HI)>1018。5 cm-2时,CNN模型对LLS柱密度估计值与真实值的平均差值为-0。05161,标准差为0。239,对LLS红移估计值和真实值的平均差值为-0。0003,标准差为0。0009。这些结果表明:模型的完备度普遍高于纯度,尤其是在低柱密度的情况下,LLS在光谱中的吸收特征非常窄,极易与其他波段混淆,导致模型产生更多的FP(false positive)样本。此外,CNN模型对LLS的柱密度和红移的估计值略低于真实值,且估计误差的离散程度较小。本研究为未来的LLS研究提供了可参考的方法,鼓励研究人员适应并采用CNN模型进行各种光谱分析。
Searching for Lyman limit systems in Dark Energy Spectroscopic Instrument mock spectra using convolutional neural network
Studying Lyman limit systems(LLS)is crucial for a deeper understanding of the large-scale structure of the universe,the evolutionary history of galaxies,and the distribution of gas within galaxy clusters.Although LLS absorption features are distinctive,current research is largely constrained by these characteristics.Additionally,traditional methods are predominantly employed,with a primary focus on the identification and analysis of small sample sets with column densities ranging from 1019 cm-2≤N(HI)<10203 cm-2.The objective of this study is to surpass the constraints of current research by utilizing deep learning methods to investigate a wider and more inclusive sample.This approach facilitates the detection and characterization of LLS with reduced column densities.We utilized high-quality spectral data simulated by the Dark Energy Spectroscopic Instrument(DESI)as the experimental foundation.Through the optimization of convolutional neural network(CNN)models,we have effectively boosted the model's identification accuracy of LLS(with column densities of 1018 5 cm-2≤N(HI)≤10200 cm-2)in DESI simulated spectra to 95%.Following that,this paper validated the completeness and purity of the model under different signal-to-noise ratios and column density conditions.Additionally,an analysis of the differences between the CNN model's estimated and actual values of column density and redshift was conducted.The analysis results indicate that,under conditions where the signal-to-noise ratio exceeds 6,for LLS with column densities of 1019 .0 cm-2>N(HI)>1018.5 cm-2,the completeness of the CNN model exceeds 0.5,and the purity exceeds 0.2.For LLS with column densities of 1020.0 cm-2>N(HI)>1019.0 cm-2,the model's completeness exceeds 0.9,and the purity exceeds 0.7.Further analysis reveals that as column density and signal-to-noise ratio increase,both the completeness and purity of the model exhibit an upward trend.In the comparison between the estimated and actual values of LLS column density and redshift by the CNN model,we found that within the range of 1020.0cm-2>N(HI)>1018.5 cm-2,the average difference between the model's estimated and actual values of LLS column density is-0.05161 with a standard deviation of 0.239.Similarly,the average difference between the estimated and actual values of LLS redshift is-0.0003 with a standard deviation of 0.0009.These findings indicate that,although the model's completeness generally surpasses its purity,particularly in low column density regions,the absorption features of LLS are relatively weak and prone to confusion with other spectral bands,resulting in a higher number of false positive(FP)samples.And as the column density and signal-to-noise ratio continuously increase,both the completeness and purity also increase accordingly.At the same time,the CNN model tends to underestimate the column density and redshift of LLS,yet the distribution of estimation errors is relatively concentrated,demonstrating the model's robustness.This study not only provides a novel analytical approach for future LLS research but also encourages researchers to adopt and adapt CNN models for a broader spectrum analysis,thus paving the way for new avenues in cosmological research.
Dark Energy Spectroscopic Instrument quasi-stellar object spectraLyman limit systemsconvolutional neural networkcompleteness and purity