Short-term Volatility Prediction of Gold Futures Based on High-frequency Data and EN-LSTM
In 2020,the sudden outbreak of the COVID-19 pandemic triggered a profound integration of cutting-edge fields such as the internet,big data,and artificial intelligence into the financial markets.This integration has led to a transformation in trading,settlement,and information dissemination modes of futures,causing a noticeable increase in the instability and uncertainty of the gold futures market.Exploring the inherent patterns of gold futures price volatility under these new conditions is essential for providing warnings and preventing"black swan"risks for all participants in the gold futures market.The marginal contribution of this paper lies in two main areas:the model improvement section,where an EN-LSTM combination is employed to predict high-frequency volatility in gold futures based on characteristics extracted from high-frequency data,demonstrating that the predictive performance of the integrated model is significantly superior to using the LSTM model alone;and the empirical application section,which achieves real-time out-of-sample forecasting of high-frequency data,dynamically contracting the rolling time window and enhancing the practicality of financial time series forecasting.The paper integrates an Elastic Net(EN)and Long Short-Term Memory(LSTM)model(referred to as EN-LSTM)for predicting high-frequency volatility in gold futures.Drawing inspiration from the contemporary practice of combining LASSO and LSTM models,a penalty term is introduced in the traditional linear regression model,with improvements made on the LASSO penalty,forming the EN model.The EN model is then used for variable shrinkage,primarily reducing overfitting through variable selection and regularization,resulting in a novel integrated prediction model,EN-LSTM.The chosen sample in this study is the standard continuous main contract of gold futures from the Shanghai Futures Exchange,with a sample period from January 2,2019,to December 31,2020.High-frequency raw data is sourced from the Tonghuashun database.The paper begins by scaling 20-dimensional input variables using the EN model,feeding the scaled selected variables into the LSTM prediction model for training,ultimately output-ting the high-frequency returns of Shanghai gold futures.The differential absolute value of returns is employed as a proxy variable for short-term volatility changes in Shanghai gold futures.From the empirical results,the following conclusions can be drawn:Firstly,in terms of data frequency,the prediction accuracy is higher with high-frequency data.Secondly,considering the training time steps,the prediction performance is the most ideal with 15 training time steps.Furthermore,a comparison of returns before and after the impact of the pandemic confirms that the EN-LSTM prediction model accurately captures the changes brought about by the pandemic,reflecting the micro effects of macro environmental shifts in a timely manner.Additionally,further research is needed to determine the applicability of the EN model for the analysis and prediction of daily,monthly,and other data.The real-time dynamic contraction of the rolling time window also requires further development.