To improve the accuracy of traffic flow prediction and to solve the problems of high-dimensional nonlin-earity and spatio-temporal dependence of traffic flow,a combined feature distillation and variational Bayes encoders traffic flow forecasting model (ST-DVBE) is proposed. First,to extract the time window characteristics corresponding to each time series,the multi-modal time slots and spatial slots are constructed. Second,with spatio-temporal slot feature extraction mod-el as the input of feature knowledge distillation architecture,and space-time feature crystallization extracted by knowledge distillation structure,the learning process of student model is guided by teacher model,so as to improve the generalization ability of student model. Finally,the variational Bayesian encoder is employed to capture the latent variables of traffic flow data by encoding the crystallization of spatiotemporal features. Utilizing the generated latent variables,the decoder recon-structs them into new predicted values. Experimental results demonstrate a significant enhancement in predictive perfor-mance with the proposed model,especially with better robustness in mid-and long-term forecasting.