Design of Surveillance Video Summarization System Based on Dynamic Transformer
A surveillance video summarization system is an important technical tool,it is used to extract key information from large and complex surveillance videos,and provides an effective support for security management and event analysis.With the popu-larization of surveillance devices and rapid growth of surveillance video data,traditional manual summarization methods cannot meet the demands of fast processing and accurate extraction of required information.Modern deep learning methods widely have the shorta-ges of high computational complexity and large parameters.To address this issue,a dynamic Transformer-based surveillance video summarization model is proposed.The model automatically assigns appropriate tokens to each input video frame,cascades multiple Transformer models,and gradually increases the number of generated tokens to achieve the adaptive activation order.Once,it gener-ates the sufficient confident predictions,the inference process will terminate.The model adopts the feature reuse and attention reuse techniques to reduce the redundant computations.It makes a significant progress in reducing the computational complexity.Experi-mental tests show that compared with traditional models,the dynamic Transformer model increases the accuracy,the F score indica-tors by 3.7%and 0.9%on two publicly available datasets,respectively.At the same time,the computational complexity is reduced by 40%.This model can meet the requirements of precision and surveillance,demonstrating a good generalization performance.
video summarization techniquesdynamic Transformercomputational complexityfeature reuseattention reuse