首页|Efficient title text detection using multi-loss
Efficient title text detection using multi-loss
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NETL
NSTL
Springer Nature
YouTube's "Video Chapter" feature segments videos into different sections, marked by timestamps on the slider, enhancing user navigation. Given the vast volume of video data, processing these efficiently demands substantial time and computational resources. This paper addresses two key objectives: reducing the computational cost of deep model training for text detection and enhancing overall performance with minimal effort. We introduce a classroom-based multi-loss learning approach for text detection, extending its application to title detection without requiring annotations. In deep learning, loss functions play a crucial role in updating model weights. Our proposed multi-loss functions facilitate faster convergence compared to baseline methods. Additionally, we present a novel technique to handle annotation-less data by employing a text grouping method to differentiate between regular text and title text. Experimental results on the COCO-Text and Slidin' Videos AI-5G Challenge datasets demonstrate the efficacy and practicality of our approach.
Classroom learningDeep learningMulti-lossSlidin' videos AI-5G challengeTitle text spotting
Shitala Prasad、Anuj Abraham
展开 >
School of Mathematics and Computer Science, Indian Institute of Technology Goa, Farmagudi, Ponda, Goa 403401, India
Technology Innovation Institute, Masdar City, Abu Dhabi 9639, UAE