首页|Handling data scarcity through data augmentation for detecting offensive speech
Handling data scarcity through data augmentation for detecting offensive speech
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NETL
NSTL
Detecting offensive speech poses a challenge due to the absence of a universally accepted definition delineating its boundaries. However, the scarcity of labeled data often poses a significant challenge for training robust offensive speech detection models. In this paper, we propose an approach to handle data scarcity through data augmentation techniques tailored for offensive speech detection tasks. By augmenting the existing labeled data with speech samples generated through noise injection, our method effectively expands the training dataset, enabling more comprehensive model training. We evaluate our approach on Vera Am Mittag (VAM) corpus and demonstrate significant improvements in offensive speech detection performance compared to that without data augmentation. Our findings highlight the efficacy of data augmentation in mitigating data scarcity challenges and enhancing the reliability of offensive speech detection systems in a real-world scenario.