Toxic comments detection based on bidirectional capsule network
To address the issue that existing detection models struggle to accurately identify mali-cious comments with varied linguistic styles and implicit semantics,a malicious comment detection mod-el based on a bidirectional capsule network is proposed.Firstly,the BERT model is utilized to perform word embedding on comment texts,creating an input matrix.This input matrix is then passed to a bidi-rectional feature extraction layer,which comprises stacked LSTM,bidirectional capsule networks,and attention networks.This layer captures the deep semantic information of the text simultaneously from both forward and backward directions.The generated forward and backward matrices are concatenated and input into an attention mechanism,which focuses on words related to malicious comments and gen-erates an output vector.Secondly,the output vector is concatenated with a context-assisted feature vec-tor to enrich the feature representation.Finally,the concatenated vector is input into a fully connected layer,and the comment text is classified through the Sigmoid activation function.Experiments conducted on the Wikipedia malicious comment dataset demonstrate that compared to existing research,the malicious comment detection model based on the bidirectional capsule network achieves significant performance improvements.It is capable of capturing richer semantic information in comment texts and effectively detecting malicious comments.
BERT language modelbidirectional capsule networkcontextual auxiliary featurestoxic comments detection