Multi-Label Text Classification Method Based on Label Concept
Multi-label text classification is one of the important and challenging tasks in natural language processing.The existing methods pay attention to text representation learning,focus on the information inside the text to predict the label,but ignore the key information shared in all instances belonging to a certain label.In view of this,in this article we propose a multi-label text classification method based on the label concept.In our proposed method,word frequency and latent Dirichlet allocation(LDA)method are used to extract the key words corresponding to each tag from all the examples of the training set,and then the key words are encoded in the same way as text encoding to obtain the label concept representation.In the process of training and prediction,the auxiliary classification of tag concept that is most similar to text representation is retrieved,and the loss of comparison between tag concept representation and text representation is increased,so that the global tag concept information can be fully learned in the process of text coding.Experimental results showed that integrat-ing our proposed method into commonly used multi-label text classification models significantly improved the performance of the respective models.
label conceptglobal key informationcontrast lossmulti-label text classification