Research on the construction and automatic annotation method of Tibetan sentiment corpus
In the field of Tibetan sentiment analysis,there are problems such as a lack of corresponding basic training corpus,the need for a large amount of data to support models,the consumption of a lot of human and material resources and low universality for traditional manual annotation.To this end,a fine-grained Tibetan sentiment corpus and sentiment dictionary are constructed.Firstly,each word is annotated with sentiment intensity by three individuals.Then,the corpus and dictionary are matched according to the rules.Finally,the average score of sentiment intensity is used to represent the sentiment category of the text.The fine-grained sentiment corpus resources constructed in this paper can,to some extent,shorten the development cycle of massive annotated corpus and reduce the labor cost of corpus annotation.