Tibetan Semantic Chunking Classification and Labeling based on Tibetan Syllables and BiLSTM-CRF
A Tibetan semantic chunking recognition method based on the combination of Tibetan syllable vec-tors and BiLSTM-CRF hybrid model is proposed to address the difficulties associated with diverse semantic types and ambiguities in the semantic analysis of Tibetan sentences.Firstly,13 semantic chunking annotation standards were developed,and a semantic chunking annotation corpus comprising 13 211 sentences was then constructed.Based on this,the Tibetan semantic chunking recognition and classification model was trained using the TS-BiLSTM-CRF method.The results of the comprehensive test experiment show that the accuracy rate,the recall rate,and the F1 value are 75.03%,76.52%,and 75.77%,respectively.Among all types of semantic chunk-ing recognition,the evaluation results show that the accuracy rate of INS class recognition are much higher com-pared to other types of semantic blocks,with a value of 90.87%,while the ORG class has a lower accuracy rate of 66.67%than those of other types.This study validates that the TS-BiLSTM-CRF model exhibits strong perfor-mance in Tibetan semantic chunking recognition and analysis tasks.