[Objective]This paper uses the knowledge distillation method to improve the performance of a small-parameter model guided by the high-performance large-parameter model with insufficient labeled samples.It tries to address the issue of sample scarcity and reduce the cost of large-parameter models with high performance in natural language processing.[Methods]First,we used noise purification to obtain valuable data from an unlabeled corpus.Then,we added pseudo labels and increased the number of labeled samples.Meanwhile,we added the knowledge review mechanism and teaching assistant model to the traditional distillation model to realize comprehensive knowledge transfer from the large-parameter model to the small-parameter model.[Results]We conducted text classification and sentiment analysis tasks with the proposed model on IMDB,AG_NEWS,and Yahoo! Answers datasets.With only 5%of the original data labeled,the new model's accuracy rate was only 1.45%,2.75%,and 7.28%less than the traditional distillation model trained with original data.[Limitations]We only examined the new model with text classification and sentiment analysis tasks in natural language processing,which need to be expanded in the future.[Conclusions]The proposed method could achieve a better distillation effect and improve the performance of the small-parameter model.