To address the problem of limited Tibetan corpus resources and the scarcity of availa-ble pre-trained models for training,two pre-trained models with strong encoding capabilities are established:T-Transformer-XL and T-XLNet.These models are trained on a self-built large-scale Tibetan dataset,T-News.Considering the unique structure of the Tibetan script,the byte-pair encoding in the Sentence Piece tokenization model is used for tokenizing the Tibetan data.The tokenization strategy and objective function are adjusted to solve the Tibetan text generation problem under different computational power and application scenarios.The cyclic mechanism matching and the relative position encoding matching are performed on the T-Transformer-XL model to effectively model the contextual features of long texts,while the T-XLNet model applies the permutation language modeling matching,using a two-state self-attention mechanism to ex-tract text features.Finally,a self-supervised manifold-based data augmentation method is em-ployed,using a masked language model to generate realistic augmented samples to enrich the out-put text of the pre-trained models.Experimental results show that T-Transformer-XL and T-XL-Net perform excellently in text generation tasks.Specific models can be selected based on the par-ticular task requirements,available computational resources,and performance demands of the model to achieve optimal application results.