Exploring Effective Factors Leading to Data Leakage in Pre-trained Language Models
Currently,pre-trained language models are widely used to learn general language representations from massive training corpora.The performance of downstream tasks in the field of natural language processing has been significantly improved after using the pre-trained language model,but the over-fitting phenomenon of the deep neural network makes the pre-trained language model may have the risk of leaking the privacy of the training corpus.This paper selects T5,GPT,OPT and other widely used pre-trained language models as research objects,and uses model inversion attacks to explore the factors that affect the data leakage of pre-trained language models.During the experiment,the pre-trained language model was used to generate a large number of samples,and the samples most likely to cause data leakage risk were selected for verification by indicators such as perplexity.It proved that different models such as T5 have different degrees of data leakage problems.For the same model,the larger size of the model,the scale,the greater the possibility of data leakage;adding a specific prefix makes it easier to obtain leaked data.The future data leakage problem and its defense methods are prospected.
natural language processingpre-trained language modelsprivate data leakagemodel inversion attackmodel architecture