The virtual fitting based on mask synthesis strategy is easy to cause serious occlusion,and complex detail of the clothing will be changed due to the insufficient clothing feature extraction.In this regard,diffusion model was used as the backbone framework of virtual fitting task,and a virtual fitting method based on denoising diffusion generation model was proposed.At each time point,noise is added to the model image according to Gaussian distribution,and Gaussian noise distribution is gradually transformed into the data distribution for which the generation model is trained.The target clothing image is used as a condition to guide diffusion model to inverse sample to generate a clean background image after the model is tried on.The U-net structure is employed as the backbone network of the noise prediction network,and the texture alignment module based on cross-attention mechanism is embedded in different layers of the U-net structure.In order to ensure that the identity information of the model and the background image are unchanged,the mixed diffusion method is used to mix the noise images of the same step in the two stages through the mask,so that the original diversity diffusion generation result is transformed into the image reconstruction limited to the changing area.The experimental result shows that the peak signal-to-noise ratio and structural similarity index of the generated image are improved,and the average FID index is relatively reduced by using the virtual fitting method based on denoising diffusion generation model.