Aiming at the inconsistency problem between the generated image and the semantic layouts in the semantic image synthesis,as well as the unsatisfactory image quality and the irrationality of the combination and collocation of local image,a semantic image synthesis method combining the local-global network model and the instance-adaptive normalization method is proposed.In the global network,the in-stance-adaptive normalization model is used to stochastically sample the congeneric instance in the network to independently modulate the pa-rameters of each instance,so as to improve the instance matching degree at the global level;build a local generation network,and construct sub generators for each instance according to instance labels,enabling it to capture finer details;the local and global network is combined to fuse local instance image features into the generated image,enabling it to achieve clear instance-level presentation on the global scale.The op-timized model has been tested on COCO-stuff,ADE20K,and Cityscapes.The experimental results show that the overall visual effect is more exquisite and realistic,with an average increase of 2.7 MIoU and an average decrease of 5.1 FID.