Addressing to the problems causing by multi-scale variations and background noise in crowd counting task,a multi-branch feature fusion network was proposed.A bidirectional feature fusion path at the front end was used to repeatedly extract and fuse deep semantic information and shallow spatial detail information.The position attention and channel attention mecha-nisms were employed to enhance the network's discriminative ability between the crowd and the background for generating high-quality feature maps.The back end of the proposed network used dense residual connections to enhance the network's ability to extract multi-scale information for continuous human head counting,and the final crowd density maps were obtained.To verify the effectiveness of the proposed model,comparative experiments were conducted on ShanghaiTech,UCF_CC_50,and UCF_QNRF datasets.Experimental results demonstrate that the proposed network is superior to conventional networks,and has a better counting accuracy.