When the traditional struct from motion (SFM) algorithm is used to realize 3D reconstruction from the perspective of unmanned aerial vehicle (UAV),in order to reduce the mismatching of feature points and the impact of moving tar-gets on the overall sparse point cloud,the random sample consensus (Ransac) algorithm is mainly relied on. How-ever,these problems can lead to a decrease in the accuracy and an increase in the number of iterations of Ransac when solving camera poses. This article conducts target detection based on a deep learning single shot multibox de-tector (SSD) network. Firstly,feature points within the range of dynamic target categories are removed after scale-invariant feature transform (SIFT) extraction of feature points. Then,mismatches are removed after K-nearest nei-ghbor (KNN) violent matching to reduce feature points within the range of invalid moving targets and mismatching between different categories. So that when the confidence is the same,the number of iterations of Ransac when solving camera pose is reduced,and the time of feature point violence matching and SFM algorithm calculation of 3D points are also reduced. Finally,the feasibility of the 3D reconstruction algorithm optimized by deep learning was verified through 12 images of 2 scenes.