Motivated by the goal of enhancing the accuracy and robustness of visual inertial navigation systems(VINSs) across a wide spectrum of dynamic scenarios, protracted missions and expansive navigation ranges, we designed a monocular visual inertial odometry (VIO) augmented by planar environmental constraints. To attain efficient feature extraction and precise feature tracking, we employed a methodology that involved the extraction and tracking of uniformly distributed using features from accelerated segment test(FAST) feature points from video images, with the subsequent removal of outliers through symmetric optical flow. Additionally, we outlined the process of identifying coplanar feature points from the sparse feature set, enabling efficient plane detection and fitting. This approach constructed spatial geometric constraints on the three-dimensional coordinates of visual feature points without resorting to computationally expensive dense depth mapping. The heart of this method lied in the formulation of a comprehensive cost function, which integrated the reprojection error of visual feature points, the coordinate constraints derived from coplanar feature points, and the inertial measurement unit (IMU) pre-integration error. These integrated measurements were then utilized to estimate the system states through a nonlinear optimization methodology. To validate the accuracy and effectiveness of the proposed approach, extensive experiments were conducted using publicly available datasets and large-scale outdoor scenes. The experimental results conclusively demonstrate that compared to VINS-Mono and ORB-SLAM3, the proposed method achieves higher positioning accuracy. It can deliver precise and stable navigation results even in challenging conditions, thereby imparting significant practical value to the fields of robotics and unmanned driving.