Abstract
A robust machine learning workflow using two well-established statistical clustering algorithms, K-Means Clustering (K-MC) and K-Nearest Neighbors (K-NN) was developed to improve ultimate recovery (EUR) predictions of new wells in shale-gas reservoirs using a decline curve model developed by Duong (2011). These two clustering algorithms can handle big datasets with multiple well parameters with high computational efficiency. Out of a total of 55,623 dry-gas wells from seven shale gas formations, 7631 wells that fit the criteria for Duong's decline curve model were selected in this study for further analysis. K-MC and K-NN were applied to group wells with similar well characteristics using well location, well depth, well length, and production starting year as parameters. Locations of the grouped wells show that the clusters of wells with similar well characteristics identified by both clustering methods scatter over a big area. These findings clearly demonstrate that simply grouping wells by the proximity of physical locations is not a good way of identifying wells with similar well characteristics. The results from this study also suggest that the selection of optimal clustering methods is highly dependent on the shale formation and the method used for EUR prediction. Even with the limited number of well parameters available in the datasets used in this study, the machine learning clustering algorithms managed to improve the EUR prediction accuracy by around 20%. More available well information will allow the machine learning algorithms to capture more of the well characteristics, thereby further increasing the accuracy of EUR predictions.