For women experiencing recurrent pregnancy loss (RPL), it is crucial not only to provide treatment but also to assess the risk of recurrence. This study aimed to develop a predictive model to anticipate subsequent early pregnancy loss (EPL) in women with RPL based on preconception data.
Methods:
A prospective, dynamic population cohort study was conducted at the Second Hospital of Lanzhou University. From September 2019 to December 2022, 1050 non-pregnant women with RPL participated. By December 2023, 605 women had subsequent pregnancy outcomes and were randomly divided into training and validation groups at a ratio of 3:1. In the training group, variables were screened using univariable analysis for RPL patients with subsequent EPL outcomes. The least absolute shrinkage and selection operator (LASSO) regression and multivariate logistic regression were employed for variable selection. Prediction models were constructed using generalized linear models (GLM), gradient boosting machine (GBM), random forest (RF), and deep learning (DP). The selected variables from LASSO and logistic regression were compared to identify the best-performing prediction model. Model performance was evaluated using area under the curve (AUC), calibration curves, and decision curve analysis (DCA). The best model was validated using the independent validation group, and a nomogram was developed based on the optimal predictive features.
Results:
In the training group, the GBM model demonstrated superior performance with the highest AUC (0.805). There was no significant difference in AUC between the 16-variable model from LASSO regression and the 9-variable model from logistic regression (AUC: 0.805 vs. 0.777, P = 0.1498). The 9-variable logistic regression model exhibited good discrimination in the validation group, with an AUC of 0.781 (95% CI 0.702, 0.843). DCA indicated that the model was clinically beneficial for decision-making. Calibration curves confirmed the model’s accuracy, supported by a non-significant Hosmer–Lemeshow test (χ² = 7.427, P = 0.505).
Conclusions:
The use of gradient boosting machine (GBM) models to predict subsequent EPL in RPL patients holds significant clinical implications. Future prospective studies are warranted to validate the applicability of these findings.