I am a student working on building a predictive model. While evaluating different models, I noticed that in some cases, some AUC is around 0.75, but the ROC curve appears below the random guess line. I am unsure why this is happening.
My target variable is binary (0 or 1), and my dataset consists of 42 patients. I would appreciate any insights or explanations for this issue.
Looking forward to your help!
This is my code:
for name, model in best_models.items(): # ✅ Extract probability estimates if hasattr(model, "predict_proba"): y_prob = model.predict_proba(X_test)[:, 1] # Get class 1 probability elif hasattr(model, "decision_function"): raw_scores = model.decision_function(X_test) y_prob = softmax(raw_scores)[:, 1] # Convert raw scores to probabilities else: continue # Skip models that don't provide probability estimates # ✅ Compute ROC curve & AUC fpr, tpr, _ = roc_curve(y_test, y_prob) auc_score = auc(fpr, tpr) # ✅ Fix flipped AUC values (if needed) if auc_score < 0.5: y_prob = 1 - y_prob # Flip probability assignments fpr, tpr, _ = roc_curve(y_test, y_prob) auc_score = auc(fpr, tpr) # ✅ Ensure FPR values are strictly increasing unique_fpr, unique_indices = np.unique(fpr, return_index=True) unique_tpr = tpr[unique_indices] # ✅ Smooth the ROC curve smooth_fpr = np.linspace(0, 1, 300) interpolator = PchipInterpolator(unique_fpr, unique_tpr) smooth_tpr = interpolator(smooth_fpr) # ✅ Plot the ROC curve plt.figure(figsize=(7, 5)) plt.plot(smooth_fpr, smooth_tpr, label=f"{name} (AUC = {auc_score:.2f})", linewidth=2, color="blue") plt.plot([0, 1], [0, 1], linestyle="--", color="gray", label="Random Guess (AUC = 0.50)") plt.plot([0, 0, 1], [0, 1, 1], linestyle="dashed", color="purple", label="Perfect Classifier") # ✅ Configure plot aesthetics plt.xlabel("False Positive Rate (FPR)") plt.ylabel("True Positive Rate (TPR)") plt.title(f"Smooth ROC Curve for {name}") plt.legend(loc="lower right") plt.grid(True) # ✅ Show the plot plt.show() 