Skip to content

Improve confusion matrix readability on imbalanced data#687

Open
ZakariaRida96 wants to merge 3 commits intoMAIF:masterfrom
ZakariaRida96:feature/issue-685
Open

Improve confusion matrix readability on imbalanced data#687
ZakariaRida96 wants to merge 3 commits intoMAIF:masterfrom
ZakariaRida96:feature/issue-685

Conversation

@ZakariaRida96
Copy link
Copy Markdown
Collaborator

Description
This PR improves confusion matrix heatmap readability for highly imbalanced datasets (see #685).

Previously, the color scale was based on the maximum value, making smaller values hard to distinguish when one class dominates.
This update uses a quantile-based upper bound and clips extreme values to enhance contrast while keeping the visualization clear.

Main changes:

  • Add quantile-based scaling to plot_confusion_matrix
  • Add optional parameters plot_confusion_matrix and also through SmartPlotter.confusion_matrix_plot
    • quantile=0.95 by default
    • use_quantile_scale=True by default
  • Clip extreme confusion matrix values before rendering the heatmap, depending of the chosen quantile threshold
  • Keep backward compatibility by allowing the feature to be disabled
  • Add unit tests for quantile scaling and disabled scaling

Actuel behavior:

  • Large dominant values no longer flatten the full color scale
  • Smaller values become more visible in the plot
  • The rendered matrix is clipped using the chosen quantile threshold
  • Display both real and clipped counts in hover text when the scale method is actived
  • The feature can be disabled to restore the previous full-range behavior

Here is the application on the exemple shown on the issue:

# Use the full value range for the color scale
plot_confusion_matrix(
    y_true=y_true_labels,
    y_pred=y_pred_labels,
    width=500,
    height=400,
    use_quantile_scale=False,
)

Screenshot 2026-04-10 at 15 20 06
# Use the 90th percentile as upper bound for the color scale
    y_true=y_true_labels,
    y_pred=y_pred_labels,
    width=500,
    height=400,
    quantile=0.90,
)

Screenshot 2026-04-10 at 15 20 38

@ZakariaRida96 ZakariaRida96 self-assigned this Apr 10, 2026
@ZakariaRida96 ZakariaRida96 marked this pull request as ready for review April 10, 2026 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant