Skip to content

Fix XGBRegressor binary:logistic ONNX conversion producing wrong predictions#771

Open
JOSH1024 wants to merge 13 commits into
onnx:mainfrom
JOSH1024:test-pr758-fix
Open

Fix XGBRegressor binary:logistic ONNX conversion producing wrong predictions#771
JOSH1024 wants to merge 13 commits into
onnx:mainfrom
JOSH1024:test-pr758-fix

Conversation

@JOSH1024

@JOSH1024 JOSH1024 commented Jun 5, 2026

Copy link
Copy Markdown

Summary

Fixes #726.

XGBRegressor models using objective="binary:logistic" can produce incorrect ONNX predictions when tree outputs are non-zero (for example when subsample < 1.0).

The converter currently uses base_score directly as base_values in the TreeEnsembleRegressor, which mixes probability-space values with tree outputs that are represented in logit space.

This change:

  • Converts base_score from probability space to logit space for binary:logistic.
  • Emits an intermediate TreeEnsembleRegressor output.
  • Adds a Sigmoid node after the tree ensemble output.
  • Adds a regression test reproducing the issue reported in Converted XGBoost model outputs wrong results #726.

Reproduction

Using the reproducer from #726:

Before this change:

Expected: 0.64148873
ONNX:     0.69600594
Allclose: False

After this change:

Expected: 0.64148873
ONNX:     0.64148873
Allclose: True

Testing

Added regression test:

tests/xgboost/test_xgboost_issues.py::test_issue_726_binary_logistic_subsample

Executed:

pytest tests/xgboost/test_xgboost_issues.py -v

Result:

tests/xgboost/test_xgboost_issues.py::TestXGBoostIssues::test_issue_676 PASSED
tests/xgboost/test_xgboost_issues.py::TestXGBoostIssues::test_issue_726_binary_logistic_subsample PASSED

========================= 2 passed =========================

Also ran:

pytest tests/xgboost -v

Observed no additional failures relative to main.

Validation

  • Reproduced the issue locally.
  • Verified the generated ONNX graph changes from:
TreeEnsembleRegressor

to:

TreeEnsembleRegressor -> Sigmoid
  • Added a regression test covering the reported scenario.
  • Verified ONNX predictions now match XGBoost predictions for the reproducer.

Related: #758

@JOSH1024

JOSH1024 commented Jun 5, 2026

Copy link
Copy Markdown
Author

hi contributers
@vinitra, @wenbingl , @prasanthpul and @xadupre it would be wonderful if any of you could u pls review my pr ??

@JOSH1024

JOSH1024 commented Jun 9, 2026

Copy link
Copy Markdown
Author

@take-cheeze could u please review this pr ?

@take-cheeze

Copy link
Copy Markdown
Member

@JOSH1024 Sorry. Seems I don't have write access to this repository
@onnx/sig-converters May approve this PR to merge

@JOSH1024

JOSH1024 commented Jun 9, 2026

Copy link
Copy Markdown
Author

@take-cheeze thanks for letting me know

@JOSH1024

JOSH1024 commented Jun 9, 2026

Copy link
Copy Markdown
Author

@andife it would be wonderful if u could review this pr?

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes incorrect ONNX predictions for xgboost.XGBRegressor models using objective="binary:logistic" by aligning the ONNX graph with XGBoost’s logit-space accumulation + sigmoid behavior.

Changes:

  • Convert base_score from probability space to logit space for binary:logistic when populating TreeEnsembleRegressor.base_values.
  • Emit an intermediate TreeEnsembleRegressor output and apply a Sigmoid node to produce probability outputs matching XGBoost.
  • Add a regression test reproducing issue #726 with subsample < 1.0.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
tests/xgboost/test_xgboost_issues.py Adds regression test coverage for binary:logistic subsample scenario (but currently contains an invalid import that breaks tests).
onnxmltools/convert/xgboost/operator_converters/XGBoost.py Adjusts XGBRegressor conversion for binary:logistic to use logit base values and an explicit Sigmoid post-processing node.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/xgboost/test_xgboost_issues.py Outdated
Comment thread tests/xgboost/test_xgboost_issues.py Fixed
@JOSH1024

JOSH1024 commented Jun 9, 2026

Copy link
Copy Markdown
Author

@andife i have done the things which copilot said

@andife

andife commented Jun 9, 2026

Copy link
Copy Markdown
Member

could you fix dco

@JOSH1024

JOSH1024 commented Jun 9, 2026

Copy link
Copy Markdown
Author

yes done @andife

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment thread onnxmltools/convert/lightgbm/operator_converters/LightGbm.py
@JOSH1024

JOSH1024 commented Jun 10, 2026

Copy link
Copy Markdown
Author

@justinchuby i have updated is it ok now ?

@JOSH1024

Copy link
Copy Markdown
Author

@andife Could you please review and merge this PR if you think the changes look correct? Thanks!

@andife

andife commented Jun 10, 2026

Copy link
Copy Markdown
Member

@xadupre

@JOSH1024

Copy link
Copy Markdown
Author

@andife i am not understand why is ubuntu and windows checks are fail could u please help me ?

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Comment thread onnxmltools/convert/xgboost/operator_converters/XGBoost.py Outdated
Comment thread tests/xgboost/test_xgboost_converters.py Outdated
Comment thread tests/xgboost/test_xgboost_issues.py
joshua added 4 commits June 10, 2026 23:07
Signed-off-by: joshua <joshua.abraham@multicorewareinc.com>
Signed-off-by: joshua <joshua.abraham@multicorewareinc.com>
Signed-off-by: joshua <joshua.abraham@multicorewareinc.com>
…ansform regression

Signed-off-by: joshua <joshua.abraham@multicorewareinc.com>
@JOSH1024

Copy link
Copy Markdown
Author

@andife could u check now ?

joshua added 2 commits June 11, 2026 00:28
…te_existing kwarg

Signed-off-by: joshua <joshua.abraham@multicorewareinc.com>
Signed-off-by: joshua <joshua.abraham@multicorewareinc.com>
@JOSH1024

Copy link
Copy Markdown
Author

@andife could u run the checks?

joshua added 2 commits June 11, 2026 10:29
…core

Signed-off-by: joshua <joshua.abraham@multicorewareinc.com>
Signed-off-by: joshua <joshua.abraham@multicorewareinc.com>
@JOSH1024

Copy link
Copy Markdown
Author

@andife could u re run the checks?

@JOSH1024

JOSH1024 commented Jun 11, 2026

Copy link
Copy Markdown
Author

@andife is this ok becasue those ubuntu and windows check fails are there in all the merged pr also ?? so if everything else looks good could u please approve it

@JOSH1024

Copy link
Copy Markdown
Author

@xadupre is this ok because those ubuntu and windows check fails are there in all the merged pr also ?? so if everything else looks good could u please approve it

The old guard skipped remapping any true/false child ID equal to 0,
intending to skip dummy placeholders on LEAF nodes. But it also
silently skipped valid branch pointers whose child happened to be
node 0 (e.g. root's left child after set_new_numbers). This caused
broken tree traversal in ONNX, producing wrong predictions.

Fix: skip only when the current node is itself a LEAF.
Signed-off-by: joshua <joshua.abraham@multicorewareinc.com>
@JOSH1024

Copy link
Copy Markdown
Author

The 2 failing XGBoost tests (test_xgb0_empty_tree_classifier, test_xgb_classifier_13) on windows-latest py==3.10 xgboost<2 are pre-existing failures unrelated to this change. They are skipped on xgboost>=2 and fail due to an existing compatibility issue between xgboost<2 and onnxruntime==1.16.3 on Windows.

@JOSH1024

Copy link
Copy Markdown
Author

@andife can u check now

@JOSH1024

Copy link
Copy Markdown
Author

@andife ??

@andife andife requested a review from xadupre June 12, 2026 05:06
@andife

andife commented Jun 12, 2026

Copy link
Copy Markdown
Member

@xadupre Can you review this PR?

@JOSH1024

Copy link
Copy Markdown
Author

@xadupre could u please look into this pr ??

@JOSH1024

Copy link
Copy Markdown
Author

@xadupre could u pls look into this ??

@JOSH1024

Copy link
Copy Markdown
Author

@xadupre can u look into this

@JOSH1024

Copy link
Copy Markdown
Author

@andife could u pls help me because there is no response from @xadupre

@andife

andife commented Jun 15, 2026

Copy link
Copy Markdown
Member

@onnx/sig-converters-approvers

@JOSH1024

Copy link
Copy Markdown
Author

@andife i have been requesting @xadupre and @onnx/sig-converters-approvers from last week and did'nt get any reply from their side so could u pls help me what should i do?

@justinchuby

Copy link
Copy Markdown
Member

I'll ping Xavier tomorrow

@JOSH1024

Copy link
Copy Markdown
Author

@justinchuby could u help me here also

@JOSH1024

Copy link
Copy Markdown
Author

@justinchuby ??

@JOSH1024 JOSH1024 force-pushed the test-pr758-fix branch 2 times, most recently from bb1cba6 to 1b0078c Compare June 18, 2026 09:44
…g, base_score logit transform, and TreeEnsembleClassifier label workarounds

Signed-off-by: joshua <joshua.abraham@multicorewareinc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Converted XGBoost model outputs wrong results

6 participants