Releases: ModelOriented/shapviz
Releases · ModelOriented/shapviz
CRAN release 0.10.3
CRAN release 0.10.2
Bug fixes
sv_interaction(..., type = "bar"): Bars of feature pairs appeared twice, see #178.
CRAN release 0.10.1
Maintenance
- Bump ggplot2 and patchwork dependencies.
CRAN release 0.10.0
Major improvements
sv_dependence(): The new argumentsylimandshare_y = FALSEallow to control the y-axis limits.
They help to assess the importance in multiple dependence plots (#172).
Later, we might change the default toshare_y = TRUE(as in Python's SHAP dependence plots) (#171).sv_interaction()has received a new visualization:kind = "bar"now shows mean absolute SHAP interactions/main effects as a barplot.
Its appearance can be modified by the argumentsfillandbar_width(#169).- We are now (cautiously) collecting axes, axis titles, and color guides via {patchwork} (#171).
Currently fails forsv_force().
Minor user-visible changes
Minor API changes
- In
sv_dependence(), passing the same variable forvandcolor_vardoes not suppress the color axis anymore,
except wheninteractions = TRUE(#171). sv_dependence()andsv_dependence2D()has received aseed = 1argument used for jittering.
This does not modify the global seed (#174).
Maintenance
CRAN release 0.9.7
Documentation
- H2O now supports passing background data for model agnostic SHAP. This is now easier visible in {shapviz}, see h2oai/h2o-3#16463.
- H2O random forests (regression and binary classification) now support TreeSHAP as well #163.
Compatibility
CRAN release 0.9.6
Documentation
- Fixed wrong link vignette #158.
CRAN release 0.9.5
User-visible changes
sv_waterfall()andsv_force(): The x label has been changed from "SHAP value" to "Prediction".
Documentation
- Add vignette for Tidymodels.
- Update vignettes.
- Update README.
CRAN release 0.9.4
API improvements
- Support both XGBoost 1.x.x as well as XGBoost 2.x.x, implemented in #144.
Other improvements
- New argument
sort_features = TRUEinsv_importance()andsv_interaction(). Set toFALSEto show the features as they appear in your SHAP matrix. In that case, the plots will show the firstmax_displayfeatures, not the most important features. Implements #137.
Bug fixes
CRAN release 0.9.3
sv_dependence(): Control over automatic color feature selection
How is the color feature selected, anyway?
If no SHAP interaction values are available, by default, the color feature v' is selected by the heuristic potential_interaction(), which works as follows:
- If the feature
v(the on the x-axis) is numeric, it is binned intonbinsbins. - Per bin, the SHAP values of
vare regressed ontov'and the R-squared is calculated. Rows with missingv'are discarded. - The R-squared are averaged over bins, weighted by the number of non-missing
v'values.
This measures how much variability in the SHAP values of v is explained by v', after accounting for v.
We have introduced four parameters to control the heuristic. Their defaults are in line with the old behaviour.
-
nbin = NULL: Into how many quantile bins should a numericvbe binned? The defaultNULLequals the smaller of$n/20$ and$\sqrt n$ (rounded up), where$n$ is the sample size. -
color_numShould color features be converted to numeric, even if they are factors/characters? Default isTRUE. -
scale = FALSE: Should R-squared be multiplied with the sample variance of
within-bin SHAP values? IfTRUE, bins with stronger vertical scatter will get higher weight. The default isFALSE. -
adjusted = FALSE: Should adjusted R-squared be calculated?
If SHAP interaction values are available, these parameters have no effect. In sv_dependence() they are called ih_nbin etc.
This partly implements the ideas in #119 of Roel Verbelen, thanks a lot for your patient explanations!
Further plans?
We will continue to experiment with the defaults, which might change in the future. A good alternative to the current (naive) defaults could be:
nbins = 7: Smaller than now to not overfit too strongly with factor/character color features.color_num = FALSE: To not naively integer encode factors/characters.scale = TRUE: To account for non-equal spread in bins.adjusted = TRUE: To not put too much weight on factors with many categories.
Other user-visible changes
sv_dependence(): Ifcolor_var = "auto"(default) and no color feature seems to be relevant (SHAP interaction isNULL, or heuristic returns no positive value), there won't be any color scale. Furthermore, in some edge cases, a different
color feature might be selected.mshapviz()objects can now be rowbinded viarbind()or+. Implemented by @jmaspons in #110.mshapviz()is more strict when combining multiple "shapviz" objects. These now need to have identical column names, see #114.
Small changes
- The README is shorter and easier.
- Updated vignettes.
print.shapviz()now shows top two rows of SHAP matrix.- Re-activate all unit tests.
- Setting
nthread = 1in all calls toxgb.DMatrix()as suggested by @jmaspons in #109. - Added "How to contribute" to README.
permshap()connector is now part of {kerneshap} #122.
Bug fixes
CRAN release 0.9.2
User-visible changes
sv_importance()of a "mshapviz" object now returns a dodged barplot instead of separate barplots via {patchwork}. Use the new argumentbar_typeto switch to a stacked barplot (bar_type = "stack"), to "facets" (via {ggplot2}), or "separate" for the old behaviour.
New features
- Added connector to permshap, a package calculating permutation SHAP values for regression and (probabilistic) classification.
Other changes
- Revised vignette on "mshapviz".
- Commenting out most unit tests as they would not pass timings measured on Debian.