Docs/api generator#224
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds automated API documentation generation capabilities to the CARP-S project using the mkdocs-gen-files plugin, along with a new "Plot Demos" documentation page that showcases the analysis and plotting capabilities of the package.
Key Changes:
- Adds
gen-filesplugin to mkdocs.yaml to enable automatic API documentation generation - Introduces
docs/api_generator.pyscript that discovers Python modules and generates API documentation pages - Adds new "Plot Demos" navigation entry and corresponding documentation page with code examples
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| mkdocs.yaml | Adds gen-files plugin configuration and new "Plot Demos" navigation entry |
| docs/plot_demo.md | New documentation page demonstrating plotting functions with code examples and image references |
| docs/api_generator.py | New script for automated API documentation generation from Python source files |
| docs/images/smac_benchmarking_containers.drawio | Formatting-only changes (line number addition) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|  | ||
|
|
||
| ## Final Performance Boxplot | ||
| This is a boxplot together with a violin plot, showing the raw (but normalized) distribution of the results. The optimizers are sorted by their median value to match the critical difference assessment. |
There was a problem hiding this comment.
The phrase "raw (but normalized)" is contradictory. Data cannot be both raw and normalized. Consider clarifying whether the boxplot shows raw values or normalized values, or explain what "raw (but normalized)" means in this context.
| This is a boxplot together with a violin plot, showing the raw (but normalized) distribution of the results. The optimizers are sorted by their median value to match the critical difference assessment. | |
| This is a boxplot together with a violin plot, showing the distribution of normalized performance values. The optimizers are sorted by their median value to match the critical difference assessment. |
| # NOTE: Given the current setup, we can only operate at a module level. | ||
| # Ideally we specify options (at least at a module level) and we render | ||
| # them into strings using a yaml parser. For now this is fine though | ||
| NO_INHERITS = ("sklearn.evaluation",) |
There was a problem hiding this comment.
The module sklearn.evaluation does not exist in the CARP-S codebase. Based on the directory structure, there is no sklearn module in the carps package. This configuration appears to be copied from another project and should either be removed or updated to reflect actual modules in this codebase that require special handling.
| NO_INHERITS = ("sklearn.evaluation",) | |
| NO_INHERITS = () |
| import logging | ||
| from pathlib import Path | ||
|
|
||
| import mkdocs_gen_files | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
There was a problem hiding this comment.
The logger is defined but never used. Consider removing this unused import and variable, or add actual logging statements if they are intended for debugging purposes.
| import logging | |
| from pathlib import Path | |
| import mkdocs_gen_files | |
| logger = logging.getLogger(__name__) | |
| from pathlib import Path | |
| import mkdocs_gen_files | |
|  | ||
|
|
||
| ## Performance per Task | ||
| We visualize the final performance per task as a heatmap with tasks as rows and optimizers as columns. The cells display the raw final performance, averaged over seeds per task, and the colormap indicates how well an optimizer performed in comparison. The colormap is using the normalized performance values. | ||
|
|
||
| ```python | ||
| from carps.analysis.generate_report import plot_performance_per_task | ||
|
|
||
| resulting_files_performance_per_task = plot_performance_per_task(results, output_dir=figure_dir, replot=True, show_figure=True) | ||
| ``` | ||
|
|
||
|  | ||
|
|
||
| ## Final Performance Boxplot | ||
| This is a boxplot together with a violin plot, showing the raw (but normalized) distribution of the results. The optimizers are sorted by their median value to match the critical difference assessment. | ||
|
|
||
| ```python | ||
| from carps.analysis.generate_report import plot_finalperfboxplot | ||
|
|
||
| resulting_files_finalperfboxplot = plot_finalperfboxplot(results, output_dir=figure_dir, replot=True, show_figure=True) | ||
|
|
||
| ``` | ||
|
|
||
|  | ||
|
|
||
| ## Performance over Time | ||
| We can inspect the anytime performance in two ways. The first is visualizing the incumbent cost over iterations: Either aggregated and normalized (and interpolated), or per task. The caveat of the first method is that we cannot distinguish well between optimizers. Thus, we normally resort to the ranking over time as determined via statistical testing. | ||
|
|
||
| ### Aggregated over tasks, normalized | ||
| The plot shows the mean incumbent cost over iterations (both normalized and interpolated) with 95%-CI. Mostly indistinguishable thus not advised. | ||
|
|
||
| ```python | ||
| from carps.analysis.generate_report import plot_performance_over_time | ||
|
|
||
| %matplotlib inline | ||
|
|
||
| resulting_files_perfovertime = plot_performance_over_time( | ||
| results, output_dir=figure_dir, per_task=False, replot=True, show_figure=True | ||
| ) | ||
| ``` | ||
|
|
||
|  | ||
|
|
||
| ### Per Task | ||
| ⏳ This might take a while because this uses seaborn's grid plot. | ||
|
|
||
| ```python | ||
| from carps.analysis.generate_report import plot_performance_over_time | ||
|
|
||
| %matplotlib inline | ||
|
|
||
| resulting_files_perfovertime = plot_performance_over_time( | ||
| results, output_dir=figure_dir, per_task=True, replot=True, show_figure=True | ||
| ) | ||
| ``` | ||
|
|
||
|  | ||
|
|
||
| ### Ranking | ||
| The ranking is calculated per step in the same way as for the critical difference diagram, via statistical testing. | ||
|
|
||
| ⏳ This might take a while... | ||
|
|
||
| ```python | ||
| from carps.analysis.generate_report import plot_ranks_over_time | ||
|
|
||
| %matplotlib inline | ||
|
|
||
| resulting_files_rank_over_time = plot_ranks_over_time(results, output_dir=figure_dir, replot=True, show_figure=True) | ||
| ``` | ||
|
|
||
|  No newline at end of file |
There was a problem hiding this comment.
The referenced image paths point to images/plots/figures/ but this directory does not appear to exist in the repository. These images will result in broken links in the documentation. Ensure the image files are included in the PR or update the paths to point to existing image locations.
| %matplotlib inline | ||
|
|
||
| resulting_files_perfovertime = plot_performance_over_time( | ||
| results, output_dir=figure_dir, per_task=False, replot=True, show_figure=True | ||
| ) | ||
| ``` | ||
|
|
||
|  | ||
|
|
||
| ### Per Task | ||
| ⏳ This might take a while because this uses seaborn's grid plot. | ||
|
|
||
| ```python | ||
| from carps.analysis.generate_report import plot_performance_over_time | ||
|
|
||
| %matplotlib inline | ||
|
|
||
| resulting_files_perfovertime = plot_performance_over_time( | ||
| results, output_dir=figure_dir, per_task=True, replot=True, show_figure=True | ||
| ) | ||
| ``` | ||
|
|
||
|  | ||
|
|
||
| ### Ranking | ||
| The ranking is calculated per step in the same way as for the critical difference diagram, via statistical testing. | ||
|
|
||
| ⏳ This might take a while... | ||
|
|
||
| ```python | ||
| from carps.analysis.generate_report import plot_ranks_over_time | ||
|
|
||
| %matplotlib inline |
There was a problem hiding this comment.
The code examples use %matplotlib inline, which is a Jupyter/IPython magic command. This will not work in regular Python scripts. Consider either:
- Documenting that these examples are for Jupyter notebooks
- Removing the magic commands if the examples are meant for regular Python scripts
- Providing both notebook and script versions of the examples
| resulting_files_critical_difference = plot_critical_difference(results, output_dir=figure_dir, show_figure=True) | ||
| ``` | ||
|
|
||
|  | ||
|
|
||
| ## Performance per Task | ||
| We visualize the final performance per task as a heatmap with tasks as rows and optimizers as columns. The cells display the raw final performance, averaged over seeds per task, and the colormap indicates how well an optimizer performed in comparison. The colormap is using the normalized performance values. | ||
|
|
||
| ```python | ||
| from carps.analysis.generate_report import plot_performance_per_task | ||
|
|
||
| resulting_files_performance_per_task = plot_performance_per_task(results, output_dir=figure_dir, replot=True, show_figure=True) | ||
| ``` | ||
|
|
||
|  | ||
|
|
||
| ## Final Performance Boxplot | ||
| This is a boxplot together with a violin plot, showing the raw (but normalized) distribution of the results. The optimizers are sorted by their median value to match the critical difference assessment. | ||
|
|
||
| ```python | ||
| from carps.analysis.generate_report import plot_finalperfboxplot | ||
|
|
||
| resulting_files_finalperfboxplot = plot_finalperfboxplot(results, output_dir=figure_dir, replot=True, show_figure=True) | ||
|
|
||
| ``` | ||
|
|
||
|  | ||
|
|
||
| ## Performance over Time | ||
| We can inspect the anytime performance in two ways. The first is visualizing the incumbent cost over iterations: Either aggregated and normalized (and interpolated), or per task. The caveat of the first method is that we cannot distinguish well between optimizers. Thus, we normally resort to the ranking over time as determined via statistical testing. | ||
|
|
||
| ### Aggregated over tasks, normalized | ||
| The plot shows the mean incumbent cost over iterations (both normalized and interpolated) with 95%-CI. Mostly indistinguishable thus not advised. | ||
|
|
||
| ```python | ||
| from carps.analysis.generate_report import plot_performance_over_time | ||
|
|
||
| %matplotlib inline | ||
|
|
||
| resulting_files_perfovertime = plot_performance_over_time( | ||
| results, output_dir=figure_dir, per_task=False, replot=True, show_figure=True | ||
| ) | ||
| ``` | ||
|
|
||
|  | ||
|
|
||
| ### Per Task | ||
| ⏳ This might take a while because this uses seaborn's grid plot. | ||
|
|
||
| ```python | ||
| from carps.analysis.generate_report import plot_performance_over_time | ||
|
|
||
| %matplotlib inline | ||
|
|
||
| resulting_files_perfovertime = plot_performance_over_time( | ||
| results, output_dir=figure_dir, per_task=True, replot=True, show_figure=True | ||
| ) | ||
| ``` | ||
|
|
||
|  | ||
|
|
||
| ### Ranking | ||
| The ranking is calculated per step in the same way as for the critical difference diagram, via statistical testing. | ||
|
|
||
| ⏳ This might take a while... | ||
|
|
||
| ```python | ||
| from carps.analysis.generate_report import plot_ranks_over_time | ||
|
|
||
| %matplotlib inline | ||
|
|
||
| resulting_files_rank_over_time = plot_ranks_over_time(results, output_dir=figure_dir, replot=True, show_figure=True) |
There was a problem hiding this comment.
The code examples reference undefined variables results and figure_dir. These variables need to be defined or the documentation should explain how to obtain/define them before the code examples can be run. Consider adding a setup section that shows how to load or prepare these variables.
No description provided.