Docs/api generator by sagar1sharma · Pull Request #224 · automl/CARP-S

sagar1sharma · 2025-12-04T06:43:51Z

No description provided.

Copilot

Pull request overview

This PR adds automated API documentation generation capabilities to the CARP-S project using the mkdocs-gen-files plugin, along with a new "Plot Demos" documentation page that showcases the analysis and plotting capabilities of the package.

Key Changes:

Adds gen-files plugin to mkdocs.yaml to enable automatic API documentation generation
Introduces docs/api_generator.py script that discovers Python modules and generates API documentation pages
Adds new "Plot Demos" navigation entry and corresponding documentation page with code examples

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 6 comments.

File	Description
mkdocs.yaml	Adds gen-files plugin configuration and new "Plot Demos" navigation entry
docs/plot_demo.md	New documentation page demonstrating plotting functions with code examples and image references
docs/api_generator.py	New script for automated API documentation generation from Python source files
docs/images/smac_benchmarking_containers.drawio	Formatting-only changes (line number addition)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-04T06:48:24Z

+![Performance per Task](images/plots/figures/performance_per_task.png)
+
+## Final Performance Boxplot
+This is a boxplot together with a violin plot, showing the raw (but normalized) distribution of the results. The optimizers are sorted by their median value to match the critical difference assessment.


The phrase "raw (but normalized)" is contradictory. Data cannot be both raw and normalized. Consider clarifying whether the boxplot shows raw values or normalized values, or explain what "raw (but normalized)" means in this context.

Suggested change

This is a boxplot together with a violin plot, showing the raw (but normalized) distribution of the results. The optimizers are sorted by their median value to match the critical difference assessment.

This is a boxplot together with a violin plot, showing the distribution of normalized performance values. The optimizers are sorted by their median value to match the critical difference assessment.

Copilot · 2025-12-04T06:48:25Z

+# NOTE: Given the current setup, we can only operate at a module level.
+# Ideally we specify options (at least at a module level) and we render
+# them into strings using a yaml parser. For now this is fine though
+NO_INHERITS = ("sklearn.evaluation",)


The module sklearn.evaluation does not exist in the CARP-S codebase. Based on the directory structure, there is no sklearn module in the carps package. This configuration appears to be copied from another project and should either be removed or updated to reflect actual modules in this codebase that require special handling.

Suggested change

NO_INHERITS = ("sklearn.evaluation",)

NO_INHERITS = ()

Copilot · 2025-12-04T06:48:25Z

+import logging
+from pathlib import Path
+
+import mkdocs_gen_files
+
+logger = logging.getLogger(__name__)
+


The logger is defined but never used. Consider removing this unused import and variable, or add actual logging statements if they are intended for debugging purposes.

Suggested change

import logging

from pathlib import Path

import mkdocs_gen_files

logger = logging.getLogger(__name__)

from pathlib import Path

import mkdocs_gen_files

Copilot · 2025-12-04T06:48:25Z

+![Critical Difference](images/plots/figures/critical_difference.png)
+
+## Performance per Task
+We visualize the final performance per task as a heatmap with tasks as rows and optimizers as columns. The cells display the raw final performance, averaged over seeds per task, and the colormap indicates how well an optimizer performed in comparison. The colormap is using the normalized performance values.
+
+```python
+from carps.analysis.generate_report import plot_performance_per_task
+
+resulting_files_performance_per_task = plot_performance_per_task(results, output_dir=figure_dir, replot=True, show_figure=True)
+```
+
+![Performance per Task](images/plots/figures/performance_per_task.png)
+
+## Final Performance Boxplot
+This is a boxplot together with a violin plot, showing the raw (but normalized) distribution of the results. The optimizers are sorted by their median value to match the critical difference assessment.
+
+```python
+from carps.analysis.generate_report import plot_finalperfboxplot
+
+resulting_files_finalperfboxplot = plot_finalperfboxplot(results, output_dir=figure_dir, replot=True, show_figure=True)
+
+```
+
+![Final Performance Boxplot](images/plots/figures/final_performance.png)
+
+## Performance over Time
+We can inspect the anytime performance in two ways. The first is visualizing the incumbent cost over iterations: Either aggregated and normalized (and interpolated), or per task. The caveat of the first method is that we cannot distinguish well between optimizers. Thus, we normally resort to the ranking over time as determined via statistical testing.
+
+### Aggregated over tasks, normalized
+The plot shows the mean incumbent cost over iterations (both normalized and interpolated) with 95%-CI. Mostly indistinguishable thus not advised.
+
+```python
+from carps.analysis.generate_report import plot_performance_over_time
+
+%matplotlib inline
+
+resulting_files_perfovertime = plot_performance_over_time(
+    results, output_dir=figure_dir, per_task=False, replot=True, show_figure=True
+)
+```
+
+![aggregated over tasks](images/plots/figures/aggregated_over_task.png)
+
+### Per Task
+⏳ This might take a while because this uses seaborn's grid plot.
+
+```python
+from carps.analysis.generate_report import plot_performance_over_time
+
+%matplotlib inline
+
+resulting_files_perfovertime = plot_performance_over_time(
+    results, output_dir=figure_dir, per_task=True, replot=True, show_figure=True
+)
+```
+
+![Per Task](images/plots/figures/per_task.png)
+
+### Ranking
+The ranking is calculated per step in the same way as for the critical difference diagram, via statistical testing.
+
+⏳ This might take a while...
+
+```python
+from carps.analysis.generate_report import plot_ranks_over_time
+
+%matplotlib inline
+
+resulting_files_rank_over_time = plot_ranks_over_time(results, output_dir=figure_dir, replot=True, show_figure=True)
+```
+
+![Ranks Over Time](images/plots/figures/ranking.png)


The referenced image paths point to images/plots/figures/ but this directory does not appear to exist in the repository. These images will result in broken links in the documentation. Ensure the image files are included in the PR or update the paths to point to existing image locations.

Copilot · 2025-12-04T06:48:26Z

+%matplotlib inline
+
+resulting_files_perfovertime = plot_performance_over_time(
+    results, output_dir=figure_dir, per_task=False, replot=True, show_figure=True
+)
+```
+
+![aggregated over tasks](images/plots/figures/aggregated_over_task.png)
+
+### Per Task
+⏳ This might take a while because this uses seaborn's grid plot.
+
+```python
+from carps.analysis.generate_report import plot_performance_over_time
+
+%matplotlib inline
+
+resulting_files_perfovertime = plot_performance_over_time(
+    results, output_dir=figure_dir, per_task=True, replot=True, show_figure=True
+)
+```
+
+![Per Task](images/plots/figures/per_task.png)
+
+### Ranking
+The ranking is calculated per step in the same way as for the critical difference diagram, via statistical testing.
+
+⏳ This might take a while...
+
+```python
+from carps.analysis.generate_report import plot_ranks_over_time
+
+%matplotlib inline


The code examples use %matplotlib inline, which is a Jupyter/IPython magic command. This will not work in regular Python scripts. Consider either:

Documenting that these examples are for Jupyter notebooks

Removing the magic commands if the examples are meant for regular Python scripts

Providing both notebook and script versions of the examples

Copilot · 2025-12-04T06:48:26Z

+resulting_files_critical_difference = plot_critical_difference(results, output_dir=figure_dir, show_figure=True)
+```
+
+![Critical Difference](images/plots/figures/critical_difference.png)
+
+## Performance per Task
+We visualize the final performance per task as a heatmap with tasks as rows and optimizers as columns. The cells display the raw final performance, averaged over seeds per task, and the colormap indicates how well an optimizer performed in comparison. The colormap is using the normalized performance values.
+
+```python
+from carps.analysis.generate_report import plot_performance_per_task
+
+resulting_files_performance_per_task = plot_performance_per_task(results, output_dir=figure_dir, replot=True, show_figure=True)
+```
+
+![Performance per Task](images/plots/figures/performance_per_task.png)
+
+## Final Performance Boxplot
+This is a boxplot together with a violin plot, showing the raw (but normalized) distribution of the results. The optimizers are sorted by their median value to match the critical difference assessment.
+
+```python
+from carps.analysis.generate_report import plot_finalperfboxplot
+
+resulting_files_finalperfboxplot = plot_finalperfboxplot(results, output_dir=figure_dir, replot=True, show_figure=True)
+
+```
+
+![Final Performance Boxplot](images/plots/figures/final_performance.png)
+
+## Performance over Time
+We can inspect the anytime performance in two ways. The first is visualizing the incumbent cost over iterations: Either aggregated and normalized (and interpolated), or per task. The caveat of the first method is that we cannot distinguish well between optimizers. Thus, we normally resort to the ranking over time as determined via statistical testing.
+
+### Aggregated over tasks, normalized
+The plot shows the mean incumbent cost over iterations (both normalized and interpolated) with 95%-CI. Mostly indistinguishable thus not advised.
+
+```python
+from carps.analysis.generate_report import plot_performance_over_time
+
+%matplotlib inline
+
+resulting_files_perfovertime = plot_performance_over_time(
+    results, output_dir=figure_dir, per_task=False, replot=True, show_figure=True
+)
+```
+
+![aggregated over tasks](images/plots/figures/aggregated_over_task.png)
+
+### Per Task
+⏳ This might take a while because this uses seaborn's grid plot.
+
+```python
+from carps.analysis.generate_report import plot_performance_over_time
+
+%matplotlib inline
+
+resulting_files_perfovertime = plot_performance_over_time(
+    results, output_dir=figure_dir, per_task=True, replot=True, show_figure=True
+)
+```
+
+![Per Task](images/plots/figures/per_task.png)
+
+### Ranking
+The ranking is calculated per step in the same way as for the critical difference diagram, via statistical testing.
+
+⏳ This might take a while...
+
+```python
+from carps.analysis.generate_report import plot_ranks_over_time
+
+%matplotlib inline
+
+resulting_files_rank_over_time = plot_ranks_over_time(results, output_dir=figure_dir, replot=True, show_figure=True)


The code examples reference undefined variables results and figure_dir. These variables need to be defined or the documentation should explain how to obtain/define them before the code examples can be run. Consider adding a setup section that shows how to load or prepare these variables.

sagar1sharma added 2 commits November 25, 2025 20:03

Added demo plots in documentation

6d0d5f3

api_generator added

3cb11c7

Copilot AI review requested due to automatic review settings December 4, 2025 06:43

Copilot started reviewing on behalf of sagar1sharma December 4, 2025 06:44 View session

Copilot finished reviewing on behalf of sagar1sharma December 4, 2025 06:47

Copilot AI reviewed Dec 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs/api generator#224

Docs/api generator#224
sagar1sharma wants to merge 2 commits intodevelopmentfrom
docs/api_generator

sagar1sharma commented Dec 4, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 4, 2025

Uh oh!

Copilot AI Dec 4, 2025

Uh oh!

Copilot AI Dec 4, 2025

Uh oh!

Copilot AI Dec 4, 2025

Uh oh!

Copilot AI Dec 4, 2025

Uh oh!

Copilot AI Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	This is a boxplot together with a violin plot, showing the raw (but normalized) distribution of the results. The optimizers are sorted by their median value to match the critical difference assessment.
	This is a boxplot together with a violin plot, showing the distribution of normalized performance values. The optimizers are sorted by their median value to match the critical difference assessment.

Conversation

sagar1sharma commented Dec 4, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants