Skip to content

Docs/api generator#224

Open
sagar1sharma wants to merge 2 commits intodevelopmentfrom
docs/api_generator
Open

Docs/api generator#224
sagar1sharma wants to merge 2 commits intodevelopmentfrom
docs/api_generator

Conversation

@sagar1sharma
Copy link
Copy Markdown
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings December 4, 2025 06:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds automated API documentation generation capabilities to the CARP-S project using the mkdocs-gen-files plugin, along with a new "Plot Demos" documentation page that showcases the analysis and plotting capabilities of the package.

Key Changes:

  • Adds gen-files plugin to mkdocs.yaml to enable automatic API documentation generation
  • Introduces docs/api_generator.py script that discovers Python modules and generates API documentation pages
  • Adds new "Plot Demos" navigation entry and corresponding documentation page with code examples

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 6 comments.

File Description
mkdocs.yaml Adds gen-files plugin configuration and new "Plot Demos" navigation entry
docs/plot_demo.md New documentation page demonstrating plotting functions with code examples and image references
docs/api_generator.py New script for automated API documentation generation from Python source files
docs/images/smac_benchmarking_containers.drawio Formatting-only changes (line number addition)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/plot_demo.md
![Performance per Task](images/plots/figures/performance_per_task.png)

## Final Performance Boxplot
This is a boxplot together with a violin plot, showing the raw (but normalized) distribution of the results. The optimizers are sorted by their median value to match the critical difference assessment.
Copy link

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phrase "raw (but normalized)" is contradictory. Data cannot be both raw and normalized. Consider clarifying whether the boxplot shows raw values or normalized values, or explain what "raw (but normalized)" means in this context.

Suggested change
This is a boxplot together with a violin plot, showing the raw (but normalized) distribution of the results. The optimizers are sorted by their median value to match the critical difference assessment.
This is a boxplot together with a violin plot, showing the distribution of normalized performance values. The optimizers are sorted by their median value to match the critical difference assessment.

Copilot uses AI. Check for mistakes.
Comment thread docs/api_generator.py
# NOTE: Given the current setup, we can only operate at a module level.
# Ideally we specify options (at least at a module level) and we render
# them into strings using a yaml parser. For now this is fine though
NO_INHERITS = ("sklearn.evaluation",)
Copy link

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module sklearn.evaluation does not exist in the CARP-S codebase. Based on the directory structure, there is no sklearn module in the carps package. This configuration appears to be copied from another project and should either be removed or updated to reflect actual modules in this codebase that require special handling.

Suggested change
NO_INHERITS = ("sklearn.evaluation",)
NO_INHERITS = ()

Copilot uses AI. Check for mistakes.
Comment thread docs/api_generator.py
Comment on lines +7 to +13
import logging
from pathlib import Path

import mkdocs_gen_files

logger = logging.getLogger(__name__)

Copy link

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logger is defined but never used. Consider removing this unused import and variable, or add actual logging statements if they are intended for debugging purposes.

Suggested change
import logging
from pathlib import Path
import mkdocs_gen_files
logger = logging.getLogger(__name__)
from pathlib import Path
import mkdocs_gen_files

Copilot uses AI. Check for mistakes.
Comment thread docs/plot_demo.md
Comment on lines +22 to +93
![Critical Difference](images/plots/figures/critical_difference.png)

## Performance per Task
We visualize the final performance per task as a heatmap with tasks as rows and optimizers as columns. The cells display the raw final performance, averaged over seeds per task, and the colormap indicates how well an optimizer performed in comparison. The colormap is using the normalized performance values.

```python
from carps.analysis.generate_report import plot_performance_per_task

resulting_files_performance_per_task = plot_performance_per_task(results, output_dir=figure_dir, replot=True, show_figure=True)
```

![Performance per Task](images/plots/figures/performance_per_task.png)

## Final Performance Boxplot
This is a boxplot together with a violin plot, showing the raw (but normalized) distribution of the results. The optimizers are sorted by their median value to match the critical difference assessment.

```python
from carps.analysis.generate_report import plot_finalperfboxplot

resulting_files_finalperfboxplot = plot_finalperfboxplot(results, output_dir=figure_dir, replot=True, show_figure=True)

```

![Final Performance Boxplot](images/plots/figures/final_performance.png)

## Performance over Time
We can inspect the anytime performance in two ways. The first is visualizing the incumbent cost over iterations: Either aggregated and normalized (and interpolated), or per task. The caveat of the first method is that we cannot distinguish well between optimizers. Thus, we normally resort to the ranking over time as determined via statistical testing.

### Aggregated over tasks, normalized
The plot shows the mean incumbent cost over iterations (both normalized and interpolated) with 95%-CI. Mostly indistinguishable thus not advised.

```python
from carps.analysis.generate_report import plot_performance_over_time

%matplotlib inline

resulting_files_perfovertime = plot_performance_over_time(
results, output_dir=figure_dir, per_task=False, replot=True, show_figure=True
)
```

![aggregated over tasks](images/plots/figures/aggregated_over_task.png)

### Per Task
⏳ This might take a while because this uses seaborn's grid plot.

```python
from carps.analysis.generate_report import plot_performance_over_time

%matplotlib inline

resulting_files_perfovertime = plot_performance_over_time(
results, output_dir=figure_dir, per_task=True, replot=True, show_figure=True
)
```

![Per Task](images/plots/figures/per_task.png)

### Ranking
The ranking is calculated per step in the same way as for the critical difference diagram, via statistical testing.

⏳ This might take a while...

```python
from carps.analysis.generate_report import plot_ranks_over_time

%matplotlib inline

resulting_files_rank_over_time = plot_ranks_over_time(results, output_dir=figure_dir, replot=True, show_figure=True)
```

![Ranks Over Time](images/plots/figures/ranking.png) No newline at end of file
Copy link

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The referenced image paths point to images/plots/figures/ but this directory does not appear to exist in the repository. These images will result in broken links in the documentation. Ensure the image files are included in the PR or update the paths to point to existing image locations.

Copilot uses AI. Check for mistakes.
Comment thread docs/plot_demo.md
Comment on lines +56 to +88
%matplotlib inline

resulting_files_perfovertime = plot_performance_over_time(
results, output_dir=figure_dir, per_task=False, replot=True, show_figure=True
)
```

![aggregated over tasks](images/plots/figures/aggregated_over_task.png)

### Per Task
⏳ This might take a while because this uses seaborn's grid plot.

```python
from carps.analysis.generate_report import plot_performance_over_time

%matplotlib inline

resulting_files_perfovertime = plot_performance_over_time(
results, output_dir=figure_dir, per_task=True, replot=True, show_figure=True
)
```

![Per Task](images/plots/figures/per_task.png)

### Ranking
The ranking is calculated per step in the same way as for the critical difference diagram, via statistical testing.

⏳ This might take a while...

```python
from carps.analysis.generate_report import plot_ranks_over_time

%matplotlib inline
Copy link

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code examples use %matplotlib inline, which is a Jupyter/IPython magic command. This will not work in regular Python scripts. Consider either:

  1. Documenting that these examples are for Jupyter notebooks
  2. Removing the magic commands if the examples are meant for regular Python scripts
  3. Providing both notebook and script versions of the examples

Copilot uses AI. Check for mistakes.
Comment thread docs/plot_demo.md
Comment on lines +19 to +90
resulting_files_critical_difference = plot_critical_difference(results, output_dir=figure_dir, show_figure=True)
```

![Critical Difference](images/plots/figures/critical_difference.png)

## Performance per Task
We visualize the final performance per task as a heatmap with tasks as rows and optimizers as columns. The cells display the raw final performance, averaged over seeds per task, and the colormap indicates how well an optimizer performed in comparison. The colormap is using the normalized performance values.

```python
from carps.analysis.generate_report import plot_performance_per_task

resulting_files_performance_per_task = plot_performance_per_task(results, output_dir=figure_dir, replot=True, show_figure=True)
```

![Performance per Task](images/plots/figures/performance_per_task.png)

## Final Performance Boxplot
This is a boxplot together with a violin plot, showing the raw (but normalized) distribution of the results. The optimizers are sorted by their median value to match the critical difference assessment.

```python
from carps.analysis.generate_report import plot_finalperfboxplot

resulting_files_finalperfboxplot = plot_finalperfboxplot(results, output_dir=figure_dir, replot=True, show_figure=True)

```

![Final Performance Boxplot](images/plots/figures/final_performance.png)

## Performance over Time
We can inspect the anytime performance in two ways. The first is visualizing the incumbent cost over iterations: Either aggregated and normalized (and interpolated), or per task. The caveat of the first method is that we cannot distinguish well between optimizers. Thus, we normally resort to the ranking over time as determined via statistical testing.

### Aggregated over tasks, normalized
The plot shows the mean incumbent cost over iterations (both normalized and interpolated) with 95%-CI. Mostly indistinguishable thus not advised.

```python
from carps.analysis.generate_report import plot_performance_over_time

%matplotlib inline

resulting_files_perfovertime = plot_performance_over_time(
results, output_dir=figure_dir, per_task=False, replot=True, show_figure=True
)
```

![aggregated over tasks](images/plots/figures/aggregated_over_task.png)

### Per Task
⏳ This might take a while because this uses seaborn's grid plot.

```python
from carps.analysis.generate_report import plot_performance_over_time

%matplotlib inline

resulting_files_perfovertime = plot_performance_over_time(
results, output_dir=figure_dir, per_task=True, replot=True, show_figure=True
)
```

![Per Task](images/plots/figures/per_task.png)

### Ranking
The ranking is calculated per step in the same way as for the critical difference diagram, via statistical testing.

⏳ This might take a while...

```python
from carps.analysis.generate_report import plot_ranks_over_time

%matplotlib inline

resulting_files_rank_over_time = plot_ranks_over_time(results, output_dir=figure_dir, replot=True, show_figure=True)
Copy link

Copilot AI Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code examples reference undefined variables results and figure_dir. These variables need to be defined or the documentation should explain how to obtain/define them before the code examples can be run. Consider adding a setup section that shows how to load or prepare these variables.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants