Add 9 new Green AI patterns to the catalogue by russelltrow · Pull Request #407 · Green-Software-Foundation/patterns

russelltrow · 2026-06-16T11:20:12Z

Summary

This PR contributes 9 new green software patterns focused on AI and ML workloads, authored by Naveen Balani. The patterns cover the full AI lifecycle — from development decisions through to runtime operations — and are structured according to the GSF pattern template.

New patterns

Development

right-sized-energy-efficient-ai-models.md — Select and optimize AI models appropriately sized for the task to reduce compute, memory, and energy consumption during training and inference.
data-handling/optimize-data-storage-ai-training.md — Use efficient storage formats, compression, and indexing strategies for AI datasets and embeddings to reduce storage footprint, data transfer, and retrieval compute.
pre-trained-transfer-learning.md — Fine-tune existing pre-trained models instead of training from scratch to dramatically reduce the compute, energy, and time required for model development.
select-efficient-ml-frameworks-inference-runtimes.md — Choose ML frameworks and inference runtimes that best match your hardware and workload to reduce compute overhead and improve energy efficiency across training and production inference.
optimize-agent-orchestration-reduce-model-calls.md — Design agentic AI workflows to minimise redundant model invocations and unnecessary compute through caching, conditional logic, and efficient orchestration patterns.

Architecture

system-topology/run-ai-models-edge.md — Deploy AI inference on edge devices or local infrastructure to reduce data transfer, network energy use, and reliance on centralised cloud compute.
system-topology/efficient-hardware-ai-workloads.md — Match AI workloads to the most energy-efficient hardware accelerator or instance type to improve utilisation and reduce energy consumption per inference or training run.
system-topology/on-demand-execution-ai-agent-workloads.md — Trigger AI and agent workloads only when needed using serverless or event-driven platforms to eliminate idle compute and reduce unnecessary energy consumption.

Operations

operations/carbon-aware-ai-scheduling.md — Reduce the carbon impact of AI workloads by running them in cloud regions with lower grid carbon intensity and scheduling deferrable jobs during periods of high renewable energy availability.

Pattern structure

Each pattern follows the standard GSF template and includes:

## Description — problem context and motivation
## Solution — actionable guidance
## SCI Impact — mapping to the E, I, M, and R factors of the SCI equation
## Cost Impact — compute, infrastructure, and trade-off considerations
## Assumptions — preconditions for the pattern to apply
## Considerations — trade-offs and caveats
## References — citations and further reading

All patterns include a description field in YAML front matter for catalogue indexing, and personas are aligned to the official GSF persona list.

Note on patterns 4A and 4B

The ML frameworks and agent orchestration patterns were originally scoped as a single pattern. Following a review with Naveen Balani, they were split into two focused patterns — one covering execution engine selection (4A) and one covering workflow design for agentic systems (4B) — as the two decisions are made by different personas at different points in development.

Test plan

Review YAML front matter is valid for all 9 files
Confirm section order matches GSF template: Description → Solution → SCI Impact → Cost Impact → Assumptions → Considerations → References
Verify patterns render correctly in the Docusaurus site
Confirm lifecycle category assignments (Development / Architecture / Operations) are correct for each pattern
Confirm all personas are drawn from the official GSF persona list
Review SCI Impact mappings for accuracy against the SCI specification

🤖 Generated with Claude Code

Signed-off-by: Russell Trow <russell@greensoftware.foundation>

Updated base URL format for Docusaurus configuration. Signed-off-by: Russell Trow <russell@greensoftware.foundation>

Signed-off-by: Russell Trow <russell@greensoftware.foundation>

Updated GH deploy workflow to allow manual runs Signed-off-by: Russell Trow <russell@greensoftware.foundation>

Remove several legacy AI/architecture pages and replace them with reorganized, up-to-date guidance. Adds new system-topology patterns (efficient-hardware, on-demand-execution for agent workloads, run-ai-models-edge), new development docs (right-sized models, optimize data storage), and an operations doc for carbon-aware scheduling. Also updates pre-trained-transfer-learning metadata/content (author and expanded guidance). Consolidates and modernizes AI sustainability guidance and authorship (Naveen Balani).

Rename file from 'Use right-sized and energy-efficient AI models .md' to 'right-sized-energy-efficient-ai-models.md' to remove trailing space and normalize the filename to kebab-case. No content changes were made; this improves consistency and prevents issues with linking and tooling that don't handle spaces well.

…vements - Add ## Cost Impact section to all 7 AI patterns (between SCI Impact and Assumptions) - Fix stray trailing quote in pattern-02 h1 title - Strengthen edge deployment assumption (memory/compute/power specifics) - Strengthen transfer learning fine-tuning cost caveat - Strengthen on-demand execution stateful workflow assumption Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replace 'Enterprise Architect' with 'Solution Architect' in three AI patterns to match the personas defined at patterns.greensoftware.foundation/personas/ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Split the originally scoped Pattern 4 into two focused patterns per Naveen Balani's approval: - 4A: Select efficient ML frameworks and inference runtimes (Development) Covers framework/runtime selection criteria, inference-optimised runtimes, hardware-specific optimisations, and benchmarking guidance. - 4B: Optimize agent orchestration to reduce unnecessary model calls (Development) Covers caching, conditional logic, batching, early termination, and workflow profiling for agentic AI systems. Both patterns follow the full GSF template including Cost Impact and description front matter fields. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

All other patterns in the repo use empty tags fields. Comma-separated string values are not valid YAML arrays and caused a ValidationError on deploy. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> EOF )

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Five redirects pointed to old pattern paths that no longer exist: - compress-ml-models-for-inference → right-sized-energy-efficient-ai-models - energy-efficent-ai-edge → run-ai-models-edge - efficent-format-for-model-training → optimize-data-storage-ai-training - right-hardware-type → efficient-hardware-ai-workloads - leverage-sustainable-regions → carbon-aware-ai-scheduling Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

russelltrow · 2026-06-16T14:06:45Z

@LiyaMath @franziska-warncke @navveenb here are the new AI patterns on the staging website:

https://russelltrow.github.io/gsf-patterns/personas/ai-ml-engineer/

caxenie · 2026-06-18T13:17:55Z

+
+AI workloads such as training, fine-tuning, and inference require significant compute resources. The type of hardware used, including CPUs, GPUs, TPUs, and specialized accelerators, has a direct impact on energy efficiency and performance.
+
+Different hardware options vary in their ability to execute AI workloads efficiently. Selecting appropriate hardware and compute resources improves utilization, reduces execution time, and lowers overall energy consumption.


I would also add that the orchestration is critical here. The suggested selection should not happen manually but should depend on a characterisation of the system heterogeneity.

I believe we can refine this here by looking at middleware systems that do the workload balacing, dispatching, and monitoring in closed-loop.

Excellent suggestion. We'll explicitly mention that orchestration layers can act as closed-loop resource controllers, continuously adjusting allocations based on workload requirements, utilization, and efficiency objectives to avoid over-provisioning.

caxenie · 2026-06-18T13:20:18Z

+
+## Solution
+
+- Choose hardware that is optimized for the specific workload, such as GPUs or TPUs for parallel processing tasks


Here can become more fine-grained, basically saying that we can define time scales for each task to be executed on CPU / GPU / TPU or accelerators by offering a "catalogue" of possible solutions for the needed time scales.

Agreed, made the changes

caxenie · 2026-06-18T13:26:53Z

+## Solution
+
+- Choose hardware that is optimized for the specific workload, such as GPUs or TPUs for parallel processing tasks
+- Use specialized accelerators where available to improve efficiency


Let us build a catalogue for accelerator integration. We need to discuss here at the orchestration level (middleware) to enable the monitoring workload dispatching, with judiciously allocated resources.

Great suggestion, updated the principle.

caxenie · 2026-06-25T04:40:20Z

+## Solution
+
+- Deploy models on edge devices or local infrastructure to reduce data transfer to centralized systems
+- Perform data preprocessing tasks such as filtering, cleansing, and feature generation locally


I would complete "problem/workload-specific preprocessing"

caxenie · 2026-06-25T04:41:33Z

+
+## SCI Impact
+
+**SCI = (E × I) + M per R**


I guess the formula changes here as we have edge and cloud measurements. Shall we split the contributions? We might, of course, run into the problem of having the cloud "absorb" the edge values.

The SCI formula should work. However, in hybrid edge-cloud architectures, energy, carbon intensity, and embodied emissions should be measured across all participating components, including edge devices, networking, and centralized infrastructure, and aggregated when calculating SCI

caxenie · 2026-06-25T04:42:33Z

+
+## Cost Impact
+
+- **Cloud compute costs:** Reduced by moving inference to edge devices


I would also add the availability of the edge devices. Dependent on the network connection type, the edge device availability and responsiveness would also play a role

Great point, added

caxenie · 2026-06-25T04:44:11Z

+
+- **Cloud compute costs:** Reduced by moving inference to edge devices
+- **Network costs:** Lower data transfer to centralized systems
+- **Edge device costs:** Increased due to deploying hardware at the edge


I would add the edge device to the cost impact. We can definitely use vonNeumann machines with low price and high energy need, or in-memory compute non-vonNeumann machine with a higher price (relatively) and ultra-low power consumption.

caxenie · 2026-06-25T04:45:41Z

+
+## Assumptions
+
+- Edge or local devices have sufficient memory, compute capacity, and power to run the target model without requiring additional optimization


In my experience, I have also made tradeoffs between the pre-processing costs on the near-data edge device and the actual ML model, which comes anyway quantised / compressed for inference.

Excellent point. Preprocessing itself consumes compute resources and should be evaluated alongside model execution when designing edge architectures.

caxenie · 2026-06-25T04:47:42Z

+## Assumptions
+
+- Edge or local devices have sufficient memory, compute capacity, and power to run the target model without requiring additional optimization
+- Workloads can be partitioned effectively between edge and cloud


This is an open question: how to quantify/predict workloads and dispatch the workload smartly between cloud and edge. We need to define metrics also for the partitioning.

Good point, i will extend the considerations section to highlight the need for metrics and evaluation criteria when deciding how workloads should be distributed

caxenie · 2026-06-25T04:50:35Z

+Using efficient data storage and access patterns improves data retrieval performance and reduces the overall resource footprint of both training and runtime systems.
+
+## Solution
+


Would smart methods for serialization/deserialization also play a role here? Especially when talking about cloud-edge distribution?

Great suggestion. Will expand the solution section to include serialization efficiency as an important consideration.

caxenie · 2026-06-25T04:51:20Z

+
+## Considerations
+
+- Compatibility with existing tools and pipelines must be evaluated


Here the conversion tools might also introduce costs

Good point, added

caxenie · 2026-06-25T04:57:25Z

+- Use caching mechanisms to avoid re-processing identical inputs or identical tool results
+- Implement conditional logic to skip unnecessary model calls when prior results can be reused
+- Prefer direct tool calls or API integrations over calling models to transform simple data
+- Use streaming and progressive results where possible instead of processing entire responses at once


Yes, this included harmonising the event processing concepts eventually. There are lessons learned from stream processing and especially for AI models serving could benefit.

Good point, will broaden the guidance to acknowledge that streaming and event-driven processing patterns can reduce unnecessary computation and improve responsiveness.

caxenie · 2026-06-25T04:58:44Z

+
+## Assumptions
+
+- Workflows can be analyzed and profiled to identify inefficiencies


In my experience the state profiling is not that efficient. We need good calibration with deployed workflows, especially when considering the cloud-edge continuum

Agreed, will modify the wording to reflect this

caxenie · 2026-06-25T05:01:41Z

+- Caching strategies must account for data freshness and accuracy requirements
+- Some tasks genuinely require multiple model calls; avoid false economy measures
+- Agent design patterns vary (ReAct, Tree of Thought, etc.); optimization strategies differ by pattern
+- Monitoring and profiling agent execution requires observable logging and metrics


In this case, we need mechanisms as in feedback control systems. Here, we would have a controller fed with max resources, values of continuously monitored resource consumption, and the capacity to orchestrate the execution of agents in an event-based manner. Imagine that continuous monitoring would be costly, I would say event-based reactive control, basically, the system would react to relevant changes in the metrics. There is so much potential here!

Great suggestion. Closed-loop monitoring and event-driven adaptation can help optimize agent execution while avoiding the overhead of continuous monitoring. Will incorporate this as a consideration for advanced orchestration scenarios.

caxenie · 2026-06-25T05:06:12Z

+- Pre-trained models may introduce biases or limitations from their original training data
+- Fine-tuning large foundation models can still require substantial compute resources comparable to training from scratch; evaluate the true cost-benefit of fine-tuning vs. full training for your use case
+- Licensing and usage restrictions of pre-trained models must be evaluated
+- Model suitability should be validated for the specific domain


Here would be hard to find a diverse spectrum of domain-specific models, and then, to train them, we need more data and resources. This domain-specific validation would also have a price.

Great point. Domain-specific pre-trained models may not always be available, and adapting or validating them for specialized domains can require additional data, compute, and evaluation effort. We'll expand the considerations section to reflect these trade-offs.

caxenie · 2026-06-25T05:09:49Z

+
+AI and ML models vary significantly in size, architecture, complexity, and resource requirements. Larger models typically require more compute, memory, and storage, leading to higher energy consumption during both training and inference.
+
+Using models that are appropriately sized and architecturally efficient for the task avoids unnecessary resource usage. This includes selecting smaller or task-specific models, choosing energy-efficient architectures at equivalent capability levels, and applying optimization techniques to reduce model footprint without sacrificing required performance.


This might come as a measure competing with the re-use of larger models and retraining. Model choice is also a decision not always available to all teams bringing AI into production. Offering a model zoo is typically good, but choosing is still a problem. A problem/model catalogue would be great.

Good point, will clarify this

caxenie · 2026-06-25T05:11:21Z

+- Prefer optimized or distilled versions of larger models for fine-tuning and inference
+- Apply model compression techniques such as quantization, pruning, and knowledge distillation
+- Remove redundant or inactive parameters where possible
+- Evaluate model options based on both performance and energy efficiency before selection


Especially hard in edge deployment. The estimated values have some variance with respect to the deployed values on the hardware.

Agreed, will add this.

caxenie · 2026-06-25T05:15:01Z

+- Remove redundant or inactive parameters where possible
+- Evaluate model options based on both performance and energy efficiency before selection
+- Continuously evaluate newer model variants that offer improved efficiency
+- Avoid defaulting to the largest available model when simpler alternatives can achieve similar outcomes


This is problem-dependent and taps into more parameters of the problem: deployment hardware type and resources (i.e., DSP, GPU, NPU) available, type of data, and pre-processing. For instance, one can use for temporal data both feedforward and recurrent models, with the latter being more resource-efficient (i.e., fewer parameters) but harder to build.

Excellent point. Model suitability depends not only on task complexity but also on deployment hardware, data characteristics, preprocessing requirements, and operational constraints. Will incorporate these factors into the considerations section.

caxenie · 2026-06-25T05:16:03Z

+
+- **Compute costs:** Reduced due to smaller model sizes and faster inference
+- **Infrastructure costs:** Lower due to reduced memory and storage requirements
+- **Benchmarking overhead:** May add cost for performance testing across model variants


This should be a must! More work to be done here, but I would argue the overhead is needed to enable all the other cost savings.

Agreed, added the trade-off

caxenie · 2026-06-25T05:16:45Z

+## Assumptions
+
+- Smaller or optimized models can meet the functional requirements of the application
+- Model performance can be validated against acceptable thresholds


Acceptable thresholds are problem specific.

Good point. Performance thresholds vary by application and should be defined according to business and functional requirements.

caxenie · 2026-06-25T05:17:24Z

+
+- Smaller or optimized models can meet the functional requirements of the application
+- Model performance can be validated against acceptable thresholds
+- Efficiency improvements do not significantly degrade output quality


There is always a trade off

caxenie · 2026-06-25T05:19:11Z

+- Some complex tasks may require larger models
+- Over-optimization can degrade performance
+- Fine-tuning larger models may be necessary for complex domain-specific tasks
+- Periodic re-evaluation is needed as workloads and models evolve


This monitoring comes with overhead but can be integrated into the previously defined event-based platform to handle resource allocation, as profiling is the core of the lifecyle.

Yes, will add that monitoring approaches should balance observability benefits with resource consumption.

caxenie · 2026-06-25T05:23:50Z

+
+Different frameworks and runtimes vary significantly in their ability to leverage hardware capabilities, execute operations efficiently, and minimize computational overhead. Inefficient framework choices can lead to unnecessary compute consumption, poor hardware utilization, and increased energy expenditure for the same workload.
+
+Selecting efficient ML frameworks and inference runtimes improves model execution performance and reduces the carbon footprint of AI training and inference.


A very important point here is that the community tries to unfold a harmonised and interoperable framework APIs, so that one can deploy the same model on various backends. This approach of streamlining model deployment to specialised hardware happens already (see from the neuromorphic accelerators community, the SNNtorch and the Neuromorphic Intermediate Representation). This would open the stage for truly heterogeneous systems with large benefits in performance and sustainability, once a workload orchestrator is in place.

See: https://neuroir.org/ and https://snntorch.readthedocs.io/en/latest/

Good point, will extend the pattern to highlight the importance of interoperable runtimes and portable model representations.

caxenie · 2026-06-25T06:05:12Z

+
+## Solution
+
+- Choose frameworks that efficiently utilize available hardware (GPUs, TPUs, specialized accelerators)


Here, the community starts opening the way for new systems. For example, a heterogeneous orchestrator is the one from https://klepsydra.com/.
Efficient workload dispatching, monitoring, and balancing on heterogeneous edge systems.

We will acknowledge that framework choices may influence workload portability and orchestration across diverse hardware platforms.

caxenie · 2026-06-25T06:12:03Z

+- Use optimized inference layers that reduce latency and compute overhead compared to training frameworks
+- Select frameworks with strong compiler optimization and memory management capabilities
+- Benchmark framework options under your actual workload conditions before committing to production
+- Keep frameworks and runtime dependencies updated to benefit from performance and efficiency improvements


Here, backwards compatibility and the option to also "retrofit" existing hardware in large heterogeneous systems is still an open problems. Imagine we have existing systems which could deliver more performance in a more efficient way, but back compatibility and runtime dependencies create friction.

Great point, will add this consideration to ensure organizations evaluate migration friction and retrofit opportunities.

Enhanced the document on efficient hardware for AI workloads by adding details on workload profiling, hardware optimization, and orchestration systems. Updated trade-off considerations for specialized accelerators and added new references for benchmarking. Signed-off-by: Navveen Balani <88837066+navveenb@users.noreply.github.com>

Enhanced recommendations for event-driven execution and resource management. Expanded sections on cost impact, assumptions, and considerations for on-demand workloads. Signed-off-by: Navveen Balani <88837066+navveenb@users.noreply.github.com>

Enhanced the documentation on deploying AI models at the edge by adding details on workload classification, preprocessing tasks, and cost implications of edge devices. Signed-off-by: Navveen Balani <88837066+navveenb@users.noreply.github.com>

Expanded on data handling strategies for AI training, emphasizing efficient serialization, compatibility considerations, and the balance between compression and decompression costs. Signed-off-by: Navveen Balani <88837066+navveenb@users.noreply.github.com>

Enhanced recommendations for optimizing agent workflows by incorporating telemetry and event-driven processing. Updated considerations to emphasize the need for adaptive orchestration and careful evaluation of trade-offs. Signed-off-by: Navveen Balani <88837066+navveenb@users.noreply.github.com>

Updated the note on model suitability for specific domains to emphasize the need for additional resources and validation efforts. Signed-off-by: Navveen Balani <88837066+navveenb@users.noreply.github.com>

Updated evaluation criteria for model selection to include energy efficiency benchmarks and clarified assumptions regarding performance and quality thresholds. Signed-off-by: Navveen Balani <88837066+navveenb@users.noreply.github.com>

Updated considerations for selecting ML frameworks and inference runtimes, emphasizing compatibility and portability. Signed-off-by: Navveen Balani <88837066+navveenb@users.noreply.github.com>

russelltrow and others added 13 commits January 15, 2026 16:49

Update site URL in docusaurus.config.js

4d5e968

Signed-off-by: Russell Trow <russell@greensoftware.foundation>

Update site URL and base URL in config

50a963c

Signed-off-by: Russell Trow <russell@greensoftware.foundation>

Change baseUrl format in docusaurus.config.js

7b92df7

Updated base URL format for Docusaurus configuration. Signed-off-by: Russell Trow <russell@greensoftware.foundation>

Update docusaurus.config.js

2010146

Signed-off-by: Russell Trow <russell@greensoftware.foundation>

Update deploy.yml

35beaaf

Signed-off-by: Russell Trow <russell@greensoftware.foundation>

Update docusaurus.config.js

59391bf

Signed-off-by: Russell Trow <russell@greensoftware.foundation>

Update deploy.yml

d7f4835

Signed-off-by: Russell Trow <russell@greensoftware.foundation>

Update deploy.yml

0c7c215

Updated GH deploy workflow to allow manual runs Signed-off-by: Russell Trow <russell@greensoftware.foundation>

Merge branch 'main' of https://github.com/russelltrow/gsf-patterns

a94192e

docs: add description field to front matter for all 7 AI patterns

2ece6eb

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

russelltrow requested review from LiyaMath and franziska-warncke June 16, 2026 11:22

russelltrow assigned navveenb Jun 16, 2026

russelltrow and others added 2 commits June 16, 2026 12:39

docs: align personas to official GSF persona list

0ab2884

Replace 'Enterprise Architect' with 'Solution Architect' in three AI patterns to match the personas defined at patterns.greensoftware.foundation/personas/ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

russelltrow changed the title ~~feat: add 7 AI sustainability patterns to the GSF catalogue~~ feat: add 9 AI sustainability patterns to the GSF catalogue Jun 16, 2026

russelltrow and others added 3 commits June 16, 2026 12:52

fix: clear tags front matter to fix Docusaurus YAML validation error

3317014

All other patterns in the repo use empty tags fields. Comma-separated string values are not valid YAML arrays and caused a ValidationError on deploy. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> EOF )

docs: add tags to all 9 AI patterns using correct YAML list format

f88c92a

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

russelltrow mentioned this pull request Jun 16, 2026

Evolve Green AI patterns with the Green AI Committee #369

Open

6 tasks

russelltrow changed the title ~~feat: add 9 AI sustainability patterns to the GSF catalogue~~ Add 9 new AI sustainability patterns to the catalogue Jun 16, 2026

russelltrow changed the title ~~Add 9 new AI sustainability patterns to the catalogue~~ Add 9 new Green AI patterns to the catalogue Jun 16, 2026

russelltrow mentioned this pull request Jun 16, 2026

2026-06-18 Green Software Principles & Patterns agenda #406

Open

13 tasks

caxenie reviewed Jun 18, 2026

View reviewed changes

caxenie reviewed Jun 25, 2026

View reviewed changes

navveenb added 8 commits June 26, 2026 18:00

Revise model suitability considerations for domains

87b73ce

Updated the note on model suitability for specific domains to emphasize the need for additional resources and validation efforts. Signed-off-by: Navveen Balani <88837066+navveenb@users.noreply.github.com>

Revise guidelines for efficient ML framework selection

d9ad231

Updated considerations for selecting ML frameworks and inference runtimes, emphasizing compatibility and portability. Signed-off-by: Navveen Balani <88837066+navveenb@users.noreply.github.com>


		AI workloads such as training, fine-tuning, and inference require significant compute resources. The type of hardware used, including CPUs, GPUs, TPUs, and specialized accelerators, has a direct impact on energy efficiency and performance.

		Different hardware options vary in their ability to execute AI workloads efficiently. Selecting appropriate hardware and compute resources improves utilization, reduces execution time, and lowers overall energy consumption.


		## Solution

		- Choose hardware that is optimized for the specific workload, such as GPUs or TPUs for parallel processing tasks


		## Cost Impact

		- Cloud compute costs: Reduced by moving inference to edge devices


		## Assumptions

		- Edge or local devices have sufficient memory, compute capacity, and power to run the target model without requiring additional optimization

		Using efficient data storage and access patterns improves data retrieval performance and reduces the overall resource footprint of both training and runtime systems.

		## Solution


		## Considerations

		- Compatibility with existing tools and pipelines must be evaluated


		## Assumptions

		- Workflows can be analyzed and profiled to identify inefficiencies


		AI and ML models vary significantly in size, architecture, complexity, and resource requirements. Larger models typically require more compute, memory, and storage, leading to higher energy consumption during both training and inference.

		Using models that are appropriately sized and architecturally efficient for the task avoids unnecessary resource usage. This includes selecting smaller or task-specific models, choosing energy-efficient architectures at equivalent capability levels, and applying optimization techniques to reduce model footprint without sacrificing required performance.


		Different frameworks and runtimes vary significantly in their ability to leverage hardware capabilities, execute operations efficiently, and minimize computational overhead. Inefficient framework choices can lead to unnecessary compute consumption, poor hardware utilization, and increased energy expenditure for the same workload.

		Selecting efficient ML frameworks and inference runtimes improves model execution performance and reduces the carbon footprint of AI training and inference.


		## Solution

		- Choose frameworks that efficiently utilize available hardware (GPUs, TPUs, specialized accelerators)

Uh oh!

Conversation

russelltrow commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New patterns

Pattern structure

Note on patterns 4A and 4B

Test plan

Uh oh!

russelltrow commented Jun 16, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

russelltrow commented Jun 16, 2026 •

edited

Loading