Adapt to plumpy's greenback async bridge #7206
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7206 +/- ##
==========================================
- Coverage 79.70% 79.70% -0.00%
==========================================
Files 565 565
Lines 43836 43867 +31
==========================================
+ Hits 34936 34959 +23
- Misses 8900 8908 +8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
d31ac38 to
61aa092
Compare
agoscinski
left a comment
There was a problem hiding this comment.
Integration tests are typically in a separate tests directory since they are not unit tests. A separate tests directory however would require to adapt all the GH workflows which is a bit overkill. I would go for a directory tests/integration/notebook to separate it from the manager unit tests. My experience with jupyter notebook tests is that starting each notebook is quite slow, so I would put them as nightly.
|
|
||
| # NOTE: We need to ensure the portal here only because | ||
| # our scheduler has only a sync interface and _get_jobs_from_scheduler is using that | ||
| # if we ever provide a fully async scheduler interface then we can remove this here |
There was a problem hiding this comment.
I don't fully understand. We need to call ensure_portal, when we switch to a different task than the task in execute (because this one has an open portal), and when we require a nested sync->async call. So in scheduler we have such a case? But why do we need to open it in transport then? Aren't the two classes decoupled?
There was a problem hiding this comment.
So when a CalcJob polls the scheduler, the update gets scheduled as a new asyncio task (via this call_later). And this new task doesn't inherit the portal that the original process had from execute(). The main problem is that scheduler.get_jobs() is sync, and internally it calls transport.exec_command_wait(). For which it uses run_until_complete().
Once scheduler interface is async, as well we can get rid of the ensure portal here
7134d55 to
b749429
Compare
| def _install_greenback_portal(self) -> None: | ||
| """Register an IPython input transformer that ensures a portal. |
There was a problem hiding this comment.
docstring and name is a bit inconsistent. Something like
| def _install_greenback_portal(self) -> None: | |
| """Register an IPython input transformer that ensures a portal. | |
| def _setup_event_loop_in_ipython_environment(self) -> None: | |
| """Setups the event loop in an IPython kernel environment. |
The hacks we do only work for ipython and the the register docstring is more for the_register_portal_transformer function
|
|
||
| When running inside an environment with an already-running event loop | ||
| (e.g. a Jupyter notebook kernel), this patches the kernel's | ||
| ``do_execute`` so that ``await greenback.ensure_portal()`` is called |
There was a problem hiding this comment.
would just make it greenback agonistic
| ``do_execute`` so that ``await greenback.ensure_portal()`` is called | |
| ``do_execute`` so that before cell execution aportal is opened to switch between tasks |
| ::: | ||
|
|
||
| :::{important} | ||
| If you are running this tutorial in a Jupyter notebook, make sure to call `load_profile()` in a **separate cell** before executing any AiiDA engine processes (e.g. calculation functions or work chains). |
There was a problem hiding this comment.
| If you are running this tutorial in a Jupyter notebook, make sure to call `load_profile()` in a **separate cell** before executing any AiiDA engine processes (e.g. calculation functions or work chains). | |
| If you are running this tutorial in a Jupyter notebook, make sure to call `load_profile()` in a **separate cell** before running any AiiDA engine processes (e.g. calculation functions or work chains). |
agoscinski
left a comment
There was a problem hiding this comment.
We also need to add a section in the howto docs/source/howto/interact.rst
https://aiida.readthedocs.io/projects/aiida-core/en/stable/howto/interact.html#how-to-interact-notebook
5e6383a to
541d6b3
Compare
36f17f5 to
7e1d965
Compare
b4ca462 to
0dbbf3e
Compare
|
an issue is opened on plumpy to improve the error message: |
4e4eee9 to
f464198
Compare
|
@agoscinski do we dare? |
There was a problem hiding this comment.
Just one thing, the disk-objectstore version constraint in the pyproject.toml sneaked into the second commit 6cbd336 and not the first one. I ran the current version of the production tests (PR https://github.com/aiidateam/aiida-production-tests/pull/1) and they ran through without any issue or any significant changes in performance.
408307c to
e47d673
Compare
All internal call sites (Runner.run_until_complete, FunctionProcess, WorkChain.run, TransportQueue, AsyncTransport) are updated to use plumpy's run_until_complete and run_with_portal helpers instead of direct loop.run_until_complete() calls. This requires plumpy~=0.26.0 which ship the greenback-based portal utilities. The key challenge that we had to solve is that AiiDA's engine is async, but users interact with it through synchronous entry points. Outside notebooks this works well via loop.run_until_complete(), but Jupyter kernels already run an event loop, making nested run_until_complete() calls impossible. Previously this was solved with nest_asyncio. After replacement, IPythonKernel.do_execute is monkey-patched at profile load time to call ensure_portal() before each cell execution, establishing the greenback context that synchronous AiiDA calls rely on. This patch activates on the next cell after load_profile(), requiring users to call it in a separate cell. For non-notebook contexts (CLI, daemon, scripts), no patching occurs since there is no running event loop. Documentation is updated with instructions for running engine processes in notebooks, including the separate-cell requirement for load_profile(). A nbstripout pre-commit hook is added to keep notebook test fixtures clean. Integration tests using nbclient verify notebook workflows across same-cell, separate-cell, and magic-cell scenarios. Co-Authored-By: Alexander Goscinski <alex.goscinski@posteo.de>
…_event_loop() (aiidateam#7206) Use plumpy's get_or_create_event_loop() throughout the engine and tests to avoid DeprecationWarnings on Python 3.12+ where asyncio.get_event_loop() raises when no loop is running. Remove the set/reset_event_loop_policy() calls from Runner, as the new helper handles loop creation internally. Add an autouse _reset_runner fixture in tests/conftest.py to ensure a clean runner between tests. Co-Authored-By: Alexander Goscinski <alex.goscinski@posteo.de>
Bump disk-objectstore~=1.5.0 and remove now-unnecessary type: ignore comments on its callback and stream APIs. Co-Authored-By: Alexander Goscinski <alex.goscinski@posteo.de>
All internal call sites (Runner.run_until_complete, FunctionProcess, WorkChain.run, TransportQueue, AsyncTransport) are updated to use plumpy's run_until_complete and run_with_portal helpers instead of direct loop.run_until_complete() calls. This requires plumpy~=0.26.0 which ship the greenback-based portal utilities. The key challenge that we had to solve is that AiiDA's engine is async, but users interact with it through synchronous entry points. Outside notebooks this works well via loop.run_until_complete(), but Jupyter kernels already run an event loop, making nested run_until_complete() calls impossible. Previously this was solved with nest_asyncio. After replacement, IPythonKernel.do_execute is monkey-patched at profile load time to call ensure_portal() before each cell execution, establishing the greenback context that synchronous AiiDA calls rely on. This patch activates on the next cell after load_profile(), requiring users to call it in a separate cell. For non-notebook contexts (CLI, daemon, scripts), no patching occurs since there is no running event loop. Documentation is updated with instructions for running engine processes in notebooks, including the separate-cell requirement for load_profile(). A nbstripout pre-commit hook is added to keep notebook test fixtures clean. Integration tests using nbclient verify notebook workflows across same-cell, separate-cell, and magic-cell scenarios. Co-Authored-By: Alexander Goscinski <alex.goscinski@posteo.de>
…_event_loop() (#7206) Use plumpy's get_or_create_event_loop() throughout the engine and tests to avoid DeprecationWarnings on Python 3.12+ where asyncio.get_event_loop() raises when no loop is running. Remove the set/reset_event_loop_policy() calls from Runner, as the new helper handles loop creation internally. Add an autouse _reset_runner fixture in tests/conftest.py to ensure a clean runner between tests. Co-Authored-By: Alexander Goscinski <alex.goscinski@posteo.de>
|
|
||
| .. important:: | ||
|
|
||
| ``load_profile()`` must be called in a **separate cell** before any AiiDA engine processes can be executed. |
There was a problem hiding this comment.
@khsrali is this a new requirement or has this been the case even before?
There was a problem hiding this comment.
It's new..
We could really not find a better way to do this.
The thing is notebooks have their own running event loop. Since nest_asyncio is dropped, the only way for us to use their loop (since you can have only one running event loop in) is to open a greenback portal.
And that has to be called when the loop has started but before engine calls.
The most practical place to stuff this logic in, was in load_profile.
However, there's a technical issue with that: the greenback portals are only usable when you are back in a the async context. Basically that means either we had to changes it to something like await load_profile_async() --which defies the efforts of aiida to not expose async syntax to users-- Or to register that call on each cell execution. After many brainstorming we decided to go with the second solution.
The interface remains the same load_profile() but greenback portals become useful from the next execution cell. A minimum "backward incompatible" price that we'll had to pay
|
This will need to be carefully tested in various AiiDAlab apps since they
rely on Jupiter Notebooks.
…On Sat, Feb 28, 2026, 21:14 Ali Khosravi ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In docs/source/howto/interact.rst
<#7206 (comment)>
:
> @@ -163,6 +163,43 @@ It is also possible to run ``verdi`` commands inside the notebook, for example:
%verdi status
+Running AiiDA engine processes in notebooks
+-------------------------------------------
+
+AiiDA supports running engine processes (such as calculation functions and work chains) directly in Jupyter notebooks.
+When :meth:`~aiida.manage.configuration.load_profile` is called inside a Jupyter notebook, AiiDA automatically sets up the necessary infrastructure to allow synchronous process execution within the notebook's event loop.
+
+.. important::
+
+ ``load_profile()`` must be called in a **separate cell** before any AiiDA engine processes can be executed.
It's new..
We could really not find a better way to do this.
The thing is notebooks have their own running event loop. Since
nest_asyncio is dropped, the only way for us to use their loop (since you
can have only one running event loop in) is to open a greenback portal.
And that has to be called when the loop has started but before engine
calls.
The most practical place to stuff this logic in, was in load_profile.
However, there's a technical issue with that: the greenback portals are
only usable when you are back in a the async context. Basically that means
either we had to changes it to something like await load_profile_async()
--which defies the efforts of aiida to not expose async syntax to users--
Or to register that call on each cell execution. After many brainstorming
we decided to go with the second solution.
The interface remains the same load_profile() but greenback portals
become useful from the next execution cell. A minimum "backward
incompatible" price that we'll had to pay
—
Reply to this email directly, view it on GitHub
<#7206 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACIY64LQQZ6EGL2WGNGSA7D4OIANHAVCNFSM6AAAAACUGSHVASVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZTQNZQHAYDKMRZGY>
.
You are receiving this because you commented.Message ID:
***@***.***>
|


Depends on aiidateam/plumpy#332
Alternative to #7188
Background
AiiDA's engine has a fundamental architectural tension: user-facing entry points (
engine.run(),engine.submit(), calcfunctions, workfunctions) are synchronous, but the engine internals that drive them — the state machine, the daemon, the runner — are async. Whenever synchronous process code needs to call an async operation (running a nested process, polling a scheduler, performing a transport operation), it has historically calledloop.run_until_complete(). This fails when the loop is already running (inside the daemon, inside Jupyter, etc.) withRuntimeError: This event loop is already running.Until now,
nest_asynciosolved this by monkey-patching asyncio internals to allow re-entrantrun_until_complete()calls. While convenient, this approach has serious drawbacks:What this PR does
This PR adapts aiida-core to use plumpy's new greenback-based async bridge (plumpy#332), replacing
nest_asyncioentirely. The changes are organized in three commits:Commit 1: Replace nest_asyncio with greenback support
All internal call sites (
Runner.run_until_complete,FunctionProcess,WorkChain.run,TransportQueue,AsyncTransport) are updated to use plumpy'srun_until_completeandrun_with_portalhelpers instead of directloop.run_until_complete()calls. This requiresplumpy~=0.26.0which ships the greenback-based portal utilities.The bridge in plumpy provides three functions:
The bridge is the single decision point: if the loop is already running (daemon, Jupyter), it uses
greenback.await_()through the greenlet portal. If the loop is idle (CLI invocation), it falls back to nativeloop.run_until_complete(). No monkey-patching anywhere.Transport/scheduler portal:
ensure_portal()is called in transport'srequest_transportcontext manager, because when aCalcJobpolls the scheduler, the update is scheduled as a new asyncio task (viacall_later). This new task doesn't inherit the portal from the originalexecute()call, andscheduler.get_jobs()is sync code that internally needsrun_until_complete(). Once the scheduler interface is fully async, this can be removed (see #7222).Jupyter notebook support:
Jupyter kernels have a permanently running event loop, so
run_until_complete()can never be called natively. The bridge needs a greenback portal to be active. Sinceensure_portal()is async, we can't call it from synchronous user code in a cell. Instead, whenload_profile()detects an IPython kernel environment, it patcheskernel.do_executeto callawait ensure_portal()before each cell execution. This patch activates on the next cell afterload_profile(), requiring users to call it in a separate cell before running any AiiDA engine processes — because the cell that installs the patch has already had its owndo_executecalled without the portal. For non-notebook contexts (CLI, daemon, scripts), no patching occurs.Documentation is updated with notebook instructions, and integration tests using
nbclientverify notebook workflows across same-cell, separate-cell, and magic-cell scenarios. Anbstripoutpre-commit hook is added to keep notebook test fixtures clean.Commit 2: Replace deprecated
asyncio.get_event_loop()withplumpy.get_or_create_event_loop()Separately from the re-entrancy problem, AiiDA passes the event loop reference throughout the codebase — the runner, the daemon, and the communicator all hold a reference. On Python 3.12+,
asyncio.get_event_loop()raises aDeprecationWarningwhen no loop is running, and may return a different loop object (e.g. after Python creates a fresh one in a new thread context), causing callbacks to be scheduled on the wrong loop.This commit replaces all
asyncio.get_event_loop()calls withplumpy.get_or_create_event_loop(), which consistently returns the same cached loop instance. Theset/reset_event_loop_policy()calls are removed fromRunner, as the new helper handles loop creation internally. An autouse_reset_runnerfixture is added totests/conftest.pyto ensure a clean runner between tests.This is orthogonal to the greenback migration — even with greenback, AiiDA still needs a stable loop reference.
But with this commit we make our code base fully compatible with python 3.14. (see also aiidateam/plumpy#336)
Commit 3: Bump disk-objectstore to ~=1.5.0
To pick up aiidateam/disk-objectstore#205, which adds no-op
close()andflush()methods toPackedObjectReader,CallbackStreamWrapper, andZlibLikeBaseStreamDecompresser.This fixes flaky test failures discovered in this PR, where the async changes result in different test ordering (or timing), causing
disk-objectstoreto pack objects before certain tests read from the repository. When objects are packed (rather than loose), the returned stream is aPackedObjectReader— and in aiida-core,TextIOWrapperis used as a context manager whose__exit__callsclose(), which propagates to the underlying stream. Withoutclose()on packed readers, this raisesAttributeError: 'PackedObjectReader' object has no attribute 'close'.The methods are intentionally no-ops because these readers don't own the underlying file handles they read from. The bump also removes now-unnecessary
type: ignorecomments on disk-objectstore's callback and stream APIs.How to test