Skip to content

fix(library_tools): duplicate legacy library content synchronously to avoid race#38316

Open
kingoftech-v01 wants to merge 1 commit intoopenedx:masterfrom
kingoftech-v01:fix/bug-36544
Open

fix(library_tools): duplicate legacy library content synchronously to avoid race#38316
kingoftech-v01 wants to merge 1 commit intoopenedx:masterfrom
kingoftech-v01:fix/bug-36544

Conversation

@kingoftech-v01
Copy link
Copy Markdown

Description

Duplicating a unit containing a LegacyLibraryContentBlock (randomized content / V1 library content) produces a duplicate block with no children: "There are no problems in the specified library of type any. Select another problem type." The bug only appears in production — CI's CELERY_ALWAYS_EAGER=True hides it.

User impact (Course Author): Duplicating a unit with randomized content now produces a functional duplicate with the same children as the source.

Root Cause

Race condition. duplicate_block() (cms/djangoapps/contentstore/utils.py:1149-1219) wraps the whole operation in store.bulk_operations(course_key). Inside, it calls store.create_item(...) and dest_block.studio_post_duplicate(store, source_item), which for LegacyLibraryContentBlock calls self.get_tools().trigger_duplication(...), which fires library_tasks.duplicate_children.delay(...).

bulk_operations on split modulestore buffers writes thread-locally and only flushes on exit. The Celery worker runs in a separate process — its store.get_item(BlockUsageLocator.from_string(dest_block_id)) executes before the parent request ends its bulk op, so the worker sees either ItemNotFoundError or an empty dest block. _sync_children then operates on a block with no children, and the duplicated library_content block ends up with len(children) == 0.

The identical race was fixed for sync_from_library in commit 2b47b8a379 (PR #34030, TNL-11339, 2024-01-09) by switching to .apply(). trigger_duplication was overlooked.

Typo (same file, line 124): The cross-learning-context guard compares source_block to itself — it's always False and silently disabled. Has been broken since the original 2023-11-20 commit e800ae7622.

Fix

Two surgical changes in xmodule/library_tools.py:

  1. Fix the self-comparison typo: source_block.scope_ids.usage_id.context_key != source_block.scope_ids.usage_id.context_keysource_block.scope_ids.usage_id.context_key != dest_block.scope_ids.usage_id.context_key.
  2. Switch library_tasks.duplicate_children.delay(...)library_tasks.duplicate_children.apply(kwargs=...) to run in-process. This mirrors the sync_from_library.apply(...) call just above and is the same fix applied to that sibling task in feat: make course -> lib import synchronous #34030.

A TODO comment points at #36544 and #34029 for future readers.

Supporting Information

Testing Instructions

  1. Create a content library with at least one problem.
  2. In a course, add a Randomized Content Block pointing at that library.
  3. Sync the library and verify the block has children.
  4. Duplicate the parent unit.
  5. Before the fix: the duplicated randomized block displays "There are no problems in the specified library of type any."
  6. After the fix: the duplicated randomized block has the same children as the source.

Three new tests in xmodule/tests/test_library_tools.py:

  • test_unit_trigger_duplication_does_not_enqueue_async_task — Unit: asserts .apply() called, not .delay()
  • test_integration_trigger_duplication_inside_bulk_operations — Integration: asserts duplicate has children inside an outer bulk_operations
  • test_bug_36544_regression_cross_context_guard — Regression: asserts the typo fix rejects cross-course duplication

Deadline

None

Other Information

  • No database migrations
  • No new dependencies
  • No deprecations
  • V2 libraries unaffected (studio_post_duplicate gates on is_migrated_to_v2)
  • .apply() blocks the CMS request thread for the duration of the sync, matching the already-shipped sync_from_library.apply() path. Duplicate is a user action; blocking until completion is the correct semantics.

Closes #36544

… avoid race

LegacyLibraryToolsService.trigger_duplication enqueued duplicate_children
via Celery .delay(), but its only caller runs inside store.bulk_operations()
in contentstore.utils.duplicate_block. With a real (non-eager) worker, the
Celery task executes in another process before the bulk op commits, so
store.get_item(dest_block_id) either raises ItemNotFoundError or returns a
dest block with no children. The duplicated library_content block then
renders the validation warning "There are no problems in the specified
library of type any."

Switch the call to duplicate_children.apply(kwargs=...), mirroring the
identical fix applied to sync_from_library in openedx#34029 / TNL-11339. The task
now runs in the calling thread and observes the in-flight bulk operation.

Also fix a self-comparison typo that made the cross-learning-context guard
in trigger_duplication a silent no-op.

Closes openedx#36544
@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Apr 10, 2026
@openedx-webhooks
Copy link
Copy Markdown

Thanks for the pull request, @kingoftech-v01!

This repository is currently maintained by @openedx/wg-maintenance-openedx-platform.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.
🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads
🔘 Submit a signed contributor agreement (CLA)

⚠️ We ask all contributors to the Open edX project to submit a signed contributor agreement or indicate their institutional affiliation.
Please see the CONTRIBUTING file for more information.

If you've signed an agreement in the past, you may need to re-sign.
See The New Home of the Open edX Codebase for details.

Once you've signed the CLA, please allow 1 business day for it to be processed.
After this time, you can re-run the CLA check by adding a comment below that you have signed it.
If the CLA check continues to fail, you can tag the @openedx/cla-problems team in a comment for further assistance.

🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

Details
Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

open-source-contribution PR author is not from Axim or 2U

Projects

Status: Needs Triage

2 participants