Conversation
Signed-off-by: Saifuddin Kaijar <saifuddin.kaijar@amd.com>
Signed-off-by: Saifuddin Kaijar <saifuddin.kaijar@amd.com>
Signed-off-by: Saifuddin Kaijar <saifuddin.kaijar@amd.com>
Signed-off-by: Saifuddin Kaijar <saifuddin.kaijar@amd.com>
There was a problem hiding this comment.
Pull request overview
This PR implements a retry mechanism for handling full hardware queues in the AMD XDNA driver. When command submission fails due to a full queue, the driver now waits (using IRQ-driven events) for slots to become available rather than immediately failing. The implementation includes a 5-second timeout and handles both single commands and command chains with partial submission support.
Key changes:
- Introduces IRQ-driven wait mechanism with 5-second timeout for queue slot availability
- Splits queue slot reservation and commit into separate phases to enable parallel packet preparation
- Adds partial submission support for command chains to resume from interruption point
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/driver/amdxdna/ve2_of.h | Adds VE2_RETRY_TIMEOUT_MS constant (5000ms) for retry timeout |
| src/driver/amdxdna/ve2_hwctx.c | Core implementation: adds wait functions, refactors queue operations into reserve/commit phases, implements retry loops for single/chain commands |
| src/driver/amdxdna/ve2_host_queue.h | Adds reserved_write_index field to track in-flight slot reservations |
| src/driver/amdxdna/amdxdna_dpt.c | Removes extraneous blank line |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
|
|
||
| amdxdna_gem_put_obj(abo); | ||
| (*submitted_count)++; |
There was a problem hiding this comment.
The submitted_count is incremented after a successful command submission, but the amdxdna_gem_put_obj(abo) call on line 775 happens before checking the return value. If ret is -EBUSY, the function returns early on line 780, but submitted_count remains at 0 even though resources were acquired and released. This causes the loop in ve2_submit_cmd_chain to not advance start_idx (line 817), resulting in an infinite retry loop attempting to submit the same command repeatedly.
Signed-off-by: Saifuddin Kaijar <saifuddin.kaijar@amd.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
fb6a5c9 to
95244d4
Compare
No description provided.