Added the ve2 driver tracing support#1251
Conversation
Signed-off-by: Saifuddin Kaijar <saifuddin.kaijar@amd.com>
There was a problem hiding this comment.
Pull request overview
This PR adds VE2 driver tracing hooks and introduces a per-hardware-context ID to correlate trace events across hwctx lifecycle, command submission, scheduling, and IRQ completion paths.
Changes:
- Added a new generic trace event (
__amdxdna_trace_point) and wrapper macro to record profiling points with function name + 3 integer args. - Introduced hwctx ID allocation via an
xarrayin the VE2 device handle, and stored an ID inamdxdna_ctx_priv. - Instrumented VE2 management/scheduler, IRQ handler, and HSA queue submission/commit/wait paths with new tracepoints.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/driver/amdxdna/ve2_of.h | Adds hwctx ID field and device-level XArray state for ID allocation. |
| src/driver/amdxdna/ve2_of.c | Initializes/destroys the hwctx ID XArray during device init/fini. |
| src/driver/amdxdna/ve2_mgmt.c | Adds tracepoints around partition init, scheduling, scheduler work, and IRQ completion. |
| src/driver/amdxdna/ve2_hwctx.c | Allocates/releases hwctx IDs; adds tracepoints around queue write/read index updates, submit, wait, init/fini. |
| src/driver/amdxdna/amdxdna_trace.h | Adds new trace event + macro wrapper capturing __func__. |
Comments suppressed due to low confidence (3)
src/driver/amdxdna/ve2_of.h:90
struct amdxdna_ctx_priv::idis declared asu64, but IDs are allocated viaxa_alloc_cyclic()using a(u32 *)&priv->idcast. This relies on type-punning and only initializes the low 32 bits; it’s safer and clearer to store the ID asu32(or allocate into a temporaryu32and then assign to theu64).
struct amdxdna_ctx_priv {
u64 id; /* Unique incrementing hwctx ID */
u32 start_col;
u32 num_col;
u32 mem_bitmap;
u32 state;
src/driver/amdxdna/ve2_hwctx.c:1428
- After successfully allocating an ID with
xa_alloc_cyclic(), later failures (e.g.,ve2_xrs_request()or host queue creation) jump tocleanup_privand freeprivwithout removing the entry fromhwctx_ids. This leaks the ID and leaves a stale pointer in the XArray; add a cleanup path that callsxa_erase()(orxa_release()) before freeingprivon all post-allocation error paths.
/* Allocate unique ID using XArray (thread-safe, supports ID recycling) */
ret = xa_alloc_cyclic(&xdna->dev_handle->hwctx_ids, (u32 *)&priv->id, priv,
XA_LIMIT(1, U32_MAX), &xdna->dev_handle->next_hwctx_id, GFP_KERNEL);
if (ret < 0) {
XDNA_ERR(xdna, "Failed to allocate hwctx ID, ret=%d", ret);
goto cleanup_priv;
}
trace_amdxdna_trace_point("XRT_PROFILING_TRACE_ENTER",
client->pid, 0, priv->id, 0);
XDNA_DBG(xdna, "hwctx init: enter hwctx=%p pid=%d", hwctx, client->pid);
init_waitqueue_head(&priv->waitq);
ret = ve2_xrs_request(xdna, hwctx);
if (ret) {
XDNA_ERR(xdna, "XRS resource request failed, ret=%d", ret);
goto cleanup_priv;
}
src/driver/amdxdna/ve2_hwctx.c:1000
check_read_index()introducesstatic u64 read_index_value, which is shared across all callers/contexts and can be concurrently modified, producing incorrect comparisons/logs. This value should be a plain local variable (stack) instead ofstatic.
static inline bool check_read_index(struct amdxdna_ctx *hwctx,
u64 seq, u32 print_interval)
{
struct amdxdna_ctx_priv *priv_ctx = hwctx->priv;
static u64 counter;
static u64 read_index_value;
u64 *read_index;
if (!hwctx || !priv_ctx || !priv_ctx->hwctx_hsa_queue.hsa_queue_p)
return false;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Saif <saifuddin.kaijar@amd.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.
Comments suppressed due to low confidence (1)
src/driver/amdxdna/ve2_hwctx.c:1466
xa_alloc_cyclic()insertsprivintoxdna->dev_handle->hwctx_idsbefore the rest of hwctx initialization. If any later step fails (ve2_xrs_request(),ve2_create_host_queue(), etc.), the cleanup path freesprivbut never removes the XArray entry, leaving a stale pointer and leaking the ID. Track whether allocation succeeded andxa_erase()the ID on all failure paths (or delay allocation until after the last failing step).
/* Allocate unique ID using XArray (thread-safe, supports ID recycling) */
ret = xa_alloc_cyclic(&xdna->dev_handle->hwctx_ids, (u32 *)&priv->id, priv,
XA_LIMIT(1, U32_MAX), &xdna->dev_handle->next_hwctx_id, GFP_KERNEL);
if (ret < 0) {
XDNA_ERR(xdna, "Failed to allocate hwctx ID, ret=%d", ret);
goto cleanup_priv;
}
trace_amdxdna_trace_point("XRT_PROFILING_TRACE_ENTER",
client->pid, 0, priv->id, 0);
XDNA_DBG(xdna, "hwctx init: enter hwctx=%p pid=%d", hwctx, client->pid);
init_waitqueue_head(&priv->waitq);
ret = ve2_xrs_request(xdna, hwctx);
if (ret) {
XDNA_ERR(xdna, "XRS resource request failed, ret=%d", ret);
goto cleanup_priv;
}
/* Auto-select memory bitmap based on start_col */
ve2_auto_select_mem_bitmap(xdna, hwctx);
/* One host_queue entry per hwctx */
ret = ve2_create_host_queue(xdna, hwctx, &priv->hwctx_hsa_queue);
if (ret) {
XDNA_ERR(xdna, "Failed to create host queue, ret=%d", ret);
goto cleanup_xrs;
}
if (enable_polling) {
XDNA_DBG(xdna, "Running in timer mode");
timer_setup(&priv->event_timer, timeout_cb, 0);
mod_timer(&priv->event_timer, jiffies + CTX_TIMER);
} else {
XDNA_DBG(xdna, "Running in interrupt mode");
}
if (verbosity >= VERBOSITY_LEVEL_DBG)
ve2_clear_firmware_status(xdna, hwctx);
mutex_init(&priv->privctx_lock);
priv->state = AMDXDNA_HWCTX_STATE_IDLE;
XDNA_DBG(xdna, "hwctx init: ready hwctx=%p start_col=%u pid=%d",
hwctx, priv->start_col, hwctx->client->pid);
trace_amdxdna_trace_point("XRT_PROFILING_TRACE_EXIT",
hwctx->client->pid, priv->start_col, priv->id, 0);
return 0;
cleanup_xrs:
/* Releases XRS and partition (ve2_mgmt_destroy_partition calls ve2_xrs_release). */
ve2_mgmt_destroy_partition(hwctx);
cleanup_priv:
kfree(hwctx->priv);
hwctx->priv = NULL;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Saif <saifuddin.kaijar@amd.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Comments suppressed due to low confidence (1)
src/driver/amdxdna/ve2_hwctx.c:1471
- On error paths after a successful
xa_alloc_cyclic()(e.g., failures inve2_xrs_request()/ve2_create_host_queue()), cleanup falls through tocleanup_privand freeshwctx->privwithout removing the entry fromxdna->dev_handle->hwctx_ids. This leaves a dangling pointer in the XArray and leaks the allocated ID. Ensure the allocated ID isxa_erase()d on all failure paths before freeingpriv.
cleanup_xrs:
/* Releases XRS and partition (ve2_mgmt_destroy_partition calls ve2_xrs_release). */
ve2_mgmt_destroy_partition(hwctx);
cleanup_priv:
kfree(hwctx->priv);
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Saif <saifuddin.kaijar@amd.com>
No description provided.