From 28edff18d961ecf18a4aa7e703bb49aeffeb448f Mon Sep 17 00:00:00 2001 From: Flavien Solt Date: Sat, 9 May 2026 00:19:18 +0800 Subject: [PATCH] fix(load_unit): move flush_i branch after ldbuf_w to prevent phantom writeback MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In ldbuf_comb the flush_i "set-all-flushed" branch appeared before the ldbuf_w "allocate-slot" branch. When both are asserted in the same cycle (ldbuf_w = data_req & data_gnt, flush_i = flush_ex_o from controller), the allocation write wins for the new slot via lexical last-assignment semantics: if (flush_i) ldbuf_flushed_d = '1; // (a) ... if (ldbuf_w) ldbuf_flushed_d[windex] = 1'b0; // (b) overrides (a) for windex Result: flushed_q[windex]=0, valid_q[windex]=1 — a slot that survived the flush. When data_rvalid arrives N cycles later the FSM is in IDLE (kill_req=0) and ldbuf_flushed_q[slot]=0, so valid_o=1. The flushed load's result is forwarded to scoreboard writeback with a potentially recycled trans_id, corrupting an unrelated instruction's result. Trigger: an exception or fence commits (flush_ex_o=1 → flush_i=1) while the load unit is in WAIT_GNT and the dcache arbiter grants the same cycle. A concrete sequence: an older store faults (store page-fault) at the commit head while a younger speculative load has been held in WAIT_GNT waiting for the dcache arbiter. If the arbiter grants on the same cycle the fault commits, flush_i and data_gnt are simultaneously high. Fix: move if (flush_i) to after if (ldbuf_w) so the flush unconditionally wins for every slot, including the one just allocated. Verified with a standalone Verilator simulation of the isolated ldbuf_comb / ldbuf_ff logic. Pre-fix: valid_o=1 on rvalid (BUG-CONFIRMED). Post-fix: valid_o=0 (correctly suppressed). --- core/load_unit.sv | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/core/load_unit.sv b/core/load_unit.sv index d4167b001c..d4931ac875 100644 --- a/core/load_unit.sv +++ b/core/load_unit.sv @@ -156,10 +156,6 @@ module load_unit ldbuf_flushed_d = ldbuf_flushed_q; ldbuf_valid_d = ldbuf_valid_q; - // In case of flush, raise the flushed flag in all slots. - if (flush_i) begin - ldbuf_flushed_d = '1; - end // Free read entry (in the case of fall-through mode, free the entry // only if there is no pending load) if (ldbuf_r && (!LDBUF_FALLTHROUGH || !ldbuf_w)) begin @@ -170,6 +166,15 @@ module load_unit ldbuf_flushed_d[ldbuf_windex] = 1'b0; ldbuf_valid_d[ldbuf_windex] = 1'b1; end + // In case of flush, raise the flushed flag in all slots, including any + // slot just allocated by ldbuf_w in the same cycle. This branch must + // appear after ldbuf_w so that flush_i wins the lexical priority contest + // for the newly written slot; if it appeared before, a simultaneous + // data_gnt would clear flushed_d[windex] back to 0 and the slot would + // survive the flush, later producing a phantom valid_o writeback. + if (flush_i) begin + ldbuf_flushed_d = '1; + end end always_ff @(posedge clk_i or negedge rst_ni) begin : ldbuf_ff