ringbuf: add zero-copy read with deferred consumer advancement#1968
ringbuf: add zero-copy read with deferred consumer advancement#1968orishuss wants to merge 1 commit intocilium:mainfrom
Conversation
2c99f76 to
da08000
Compare
da08000 to
609e92e
Compare
|
|
||
| err := rr.readRecordZeroCopy(rec) | ||
| if err != nil { | ||
| return err |
There was a problem hiding this comment.
If we get an error here and don't call rr.advance() don't we get stuck? What is the idea for recovery?
There was a problem hiding this comment.
good point - an error after isDiscard=true records will not advance. fixing
| // to the next record. | ||
| atomic.StoreUintptr(rr.cons_pos, cons) | ||
| rr.pendingCons = cons | ||
| rr.hasPending = true |
There was a problem hiding this comment.
What clears hasPending if we never reach a non-discard record?
There was a problem hiding this comment.
In the event that all records in readRecordZeroCopy are isDiscard=true, advance must stil be called by the user. the user's readWait still waits for a record, so it should still get it eventually and the user may successsfully get a non-discard record.
However, even if an error is returned to the user - for example a timeout - Commit must still be called. This is a very important detail that must be added to the documentation.
At first when I read your comment I wanted to address it by simply advancing the ring buffer for isDiscard=true records and for errors, but this can't be done, because there may be user pointer to the internal ring buffer, which wasn't freed (Commit wasn't called).
ringbuf/reader.go
Outdated
|
|
||
| // Set by readRecordZeroCopy to prevent readRecord from reusing | ||
| // RawSample as a write destination (it points into read-only mmap memory). | ||
| zeroCopy bool |
There was a problem hiding this comment.
Embedding call-scoped state in a user-visible type creates hidden coupling. A Record reused across two different Reader instances carries a zeroCopy flag from reader A into reader B, silently changing allocation behavior. The pending-commit state belongs to the reader or ring, not the record.
There was a problem hiding this comment.
This was exactly my intention.
If a record points to kernel memory, it must not be written to. This is a property of the buf [] (record).
Indeed, if someone uses a raw record to read without copying, and then uses the same record on another ring buffer, I would want an allocation, because the old memory isn't writable.
This field should be renamed - to something like kernelBacked.
| } | ||
|
|
||
| // Commits the pending consumer position from readRecordZeroCopy calls. | ||
| func (rr *ringReader) advance() { |
There was a problem hiding this comment.
After atomic.StoreUintptr() returns, the Kernel may overwrite the mmap'd region immediately. Any RawSample slice handed out by ReadZeroCopy is now a dangling view into potentially live Kernel memory. The documentation warns callers, but nothing prevents silent use-after-commit — Go's GC keeps the mapping alive while rec.RawSample is reachable, so there's no crash, just corrupt data.
There was a problem hiding this comment.
Yes :)
This will be a problem in every implementation of such a feature.
This may be "hidden" because of the function name. May be ReadZeroCopy should be ReadUnsafe?
|
Okay, here's my ideal scenario as someone watching this from a distance (and this now being the 3rd proposal 🙂):
@florianl @orishuss @jschwinger233 WDYT? |
d7805dc to
1c6b21c
Compare
|
Hi @ti-mo, I like your idea. A separation between Does this seem sufficient, or would you still like a separation to two readers? |
Signed-off-by: Ori Shussman <orishuss@gmail.com>
1c6b21c to
d1f0ef5
Compare
|
About concurrency, I can think of 2 ways forward:
|
Summary
Adds
ReadZeroCopyandCommittoringbuf.Reader, enabling zero-copy reads from the mmap'd ring buffer with batched consumer position advancement.Motivation
ReadIntocopies every record from mmap into a user buffer and advances the consumer position per-record via atomic store. In high-throughput scenarios (100K+ events/sec), the memmove and per-record atomics become a measurable CPU cost.This change allows callers to:
CommitChanges
Reader.ReadZeroCopy(rec *Record) error— setsrec.RawSampleto a slice of the mmap'd ring, does not advance consumer positionReader.Commit()— single atomic store to release all consumed spacereadRecordrefactored to delegate toreadRecordZeroCopy+ copy + advance (no duplication)ReadIntoandReadZeroCopyshare poll logic viareadWaithelperringReaderdelegation (same as Linux)API
Existing
Read/ReadIntosemantics are unchanged.Related
This PR differs by deferring consumer advancement, eliminating per-record atomic stores in addition to the copy.