feat(clp-s::log_converter): Add max-log-event-size argument; Make default max log event size 512 MiB to match clp-s (fixes #2176).#2193
Conversation
…MiB to match clp-s.
WalkthroughAdds a configurable maximum log event/buffer size: new CLI option Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
components/core/src/clp_s/log_converter/LogConverter.cpp (1)
138-143:⚠️ Potential issue | 🟠 MajorClamp buffer growth to the configured ceiling.
Many valid
--max-log-event-sizevalues are unreachable here. With a 96 MiB limit, the buffer still tops out at 64 MiB because the next growth step jumps to 128 MiB and returnsresult_out_of_range.2 * m_buffer.size()can also wrap before the comparison when the configured ceiling is very large. Grow to the smaller of2xandm_max_buffer_size, and only fail once the buffer is already at the cap.Suggested fix
- size_t const new_size{2 * m_buffer.size()}; - if (new_size > m_max_buffer_size) { + auto const current_size{m_buffer.size()}; + if (current_size >= m_max_buffer_size) { return std::errc::result_out_of_range; } + size_t const new_size{ + current_size > m_max_buffer_size - current_size ? m_max_buffer_size + : current_size + current_size + }; ystdlib::containers::Array<char> new_buffer(new_size);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/core/src/clp_s/log_converter/LogConverter.cpp` around lines 138 - 143, The current growth logic computes new_size as 2 * m_buffer.size() which can overflow and skips reachable sizes below m_max_buffer_size; change it to first check if m_buffer.size() >= m_max_buffer_size and return std::errc::result_out_of_range only when already at the cap, otherwise compute new_size safely as the smaller of 2*m_buffer.size() and m_max_buffer_size (avoid overflow by testing m_buffer.size() > m_max_buffer_size/2 and using m_max_buffer_size in that case), then allocate the new ystdlib::containers::Array<char> with that new_size and memcpy m_num_bytes_buffered bytes as before (symbols: m_buffer, m_max_buffer_size, m_num_bytes_buffered).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/core/src/clp_s/log_converter/LogConverter.hpp`:
- Around line 20-22: The constructor for LogConverter currently always
initializes m_buffer with cDefaultBufferSize which allows records larger than a
configured max; update the LogConverter(size_t max_buffer_size) constructor to
enforce the limit by sizing m_buffer from the provided max_buffer_size (or the
lesser of max_buffer_size and cDefaultBufferSize) and/or validate/reject
max_buffer_size below cDefaultBufferSize by setting m_max_buffer_size and
throwing or asserting; specifically change the initialization logic that sets
m_buffer(cDefaultBufferSize) and adjust handling of m_max_buffer_size so the
initial buffer cannot exceed the configured max and small limits are enforced at
construction time.
---
Outside diff comments:
In `@components/core/src/clp_s/log_converter/LogConverter.cpp`:
- Around line 138-143: The current growth logic computes new_size as 2 *
m_buffer.size() which can overflow and skips reachable sizes below
m_max_buffer_size; change it to first check if m_buffer.size() >=
m_max_buffer_size and return std::errc::result_out_of_range only when already at
the cap, otherwise compute new_size safely as the smaller of 2*m_buffer.size()
and m_max_buffer_size (avoid overflow by testing m_buffer.size() >
m_max_buffer_size/2 and using m_max_buffer_size in that case), then allocate the
new ystdlib::containers::Array<char> with that new_size and memcpy
m_num_bytes_buffered bytes as before (symbols: m_buffer, m_max_buffer_size,
m_num_bytes_buffered).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9137d52b-35d4-4a05-b0f6-1f558fd702ab
📒 Files selected for processing (5)
components/core/src/clp_s/log_converter/CommandLineArguments.cppcomponents/core/src/clp_s/log_converter/CommandLineArguments.hppcomponents/core/src/clp_s/log_converter/LogConverter.cppcomponents/core/src/clp_s/log_converter/LogConverter.hppcomponents/core/src/clp_s/log_converter/log_converter.cpp
components/core/src/clp_s/log_converter/CommandLineArguments.cpp
Outdated
Show resolved
Hide resolved
| explicit LogConverter(size_t max_buffer_size) | ||
| : m_buffer(cDefaultBufferSize), | ||
| m_max_buffer_size{max_buffer_size} {} |
There was a problem hiding this comment.
Shall we check that the max buffer size is at least cDefaultBufferSize?
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/core/src/clp_s/log_converter/CommandLineArguments.cpp`:
- Around line 198-200: The check against m_max_log_event_size should explicitly
test for zero because it's declared as size_t (unsigned); in
CommandLineArguments.cpp replace the condition "m_max_log_event_size <= 0" with
an explicit "m_max_log_event_size == 0" to make intent clear and avoid
misleading unsigned comparisons, leaving the thrown std::invalid_argument
message and surrounding logic in the constructor/validation unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 79f20945-6939-4893-b130-7970f4f07c05
📒 Files selected for processing (1)
components/core/src/clp_s/log_converter/CommandLineArguments.cpp
LinZhihao-723
left a comment
There was a problem hiding this comment.
For the PR title, how about:
feat(clp-s::log_converter): Add `max-log-event-size` argument; Make default max log event size 512 MiB to match `clp-s` (fixes #2176).
I think this PR is adding a feature, not strictly fixing an issue...
max-log-event-size argument; make default max log event size 512 MiB to match clp-s (fixes #2176).max-log-event-size argument; Make default max log event size 512 MiB to match clp-s (fixes #2176).
…MiB to match clp-s.
Description
This PR brings
log-converterin line withclp-sfor maximum log event size. We add a command line option tolog-converterto make it configurable, and we set the default to 512MiB in order to matchclp-s.It may be worth adding a per-compression-job config option to allow users to raise this limit for datasets they know will have large log lines, but that should be part of a separate PR.
Checklist
breaking change.
Validation performed
log-converternow accepts these records by default.--max-log-event-sizeto a value lower than 64MiB results in the two generated files failing conversion, as expected.Summary by CodeRabbit
New Features
Behavior Change