feat(clp-s::ffi::sfa): Add ClpArchiveDecoder for iterating log events from decompression and search.#2163
feat(clp-s::ffi::sfa): Add ClpArchiveDecoder for iterating log events from decompression and search.#2163Bill-hbrhbr wants to merge 3 commits intoy-scope:mainfrom
ClpArchiveDecoder for iterating log events from decompression and search.#2163Conversation
WalkthroughIntroduces a new Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant ClpArchiveReader
participant ClpArchiveDecoder
participant SchemaReader as SchemaReader(s)
participant LogEventStorage as m_log_events
Client->>ClpArchiveReader: decode_all()
ClpArchiveReader->>ClpArchiveDecoder: create(reader)
ClpArchiveDecoder->>ClpArchiveReader: Read all tables
ClpArchiveDecoder->>SchemaReader: Determine log ordering availability
alt Log Ordering Available
SchemaReader-->>ClpArchiveDecoder: has_log_order = true
else No Log Ordering
SchemaReader-->>ClpArchiveDecoder: has_log_order = false
end
ClpArchiveDecoder-->>ClpArchiveReader: Return Result<Decoder>
Client->>ClpArchiveDecoder: collect_log_events()
loop While events remain
ClpArchiveDecoder->>ClpArchiveDecoder: get_next_log_event()
alt has_log_order
ClpArchiveDecoder->>SchemaReader: decode_next_log_event_in_order()
else
ClpArchiveDecoder->>SchemaReader: decode_next_log_event()
end
SchemaReader->>ClpArchiveDecoder: LogEvent
ClpArchiveDecoder->>LogEventStorage: Append event
end
ClpArchiveDecoder-->>Client: Return span<LogEvent const>
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/core/src/clp_s/ffi/sfa/ClpArchiveDecoder.cpp`:
- Around line 96-107: In ClpArchiveDecoder::append_next_log_event, avoid copying
the local message when adding to m_log_events by moving it into the LogEvent;
update the emplace_back call that currently passes message as an lvalue to use
std::move(message) so the temporary local string is moved into the new LogEvent
(leave timestamp and log_event_idx unchanged).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4fee67aa-6cc1-4d53-b5e5-ad265d62d50b
📒 Files selected for processing (8)
components/core/src/clp_s/ffi/CMakeLists.txtcomponents/core/src/clp_s/ffi/sfa/ClpArchiveDecoder.cppcomponents/core/src/clp_s/ffi/sfa/ClpArchiveDecoder.hppcomponents/core/src/clp_s/ffi/sfa/ClpArchiveReader.cppcomponents/core/src/clp_s/ffi/sfa/ClpArchiveReader.hppcomponents/core/src/clp_s/ffi/sfa/LogEvent.hppcomponents/core/src/clp_s/ffi/sfa/SfaErrorCode.cppcomponents/core/src/clp_s/ffi/sfa/SfaErrorCode.hpp
| int64_t log_event_idx{0}; | ||
|
|
||
| if (table.get_next_message_with_metadata(message, timestamp, log_event_idx)) { | ||
| m_log_events.emplace_back(message, timestamp, log_event_idx); |
There was a problem hiding this comment.
| m_log_events.emplace_back(message, timestamp, log_event_idx); | |
| m_log_events.emplace_back(std::move(message), timestamp, log_event_idx); |
There was a problem hiding this comment.
Some comments.
One high level question:
It might be best to transform the decoder to an iterator. That way, get_next_log_event doesn't need to copy and return a LogEvent from the internal vector.
The behaviour of collect_log_events is also a concern. I haven't looked into it in detail, but it appears that if a user calls get_next_log_event first and then collect_log_events, and get_next_log_event fails, collect_log_events might return a span of LogEvents that silently skips the failed event.
With an iterator, the user could simply collect LogEvents into a vector themselves, avoiding this issue.
| * error code indicating the failure: | ||
| * - Forwards `ClpArchiveDecoder::create`'s return values on failure. | ||
| */ | ||
| [[nodiscard]] auto decode_all() -> ystdlib::error_handling::Result<ClpArchiveDecoder>; |
There was a problem hiding this comment.
- Is this function const?
- Is the argument of create const?
[[nodiscard]] static auto create(ClpArchiveReader const& reader) - Can we return
ClpArchiveDecoder{
m_archive_reader->read_all_tables(),
m_archive_reader->has_log_order()
}
inside auto decode_all() directly to avoid friend class ClpArchiveDecoder?
| } catch (std::bad_alloc const&) { | ||
| SPDLOG_ERROR("Failed to create ClpArchiveDecoder: out of memory."); | ||
| return SfaErrorCode{SfaErrorCodeEnum::NoMemory}; | ||
| } catch (std::exception const& ex) { |
There was a problem hiding this comment.
Not sure if catching bad_alloc here and in other functions, and having a dedicated SfaErrorCodeEnum::NoMemory is necessarily.
|
|
||
| #include <ystdlib/error_handling/Result.hpp> | ||
|
|
||
| #include "ClpArchiveDecoder.hpp" |
There was a problem hiding this comment.
should this be a forward declaration instead?
| namespace clp_s::ffi::sfa { | ||
| class LogEvent { | ||
| public: | ||
| LogEvent(std::string message, int64_t timestamp, int64_t log_event_idx) |
There was a problem hiding this comment.
| LogEvent(std::string message, int64_t timestamp, int64_t log_event_idx) | |
| explicit LogEvent(std::string message, int64_t timestamp, int64_t log_event_idx) |
| * Error code enum for SFA API operations. | ||
| */ | ||
| enum class SfaErrorCodeEnum : uint8_t { | ||
| Failure, |
There was a problem hiding this comment.
How about
| Failure, | |
| DecodeFailure, |
| std::shared_ptr<clp_s::SchemaReader> next_table; | ||
| int64_t next_log_event_idx{0}; | ||
| bool found_next_table{false}; | ||
|
|
||
| for (auto const& table : m_tables) { | ||
| if (table->done()) { | ||
| continue; | ||
| } | ||
|
|
||
| auto const table_log_event_idx{table->get_next_log_event_idx()}; | ||
| if (false == found_next_table || table_log_event_idx < next_log_event_idx) { | ||
| next_table = table; | ||
| next_log_event_idx = table_log_event_idx; | ||
| found_next_table = true; | ||
| } | ||
| } | ||
|
|
||
| if (false == found_next_table) { | ||
| return false; | ||
| } |
There was a problem hiding this comment.
| std::shared_ptr<clp_s::SchemaReader> next_table; | |
| int64_t next_log_event_idx{0}; | |
| bool found_next_table{false}; | |
| for (auto const& table : m_tables) { | |
| if (table->done()) { | |
| continue; | |
| } | |
| auto const table_log_event_idx{table->get_next_log_event_idx()}; | |
| if (false == found_next_table || table_log_event_idx < next_log_event_idx) { | |
| next_table = table; | |
| next_log_event_idx = table_log_event_idx; | |
| found_next_table = true; | |
| } | |
| } | |
| if (false == found_next_table) { | |
| return false; | |
| } | |
| std::shared_ptr<clp_s::SchemaReader> next_table; | |
| int64_t next_log_event_idx{INT64_MAX}; | |
| for (auto const& table : m_tables) { | |
| if (table->done()) { | |
| continue; | |
| } | |
| auto const table_log_event_idx{table->get_next_log_event_idx()}; | |
| if (table_log_event_idx < next_log_event_idx) { | |
| next_table = table; | |
| next_log_event_idx = table_log_event_idx; | |
| } | |
| } | |
| if (nullptr == next_table) { | |
| return false; | |
| } |
- Does removing
found_next_tablemake this function more understandable? - The function looks correct to me but I'm not fully confident. Best to ask @gibber9809 to take a look.
Description
This PR adds
ClpArchiveDecoderas the utility class for iterating log events decoded from archives. The decoder is created fromClpArchiveReader, and supports both in-order and out-of-order decoding.Once log events are decoded, they are cached inside the decoder to avoid repeating decoding work.
Checklist
breaking change.
Validation performed
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes