feat(api-server): Add GET /usage/compression endpoint.#2197
feat(api-server): Add GET /usage/compression endpoint.#2197jackluo923 wants to merge 32 commits intoy-scope:mainfrom
/usage/compression endpoint.#2197Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a GET /usage/compression endpoint with query parsing/validation, new request/response types and validation, a Client method to run an aggregate compression-usage SQL query, a ClientError::InvalidInput variant, case-insensitive CompressionJobStatus parsing, dependency updates, and a composite DB index on (start_time, status). Changes
Sequence DiagramsequenceDiagram
actor Client
participant Handler as HTTP Handler<br/>(routes.rs)
participant Validator as Params Validator<br/>(client.rs)
participant APIClient as DB Client Method<br/>(client.rs)
participant DB as SQL Database
Client->>Handler: GET /usage/compression?begin=&end=&job_status=&limit=
Handler->>Validator: Parse to CompressionUsageParams
Validator->>Validator: Validate timestamps, limit, parse job_status tokens
alt Validation Fails
Validator-->>Handler: ClientError::InvalidInput
Handler-->>Client: 400 Bad Request
else Validation Passes
Handler->>APIClient: get_compression_usage(validated_params)
APIClient->>DB: Execute aggregate query with dynamic IN (...) and time bounds
DB-->>APIClient: Vec<CompressionUsage>
APIClient-->>Handler: Query results
Handler-->>Client: 200 OK + JSON
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/api-server/src/routes.rs`:
- Around line 271-298: The parsing currently allows duplicate statuses which
leads to redundant SQL placeholders; update validate_compression_usage_params to
deduplicate the resolved statuses before returning (e.g., after mapping with
resolve_job_status in validate_compression_usage_params, filter out repeats
while preserving order by tracking seen statuses in a local HashSet or Vec),
ensuring DEFAULT_JOB_STATUSES remains the fallback and that the existing
empty-check and error paths (and references to resolve_job_status and
DEFAULT_JOB_STATUSES) are preserved.
In `@components/clp-py-utils/clp_py_utils/initialize-orchestration-db.py`:
- Around line 81-82: The composite index `JOB_START_TIME_STATUS` on (start_time,
status) is correct for get_compression_usage's query pattern but CREATE TABLE IF
NOT EXISTS only adds it for new DBs; add a migration step in
initialize-orchestration-db.py to ensure existing deployments get the index:
detect if the table exists and, if so, run an ALTER TABLE to add INDEX
`JOB_START_TIME_STATUS` (`start_time`, `status`) USING BTREE (guarded to avoid
duplicate-creation errors), or add an explicit migration script that executes
that ALTER TABLE; reference the table creation block and the index name
`JOB_START_TIME_STATUS` when implementing this change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: f41ac76a-326c-4e97-af35-71e7afa2beef
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
components/api-server/Cargo.tomlcomponents/api-server/src/client.rscomponents/api-server/src/routes.rscomponents/clp-py-utils/clp_py_utils/initialize-orchestration-db.py
Returns per-job compression usage (sizes, duration, task stats) within a configurable epoch-millisecond time window, filtered by job status. - Comma-separated `job_status` query param (e.g. SUCCEEDED,FAILED,KILLED) - Defaults to all terminal states (SUCCEEDED, FAILED, KILLED) - Parameter validation extracted into shared helper for reuse in tests - Composite index on (start_time, status) for efficient time-range queries - OpenAPI schema registration for CompressionUsageRow - Unit tests for validation logic (timestamp ordering, status resolution, edge cases like trailing commas and whitespace)
d5a0f8f to
1685f8a
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
components/clp-py-utils/clp_py_utils/initialize-orchestration-db.py (1)
81-82:⚠️ Potential issue | 🟠 MajorBackfill this index for upgraded databases.
CREATE TABLE IF NOT EXISTSonly applies this index on fresh installs. Existingcompression_jobstables keep the old definition, so/usage/compressioncan still missJOB_START_TIME_STATUSand fall back to much slower scans until a separate migration adds it. Please add a guardedALTER TABLE ... ADD INDEXpath or an explicit migration.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/clp-py-utils/clp_py_utils/initialize-orchestration-db.py` around lines 81 - 82, The new index JOB_START_TIME_STATUS on compression_jobs is only created for fresh installs via CREATE TABLE IF NOT EXISTS, so upgraded DBs can miss it; add a guarded ALTER TABLE path or migration that ensures the index exists: detect whether compression_jobs has an index named JOB_START_TIME_STATUS (query information_schema.STATISTICS or use a safe "ALTER TABLE compression_jobs ADD INDEX JOB_START_TIME_STATUS (start_time, status)" wrapped in an existence check) and create it if missing, placing this logic alongside the existing CREATE TABLE IF NOT EXISTS code (the same initialization routine that references compression_jobs) so upgrades and fresh installs both end up with the index.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/api-server/src/client.rs`:
- Around line 636-640: The SQL currently uses an inner JOIN between
compression_jobs (alias j) and compression_tasks (alias t) which drops jobs that
have no task rows; change the join to a LEFT JOIN (i.e., replace "JOIN
compression_tasks t ON t.job_id = j.id" with "LEFT JOIN compression_tasks t ON
t.job_id = j.id") so jobs with zero tasks or no persisted tasks are retained;
keep the existing SUM(t.duration) aggregation as-is (it will yield NULL for
no-task jobs) so it continues to map to the Option<BigDecimal> return type in
client.rs.
In `@components/api-server/src/routes.rs`:
- Around line 252-265: The endpoint allows unbounded time windows which can
trigger huge materialization in Client::get_compression_usage (which uses
fetch_all); add validation in the request path (e.g., in CompressionUsageParams
deserialization/validation or at the start of the compression_usage handler) to
reject windows larger than a safe maximum (e.g., max_window_ms constant) or
require/implement pagination (limit + cursor) and return a 400 with a clear
message when the window exceeds the limit; update the handler to enforce this
check before forwarding to Client::get_compression_usage so expensive fetch_all
calls cannot be driven by a single request.
---
Duplicate comments:
In `@components/clp-py-utils/clp_py_utils/initialize-orchestration-db.py`:
- Around line 81-82: The new index JOB_START_TIME_STATUS on compression_jobs is
only created for fresh installs via CREATE TABLE IF NOT EXISTS, so upgraded DBs
can miss it; add a guarded ALTER TABLE path or migration that ensures the index
exists: detect whether compression_jobs has an index named JOB_START_TIME_STATUS
(query information_schema.STATISTICS or use a safe "ALTER TABLE compression_jobs
ADD INDEX JOB_START_TIME_STATUS (start_time, status)" wrapped in an existence
check) and create it if missing, placing this logic alongside the existing
CREATE TABLE IF NOT EXISTS code (the same initialization routine that references
compression_jobs) so upgrades and fresh installs both end up with the index.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 93573497-43a1-4b75-b6a9-07782e7cb234
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
components/api-server/Cargo.tomlcomponents/api-server/src/client.rscomponents/api-server/src/routes.rscomponents/clp-py-utils/clp_py_utils/initialize-orchestration-db.py
components/api-server/src/client.rs
Outdated
| rows.iter() | ||
| .map(|r| { | ||
| Ok(CompressionUsageRow { |
There was a problem hiding this comment.
Instead of using try_get to map MySqlQueryResult to CompressionUsageRow, it might be easier to derive sqlx::FromRow in CompressionUsageRow, and use sqlx::query_as::<_, CompressionUsageRow>("SELECT ...") directly.
There was a problem hiding this comment.
nice suggestion, switch to sqlx::FromRow
components/api-server/src/routes.rs
Outdated
|
|
||
| /// Validates compression usage parameters and resolves the requested job | ||
| /// statuses into their integer codes. | ||
| fn validate_compression_usage_params( |
There was a problem hiding this comment.
- The conventional way to validate a serde struct is to create an
UncheckedCompressionUsageQueryandimpl TryFrom<UncheckedCompressionUsageQuery> for CompressionUsageQuery, but this will also showUncheckedCompressionUsageQueryas the schema name in the openapi docs https://petstore.swagger.io/?url=https://docs.yscope.com/clp/main/_static/generated/api-server-openapi.json#/default/health.
How about move this validate function as a method of CompressionUsageParams, and add a comment to explain why
impl CompressionUsageParams {
// explain why not having an UncheckedCompressionUsage
pub fn validate() -> Result<(), ClientError> {
....
}
}
- If you adopt
CompressionJobStatus, then the validation of job status can be removed / simplified to only checkingempty.
There was a problem hiding this comment.
Already adopted CompressionJobStatus
Co-authored-by: hoophalab <200652805+hoophalab@users.noreply.github.com>
Co-authored-by: hoophalab <200652805+hoophalab@users.noreply.github.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
components/api-server/src/client.rs (1)
51-79: 🧹 Nitpick | 🔵 TrivialConsider appending 'Z' to the ISO 8601 date format to indicate UTC.
The doc comments correctly state times are in UTC, but the actual
DATE_FORMATin the SQL query (lines 630-631) produces'%Y-%m-%dT%H:%i:%s'which lacks the 'Z' suffix. API consumers may assume local time without an explicit timezone indicator. Since the struct documents these as UTC, the format should reflect that.Suggested SQL format change
- DATE_FORMAT(j.creation_time, '%Y-%m-%dT%H:%i:%s') AS creation_time, \ - DATE_FORMAT(j.start_time, '%Y-%m-%dT%H:%i:%s') AS start_time, \ + CONCAT(DATE_FORMAT(j.creation_time, '%Y-%m-%dT%H:%i:%s'), 'Z') AS creation_time, \ + CONCAT(DATE_FORMAT(j.start_time, '%Y-%m-%dT%H:%i:%s'), 'Z') AS start_time, \Alternatively, return epoch milliseconds as discussed in the past review comment, which avoids timezone ambiguity entirely.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/api-server/src/client.rs` around lines 51 - 79, The struct CompressionUsage documents creation_time and start_time as ISO 8601 UTC but the SQL DATE_FORMAT used to populate those fields omits the trailing 'Z'; update the query that builds creation_time and start_time so the string explicitly indicates UTC (e.g., append 'Z' to the DATE_FORMAT output) or switch to returning epoch milliseconds instead (so CompressionUsage.creation_time and CompressionUsage.start_time are either ISO strings with 'Z' appended or epoch ms numbers consistently), and ensure any code expecting those fields (the query that maps to CompressionUsage) is updated to match the chosen format.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@components/api-server/src/client.rs`:
- Around line 51-79: The struct CompressionUsage documents creation_time and
start_time as ISO 8601 UTC but the SQL DATE_FORMAT used to populate those fields
omits the trailing 'Z'; update the query that builds creation_time and
start_time so the string explicitly indicates UTC (e.g., append 'Z' to the
DATE_FORMAT output) or switch to returning epoch milliseconds instead (so
CompressionUsage.creation_time and CompressionUsage.start_time are either ISO
strings with 'Z' appended or epoch ms numbers consistently), and ensure any code
expecting those fields (the query that maps to CompressionUsage) is updated to
match the chosen format.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 32b4e73a-bd16-4ce4-819d-d2d2adccadbf
📒 Files selected for processing (2)
components/api-server/src/client.rscomponents/api-server/src/routes.rs
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/api-server/src/client.rs`:
- Around line 649-697: The public method get_compression_usage currently allows
an empty job_statuses slice which produces an invalid SQL "IN ()" clause; add an
early input validation at the start of get_compression_usage that returns
Err(ClientError::InvalidInput(...)) when job_statuses.is_empty() to reject the
caller input (so callers get a structured error instead of a DB error), before
constructing placeholders and binding values; reference the
get_compression_usage function and ensure the error variant used matches other
client validations (e.g. ClientError::InvalidInput with a clear message like
"job_statuses cannot be empty").
In `@components/api-server/src/routes.rs`:
- Around line 334-559: Add regression tests exercising
CompressionUsageParams::validate for invalid limits by extending the existing
test module that uses test_app() and the "/usage/compression" route: add two new
#[tokio::test] functions (e.g., reject_zero_limit and reject_negative_limit)
that call app.oneshot with query strings containing
begin_timestamp=0&end_timestamp=100&limit=0 and limit=-1 respectively, assert
the response status is StatusCode::BAD_REQUEST, and assert the response body
contains the validation error text referencing limit (e.g., "limit must be > 0"
or similar) so the new guard in CompressionUsageParams::validate cannot regress
silently.
In `@components/clp-rust-utils/Cargo.toml`:
- Line 19: Update and commit the Rust lockfile to reflect the manifest changes:
regenerate Cargo.lock locally (e.g., run cargo fetch or cargo build) so the
change to the strum dependency line 'strum = { version = "0.28.0", features =
["derive"] }' in components/clp-rust-utils/Cargo.toml and the added
dev-dependencies in components/api-server/Cargo.toml are captured, then add and
commit the updated Cargo.lock to this PR so the CI task deps:lock:check-rust
(which uses cargo fetch --locked) will pass.
In `@tools/yscope-dev-utils`:
- Line 1: The submodule tools/yscope-dev-utils points to a non-existent commit
(6ec46856cfda8cbfb65d9386affbe3bc429186cb) and
tools/scripts/deps-download/init.sh still references the old SHA
(38bf51effc500a528c37052345c141a65ea88447); either push the missing commit to
the remote or update the submodule reference to a valid commit SHA in the
repository (fix the entry for tools/yscope-dev-utils in the gitlink), and ensure
tools/scripts/deps-download/init.sh is updated to match that same valid SHA so
both the submodule pointer and init.sh reference the same existing commit.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 185885cf-8999-4d5e-9129-4b3ba9999004
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (7)
components/api-server/Cargo.tomlcomponents/api-server/src/client.rscomponents/api-server/src/error.rscomponents/api-server/src/routes.rscomponents/clp-rust-utils/Cargo.tomlcomponents/clp-rust-utils/src/job_config/compression.rstools/yscope-dev-utils
d064a79 to
55b298f
Compare
- Return epoch-millisecond timestamps instead of ISO 8601 strings - Add `limit` query parameter (default 1000) to cap result rows - Remove `bigdecimal` dependency; use `f64` with `ROUND()` in SQL - Use `sqlx::FromRow` derive on `CompressionUsage` (removes manual `try_get` mapping) and `query_as` replacing manual row iteration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…s parsing Replace custom `JOB_STATUS_LOOKUP` table and `resolve_job_status()` with strum's `EnumString` derive on `CompressionJobStatus`. This gives us case-insensitive `FromStr` parsing (e.g. "succeeded", "SUCCEEDED" both work) and eliminates a manually-maintained string-to-int mapping that could drift out of sync with the enum. `Pending` is excluded from parsing via `#[strum(disabled)]` since it has no `start_time`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…lient Move `CompressionUsageParams`, `default_limit`, and the validation logic from `routes.rs` into `client.rs` so that `Client` owns its input types and validation. Any future caller of `Client` (CLI, library, etc.) gets the same validated interface without depending on the HTTP layer. - Add `ClientError::InvalidInput` for validation errors - `CompressionUsageParams::validate()` replaces standalone function - `HandlerError` maps `InvalidInput` to `BadRequest` in routes.rs - Uses `validate()` method rather than the `TryFrom<Unchecked>` pattern to avoid leaking an internal name into the `OpenAPI` schema Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3495161 to
5e0b6a1
Compare
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the intermediate `job_status_clause` variable and inline
`AND j.status IN ({placeholders})` directly into the SQL. The dynamic
placeholder generation is still needed for the variable-length IN
clause.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add reject_zero_limit and reject_negative_limit tests to ensure CompressionUsageParams::validate correctly rejects non-positive limit values. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When job_statuses is empty, omit the IN clause entirely instead of producing invalid SQL (`AND j.status IN ()`). An empty slice now means "no filter" — return all statuses. The HTTP layer's validate() still rejects explicitly empty job_status inputs (e.g. `job_status=,,,`) as malformed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use #[rustfmt::skip] on the SQL format!() call so the project's `format_strings = true` setting doesn't collapse the query into a single-line blob. Each SQL clause is now on its own line for readability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Group only by j.id (primary key) and wrap other non-aggregated columns in ANY_VALUE(). Since all selected columns are functionally dependent on j.id, this is correct under ONLY_FULL_GROUP_BY and avoids a lengthy GROUP BY list. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tate Make `get_compression_usage` accept a `&ValidatedCompressionUsageParams` instead of individual parameters. The validated struct is only constructable through `CompressionUsageParams::validate()`, so callers cannot bypass validation at the type level. Also clarify the `limit` parameter description in the OpenAPI schema to state "Must be > 0; defaults to 1000." Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
`CREATE TABLE IF NOT EXISTS` skips entirely when the table already exists, so the `JOB_START_TIME_STATUS` index added in this PR would never be created on existing deployments. Add an `ALTER TABLE ADD INDEX` that tolerates "Duplicate key name" errors, making it safe to run on both fresh and existing databases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/api-server/src/client.rs`:
- Around line 91-108: Validation currently enforces that
ValidatedCompressionUsageParams.job_statuses is non-empty, so the empty-branch
in get_compression_usage is dead code; remove the conditional that special-cases
an empty job_statuses and simplify the SQL construction to always use the IN
clause with the job_statuses vector (refer to the job_statuses local variable
and get_compression_usage function), ensuring any helper that builds the
parameterized IN list uses job_statuses directly and no longer needs an "omit IN
clause if empty" path.
In `@components/clp-py-utils/clp_py_utils/initialize-orchestration-db.py`:
- Around line 98-100: Replace the brittle string match in the exception handler
that checks "Duplicate key name" with a numeric MySQL error-code check: import
mysql.connector.errorcode (or errorcode) and change the condition to verify
hasattr(err, "errno") and err.errno == errorcode.ER_DUP_KEYNAME before
swallowing the exception; otherwise re-raise. Locate the except block in
initialize-orchestration-db.py (the except Exception as err handler) and apply
the same pattern used elsewhere (e.g., sql_adapter.py) to ensure robust error
detection across MySQL/MariaDB variants.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: fe109b88-109b-40fd-97d1-c8a3b227a79e
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.lockdocs/src/_static/generated/api-server-openapi.jsonis excluded by!**/generated/**
📒 Files selected for processing (7)
components/api-server/Cargo.tomlcomponents/api-server/src/client.rscomponents/api-server/src/error.rscomponents/api-server/src/routes.rscomponents/clp-py-utils/clp_py_utils/initialize-orchestration-db.pycomponents/clp-rust-utils/Cargo.tomlcomponents/clp-rust-utils/src/job_config/compression.rs
hoophalab
left a comment
There was a problem hiding this comment.
LGTM in general. Validated that the endpoint works correctly after removing ANY_VALUE in group by.
components/api-server/src/client.rs
Outdated
| impl CompressionUsageParams { | ||
| /// Validates the parameters and resolves the requested job statuses into | ||
| /// [`CompressionJobStatus`] variants. | ||
| /// | ||
| /// We use a `validate()` method rather than the conventional | ||
| /// `TryFrom<UncheckedX>` pattern because this struct also serves as the | ||
| /// `OpenAPI` query-parameter schema. Introducing an `Unchecked` wrapper would | ||
| /// leak an internal implementation detail into the generated API docs. | ||
| /// | ||
| /// # Errors | ||
| /// | ||
| /// Returns [`ClientError::InvalidInput`] if: | ||
| /// - `begin_timestamp > end_timestamp` | ||
| /// - `limit <= 0` | ||
| /// - `job_status` contains unrecognized or empty values | ||
| pub fn validate(&self) -> Result<ValidatedCompressionUsageParams, ClientError> { |
There was a problem hiding this comment.
| impl CompressionUsageParams { | |
| /// Validates the parameters and resolves the requested job statuses into | |
| /// [`CompressionJobStatus`] variants. | |
| /// | |
| /// We use a `validate()` method rather than the conventional | |
| /// `TryFrom<UncheckedX>` pattern because this struct also serves as the | |
| /// `OpenAPI` query-parameter schema. Introducing an `Unchecked` wrapper would | |
| /// leak an internal implementation detail into the generated API docs. | |
| /// | |
| /// # Errors | |
| /// | |
| /// Returns [`ClientError::InvalidInput`] if: | |
| /// - `begin_timestamp > end_timestamp` | |
| /// - `limit <= 0` | |
| /// - `job_status` contains unrecognized or empty values | |
| pub fn validate(&self) -> Result<ValidatedCompressionUsageParams, ClientError> { | |
| impl TryFrom<CompressionUsageParams> for ValidatedCompressionUsageParams { | |
| type Error = ClientError; | |
| /// Validates the parameters and resolves the requested job statuses into | |
| /// [`CompressionJobStatus`] variants. | |
| /// | |
| /// We use a `validate()` method rather than the conventional | |
| /// `TryFrom<UncheckedX>` pattern because this struct also serves as the | |
| /// `OpenAPI` query-parameter schema. Introducing an `Unchecked` wrapper would | |
| /// leak an internal implementation detail into the generated API docs. | |
| /// | |
| /// # Errors | |
| /// | |
| /// Returns [`ClientError::InvalidInput`] if: | |
| /// - `begin_timestamp > end_timestamp` | |
| /// - `limit <= 0` | |
| /// - `job_status` contains unrecognized or empty values | |
| pub fn try_from(value: CompressionUsageParams) -> Result<ValidatedCompressionUsageParams, Self::Error> { |
AI might misunderstand my comments. If we already create a ValidatedCompressionUsageParams, then instead of impl a validate method, it might be best to impl TryFrom<ValidatedCompressionUsageParams> for CompressionUsageParams { ... }.
There was a problem hiding this comment.
Will fix it manually. The doc is stale and callsite need tob e updated.
components/api-server/src/client.rs
Outdated
| // task creation) since they consumed no compute resources. | ||
| "SELECT \ | ||
| j.id, \ | ||
| ANY_VALUE(j.status) AS status, \ |
There was a problem hiding this comment.
Got this error in log. Not sure whether I misconfigured anything.
{"timestamp":"2026-04-12T13:37:11.319164Z","level":"ERROR","fields":{"message":"Failed to fetch compression usage: Sql(Database(MySqlDatabaseError { code: Some(\"42000\"), number: 1305, message: \"FUNCTION clp-db.ANY_VALUE does not exist\" }))"},"filename":"components/api-server/src/routes.rs","line_number":284}
We could just remove ANY_VALUE if there is no serious compatibility concerns with other databases.
There was a problem hiding this comment.
switched to MAX() which is equivalent to ANY_VALUE() on dependent string and works across MYSQL and MariaDB
| #[repr(i32)] | ||
| #[strum(ascii_case_insensitive)] | ||
| pub enum CompressionJobStatus { | ||
| #[strum(disabled)] |
There was a problem hiding this comment.
Why disable this? Querying pending jobs is still valid.
There was a problem hiding this comment.
you are right, when it is disabled, it not only disabled the default (where it should not be used). If user really want to fetch the resource usage for pending tasks, they should be able to.
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { |
There was a problem hiding this comment.
Those tests have become meaningless because we've switched to strum
There was a problem hiding this comment.
you are right, removed.
components/api-server/src/client.rs
Outdated
| ANY_VALUE(CAST(UNIX_TIMESTAMP(j.creation_time) * 1000 AS SIGNED)) AS creation_time, \ | ||
| ANY_VALUE(CAST(UNIX_TIMESTAMP(j.start_time) * 1000 AS SIGNED)) AS start_time, \ | ||
| ANY_VALUE(ROUND(j.duration, 3)) AS duration, \ |
There was a problem hiding this comment.
- Can we remove
ROUNDand just return the full precision float value? - We should be able to bind a
chrono::Datetime<?>directly and avoidFROM_UNIXTIME(? / 1000.0). We can modify types instruct ValidatedCompressionUsageParamsandstruct CompressionUsageParamsusing#[serde(with = "chrono::serde::ts_milliseconds")]. - Since we are already using query_as, can we change the time fields in
struct CompressionUsageto type chrono, and setup#[serde(with = "chrono::serde::ts_milliseconds")]
Remove tests that merely verify strum's EnumString derive behavior (valid parsing, case insensitivity, invalid rejection). Keep the test asserting Pending is parseable but excluded from DEFAULT_JOB_STATUSES.
… millis API. - Convert `ValidatedCompressionUsageParams` timestamps to `DateTime<Utc>`, validated via `timestamp_millis_opt()` for range checking - Use `DateTime<Utc>` in `CompressionUsage` with `chrono::serde::ts_milliseconds` so JSON serialization remains i64 epoch millis - Simplify SQL: remove `FROM_UNIXTIME(? / 1000.0)`, `CAST(UNIX_TIMESTAMP(...))`, and `ROUND()` — sqlx binds `DateTime<Utc>` directly and selects raw columns - Replace MySQL-only `ANY_VALUE()` with cross-compatible `MAX()` - Add `#[schema(value_type = i64)]` so OpenAPI docs show i64, not datetime - Add chrono dependency and sqlx chrono feature
Validation already guarantees `job_statuses` is non-empty, so the `if params.job_statuses.is_empty()` branch in `get_compression_usage` was unreachable. Remove it for clarity.
- Remove comment about empty job_statuses branch (removed in prior commit) - Add PENDING = 0 to CompressionUsage.status docstring
raise error when duplicates are detected. migration is already migrated.. |
…sage params. Returns a descriptive error (e.g. "Duplicate job_status: Succeeded") instead of silently deduplicating or accepting duplicate statuses.
hoophalab
left a comment
There was a problem hiding this comment.
LGTM
- Need to rerun
task codegen:openapiandtask lint:fix-rust - How about this title -- adding ` and a period
feat(api-server): Add GET `/usage/compression` endpoint.
/usage/compression endpoint.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/api-server/src/routes.rs`:
- Around line 336-601: Add a test that exercises the duplicate-job-status
validation path: create a new async test function named
reject_duplicate_job_status in the existing tests module that uses test_app()
and sends a GET to
"/usage/compression?begin_timestamp=0&end_timestamp=100&job_status=SUCCEEDED,SUCCEEDED",
then assert the response status is StatusCode::BAD_REQUEST and that the response
body contains "Duplicate job_status". This ensures the validation logic in
client.rs (the duplicate-status check around lines where Duplicate job_status is
produced) and the conversion via ValidatedCompressionUsageParams/try_from are
covered to prevent regressions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 11bc3b4f-a97e-41db-a41f-70d60de723ba
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.lockdocs/src/_static/generated/api-server-openapi.jsonis excluded by!**/generated/**
📒 Files selected for processing (1)
components/api-server/src/routes.rs
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/api-server/src/routes.rs`:
- Around line 372-624: Add a new Tokio test (e.g.
reject_out_of_range_epoch_millis) in the same test module that sends a request
to /usage/compression with a begin_timestamp (or end_timestamp) value outside
Chrono's representable epoch-millis range to exercise
ValidatedCompressionUsageParams::try_from's out-of-range branch; assert the
response status is StatusCode::BAD_REQUEST and that the body contains the
validation error text so the HTTP mapping for out-of-range timestamps is
covered.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: b966e771-b415-49af-98c7-1cc780adedf7
📒 Files selected for processing (1)
components/api-server/src/routes.rs
| #[tokio::test] | ||
| async fn reject_begin_greater_than_end() { | ||
| let app = test_app(); | ||
| let response = app | ||
| .oneshot( | ||
| Request::builder() | ||
| .uri("/usage/compression?begin_timestamp=200&end_timestamp=100") | ||
| .body(Body::empty()) | ||
| .unwrap(), | ||
| ) | ||
| .await | ||
| .unwrap(); | ||
| assert_eq!(response.status(), StatusCode::BAD_REQUEST); | ||
| let body = get_body(response).await; | ||
| assert!(body.contains("begin_timestamp must be <= end_timestamp")); | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn reject_unknown_job_status() { | ||
| let app = test_app(); | ||
| let response = app | ||
| .oneshot( | ||
| Request::builder() | ||
| .uri( | ||
| "/usage/compression?begin_timestamp=0&end_timestamp=100&job_status=UNKNOWN", | ||
| ) | ||
| .body(Body::empty()) | ||
| .unwrap(), | ||
| ) | ||
| .await | ||
| .unwrap(); | ||
| assert_eq!(response.status(), StatusCode::BAD_REQUEST); | ||
| let body = get_body(response).await; | ||
| assert!(body.contains("Unknown job_status: UNKNOWN")); | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn accept_lowercase_job_status() { | ||
| let app = test_app(); | ||
| let response = app | ||
| .oneshot( | ||
| Request::builder() | ||
| .uri( | ||
| "/usage/compression?begin_timestamp=0&end_timestamp=100&\ | ||
| job_status=succeeded", | ||
| ) | ||
| .body(Body::empty()) | ||
| .unwrap(), | ||
| ) | ||
| .await | ||
| .unwrap(); | ||
| assert_eq!(response.status(), StatusCode::OK); | ||
| let body = get_body(response).await; | ||
| assert_eq!(body, "[2]"); // succeeded → 2 | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn accept_valid_params_with_defaults() { | ||
| let app = test_app(); | ||
| let response = app | ||
| .oneshot( | ||
| Request::builder() | ||
| .uri("/usage/compression?begin_timestamp=0&end_timestamp=100") | ||
| .body(Body::empty()) | ||
| .unwrap(), | ||
| ) | ||
| .await | ||
| .unwrap(); | ||
| assert_eq!(response.status(), StatusCode::OK); | ||
| let body = get_body(response).await; | ||
| assert_eq!(body, "[2,3,4]"); // SUCCEEDED, FAILED, KILLED | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn accept_comma_separated_job_status() { | ||
| let app = test_app(); | ||
| let response = app | ||
| .oneshot( | ||
| Request::builder() | ||
| .uri( | ||
| "/usage/compression?begin_timestamp=0&end_timestamp=100&\ | ||
| job_status=SUCCEEDED,RUNNING", | ||
| ) | ||
| .body(Body::empty()) | ||
| .unwrap(), | ||
| ) | ||
| .await | ||
| .unwrap(); | ||
| assert_eq!(response.status(), StatusCode::OK); | ||
| let body = get_body(response).await; | ||
| assert_eq!(body, "[2,1]"); // SUCCEEDED, RUNNING | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn accept_spaces_around_commas() { | ||
| let app = test_app(); | ||
| let response = app | ||
| .oneshot( | ||
| Request::builder() | ||
| .uri( | ||
| "/usage/compression?begin_timestamp=0&end_timestamp=100&\ | ||
| job_status=SUCCEEDED%2C+FAILED", | ||
| ) | ||
| .body(Body::empty()) | ||
| .unwrap(), | ||
| ) | ||
| .await | ||
| .unwrap(); | ||
| assert_eq!(response.status(), StatusCode::OK); | ||
| let body = get_body(response).await; | ||
| assert_eq!(body, "[2,3]"); // SUCCEEDED, FAILED | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn accept_trailing_comma() { | ||
| let app = test_app(); | ||
| let response = app | ||
| .oneshot( | ||
| Request::builder() | ||
| .uri( | ||
| "/usage/compression?begin_timestamp=0&end_timestamp=100&\ | ||
| job_status=SUCCEEDED,", | ||
| ) | ||
| .body(Body::empty()) | ||
| .unwrap(), | ||
| ) | ||
| .await | ||
| .unwrap(); | ||
| assert_eq!(response.status(), StatusCode::OK); | ||
| let body = get_body(response).await; | ||
| assert_eq!(body, "[2]"); // trailing comma ignored | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn accept_single_job_status() { | ||
| let app = test_app(); | ||
| let response = app | ||
| .oneshot( | ||
| Request::builder() | ||
| .uri("/usage/compression?begin_timestamp=0&end_timestamp=100&job_status=KILLED") | ||
| .body(Body::empty()) | ||
| .unwrap(), | ||
| ) | ||
| .await | ||
| .unwrap(); | ||
| assert_eq!(response.status(), StatusCode::OK); | ||
| let body = get_body(response).await; | ||
| assert_eq!(body, "[4]"); | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn reject_missing_begin_timestamp() { | ||
| let app = test_app(); | ||
| let response = app | ||
| .oneshot( | ||
| Request::builder() | ||
| .uri("/usage/compression?end_timestamp=100") | ||
| .body(Body::empty()) | ||
| .unwrap(), | ||
| ) | ||
| .await | ||
| .unwrap(); | ||
| assert_eq!(response.status(), StatusCode::BAD_REQUEST); | ||
| let body = get_body(response).await; | ||
| assert!( | ||
| body.contains("begin_timestamp") || body.contains("deserialize"), | ||
| "expected error about missing begin_timestamp, got: {body}" | ||
| ); | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn reject_empty_job_status() { | ||
| let app = test_app(); | ||
| let response = app | ||
| .oneshot( | ||
| Request::builder() | ||
| .uri("/usage/compression?begin_timestamp=0&end_timestamp=100&job_status=") | ||
| .body(Body::empty()) | ||
| .unwrap(), | ||
| ) | ||
| .await | ||
| .unwrap(); | ||
| assert_eq!(response.status(), StatusCode::BAD_REQUEST); | ||
| let body = get_body(response).await; | ||
| assert!( | ||
| body.contains("at least one valid status"), | ||
| "expected error about empty job_status, got: {body}" | ||
| ); | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn reject_zero_limit() { | ||
| let app = test_app(); | ||
| let response = app | ||
| .oneshot( | ||
| Request::builder() | ||
| .uri("/usage/compression?begin_timestamp=0&end_timestamp=100&limit=0") | ||
| .body(Body::empty()) | ||
| .unwrap(), | ||
| ) | ||
| .await | ||
| .unwrap(); | ||
| assert_eq!(response.status(), StatusCode::BAD_REQUEST); | ||
| let body = get_body(response).await; | ||
| assert!( | ||
| body.contains("limit must be > 0"), | ||
| "expected error about limit, got: {body}" | ||
| ); | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn reject_negative_limit() { | ||
| let app = test_app(); | ||
| let response = app | ||
| .oneshot( | ||
| Request::builder() | ||
| .uri("/usage/compression?begin_timestamp=0&end_timestamp=100&limit=-1") | ||
| .body(Body::empty()) | ||
| .unwrap(), | ||
| ) | ||
| .await | ||
| .unwrap(); | ||
| assert_eq!(response.status(), StatusCode::BAD_REQUEST); | ||
| let body = get_body(response).await; | ||
| assert!( | ||
| body.contains("limit must be > 0"), | ||
| "expected error about limit, got: {body}" | ||
| ); | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn reject_duplicate_job_status() { | ||
| let app = test_app(); | ||
| let response = app | ||
| .oneshot( | ||
| Request::builder() | ||
| .uri( | ||
| "/usage/compression?begin_timestamp=0&end_timestamp=100&\ | ||
| job_status=SUCCEEDED,SUCCEEDED", | ||
| ) | ||
| .body(Body::empty()) | ||
| .unwrap(), | ||
| ) | ||
| .await | ||
| .unwrap(); | ||
| assert_eq!(response.status(), StatusCode::BAD_REQUEST); | ||
| let body = get_body(response).await; | ||
| assert!( | ||
| body.contains("Duplicate job_status"), | ||
| "expected error about duplicate job_status, got: {body}" | ||
| ); | ||
| } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Add a regression test for out-of-range epoch millis.
ValidatedCompressionUsageParams::try_from also rejects timestamps outside Chrono’s representable range, but this test module never exercises that 400 path. Covering it here would keep the validation-to-HTTP mapping from regressing silently.
Proposed test
+ #[tokio::test]
+ async fn reject_out_of_range_timestamp() {
+ let app = test_app();
+ let response = app
+ .oneshot(
+ Request::builder()
+ .uri(
+ "/usage/compression?begin_timestamp=9223372036854775807&\
+ end_timestamp=9223372036854775807",
+ )
+ .body(Body::empty())
+ .unwrap(),
+ )
+ .await
+ .unwrap();
+ assert_eq!(response.status(), StatusCode::BAD_REQUEST);
+ let body = get_body(response).await;
+ assert!(
+ body.contains("timestamp out of representable range"),
+ "expected error about timestamp range, got: {body}"
+ );
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/api-server/src/routes.rs` around lines 372 - 624, Add a new Tokio
test (e.g. reject_out_of_range_epoch_millis) in the same test module that sends
a request to /usage/compression with a begin_timestamp (or end_timestamp) value
outside Chrono's representable epoch-millis range to exercise
ValidatedCompressionUsageParams::try_from's out-of-range branch; assert the
response status is StatusCode::BAD_REQUEST and that the body contains the
validation error text so the HTTP mapping for out-of-range timestamps is
covered.
There was a problem hiding this comment.
no need to cover this since it is impossible in production, we do not live that long before it overflows
There was a problem hiding this comment.
@jackluo923 That's a fair point — i64::MAX milliseconds from the Unix epoch is around year 292,278,994, so the overflow branch is purely theoretical. Happy to drop the suggestion.
🧠 Learnings used
Learnt from: Nathan903
Repo: y-scope/clp PR: 2079
File: components/api-server/src/telemetry.rs:8-8
Timestamp: 2026-03-06T17:46:23.008Z
Learning: Assume the repository tracks the latest stable Rust toolchain (no pinned MSRV). Do not rely on nightly-only features in code; prefer stable APIs such as Duration::from_hours (stable since Rust 1.91). When reviewing code in this area, ensure it aligns with the repo’s clippy configuration: clippy runs on nightly with -D clippy::pedantic and -D clippy::duration_suboptimal_units, so prefer idiomatic duration usage that satisfies these lints and avoids discouraged duration literals.
| /// Sum of all task durations (CPU-seconds across all parallel workers). | ||
| /// `None` if all task duration values are NULL in the database. | ||
| pub tasks_duration: Option<f64>, | ||
| } |
There was a problem hiding this comment.
Sorry, I forgot that the docstrings in this struct is used in the OpenAPI document. We need to make it user-oriented:
- We probably shouldn't reference a
.pyfile instatus's docstring. Instead, how about changing its type fromi32toCompressionJobStatus, deriveToSchemaforCompressionJobStatus, and add docstrings inCompressionJobStatus. OpenAPI should be able to referenceenum CompressionJobStatusdirectly.- there is another dev-oriented docstring "Mirror of ..." on top of
CompressionJobStatus. We should be able to hide it from OpenAPI by replacing///with//.
- there is another dev-oriented docstring "Mirror of ..." on top of
- It might be worth renaming
statusasjob_statusto keep consistent withstruct CompressionUsageParams.sqlx::renameshould allow us to use different struct field names and column names. - It might be worthwhile to rename
start_timeastime_begin_millisecsto keep consistent withQueryConfig. Alsocreation_time->time_creation_millisecs. - Recent log ingestor changes also include units in the API field names. So, it's best to rename
(un)?compressed_sizeas(un)?compressed_size_bytesor(un)?compressed_bytes..*durationas.*duration_secs
- Can we not exposing database details such as
WHERE clauseandNULL value in databasesince this struct is user oriented. - Can we change
/// A single row returned by the compression usage query (one row per job).to something like/// Resource usage statistics for the compression job with the specified id.
components/api-server/src/client.rs
Outdated
| /// | ||
| /// Note: `Pending` is intentionally excluded because pending jobs have no | ||
| /// `start_time` and are therefore excluded by the time-range WHERE clause. |
There was a problem hiding this comment.
This NOTE duplicates with Covers all terminal states that consumed compute resources since PENDING is a non-terminal state?
components/api-server/src/client.rs
Outdated
| /// End of usage window (epoch milliseconds, inclusive). | ||
| pub end_timestamp: i64, |
There was a problem hiding this comment.
- Thinking about this again: is it better to make
end_timestampexclusive?endinvec[start: end]is exclusive - To keep consistent with the query API, can we rename
begin_timestampastime_range_begin_millisecsandend_timestampastime_range_end_millisecs
| pub job_status: Option<String>, | ||
| /// Maximum number of jobs to return. Must be > 0; defaults to 1000. | ||
| #[serde(default = "default_limit")] | ||
| pub limit: i64, |
There was a problem hiding this comment.
| pub limit: i64, | |
| pub limit: u64, |
Can we change i64 to u32 since it > 0? Also, using u64 in rust doesn't actually give us performance benefits on 64-bit systems.
There was a problem hiding this comment.
let's keep it as i64 since it produce better error message.
"invalid type: expected u32, found -1" error
vs
"limit must be > 0"
…ted. - Change `status: i32` to `job_status: CompressionJobStatus` with typed enum - Rename fields to include units: `_millisecs`, `_secs`, `_bytes` - Rewrite docstrings to remove DB internals/Python file references - Add `ToSchema` derive and user-facing docstrings to `CompressionJobStatus` - Register `CompressionJobStatus` in OpenAPI schema components
9d5801a to
99ea8f5
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/api-server/src/client.rs`:
- Around line 57-66: ValidatedCompressionUsageParams currently exposes its
invariants by making fields pub (begin_timestamp, end_timestamp, job_statuses,
limit), allowing callers to construct invalid instances; change those fields to
private (remove pub) and add minimal read-only accessor methods (e.g.,
begin_timestamp(), end_timestamp(), job_statuses(), limit()) used by routes.rs,
keep the TryFrom<CompressionUsageParams> impl as the only constructor that
enforces validation, and update any call sites (such as
Client::get_compression_usage and routes.rs) to use the new accessors.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2784a491-b5e4-4eb8-8bb2-dc7564fde8d3
⛔ Files ignored due to path filters (1)
docs/src/_static/generated/api-server-openapi.jsonis excluded by!**/generated/**
📒 Files selected for processing (3)
components/api-server/src/client.rscomponents/api-server/src/routes.rscomponents/clp-rust-utils/src/job_config/compression.rs
| /// Validated parameters produced by [`TryFrom<CompressionUsageParams>`]. | ||
| /// | ||
| /// This type is intentionally non-constructable outside of the `TryFrom` impl | ||
| /// so that callers of [`Client::get_compression_usage`] cannot bypass validation. | ||
| pub struct ValidatedCompressionUsageParams { | ||
| pub begin_timestamp: DateTime<Utc>, | ||
| pub end_timestamp: DateTime<Utc>, | ||
| pub job_statuses: Vec<CompressionJobStatus>, | ||
| pub limit: i64, | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n components/api-server/src/client.rs | sed -n '50,75p'Repository: y-scope/clp
Length of output: 1000
🏁 Script executed:
# Search for ValidatedCompressionUsageParams struct definition and usage
rg "ValidatedCompressionUsageParams" -A 10 -B 5 components/api-server/src/Repository: y-scope/clp
Length of output: 6974
🏁 Script executed:
# Check if TryFrom impl exists for CompressionUsageParams
rg "impl.*TryFrom.*CompressionUsageParams" -A 20 components/api-server/src/Repository: y-scope/clp
Length of output: 1737
🏁 Script executed:
# Check how routes.rs accesses fields from ValidatedCompressionUsageParams
rg "ValidatedCompressionUsageParams" components/api-server/src/routes.rs -B 2 -A 2Repository: y-scope/clp
Length of output: 770
🏁 Script executed:
# Search for direct construction of ValidatedCompressionUsageParams (struct literals with {})
rg "ValidatedCompressionUsageParams\s*\{" components/api-server/src/Repository: y-scope/clp
Length of output: 246
🏁 Script executed:
# Also search for any .try_from patterns to ensure all conversions go through the impl
rg "try_from.*CompressionUsageParams" components/api-server/src/ -A 2Repository: y-scope/clp
Length of output: 347
🏁 Script executed:
# Check if there are any tests or external code that might construct ValidatedCompressionUsageParams
rg "ValidatedCompressionUsageParams" components/api-server/src/ -B 3 -A 3 | grep -E "(test|::new|::default|\{)"Repository: y-scope/clp
Length of output: 723
🏁 Script executed:
# Verify the TryFrom validation actually checks job_statuses is not empty
rg "impl TryFrom" components/api-server/src/client.rs -A 60 | head -80Repository: y-scope/clp
Length of output: 2658
ValidatedCompressionUsageParams public fields undermine validation enforcement.
The doc comment declares this type "non-constructable outside of the TryFrom impl" to prevent bypass of validation, but lines 62–65 expose all fields as pub. This allows callers to construct invalid instances directly—for example, with backwards timestamps, non-positive limit, or empty job_statuses—defeating the invariant.
Suggested direction
pub struct ValidatedCompressionUsageParams {
- pub begin_timestamp: DateTime<Utc>,
- pub end_timestamp: DateTime<Utc>,
- pub job_statuses: Vec<CompressionJobStatus>,
- pub limit: i64,
+ begin_timestamp: DateTime<Utc>,
+ end_timestamp: DateTime<Utc>,
+ job_statuses: Vec<CompressionJobStatus>,
+ limit: i64,
}Then expose only accessor methods for the fields that routes.rs actually needs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/api-server/src/client.rs` around lines 57 - 66,
ValidatedCompressionUsageParams currently exposes its invariants by making
fields pub (begin_timestamp, end_timestamp, job_statuses, limit), allowing
callers to construct invalid instances; change those fields to private (remove
pub) and add minimal read-only accessor methods (e.g., begin_timestamp(),
end_timestamp(), job_statuses(), limit()) used by routes.rs, keep the
TryFrom<CompressionUsageParams> impl as the only constructor that enforces
validation, and update any call sites (such as Client::get_compression_usage and
routes.rs) to use the new accessors.
There was a problem hiding this comment.
fixed docstring instead.
There was a problem hiding this comment.
@jackluo923 Understood — updating the docstring to accurately reflect that the fields are public (rather than restricting access) is a valid choice. The mismatched claim between the documentation and reality was the key issue flagged. Thanks for the clarification!
🧠 Learnings used
Learnt from: sitaowang1998
Repo: y-scope/clp PR: 1417
File: components/clp-py-utils/clp_py_utils/serialization_utils.py:6-13
Timestamp: 2025-10-18T15:38:01.210Z
Learning: In the y-scope/clp repository, parameter descriptions in docstrings are intentionally omitted when the parameter name is self-explanatory. This is the project's preferred documentation style.
Learnt from: goynam
Repo: y-scope/clp PR: 2178
File: components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py:0-0
Timestamp: 2026-04-13T18:49:55.259Z
Learning: In the query scheduler (`components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py`), `job["job_config"]` is serialized as plain msgpack (no brotli compression), written by the Rust API server via `rmp_serde`. It is not subject to brotli decompression failures. Query configs are also inherently small (query string, timestamps, a few flags), making VARBINARY(60000) truncation extremely unlikely in practice. The MEDIUMBLOB schema change in PR `#2178` further prevents future truncation for that column.
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: Nathan903
Repo: y-scope/clp PR: 2079
File: components/api-server/src/telemetry.rs:8-8
Timestamp: 2026-03-06T17:46:23.008Z
Learning: Assume the repository tracks the latest stable Rust toolchain (no pinned MSRV). Do not rely on nightly-only features in code; prefer stable APIs such as Duration::from_hours (stable since Rust 1.91). When reviewing code in this area, ensure it aligns with the repo’s clippy configuration: clippy runs on nightly with -D clippy::pedantic and -D clippy::duration_suboptimal_units, so prefer idiomatic duration usage that satisfies these lints and avoids discouraged duration literals.
…ake end exclusive. - Rename begin_timestamp -> time_range_begin_millisecs - Rename end_timestamp -> time_range_end_millisecs - Change end boundary from inclusive (<=) to exclusive (<)
Co-authored-by: hoophalab <200652805+hoophalab@users.noreply.github.com>
Description
Adds a
GET /usage/compressionendpoint to the api-server that returns per-job compression usage data (sizes, durations, task stats) within a configurable epoch-millisecond time window. Intended for reporting, dashboarding, and chargeback use cases.Query parameters:
begin_timestamp(required) — start of usage window, epoch ms inclusiveend_timestamp(required) — end of usage window, epoch ms inclusivejob_status(optional) — comma-separated, case-insensitive list of statuses (succeeded,FAILED, etc.). Defaults to all terminal states (SUCCEEDED,FAILED,KILLED).limit(optional) — maximum number of jobs to return. Defaults to 1000.Key changes:
CompressionUsageParamsand validation live inclient.rs(not the HTTP layer) so any future caller ofClientgets the same validated interfacestrum::EnumStringonCompressionJobStatus(case-insensitive,Pendingexcluded via#[strum(disabled)])get_compression_usageaccepts&[CompressionJobStatus]; empty slice means "no status filter" (IN clause omitted)sqlx::FromRowderive onCompressionUsageeliminates manualtry_getmapping;query_asreplaces manual row iterationf64withROUND()in SQL instead ofBigDecimalcompression_jobs(start_time, status)for efficient time-range queriesChecklist
Validation performed
task lint:check-rust(fmt + clippy) passestask codegen:openapi— generated OpenAPI JSON committedcargo test -p api-server— 15/15 tests passSummary by CodeRabbit
New Features
Bug Fixes
Chores
Tests