SED-4789 Implement live reporting in the Node.js agent#661
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces live reporting capabilities for the Node.js agent, enabling running keywords to stream performance measures, metrics, and file uploads to the controller during execution. Key feedback from the review focuses on improving robustness and error handling: wrapping the initialization of liveReporting in a try-catch block and checking its existence before closing in agent-fork.js, adding an error listener to the HTTP response stream in BatchingRestPoster, using Math.floor instead of the modulo operator to prevent floating-point precision issues during metric bucketing, and addressing potential stack corruption in LiveMeasures during concurrent asynchronous operations.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| const outputBuilder = new OutputBuilder(); | ||
| const liveReporting = createLiveReporting(properties); | ||
| try { |
There was a problem hiding this comment.
To prevent any unexpected errors during createLiveReporting initialization from crashing the fork process, declare liveReporting using let and initialize it inside the try block. This ensures any initialization errors are caught and reported gracefully via outputBuilder.fail.
const outputBuilder = new OutputBuilder();
let liveReporting;
try {
liveReporting = createLiveReporting(properties);There was a problem hiding this comment.
Done — liveReporting is now declared with let and initialized as the first statement inside the try, so any initialization error is reported via outputBuilder.fail instead of escaping the message handler.
| try { | ||
| await liveReporting.close(); | ||
| } catch (e) { | ||
| console.log("[Agent fork] Error while closing live reporting", e); | ||
| } |
There was a problem hiding this comment.
Add a safety check to ensure liveReporting is defined before attempting to call close(), preventing a TypeError if initialization failed or was skipped.
try {
if (liveReporting) {
await liveReporting.close();
}
} catch (e) {
console.log("[Agent fork] Error while closing live reporting", e);
}References
- In the agent forker process (agent-fork.js), use console.log for logging as the main logger utility is not available.
There was a problem hiding this comment.
Done — added the if (liveReporting) guard before await liveReporting.close().
| (resp) => { | ||
| const status = resp.statusCode; | ||
| // Always drain the response so the socket can be reused/released. | ||
| resp.on('data', () => {}); | ||
| resp.on('end', () => { | ||
| if (status !== 204) { | ||
| logger.error(`Error while reporting ${items.length} ${this.label}(s). The live reporting service returned ${status}`); | ||
| } | ||
| resolve(); | ||
| }); | ||
| } |
There was a problem hiding this comment.
In Node.js, the response object (IncomingMessage) can emit an 'error' event (e.g., if the connection is severed mid-response). To prevent potential unhandled exceptions from crashing the process, it is recommended to attach an error listener to the resp stream.
(resp) => {
const status = resp.statusCode;
resp.on('error', (err) => {
logger.error(`Response error while reporting ${items.length} ${this.label}(s):`, err);
});
// Always drain the response so the socket can be reused/released.
resp.on('data', () => {});
resp.on('end', () => {
if (status !== 204) {
logger.error(`Error while reporting ${items.length} ${this.label}(s). The live reporting service returned ${status}`);
}
resolve();
});
}There was a problem hiding this comment.
Done — added a resp.on('error') handler that logs and settles the flush promise. (This code now lives in live-reporting/shared.js after the file split discussed below.)
| const bucket = value - (value % this._percentilePrecision); | ||
| this._distribution.set(bucket, (this._distribution.get(bucket) || 0) + 1); |
There was a problem hiding this comment.
Using the modulo operator % on floating-point numbers in JavaScript can lead to precision issues (e.g., 0.3 % 0.1 results in 0.09999999999999998), which can cause incorrect or fragmented bucket keys in the distribution map. Consider using Math.floor for more robust bucketing.
| const bucket = value - (value % this._percentilePrecision); | |
| this._distribution.set(bucket, (this._distribution.get(bucket) || 0) + 1); | |
| const bucket = Math.floor(value / this._percentilePrecision) * this._percentilePrecision; | |
| this._distribution.set(bucket, (this._distribution.get(bucket) || 0) + 1); |
There was a problem hiding this comment.
Done — bucketing now uses Math.floor(value / precision) * precision.
| class LiveMeasures { | ||
| constructor(destination) { | ||
| this.destination = destination; | ||
| this._stack = []; | ||
| } |
There was a problem hiding this comment.
The LiveMeasures class uses a shared array stack (this._stack) to track nested measures. In an asynchronous environment like Node.js, if a keyword executes concurrent overlapping asynchronous operations (e.g., using Promise.all), their stack operations will interleave and corrupt the stack. Consider documenting this limitation, or using AsyncLocalStorage to scope the measure stack to the asynchronous execution context.
There was a problem hiding this comment.
Documented the limitation on LiveMeasures: startMeasure/stopMeasure share one stack and are intended for sequential or properly nested measures within a single flow; concurrent overlapping async measures should use addMeasure, which records a fully-formed measure atomically. I kept this approach rather than AsyncLocalStorage to stay aligned with the Java LiveMeasures stack semantics.
david-stephan
left a comment
There was a problem hiding this comment.
@jeromecomte all gemeni comments look legit. I added one point of concern about multithreading. I went a big fast on the file streaming code. you may ask Christoph to check it in more details if you'd like.
| flush() { | ||
| if (this.buffer.length === 0) return; | ||
| const batch = this.buffer; | ||
| this.buffer = []; |
There was a problem hiding this comment.
I didn't check the Java implementation, but with multithreading and no synchronization between add and flush we can loose data. (i.e. a 2nd thread run line 151 (this.buffer.push(item);), while the 1st thread is in between L159-160
There was a problem hiding this comment.
As discussed, as l158 - l160 contain no async operation, they are guarantied to run atomically on the same Node.js event loop
| fs.mkdirSync(agentForkerLibPath, { recursive: true }); | ||
| fs.copyFileSync(path.resolve(__dirname, 'agent-fork.js'), path.join(agentForkerLibPath, 'agent-fork.js')); | ||
| fs.copyFileSync(path.join(__dirname, 'output.js'), path.join(agentForkerLibPath, 'output.js')); | ||
| fs.copyFileSync(path.join(__dirname, 'live-reporting.js'), path.join(agentForkerLibPath, 'live-reporting.js')); |
There was a problem hiding this comment.
I assume this is the reason of the "big" live-reporting.js file, but for code review and maintainability it would have been nicer to split the code.
There was a problem hiding this comment.
Agree with this. I would also prefer to split the code in dedicated files. In theory we could already do it and simply add more files here but I would prefer to have a better solution than the copy first
There was a problem hiding this comment.
Done — live-reporting.js is now split into a live-reporting/ folder: index.js (container + factory + public API), shared.js (logger, reporting-URL resolution, the batching REST poster), measures.js, metrics.js, and file-uploads.js. To avoid maintaining a per-file copy list in agent.js, the agent now copies the whole folder recursively (fs.cpSync(..., { recursive: true })), so adding new modules no longer requires touching agent.js.
|
Thanks for the review @david-stephan! All points addressed (replied inline as well):
All tests and eslint pass. |
No description provided.