Skip to content

SED-4789 Implement live reporting in the Node.js agent#661

Open
jeromecomte wants to merge 3 commits into
masterfrom
SED-4789-implement-live-reporting-in-the-node-js-agent
Open

SED-4789 Implement live reporting in the Node.js agent#661
jeromecomte wants to merge 3 commits into
masterfrom
SED-4789-implement-live-reporting-in-the-node-js-agent

Conversation

@jeromecomte

Copy link
Copy Markdown
Contributor

No description provided.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces live reporting capabilities for the Node.js agent, enabling running keywords to stream performance measures, metrics, and file uploads to the controller during execution. Key feedback from the review focuses on improving robustness and error handling: wrapping the initialization of liveReporting in a try-catch block and checking its existence before closing in agent-fork.js, adding an error listener to the HTTP response stream in BatchingRestPoster, using Math.floor instead of the modulo operator to prevent floating-point precision issues during metric bucketing, and addressing potential stack corruption in LiveMeasures during concurrent asynchronous operations.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines 37 to 39
const outputBuilder = new OutputBuilder();
const liveReporting = createLiveReporting(properties);
try {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

To prevent any unexpected errors during createLiveReporting initialization from crashing the fork process, declare liveReporting using let and initialize it inside the try block. This ensures any initialization errors are caught and reported gracefully via outputBuilder.fail.

    const outputBuilder = new OutputBuilder();
    let liveReporting;
    try {
      liveReporting = createLiveReporting(properties);

@jeromecomte jeromecomte Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — liveReporting is now declared with let and initialized as the first statement inside the try, so any initialization error is reported via outputBuilder.fail instead of escaping the message handler.

Comment on lines +88 to +92
try {
await liveReporting.close();
} catch (e) {
console.log("[Agent fork] Error while closing live reporting", e);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Add a safety check to ensure liveReporting is defined before attempting to call close(), preventing a TypeError if initialization failed or was skipped.

      try {
        if (liveReporting) {
          await liveReporting.close();
        }
      } catch (e) {
        console.log("[Agent fork] Error while closing live reporting", e);
      }
References
  1. In the agent forker process (agent-fork.js), use console.log for logging as the main logger utility is not available.

@jeromecomte jeromecomte Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added the if (liveReporting) guard before await liveReporting.close().

Comment on lines +187 to +197
(resp) => {
const status = resp.statusCode;
// Always drain the response so the socket can be reused/released.
resp.on('data', () => {});
resp.on('end', () => {
if (status !== 204) {
logger.error(`Error while reporting ${items.length} ${this.label}(s). The live reporting service returned ${status}`);
}
resolve();
});
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In Node.js, the response object (IncomingMessage) can emit an 'error' event (e.g., if the connection is severed mid-response). To prevent potential unhandled exceptions from crashing the process, it is recommended to attach an error listener to the resp stream.

        (resp) => {
          const status = resp.statusCode;
          resp.on('error', (err) => {
            logger.error(`Response error while reporting ${items.length} ${this.label}(s):`, err);
          });
          // Always drain the response so the socket can be reused/released.
          resp.on('data', () => {});
          resp.on('end', () => {
            if (status !== 204) {
              logger.error(`Error while reporting ${items.length} ${this.label}(s). The live reporting service returned ${status}`);
            }
            resolve();
          });
        }

@jeromecomte jeromecomte Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added a resp.on('error') handler that logs and settles the flush promise. (This code now lives in live-reporting/shared.js after the file split discussed below.)

Comment on lines +412 to +413
const bucket = value - (value % this._percentilePrecision);
this._distribution.set(bucket, (this._distribution.get(bucket) || 0) + 1);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using the modulo operator % on floating-point numbers in JavaScript can lead to precision issues (e.g., 0.3 % 0.1 results in 0.09999999999999998), which can cause incorrect or fragmented bucket keys in the distribution map. Consider using Math.floor for more robust bucketing.

Suggested change
const bucket = value - (value % this._percentilePrecision);
this._distribution.set(bucket, (this._distribution.get(bucket) || 0) + 1);
const bucket = Math.floor(value / this._percentilePrecision) * this._percentilePrecision;
this._distribution.set(bucket, (this._distribution.get(bucket) || 0) + 1);

@jeromecomte jeromecomte Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — bucketing now uses Math.floor(value / precision) * precision.

Comment on lines +242 to +246
class LiveMeasures {
constructor(destination) {
this.destination = destination;
this._stack = [];
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The LiveMeasures class uses a shared array stack (this._stack) to track nested measures. In an asynchronous environment like Node.js, if a keyword executes concurrent overlapping asynchronous operations (e.g., using Promise.all), their stack operations will interleave and corrupt the stack. Consider documenting this limitation, or using AsyncLocalStorage to scope the measure stack to the asynchronous execution context.

@jeromecomte jeromecomte Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented the limitation on LiveMeasures: startMeasure/stopMeasure share one stack and are intended for sequential or properly nested measures within a single flow; concurrent overlapping async measures should use addMeasure, which records a fully-formed measure atomically. I kept this approach rather than AsyncLocalStorage to stay aligned with the Java LiveMeasures stack semantics.

@david-stephan david-stephan left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeromecomte all gemeni comments look legit. I added one point of concern about multithreading. I went a big fast on the file streaming code. you may ask Christoph to check it in more details if you'd like.

flush() {
if (this.buffer.length === 0) return;
const batch = this.buffer;
this.buffer = [];

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't check the Java implementation, but with multithreading and no synchronization between add and flush we can loose data. (i.e. a 2nd thread run line 151 (this.buffer.push(item);), while the 1st thread is in between L159-160

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, as l158 - l160 contain no async operation, they are guarantied to run atomically on the same Node.js event loop

fs.mkdirSync(agentForkerLibPath, { recursive: true });
fs.copyFileSync(path.resolve(__dirname, 'agent-fork.js'), path.join(agentForkerLibPath, 'agent-fork.js'));
fs.copyFileSync(path.join(__dirname, 'output.js'), path.join(agentForkerLibPath, 'output.js'));
fs.copyFileSync(path.join(__dirname, 'live-reporting.js'), path.join(agentForkerLibPath, 'live-reporting.js'));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is the reason of the "big" live-reporting.js file, but for code review and maintainability it would have been nicer to split the code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with this. I would also prefer to split the code in dedicated files. In theory we could already do it and simply add more files here but I would prefer to have a better solution than the copy first

@jeromecomte jeromecomte Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — live-reporting.js is now split into a live-reporting/ folder: index.js (container + factory + public API), shared.js (logger, reporting-URL resolution, the batching REST poster), measures.js, metrics.js, and file-uploads.js. To avoid maintaining a per-file copy list in agent.js, the agent now copies the whole folder recursively (fs.cpSync(..., { recursive: true })), so adding new modules no longer requires touching agent.js.

@jeromecomte

Copy link
Copy Markdown
Contributor Author

Thanks for the review @david-stephan! All points addressed (replied inline as well):

  • Initialize liveReporting inside the try + guard close() — done
  • Response error listener on the batching REST poster — done
  • Math.floor bucketing for the metric distribution — done
  • Documented the startMeasure/stopMeasure stack concurrency limitation (use addMeasure for concurrent async operations), matching the Java LiveMeasures semantics — done
  • add/flush data loss: the buffer swap is synchronous, so it runs atomically on the single-threaded event loop (discussed in thread) — no change needed
  • Maintainability: split live-reporting.js into a live-reporting/ folder (index, shared, measures, metrics, file-uploads); agent.js now copies the folder recursively into the fork libs, so future modules don't need a new copy line

All tests and eslint pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants