Skip to content

OEV-1179 Multiplex to Nova RPC as a secondary#410

Closed
cll-gg wants to merge 36 commits intodevelopfrom
OEV-1179-multiplex
Closed

OEV-1179 Multiplex to Nova RPC as a secondary#410
cll-gg wants to merge 36 commits intodevelopfrom
OEV-1179-multiplex

Conversation

@cll-gg
Copy link
Copy Markdown
Contributor

@cll-gg cll-gg commented Apr 7, 2026

What

Send FB requests to Nova as well, treating it as fire-and-forget.

Why

To improve SVR.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 7, 2026

⚠️ API Diff Results - github.com/smartcontractkit/chainlink-evm

⚠️ Breaking Changes (7)

pkg/config.TransactionManagerV2 (1)
  • CustomURLSecondary — ➕ Added
pkg/txm/clientwrappers/dualbroadcast (5)
  • FlashbotsClientRPC — 🗑️ Removed

  • NewFlashbotsClient — Type changed:

func(
  github.com/smartcontractkit/chainlink-common/pkg/logger.Logger, 
  - FlashbotsClientRPC, 
  + ofaRPCClient, 
  github.com/smartcontractkit/chainlink-evm/pkg/keys.MessageSigner, 
  *net/url.URL, 
  FlashbotsTxStore, 
  *bool, 
  + ofaMetrics
)
*FlashbotsClient
  • Privacy — 🗑️ Removed

  • RefundConfig — 🗑️ Removed

  • SelectClient — Type changed:

func(
  github.com/smartcontractkit/chainlink-common/pkg/logger.Logger, 
  github.com/smartcontractkit/chainlink-evm/pkg/client.Client, 
  github.com/smartcontractkit/chainlink-evm/pkg/keys.ChainStore, 
  *net/url.URL, 
  + *net/url.URL, 
  *math/big.Int, 
  github.com/smartcontractkit/chainlink-evm/pkg/txm.TxStore, 
  bool, 
  *bool, 
  *time.Duration
)
(github.com/smartcontractkit/chainlink-evm/pkg/txm.Client, github.com/smartcontractkit/chainlink-evm/pkg/txm.ErrorHandler, error)
pkg/txm/clientwrappers/dualbroadcast.(*FlashbotsClient) (1)
  • SendBundle — 🗑️ Removed

✅ Compatible Changes (1)

pkg/config/toml.TransactionManagerV2Config (1)
  • CustomURLSecondary — ➕ Added

📄 View full apidiff report

Comment thread pkg/txm/clientwrappers/dualbroadcast/flashbots_client.go Fixed
Comment thread pkg/txm/clientwrappers/dualbroadcast/multiplex_client.go Fixed
Comment thread pkg/txm/clientwrappers/dualbroadcast/flashbots_client.go Fixed
Copy link
Copy Markdown
Contributor

@dimriou dimriou Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these changes needed? chain_client should be irrelevant to this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was surprised as well. The test failed here but it seems the fix hasn't fully helped. I'll have another look

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted for now

Enabled = false
BlockTime = '42s'
CustomURL = 'http://txs.org'
CustomURLSecondary = 'http://txs-secondary.org'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on the fence with this one. What happens if we want to add another OFA? Maybe it makes sense to create a config that has all the secondary urls and it's a slice of URLs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I was on the same fence 🙂
I'm not sure what's best, so I thought keeping it simplest is ok for now and if we introduce another url we can change it to a slice.

FetchUnconfirmedTransactions(context.Context, common.Address) ([]*types.Transaction, error)
}

type FlashbotsClientRPC interface {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's common pattern to name the local interfaces with the prefix of the component they're declared for. Otherwise you run the risk of conflicting names. For example if you add another client with a different interface in this package you'd have to figure out another vague name to not conflict.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, didn't know that pattern.
Then I propose naming it ofaClientRPC or ofaRPC and reusing it in both flashbots + nova client. Agree?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we rename FlashbotsClient to OFAClient or something similar, then yes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to ofaRpcClient

Comment thread pkg/txm/clientwrappers/dualbroadcast/flashbots_client.go
go func() {
// Use a background context so the secondary isn't cancelled when the primary returns.
// Each client is responsible for its own timeout (e.g. novaClient has novaRPCTimeout).
if err := m.secondary.SendTransaction(context.Background(), tx, attempt); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing context.Background() here is unsafe even if the client handles it under the hood. We should instead pass the parent context so in case of a cancel it gets propagated.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point on the cancel!
The problem is if the primary client runs into a timeout, then the secondary gets canceled as well, while we actually want to make it fire-and-forget. I'll have a look what we can best do here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case the context comes straight from the txm, which doesn't have a timeout unless Close() is called so it should be ok, but let's double check.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now, using the main context

}

func (m *multiplexClient) SendTransaction(ctx context.Context, tx *types.Transaction, attempt *types.Attempt) error {
go func() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should properly handle the lifecycle of this goroutine, otherwise we run the risk of memory leaks.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I'll add that

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better, but we still need a waitgroup to properly handle this go routine, otherwise we would cause a potential memory leak. We should discuss if it's acceptable to wait for this call to return and not make it completely fire and forget. If we can't wait and we need to make it completely fire and forget, then we'd have to take another approach by closing the client, but I'd try to avoid it unless necessary.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this AI generated? Feels like it replicates unnecessary information that will be hard to maintain if we make changes and will quickly get out of date. There is already a TXM doc, that we should have deleted that looks like this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, AI generated indeed. I assume that any agent that updates the code will also update this readme. But happy to put this context elsewhere, do you have a good idea?

(Is this the TXM doc you mean --> https://github.com/smartcontractkit/chainlink-evm/blob/develop/pkg/txm/docs/TRANSACTION_MANAGER_V2.md?)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I would put the details in the description of the multiplex_cient. But I would definitely remove the configs, especially since they are in a Core node format (yes the txm file is the one you linked).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on the configs, they don't make sense in this context.
I'll put some description in multiplex_client instead, good idea, closer to the code 👍

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two clients share a large amount of overlapping logic, minus the signing. To make this less DRY we should combine them under one client and instead inject signing where needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, will do that!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've kept the fallback for sending to the public mempool logic in flashbots_client for now, seeing as we want to use nova_client as secondary. If nova_client would also do this we would run the risk of sending the tx to the public mempool twice. Not a big issue but still not great. Let me know what you think!

Copy link
Copy Markdown
Contributor

@dimriou dimriou Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need some product input as to whether we plan on using nova client as a standalone client. It feels like we can avoid future changes if we can decide on that now.

Copy link
Copy Markdown
Contributor

@dimriou dimriou Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are metrics needed for Titan? Seems like we never needed them for Flashbots.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's because there's been only 1 client so far. Now that we add another one it's handy to be able to differentiate them.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I'm against of adding metrics but I don't want us to fall into the trap of adding them and then never use them. Core had this problem. It's a simple change but it consumes certain resources and it's yet another thing in the codebase. Personally, I would add them in a follow up PR in case there is a need, but we can also discuss with the team if there is a use case i.e. monitoring Nova endpoint.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I'll deploy it to Staging and validate if we need them there

Comment thread pkg/txm/txm.go Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for “dual broadcast” multiplexing so Flashbots (primary) broadcasts can also be sent to a Nova RPC endpoint (secondary) in a fire-and-forget manner, with supporting config/docs and OFA metrics.

Changes:

  • Extend TransactionManagerV2 config with CustomURLSecondary and wire it into txmgr builder + docs.
  • Refactor dualbroadcast client selection to optionally wrap a primary client with a multiplexing client that also sends to Nova.
  • Introduce a shared OFA HTTP JSON-RPC client + unified OFA send metrics; add Nova client implementation and tests.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg/txmgr/builder.go Passes primary + secondary custom URLs into dualbroadcast client selection.
pkg/txm/txm.go Adjusts nonce initialization log level to Info.
pkg/txm/clientwrappers/dualbroadcast/selector_test.go Adds unit tests for SelectClient behavior with/without secondary and URL redaction.
pkg/txm/clientwrappers/dualbroadcast/selector.go Adds secondary URL handling, multiplex wrapping, and URL redaction helper.
pkg/txm/clientwrappers/dualbroadcast/ofa_metrics.go Adds OFA send status/latency metrics.
pkg/txm/clientwrappers/dualbroadcast/ofa_client.go Adds shared OFA HTTP JSON-RPC client abstraction with pluggable auth.
pkg/txm/clientwrappers/dualbroadcast/nova_client_test.go Adds Nova client tests for dual-broadcast send + nonce reads.
pkg/txm/clientwrappers/dualbroadcast/nova_client.go Implements Nova client on top of the shared OFA client.
pkg/txm/clientwrappers/dualbroadcast/multiplex_client_test.go Adds tests asserting fire-and-forget behavior and timeouts for secondary send.
pkg/txm/clientwrappers/dualbroadcast/multiplex_client.go Implements multiplexing txm.Client that sends to primary + secondary.
pkg/txm/clientwrappers/dualbroadcast/flashbots_client_test.go Updates Flashbots tests for new metrics/shared client refactor.
pkg/txm/clientwrappers/dualbroadcast/flashbots_client.go Refactors Flashbots client to use shared OFA client + auth adapter.
pkg/config/toml/testdata/config-full.toml Adds CustomURLSecondary to full config testdata.
pkg/config/toml/docs.toml Documents CustomURLSecondary.
pkg/config/toml/config_test.go Updates config defaults/docs tests to include CustomURLSecondary.
pkg/config/toml/config.go Adds CustomURLSecondary field + validation and propagation.
pkg/config/config.go Extends TransactionManagerV2 interface with CustomURLSecondary().
pkg/config/chain_scoped_transactions.go Implements CustomURLSecondary() for chain-scoped TMv2 config.
CONFIG.md Updates generated config docs to include CustomURLSecondary.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

type postResponse struct {
Result json.RawMessage `json:"result,omitempty"`
Error postError
return base.String() + "?" + params
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flashbotsHTTPAuth.requestURL unconditionally appends "?" and the raw params string to base.String(). If the configured Flashbots URL already contains a query string (e.g., an API key), this will produce an invalid URL with multiple '?' and may drop/override existing query values. Consider constructing the request URL by copying the base url.URL and merging query parameters (and returning base.String() unchanged when there are no extra params).

Suggested change
return base.String() + "?" + params
if params == "" {
return base.String()
}
extraParams, err := url.ParseQuery(params)
if err != nil {
return base.String()
}
requestURL := *base
query := requestURL.Query()
for key, values := range extraParams {
query.Del(key)
for _, value := range values {
query.Add(key, value)
}
}
requestURL.RawQuery = query.Encode()
return requestURL.String()

Copilot uses AI. Check for mistakes.
Comment on lines +32 to +35
// secondary must be a Nova RPC endpoint
if !strings.Contains(secondaryURL.String(), "novarpc") {
return nil, nil, fmt.Errorf("secondary URL must be a Nova RPC endpoint, got: %s", redactURL(secondaryURL))
}
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Secondary Nova validation uses strings.Contains on secondaryURL.String(), so a non-Nova endpoint can pass the check if "novarpc" appears in the path/query/userinfo (and a real Nova endpoint could fail if the hostname doesn’t contain that substring). To enforce “Nova endpoint” reliably, validate against secondaryURL.Hostname() (and/or a configured allowlist / suffix match) instead of the full URL string.

Copilot uses AI. Check for mistakes.
Comment on lines +41 to +53
go func() {
secondaryCtx, cancel := context.WithTimeout(ctx, m.secondarySendTimeout)
defer cancel()

if err := m.secondary.SendTransaction(secondaryCtx, tx, attempt); err != nil {
// Ignore errors from the secondary backend; it's fire-and-forget.
m.lggr.Errorw("Secondary backend send failed", "err", err, "transactionLifecycleID", tx.GetTransactionLifecycleID(m.lggr))
}
}()

primaryCtx, cancel := context.WithTimeout(ctx, rpcTimeout)
defer cancel()
return m.primary.SendTransaction(primaryCtx, tx, attempt)
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiplexClient.SendTransaction starts the secondary send concurrently using the same *types.Transaction and *types.Attempt pointers that are passed to the primary. This can race or send the wrong payload if the primary client mutates the attempt/tx during SendTransaction (e.g., MetaClient updates attempt.SignedTransaction in SendOperation). Consider either (a) restricting multiplexing to OFA clients that don’t mutate inputs (Flashbots/Nova), or (b) snapshotting/copying the data needed by the secondary before calling the primary (or running the secondary after the primary has finalized any mutations).

Suggested change
go func() {
secondaryCtx, cancel := context.WithTimeout(ctx, m.secondarySendTimeout)
defer cancel()
if err := m.secondary.SendTransaction(secondaryCtx, tx, attempt); err != nil {
// Ignore errors from the secondary backend; it's fire-and-forget.
m.lggr.Errorw("Secondary backend send failed", "err", err, "transactionLifecycleID", tx.GetTransactionLifecycleID(m.lggr))
}
}()
primaryCtx, cancel := context.WithTimeout(ctx, rpcTimeout)
defer cancel()
return m.primary.SendTransaction(primaryCtx, tx, attempt)
transactionLifecycleID := tx.GetTransactionLifecycleID(m.lggr)
primaryCtx, cancel := context.WithTimeout(ctx, rpcTimeout)
defer cancel()
err := m.primary.SendTransaction(primaryCtx, tx, attempt)
go func(transactionLifecycleID string) {
secondaryCtx, cancel := context.WithTimeout(ctx, m.secondarySendTimeout)
defer cancel()
if secondaryErr := m.secondary.SendTransaction(secondaryCtx, tx, attempt); secondaryErr != nil {
// Ignore errors from the secondary backend; it's fire-and-forget.
m.lggr.Errorw("Secondary backend send failed", "err", secondaryErr, "transactionLifecycleID", transactionLifecycleID)
}
}(transactionLifecycleID)
return err

Copilot uses AI. Check for mistakes.

func (m *multiplexClient) SendTransaction(ctx context.Context, tx *types.Transaction, attempt *types.Attempt) error {
go func() {
secondaryCtx, cancel := context.WithTimeout(ctx, m.secondarySendTimeout)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this derive from context.Background() instead of ctx? If the TXM caller passes a per-call context and it gets cancelled right after primary.SendTransaction returns, the secondary could get cancelled as well?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing context.Background() here may cause potential memory leaks. In principle, the TXM caller always uses the app's context, which won't be canceled unless the app gets closed, in which case the call should be cancelled as well. It should be each client's responsibility to add their own timeout, by calling WithTimeout, but they should also be aware if the app is still running or not.

@dimriou
Copy link
Copy Markdown
Contributor

dimriou commented Apr 20, 2026

I think we're introducing too many clients with this approach. Originally, the idea was to add only one multiplexing client and then repurpose/rename flashbots client and use it for both nova & flashbots, perhaps via an enum. Then, specifically in case of the flashbots client, we would have a helper signature method to sign the necessary fields. Clientwrappers are already wrappers on top of existing client wrappers so I'd like us to be mindful of the level of encapsulation we use.

@cll-gg
Copy link
Copy Markdown
Contributor Author

cll-gg commented Apr 27, 2026

Closing in favor of #445

@cll-gg cll-gg closed this Apr 27, 2026
@cll-gg cll-gg deleted the OEV-1179-multiplex branch April 27, 2026 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants