fix(libp2p): plumb advertise address to swarm.add_external_address#4261
Draft
fix(libp2p): plumb advertise address to swarm.add_external_address#4261
Conversation
- The CLI flag --libp2p-advertise-address (env ESPRESSO_NODE_LIBP2P_ADVERTISE_ADDRESS)
was only used in the orchestrator-bootstrap branch (lib.rs:446) and never reached
the libp2p Swarm in the persistence-loaded or peer-fetch branches. The flag's
docstring promised that it "advertises to other nodes" but in production paths
it was a silent no-op.
- Effect: validators behind NAT, K8s NodePort, or Docker bridge networking can only
be reached by peers they have already established outbound connections to. Identify
announces their bind-derived addresses (loopback / VPC-internal IPs), so peers that
learn about them via DHT lookup, or that lose and try to re-establish a connection,
cannot dial them. Coverage of the validator set is incomplete and degrades over
time as connections drop. Observed-address from existing connections is not
propagated by identify to third parties, and AutoNAT cannot self-confirm the
public address because the bind-derived addresses it offers as probe candidates
are not reachable.
- Fix: thread Vec<Multiaddr> through Libp2pNetwork::from_config (libp2p_network.rs)
into NetworkNodeConfig.announce_addresses (config.rs), and call
Swarm::add_external_address for each in NetworkNode::new (node.rs). The existing
SwarmEvent::ExternalAddrConfirmed handler at node.rs already populates the DHT
self-record once the address is confirmed.
- Also drop the default_value = "localhost:1769" on the CLI flag. Loopback was
actively wrong as a default once the flag is wired through to identify; absent
is now valid (no advertise) and bootstrapping the orchestrator without it
errors explicitly with a clear message.
- Drive-by: fix a "fales" typo in a comment and whitelist the bimap crate name
in .typos.toml so the pre-commit spell-check passes.
- References:
rust-libp2p Swarm::add_external_address (libp2p-swarm 0.47):
https://docs.rs/libp2p/0.56.0/libp2p/swarm/struct.Swarm.html#method.add_external_address
"Add a confirmed external address for the local node. This function should
only be called with addresses that are guaranteed to be reachable."
libp2p Identify spec - the listenAddrs field that peers receive during
identify exchange:
https://github.com/libp2p/specs/blob/master/identify/README.md
Without add_external_address, only listen addresses are populated, which for
bind=0.0.0.0 means loopback and any local interface IPs.
Contributor
There was a problem hiding this comment.
Code Review
This pull request updates the Libp2p networking configuration to support announcing external addresses, which is necessary for nodes operating behind NAT, K8s NodePort, or Docker bridges. The changes include updating the libp2p_advertise_address to be optional, introducing an announce_addresses field in the network configuration, and ensuring these addresses are added to the swarm as external addresses. Additionally, a minor typo in a comment was corrected. As there were no review comments provided, I have no feedback to offer.
Contributor
Nextest failures (1) in this run
See the step summary for flaky tests and slowest tests. |
- process-compose.yaml: espresso-node-4 had no libp2p env vars and was relying on the removed default localhost:1769. Set explicit BIND/ADVERTISE for it, matching nodes 0-3, using the existing ESPRESSO_DEMO_NODE_LIBP2P_PORT_4. - crates/espresso/node/src/run.rs: test_startup_before_orchestrator did not pass --libp2p-advertise-address, so init returned the new orchestrator- bootstrap error before the API server came up. Reserve a third ephemeral port and pass it. - crates/espresso/node/src/lib.rs + crates/cliquenet/src/addr.rs: warn loudly when the advertise address resolves to loopback, the unspecified address, or the literal "localhost". This catches the misconfiguration at startup rather than letting it manifest as silent unreachability for hours.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The CLI flag
--libp2p-advertise-address(envESPRESSO_NODE_LIBP2P_ADVERTISE_ADDRESS)was only used in the orchestrator-bootstrap branch (lib.rs:446) and never reached
the libp2p Swarm in the persistence-loaded or peer-fetch branches. The flag's
docstring promised that it "advertises to other nodes" but in production paths
it was a silent no-op.
Effect: validators behind NAT, K8s NodePort, or Docker bridge networking can only
be reached by peers they have already established outbound connections to. Identify
announces their bind-derived addresses (loopback / VPC-internal IPs), so peers that
learn about them via DHT lookup, or that lose and try to re-establish a connection,
cannot dial them. Coverage of the validator set is incomplete and degrades over
time as connections drop. Observed-address from existing connections is not
propagated by identify to third parties, and AutoNAT cannot self-confirm the
public address because the bind-derived addresses it offers as probe candidates
are not reachable.
Fix: thread Vec through Libp2pNetwork::from_config (libp2p_network.rs)
into NetworkNodeConfig.announce_addresses (config.rs), and call
Swarm::add_external_address for each in NetworkNode::new (node.rs). The existing
SwarmEvent::ExternalAddrConfirmed handler at node.rs already populates the DHT
self-record once the address is confirmed.
Also drop the default_value = "localhost:1769" on the CLI flag. Loopback was
actively wrong as a default once the flag is wired through to identify; absent
is now valid (no advertise) and bootstrapping the orchestrator without it
errors explicitly with a clear message.
Drive-by: fix a "fales" typo in a comment and whitelist the bimap crate name
in .typos.toml so the pre-commit spell-check passes.
References:
rust-libp2p Swarm::add_external_address (libp2p-swarm 0.47):
https://docs.rs/libp2p/0.56.0/libp2p/swarm/struct.Swarm.html#method.add_external_address
"Add a confirmed external address for the local node. This function should
only be called with addresses that are guaranteed to be reachable."
libp2p Identify spec - the listenAddrs field that peers receive during
identify exchange:
https://github.com/libp2p/specs/blob/master/identify/README.md
Without add_external_address, only listen addresses are populated, which for
bind=0.0.0.0 means loopback and any local interface IPs.