Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
ee68434
add a comment
ProjectsByJackHe Feb 18, 2026
68a7afe
update code to account for cibir
ProjectsByJackHe Feb 18, 2026
382fba7
clog fixes
ProjectsByJackHe Feb 18, 2026
c1c1445
update docs, update tests
ProjectsByJackHe Feb 19, 2026
df75326
rename reserveauxtcpsock and add explicit check for error code
ProjectsByJackHe Feb 21, 2026
a952c04
always false in kernel mode...
ProjectsByJackHe Feb 21, 2026
a0b20b3
XDP and CIBIR
ProjectsByJackHe Feb 21, 2026
d2bd384
add logs
ProjectsByJackHe Feb 28, 2026
c2ac95b
more logs
ProjectsByJackHe Feb 28, 2026
7f6d88e
more logs
ProjectsByJackHe Feb 28, 2026
b442e8a
return error if xdp not available
ProjectsByJackHe Mar 12, 2026
323d3f0
update docs on proper behavior
ProjectsByJackHe Mar 12, 2026
f55cc25
fix test
ProjectsByJackHe Mar 12, 2026
a70a8c2
get rid of bad artifact
ProjectsByJackHe Mar 12, 2026
cabc5a8
fix nit / clog
ProjectsByJackHe Mar 12, 2026
1557c28
address feedback
ProjectsByJackHe Mar 17, 2026
24e1390
clog
ProjectsByJackHe Mar 17, 2026
b43c894
fix inline def
ProjectsByJackHe Mar 18, 2026
8da485c
clog
ProjectsByJackHe Mar 19, 2026
2f06dad
update to use ifdef
ProjectsByJackHe Mar 21, 2026
a76904d
do not reserve any sockets at all
ProjectsByJackHe Mar 23, 2026
347356e
give it a non-0 local address
ProjectsByJackHe Mar 24, 2026
70a6e9e
proper port reservations
ProjectsByJackHe Mar 24, 2026
e4b5ecc
fix conn pool test case
ProjectsByJackHe Mar 24, 2026
cf4fd8b
fix tests
ProjectsByJackHe Mar 24, 2026
da3dfd6
put ifdef in handshaketest directly
ProjectsByJackHe Mar 26, 2026
4c9753c
update docs, add dbg assert, and fix logic
ProjectsByJackHe Mar 27, 2026
ced6a30
update warning logs
ProjectsByJackHe Mar 27, 2026
8a16a62
Merge branch 'main' into jackhe/sql-cibir-fix-sock-reservation
ProjectsByJackHe Apr 6, 2026
808ea0c
more crisp behavior
ProjectsByJackHe Apr 6, 2026
8a6eef8
Merge branch 'main' into jackhe/sql-cibir-fix-sock-reservation
ProjectsByJackHe Apr 7, 2026
ad2070a
update cibir docs
ProjectsByJackHe Apr 7, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions docs/CIBIR.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# CIBIR

## What is it

See [XDP](./XDP.md) first to understand the context.

When CIBIR is used, rather than programming XDP to filter and demux packets based on on address and port number,
XDP with CIBIR will instead filter and de-mux packets based on address, port number, and QUIC connection ID.

What CIBIR allows for is 2 or more separate server processes to share a single
port on the same machine, as long as their CIBIR ID is different.

## CIBIR port sharing logic
- Applications must provide a well-known local port for server sockets when using CIBIR and XDP.
- **IMPORTANT:** MsQuic will **NOT** reserve an OS port for server sockets when both CIBIR and XDP is enabled and available.
> Client sockets can never share ports, so MsQuic will reserve an OS port in that scenario.
- The responsbility of book-keeping shared ports and ensuring robust protection for those shared ports is delegated to the application.


## Port protection recommendation for shared ports

MsQuic strongly recommends applications leverage the Windows [persistent port reservations API](https://learn.microsoft.com/en-us/windows/win32/api/iphlpapi/nf-iphlpapi-createpersistentudpportreservation) to secure shared CIBIR ports prior to serving multi-process CIBIR traffic on a shared port.
- One time setup by a system admin to create the persistent reservation.
> A good option for book-keeping persistent port reservations is via registry keys.
- Persistent port reservations survive reboots, allowing for robust portection in the event of crashes.
- Having a persistent reservation makes sure critical ports are taken out of the ephemeral port pool, so an unsuspecting application process won't get accidently assigned an ephemeral port that collides with a CIBIR port.
2 changes: 1 addition & 1 deletion docs/Settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ These parameters are accessed by calling [GetParam](./api/GetParam.md) or [SetPa
|-------------------------------------------|---------------------------|-----------|-----------------------------------------------------------|
| `QUIC_PARAM_LISTENER_LOCAL_ADDRESS`<br> 0 | QUIC_ADDR | Get-only | Get the full address tuple the server is listening on. |
| `QUIC_PARAM_LISTENER_STATS`<br> 1 | QUIC_LISTENER_STATISTICS | Get-only | Get statistics specific to this Listener instance. |
| `QUIC_PARAM_LISTENER_CIBIR_ID`<br> 2 | uint8_t[] | Both | The CIBIR well-known idenfitier. |
| `QUIC_PARAM_LISTENER_CIBIR_ID`<br> 2 | uint8_t[] | Both | Sets a [CIBIR](./CIBIR.md) (CID-Based Identification and Routing) well-known identifier. |
| `QUIC_PARAM_DOS_MODE_EVENTS`<br> 2 | BOOLEAN | Both | The Listener opted in for DoS Mode event. |
| `QUIC_PARAM_LISTENER_PARTITION_INDEX`<br> (preview) | uint16_t | Both | The partition to use for listener callback events and incoming connections. |

Expand Down
130 changes: 130 additions & 0 deletions docs/XDP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# MsQuic over XDP

To avoid confusion, "XDP" refers to [XDP-for-windows](https://github.com/microsoft/xdp-for-windows). While Linux XDP has been experimented
upon in the past and shown some promise for running MsQuic, it is NOT a stable datapath actively being maintained today.

## What is XDP

XDP enables received packets to completely bypass the OS networking stack.

Applications can subscribe to XDP ring buffers to post packets to send,
and process packets that are received through AF_XDP sockets.

Additionally, applications can program XDP to determine the
logic for which packets to filter for, and what to do with them.

For instance: "drop all packets with a UDP header and destination port
42."

## Port reservation logic

The type of logic MsQuic programs into XDP looks like:
"redirect all packets with a destination port X to an AF_XDP socket."

This runs into the issue of **packet stealing.** If there was an unrelated process
that binds an OS socket to the same port MsQuic used to program XDP, XDP will steal
that traffic from underneath it.

Which is why MsQuic will always create an OS UDP socket on the same port as the AF_XDP
socket to play nice with the rest of the stack.

There are *exceptions* to this port reservation.

- Sometimes, MsQuic may create a TCP OS socket instead, or both TCP and UDP (see [QTIP](./QTIP.md)).
- Sometimes, MsQuic may NOT create any OS sockets at all (see [CIBIR](./CIBIR.md)).


## MsQuic over XDP general architecture:

```mermaid
flowchart TB

%% =========================
%% NIC + RSS
%% =========================
NIC["NIC interface"]

RSS1["RSS queue"]
RSS2["RSS queue"]

NIC --> RSS1
NIC --> RSS2

%% =========================
%% XDP FILTER ENGINE
%% =========================
subgraph XDP_ENGINE["XDP FILTER ENGINE"]

XDP_PROG1["XDP::XDP program"]
XDP_PROG2["XDP::XDP program"]

XDP_RULES["XDP::XDP RULES"]

AFXDP1["AF_XDP Socket"]
AFXDP2["AF_XDP Socket"]

RSS1 -->|packet data| XDP_PROG1
RSS2 -->|packet data| XDP_PROG2

XDP_PROG1 --> XDP_RULES
XDP_PROG2 --> XDP_RULES

XDP_RULES --> AFXDP1
XDP_RULES --> AFXDP2

end

%% =========================
%% PACKET DEMUX
%% =========================
DEMUX["Packet DE-MUX logic"]

AFXDP1 --> DEMUX
AFXDP2 --> DEMUX

%% =========================
%% CXPLAT SOCKET POOL
%% =========================
subgraph CXPLAT_POOL["CXPLAT SOCKET POOL HASH TABLE"]

CX1["CXPLAT Socket"]
CX2["CXPLAT Socket"]
CX3["CXPLAT Socket"]
CX4["CXPLAT Socket"]

end

DEMUX --> CX1
DEMUX --> CX2
DEMUX --> CX3
DEMUX --> CX4

%% =========================
%% FIND BINDING LOGIC
%% =========================
BIND["FIND BINDING LOGIC"]

CX1 --> BIND
CX2 --> BIND
CX3 --> BIND
CX4 --> BIND

%% =========================
%% MSQUIC OBJECTS
%% =========================
subgraph MSQUIC_OBJECTS["MSQUIC OBJECTS"]

CONN1["Connection"]
CONN2["Connection"]
CONN3["Connection"]
LIST1["Listener"]
LIST2["Listener"]

end

BIND --> CONN1
BIND --> CONN2
BIND --> CONN3
BIND --> LIST1
BIND --> LIST2
```
9 changes: 6 additions & 3 deletions src/core/connection.c
Original file line number Diff line number Diff line change
Expand Up @@ -6721,11 +6721,14 @@ QuicConnParamSet(
memcpy(Connection->CibirId + 1, Buffer, BufferLength);

QuicTraceLogConnInfo(
CibirIdSet,
CibirIdSetInfo,
Connection,
"CIBIR ID set (len %hhu, offset %hhu)",
"CIBIR ID set (len %hhu, offset %hhu, id 0x%llx)",
Connection->CibirId[0],
Connection->CibirId[1]);
Connection->CibirId[1],
(unsigned long long)QuicCibirIdToUint64(
Connection->CibirId + 2,
Connection->CibirId[0]));

return QUIC_STATUS_SUCCESS;
}
Expand Down
18 changes: 12 additions & 6 deletions src/core/listener.c
Original file line number Diff line number Diff line change
Expand Up @@ -835,11 +835,14 @@ QuicListenerAcceptConnection(

if (Connection->CibirId[0] != 0) {
QuicTraceLogConnInfo(
CibirIdSet,
CibirIdSetInfo,
Connection,
"CIBIR ID set (len %hhu, offset %hhu)",
"CIBIR ID set (len %hhu, offset %hhu, id 0x%llx)",
Connection->CibirId[0],
Connection->CibirId[1]);
Connection->CibirId[1],
(unsigned long long)QuicCibirIdToUint64(
Connection->CibirId + 2,
Connection->CibirId[0]));
}

if (!QuicConnGenerateNewSourceCid(Connection, TRUE)) {
Expand Down Expand Up @@ -885,11 +888,14 @@ QuicListenerParamSet(
memcpy(Listener->CibirId + 1, Buffer, BufferLength);

QuicTraceLogVerbose(
ListenerCibirIdSet,
"[list][%p] CIBIR ID set (len %hhu, offset %hhu)",
ListenerCibirIdSetInfo,
"[list][%p] CIBIR ID set (len %hhu, offset %hhu, id 0x%llx)",
Listener,
Listener->CibirId[0],
Listener->CibirId[1]);
Listener->CibirId[1],
(unsigned long long)QuicCibirIdToUint64(
Listener->CibirId + 2,
Listener->CibirId[0]));

return QUIC_STATUS_SUCCESS;
}
Expand Down
22 changes: 14 additions & 8 deletions src/generated/linux/connection.c.clog.h
Original file line number Diff line number Diff line change
Expand Up @@ -853,21 +853,27 @@ tracepoint(CLOG_CONNECTION_C, LocalInterfaceSet , arg1, arg3);\


/*----------------------------------------------------------
// Decoder Ring for CibirIdSet
// [conn][%p] CIBIR ID set (len %hhu, offset %hhu)
// Decoder Ring for CibirIdSetInfo
// [conn][%p] CIBIR ID set (len %hhu, offset %hhu, id 0x%llx)
// QuicTraceLogConnInfo(
CibirIdSet,
CibirIdSetInfo,
Connection,
"CIBIR ID set (len %hhu, offset %hhu)",
"CIBIR ID set (len %hhu, offset %hhu, id 0x%llx)",
Connection->CibirId[0],
Connection->CibirId[1]);
Connection->CibirId[1],
(unsigned long long)QuicCibirIdToUint64(
Connection->CibirId + 2,
Connection->CibirId[0]));
// arg1 = arg1 = Connection = arg1
// arg3 = arg3 = Connection->CibirId[0] = arg3
// arg4 = arg4 = Connection->CibirId[1] = arg4
// arg5 = arg5 = (unsigned long long)QuicCibirIdToUint64(
Connection->CibirId + 2,
Connection->CibirId[0]) = arg5
----------------------------------------------------------*/
#ifndef _clog_5_ARGS_TRACE_CibirIdSet
#define _clog_5_ARGS_TRACE_CibirIdSet(uniqueId, arg1, encoded_arg_string, arg3, arg4)\
tracepoint(CLOG_CONNECTION_C, CibirIdSet , arg1, arg3, arg4);\
#ifndef _clog_6_ARGS_TRACE_CibirIdSetInfo
#define _clog_6_ARGS_TRACE_CibirIdSetInfo(uniqueId, arg1, encoded_arg_string, arg3, arg4, arg5)\
tracepoint(CLOG_CONNECTION_C, CibirIdSetInfo , arg1, arg3, arg4, arg5);\

#endif

Expand Down
22 changes: 15 additions & 7 deletions src/generated/linux/connection.c.clog.h.lttng.h
Original file line number Diff line number Diff line change
Expand Up @@ -912,27 +912,35 @@ TRACEPOINT_EVENT(CLOG_CONNECTION_C, LocalInterfaceSet,


/*----------------------------------------------------------
// Decoder Ring for CibirIdSet
// [conn][%p] CIBIR ID set (len %hhu, offset %hhu)
// Decoder Ring for CibirIdSetInfo
// [conn][%p] CIBIR ID set (len %hhu, offset %hhu, id 0x%llx)
// QuicTraceLogConnInfo(
CibirIdSet,
CibirIdSetInfo,
Connection,
"CIBIR ID set (len %hhu, offset %hhu)",
"CIBIR ID set (len %hhu, offset %hhu, id 0x%llx)",
Connection->CibirId[0],
Connection->CibirId[1]);
Connection->CibirId[1],
(unsigned long long)QuicCibirIdToUint64(
Connection->CibirId + 2,
Connection->CibirId[0]));
// arg1 = arg1 = Connection = arg1
// arg3 = arg3 = Connection->CibirId[0] = arg3
// arg4 = arg4 = Connection->CibirId[1] = arg4
// arg5 = arg5 = (unsigned long long)QuicCibirIdToUint64(
Connection->CibirId + 2,
Connection->CibirId[0]) = arg5
----------------------------------------------------------*/
TRACEPOINT_EVENT(CLOG_CONNECTION_C, CibirIdSet,
TRACEPOINT_EVENT(CLOG_CONNECTION_C, CibirIdSetInfo,
TP_ARGS(
const void *, arg1,
unsigned char, arg3,
unsigned char, arg4),
unsigned char, arg4,
unsigned long long, arg5),
TP_FIELDS(
ctf_integer_hex(uint64_t, arg1, (uint64_t)arg1)
ctf_integer(unsigned char, arg3, arg3)
ctf_integer(unsigned char, arg4, arg4)
ctf_integer(uint64_t, arg5, arg5)
)
)

Expand Down
46 changes: 46 additions & 0 deletions src/generated/linux/datapath_winuser.c.clog.h
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,52 @@ tracepoint(CLOG_DATAPATH_WINUSER_C, DatapathTestSetIpv6TrafficClassFailed , arg2



/*----------------------------------------------------------
// Decoder Ring for DatapathCibirWarning
// [data][%p] CIBIR detected, %s
// QuicTraceLogWarning(
DatapathCibirWarning,
"[data][%p] CIBIR detected, %s",
Socket,
"Skipping OS port reservation for this server socket.");
// arg2 = arg2 = Socket = arg2
// arg3 = arg3 = "Skipping OS port reservation for this server socket." = arg3
----------------------------------------------------------*/
#ifndef _clog_4_ARGS_TRACE_DatapathCibirWarning
#define _clog_4_ARGS_TRACE_DatapathCibirWarning(uniqueId, encoded_arg_string, arg2, arg3)\
tracepoint(CLOG_DATAPATH_WINUSER_C, DatapathCibirWarning , arg2, arg3);\

#endif




/*----------------------------------------------------------
// Decoder Ring for DatapathCibirIdUsed
// [data][%p] Using CIBIR ID (len %hhu, id 0x%llx)
// QuicTraceLogWarning(
DatapathCibirIdUsed,
"[data][%p] Using CIBIR ID (len %hhu, id 0x%llx)",
Socket,
Config->CibirIdLength,
(unsigned long long)QuicCibirIdToUint64(
Config->CibirId,
Config->CibirIdLength));
// arg2 = arg2 = Socket = arg2
// arg3 = arg3 = Config->CibirIdLength = arg3
// arg4 = arg4 = (unsigned long long)QuicCibirIdToUint64(
Config->CibirId,
Config->CibirIdLength) = arg4
----------------------------------------------------------*/
#ifndef _clog_5_ARGS_TRACE_DatapathCibirIdUsed
#define _clog_5_ARGS_TRACE_DatapathCibirIdUsed(uniqueId, encoded_arg_string, arg2, arg3, arg4)\
tracepoint(CLOG_DATAPATH_WINUSER_C, DatapathCibirIdUsed , arg2, arg3, arg4);\

#endif




/*----------------------------------------------------------
// Decoder Ring for DatapathRecvEmpty
// [data][%p] Dropping datagram with empty payload.
Expand Down
Loading
Loading