Skip to content
Open
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
ee68434
add a comment
ProjectsByJackHe Feb 18, 2026
68a7afe
update code to account for cibir
ProjectsByJackHe Feb 18, 2026
382fba7
clog fixes
ProjectsByJackHe Feb 18, 2026
c1c1445
update docs, update tests
ProjectsByJackHe Feb 19, 2026
df75326
rename reserveauxtcpsock and add explicit check for error code
ProjectsByJackHe Feb 21, 2026
a952c04
always false in kernel mode...
ProjectsByJackHe Feb 21, 2026
a0b20b3
XDP and CIBIR
ProjectsByJackHe Feb 21, 2026
d2bd384
add logs
ProjectsByJackHe Feb 28, 2026
c2ac95b
more logs
ProjectsByJackHe Feb 28, 2026
7f6d88e
more logs
ProjectsByJackHe Feb 28, 2026
b442e8a
return error if xdp not available
ProjectsByJackHe Mar 12, 2026
323d3f0
update docs on proper behavior
ProjectsByJackHe Mar 12, 2026
f55cc25
fix test
ProjectsByJackHe Mar 12, 2026
a70a8c2
get rid of bad artifact
ProjectsByJackHe Mar 12, 2026
cabc5a8
fix nit / clog
ProjectsByJackHe Mar 12, 2026
1557c28
address feedback
ProjectsByJackHe Mar 17, 2026
24e1390
clog
ProjectsByJackHe Mar 17, 2026
b43c894
fix inline def
ProjectsByJackHe Mar 18, 2026
8da485c
clog
ProjectsByJackHe Mar 19, 2026
2f06dad
update to use ifdef
ProjectsByJackHe Mar 21, 2026
a76904d
do not reserve any sockets at all
ProjectsByJackHe Mar 23, 2026
347356e
give it a non-0 local address
ProjectsByJackHe Mar 24, 2026
70a6e9e
proper port reservations
ProjectsByJackHe Mar 24, 2026
e4b5ecc
fix conn pool test case
ProjectsByJackHe Mar 24, 2026
cf4fd8b
fix tests
ProjectsByJackHe Mar 24, 2026
da3dfd6
put ifdef in handshaketest directly
ProjectsByJackHe Mar 26, 2026
4c9753c
update docs, add dbg assert, and fix logic
ProjectsByJackHe Mar 27, 2026
ced6a30
update warning logs
ProjectsByJackHe Mar 27, 2026
8a16a62
Merge branch 'main' into jackhe/sql-cibir-fix-sock-reservation
ProjectsByJackHe Apr 6, 2026
808ea0c
more crisp behavior
ProjectsByJackHe Apr 6, 2026
8a6eef8
Merge branch 'main' into jackhe/sql-cibir-fix-sock-reservation
ProjectsByJackHe Apr 7, 2026
ad2070a
update cibir docs
ProjectsByJackHe Apr 7, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions docs/CIBIR.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# CIBIR

## What is it

See [XDP](./XDP.md) first to understand the context.

When CIBIR is used, rather than programming XDP to filter packets on port numbers,
we now filter and de-mux packets based on QUIC connection ID.

CIBIR (CID-Based Identification and Routing) is just a prefix substring that XDP
will use to match and filter all packets with a QUIC CID that contains the prefix substring equal to CIBIR.

What using CIBIR also enables is allowing 2 or more separate server processes to share a single
port. As long as the CIBIR configuration used by each process is different, XDP can
properly de-mux and dispatch received packets to the right process.

## Port reservation
The first process that uses CIBIR will still need to reserve the OS ports to avoid
non-CIBIR applications from getting their traffic stolen. The second (and so on) processes
using CIBIR thereafter will skip reserving OS socket ports.


CIBIR usage is controlled by setting the `QUIC_PARAM_LISTENER_CIBIR_ID` setparam.

CIBIR does 2 things when set:
1. XDP will now steer packets to the correct process/listener by matching the CIBIR prefix within the packet QUIC Connection ID.

2. In the case of a port collision when reserving OS UDP/TCP sockets, MsQuic will continue with initializing the datapath if XDP is available/enabled. If XDP is not available/enabled, then MsQuic will return a QUIC_STATUS_INVALID_STATE socket error.
2 changes: 1 addition & 1 deletion docs/Settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ These parameters are accessed by calling [GetParam](./api/GetParam.md) or [SetPa
|-------------------------------------------|---------------------------|-----------|-----------------------------------------------------------|
| `QUIC_PARAM_LISTENER_LOCAL_ADDRESS`<br> 0 | QUIC_ADDR | Get-only | Get the full address tuple the server is listening on. |
| `QUIC_PARAM_LISTENER_STATS`<br> 1 | QUIC_LISTENER_STATISTICS | Get-only | Get statistics specific to this Listener instance. |
| `QUIC_PARAM_LISTENER_CIBIR_ID`<br> 2 | uint8_t[] | Both | The CIBIR well-known idenfitier. |
| `QUIC_PARAM_LISTENER_CIBIR_ID`<br> 2 | uint8_t[] | Both | Sets a [CIBIR](./CIBIR.md) (CID-Based Identification and Routing) well-known identifier. |
| `QUIC_PARAM_DOS_MODE_EVENTS`<br> 2 | BOOLEAN | Both | The Listener opted in for DoS Mode event. |
| `QUIC_PARAM_LISTENER_PARTITION_INDEX`<br> (preview) | uint16_t | Both | The partition to use for listener callback events and incoming connections. |

Expand Down
130 changes: 130 additions & 0 deletions docs/XDP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# MsQuic over XDP

To avoid confusion, "XDP" refers to [XDP-for-windows](https://github.com/microsoft/xdp-for-windows). While Linux XDP has been experimented
upon in the past and shown some promise for running MsQuic, it is NOT a stable datapath actively being maintained today.

## What is XDP

XDP enables received packets to completely bypass the OS networking stack.

Applications can subscribe to XDP ring buffers to post packets to send,
and process packets that are received through AF_XDP sockets.

Additionally, applications can program XDP to determine the
logic for which packets to filter for, and what to do with them.

For instance: "drop all packets with a UDP header and destination port
42."

## Port reservation logic

The type of logic MsQuic programs into XDP looks like:
"redirect all packets with a destination port X to an AF_XDP socket."

This runs into the issue of **packet stealing.** If there was an unrelated process
that binds an OS socket to the same port MsQuic used to program XDP, XDP will steal
that traffic from underneath it.

Which is why MsQuic will always create an OS UDP socket on the same port as the AF_XDP
socket to play nice with the rest of the stack.

There are *exceptions* to this port reservation.

- Sometimes, MsQuic may create a TCP OS socket instead, or both TCP and UDP (see [QTIP](./QTIP.md)).
- Sometimes, MsQuic may NOT create any OS sockets at all (see [CIBIR](./CIBIR.md)).


## MsQuic over XDP general architecture:

```mermaid
flowchart TB

%% =========================
%% NIC + RSS
%% =========================
NIC["NIC interface"]

RSS1["RSS queue"]
RSS2["RSS queue"]

NIC --> RSS1
NIC --> RSS2

%% =========================
%% XDP FILTER ENGINE
%% =========================
subgraph XDP_ENGINE["XDP FILTER ENGINE"]

XDP_PROG1["XDP::XDP program"]
XDP_PROG2["XDP::XDP program"]

XDP_RULES["XDP::XDP RULES"]

AFXDP1["AF_XDP Socket"]
AFXDP2["AF_XDP Socket"]

RSS1 -->|packet data| XDP_PROG1
RSS2 -->|packet data| XDP_PROG2

XDP_PROG1 --> XDP_RULES
XDP_PROG2 --> XDP_RULES

XDP_RULES --> AFXDP1
XDP_RULES --> AFXDP2

end

%% =========================
%% PACKET DEMUX
%% =========================
DEMUX["Packet DE-MUX logic"]

AFXDP1 --> DEMUX
AFXDP2 --> DEMUX

%% =========================
%% CXPLAT SOCKET POOL
%% =========================
subgraph CXPLAT_POOL["CXPLAT SOCKET POOL HASH TABLE"]

CX1["CXPLAT Socket"]
CX2["CXPLAT Socket"]
CX3["CXPLAT Socket"]
CX4["CXPLAT Socket"]

end

DEMUX --> CX1
DEMUX --> CX2
DEMUX --> CX3
DEMUX --> CX4

%% =========================
%% FIND BINDING LOGIC
%% =========================
BIND["FIND BINDING LOGIC"]

CX1 --> BIND
CX2 --> BIND
CX3 --> BIND
CX4 --> BIND

%% =========================
%% MSQUIC OBJECTS
%% =========================
subgraph MSQUIC_OBJECTS["MSQUIC OBJECTS"]

CONN1["Connection"]
CONN2["Connection"]
CONN3["Connection"]
LIST1["Listener"]
LIST2["Listener"]

end

BIND --> CONN1
BIND --> CONN2
BIND --> CONN3
BIND --> LIST1
BIND --> LIST2
```
9 changes: 6 additions & 3 deletions src/core/connection.c
Original file line number Diff line number Diff line change
Expand Up @@ -6709,11 +6709,14 @@ QuicConnParamSet(
memcpy(Connection->CibirId + 1, Buffer, BufferLength);

QuicTraceLogConnInfo(
CibirIdSet,
CibirIdSetInfo,
Connection,
"CIBIR ID set (len %hhu, offset %hhu)",
"CIBIR ID set (len %hhu, offset %hhu, id 0x%llx)",
Connection->CibirId[0],
Connection->CibirId[1]);
Connection->CibirId[1],
(unsigned long long)QuicCibirIdToUint64(
Connection->CibirId + 2,
Connection->CibirId[0]));

return QUIC_STATUS_SUCCESS;
}
Expand Down
18 changes: 12 additions & 6 deletions src/core/listener.c
Original file line number Diff line number Diff line change
Expand Up @@ -835,11 +835,14 @@ QuicListenerAcceptConnection(

if (Connection->CibirId[0] != 0) {
QuicTraceLogConnInfo(
CibirIdSet,
CibirIdSetInfo,
Connection,
"CIBIR ID set (len %hhu, offset %hhu)",
"CIBIR ID set (len %hhu, offset %hhu, id 0x%llx)",
Connection->CibirId[0],
Connection->CibirId[1]);
Connection->CibirId[1],
(unsigned long long)QuicCibirIdToUint64(
Connection->CibirId + 2,
Connection->CibirId[0]));
}

if (!QuicConnGenerateNewSourceCid(Connection, TRUE)) {
Expand Down Expand Up @@ -885,11 +888,14 @@ QuicListenerParamSet(
memcpy(Listener->CibirId + 1, Buffer, BufferLength);

QuicTraceLogVerbose(
ListenerCibirIdSet,
"[list][%p] CIBIR ID set (len %hhu, offset %hhu)",
ListenerCibirIdSetInfo,
"[list][%p] CIBIR ID set (len %hhu, offset %hhu, id 0x%llx)",
Listener,
Listener->CibirId[0],
Listener->CibirId[1]);
Listener->CibirId[1],
(unsigned long long)QuicCibirIdToUint64(
Listener->CibirId + 2,
Listener->CibirId[0]));

return QUIC_STATUS_SUCCESS;
}
Expand Down
22 changes: 14 additions & 8 deletions src/generated/linux/connection.c.clog.h
Original file line number Diff line number Diff line change
Expand Up @@ -853,21 +853,27 @@ tracepoint(CLOG_CONNECTION_C, LocalInterfaceSet , arg1, arg3);\


/*----------------------------------------------------------
// Decoder Ring for CibirIdSet
// [conn][%p] CIBIR ID set (len %hhu, offset %hhu)
// Decoder Ring for CibirIdSetInfo
// [conn][%p] CIBIR ID set (len %hhu, offset %hhu, id 0x%llx)
// QuicTraceLogConnInfo(
CibirIdSet,
CibirIdSetInfo,
Connection,
"CIBIR ID set (len %hhu, offset %hhu)",
"CIBIR ID set (len %hhu, offset %hhu, id 0x%llx)",
Connection->CibirId[0],
Connection->CibirId[1]);
Connection->CibirId[1],
(unsigned long long)QuicCibirIdToUint64(
Connection->CibirId + 2,
Connection->CibirId[0]));
// arg1 = arg1 = Connection = arg1
// arg3 = arg3 = Connection->CibirId[0] = arg3
// arg4 = arg4 = Connection->CibirId[1] = arg4
// arg5 = arg5 = (unsigned long long)QuicCibirIdToUint64(
Connection->CibirId + 2,
Connection->CibirId[0]) = arg5
----------------------------------------------------------*/
#ifndef _clog_5_ARGS_TRACE_CibirIdSet
#define _clog_5_ARGS_TRACE_CibirIdSet(uniqueId, arg1, encoded_arg_string, arg3, arg4)\
tracepoint(CLOG_CONNECTION_C, CibirIdSet , arg1, arg3, arg4);\
#ifndef _clog_6_ARGS_TRACE_CibirIdSetInfo
#define _clog_6_ARGS_TRACE_CibirIdSetInfo(uniqueId, arg1, encoded_arg_string, arg3, arg4, arg5)\
tracepoint(CLOG_CONNECTION_C, CibirIdSetInfo , arg1, arg3, arg4, arg5);\

#endif

Expand Down
22 changes: 15 additions & 7 deletions src/generated/linux/connection.c.clog.h.lttng.h
Original file line number Diff line number Diff line change
Expand Up @@ -912,27 +912,35 @@ TRACEPOINT_EVENT(CLOG_CONNECTION_C, LocalInterfaceSet,


/*----------------------------------------------------------
// Decoder Ring for CibirIdSet
// [conn][%p] CIBIR ID set (len %hhu, offset %hhu)
// Decoder Ring for CibirIdSetInfo
// [conn][%p] CIBIR ID set (len %hhu, offset %hhu, id 0x%llx)
// QuicTraceLogConnInfo(
CibirIdSet,
CibirIdSetInfo,
Connection,
"CIBIR ID set (len %hhu, offset %hhu)",
"CIBIR ID set (len %hhu, offset %hhu, id 0x%llx)",
Connection->CibirId[0],
Connection->CibirId[1]);
Connection->CibirId[1],
(unsigned long long)QuicCibirIdToUint64(
Connection->CibirId + 2,
Connection->CibirId[0]));
// arg1 = arg1 = Connection = arg1
// arg3 = arg3 = Connection->CibirId[0] = arg3
// arg4 = arg4 = Connection->CibirId[1] = arg4
// arg5 = arg5 = (unsigned long long)QuicCibirIdToUint64(
Connection->CibirId + 2,
Connection->CibirId[0]) = arg5
----------------------------------------------------------*/
TRACEPOINT_EVENT(CLOG_CONNECTION_C, CibirIdSet,
TRACEPOINT_EVENT(CLOG_CONNECTION_C, CibirIdSetInfo,
TP_ARGS(
const void *, arg1,
unsigned char, arg3,
unsigned char, arg4),
unsigned char, arg4,
unsigned long long, arg5),
TP_FIELDS(
ctf_integer_hex(uint64_t, arg1, (uint64_t)arg1)
ctf_integer(unsigned char, arg3, arg3)
ctf_integer(unsigned char, arg4, arg4)
ctf_integer(uint64_t, arg5, arg5)
)
)

Expand Down
51 changes: 0 additions & 51 deletions src/generated/linux/datapath_linux.c.clog.h

This file was deleted.

Loading
Loading