Skip to content

Commit 1f8a1f7

Browse files
wip
1 parent 176ebe9 commit 1f8a1f7

17 files changed

Lines changed: 2067 additions & 163 deletions

File tree

specification/Cyphal_Specification.tex

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -81,12 +81,14 @@ \section*{Limitation of liability}
8181
\end{titlepage}
8282

8383
\tableofcontents
84-
\listoftables
85-
\listoffigures
84+
\clearpage\listoftables
85+
\BeginRightColumn\listoffigures
8686

8787
\mainmatter
8888

8989
\input{introduction/introduction.tex}
90+
\input{session/session.tex}
9091
\input{transport/transport.tex}
92+
\input{appendices/appendices.tex}
9193

9294
\end{document}
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
\appendix
2+
3+
\input{appendices/eviction.tex}
4+
\input{appendices/core_tla.tex}
5+
\input{appendices/proof.tex}
6+
\input{appendices/rapidhash.tex}
Lines changed: 256 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,256 @@
1+
\chapter{\texorpdfstring{TLA\textsuperscript{+}}{TLA+} model of the allocation CRDT}\label{sec:appendix_core_tla}
2+
3+
This appendix reproduces the complete contents of
4+
\texttt{model/tlaplus/Core.tla} from the Cy reference implementation.
5+
Core.tla is the primary source of truth for the semantics of the topic
6+
allocation CRDT described in section~\ref{sec:session_crdt}; where this
7+
specification and Core.tla disagree, Core.tla is authoritative.
8+
The \verb|Check_| statements have been removed from this transcript.
9+
10+
\begin{minted}[fontsize=\scriptsize]{text}
11+
------------------------------ MODULE Core ------------------------------
12+
\* Key operators defined on Cyphal named topics. This is the core part of the formal protocol specification.
13+
\* This is pure TLA+ without PlusCal.
14+
\*
15+
\* The decentralized topic subject-ID allocation protocol in Cyphal solves the problem of allocating unique subject-IDs
16+
\* per topic name. The objective is to find a bijective configuration that is conflict-free
17+
\* (one topic name per subject-ID) and non-divergent (one subject-ID per topic name).
18+
\*
19+
\* There is no central coordinator. Nodes eventually find the consensus by periodically broadcasting gossip messages.
20+
\* Each gossip message carries a single topic->subject allocation entry (nodes apply simple rules to choose which entry
21+
\* to gossip next). Every node only keeps the entries that it needs (publishers/subscribers that the application needs
22+
\* to use); no node is required to keep the full set of entries. Every node must be *eventually* able to receive and
23+
\* send gossip messages for the protocol to make progress; transient inability to do so will not break the protocol
24+
\* but may delay the consensus (resource-limited nodes may choose to drop some gossip messages to reduce processing
25+
\* burden; this does not violate the core assumptions of the protocol as long as such nodes are able to *eventually*
26+
\* receive each live allocation entry).
27+
\*
28+
\* The allocation protocol does not operate on topic names directly, substituting them with numerical hashes instead.
29+
\* The application layer is expected to attach names to hashes but this is of no relevance to the core protocol.
30+
\* From the standpoint of the core allocation protocol, "topic name" and "topic hash" can be used interchangeably.
31+
\*
32+
\* Each allocation entry contains three fields: the topic hash, the topic eviction count, and the topic age.
33+
\* A pair of entries can be compared to each other to determine if the pair constitutes a collision
34+
\* (different names, same subject) or a divergence (same name, different subjects);
35+
\* other comparison outcomes are of no interest.
36+
\*
37+
\* The eviction count is used as a Lamport clock and hence implementations ensure that it cannot overflow. Since
38+
\* evictions are static on a quiescent network, even relatively narrow representations (e.g., 32 bit) may suffice.
39+
\*
40+
\* The subject-ID assigned to a topic is defined as some function of its hash and eviction count. One possible way
41+
\* to define the function is, assuming 64-bit unsigned arithmetics:
42+
\*
43+
\* subject_id = 8192 + ((hash + evictions**2) % 8380403)
44+
\*
45+
\* For the background on the subject of open addressing schemes and why the specific values were chosen this way, see:
46+
\* - https://en.wikipedia.org/wiki/Quadratic_probing
47+
\* - https://github.com/OpenCyphal-Garage/cy/issues/12
48+
\*
49+
\* The eviction count of a topic is incremented whenever the topic is involved in a collision and loses arbitration.
50+
\* Arbitration is defined as two functions, defined here as LeftWinsCollision and LeftWinsDivergence.
51+
\* The topic age is incremented monotonically as some kind of real or logical clock, such that each node that holds
52+
\* this topic is expected to increment its age at approximately the same rate (the protocol does not require the
53+
\* increment rate to match accurately, but better rate matching results in faster worst-case convergence).
54+
\* The age is used to determine which topic wins arbitration in the event of a collision or a divergence,
55+
\* with the core idea being that older topics win arbitration to avoid disturbances to established networks.
56+
\*
57+
\* The topic-subject allocation table described above is a kind of CRDT (conflict-free replicated data type).
58+
\* A CRDT only guarantees eventual convergence while the system is quiescent, which seemingly contradicts the
59+
\* use of the age counter that increments continuously. To work around this, the protocol uses the binary
60+
\* logarithm of topic ages for topic comparison instead of the actual age value, which naturally slows down
61+
\* the increment rate over time, approaching quiescence.
62+
\*
63+
\* When a node creates a new topic (i.e., when the local application creates a new subscription or publication),
64+
\* its age and eviction counter are set to zero; the initial subject-ID is obtained using the function above.
65+
\* The topic is commissioned to use immediately without the need to wait for the network to converge,
66+
\* despite the fact that it is known that the initial allocation may be conflicting or divergent.
67+
\* To avoid the risk of data misinterpretation should a collision occur, the transport layer carries the
68+
\* topic hash per transfer, and the transfer is rejected should the hash be incorrect.
69+
\* Each local topic is reconfigured as necessary should the allocation protocol require so, transparently for
70+
\* the application.
71+
\*
72+
\* Upon reception of a gossip, the age of the matching local topic, if there is one, is updated as:
73+
\* local_age = max(local_age, 2**remote_age_log2).
74+
\*
75+
\* Pavel Kirienko <pavel@opencyphal.org>, MIT license
76+
77+
EXTENDS Utils
78+
LOCAL INSTANCE Integers
79+
LOCAL INSTANCE TLC
80+
LOCAL INSTANCE Sequences
81+
LOCAL INSTANCE FiniteSets
82+
83+
\* Computes floor(log2(x)), with the special case floor(log2(0)) == -1
84+
RECURSIVE Log2Floor(_)
85+
Log2Floor(x) == IF x > 1 THEN 1 + Log2Floor(x \div 2) ELSE x - 1
86+
87+
\* Special case: Pow2(-1) = 0
88+
Pow2(x) == IF x > 1 THEN 2^x ELSE x + 1
89+
90+
FloorToPow2(x) == Pow2(Log2Floor(x))
91+
92+
\***********************************************************************************************************************
93+
\* Subject-ID mapping function. The ring size is the total number of distinct subject-IDs.
94+
\* TODO: Switch to quadratic probing: https://github.com/OpenCyphal-Garage/cy/issues/12
95+
SubjectIDRing == 10
96+
SubjectID(hash, evictions) == (hash + evictions) % SubjectIDRing
97+
98+
\***********************************************************************************************************************
99+
\* COLLISION is when topics with different hashes land on the same subject-ID (see the subject_id mapping).
100+
LeftWinsCollision(a, b) == IF Log2Floor(a.age) # Log2Floor(b.age)
101+
THEN Log2Floor(a.age) > Log2Floor(b.age)
102+
ELSE a.hash < b.hash
103+
104+
\* DIVERGENCE is when topics with the same hash are found to have different eviction counters:
105+
\* the eviction counters have to match, so we need to decide which one is the correct one.
106+
LeftWinsDivergence(a, b) == \/ Log2Floor(a.age) > Log2Floor(b.age)
107+
\/ (Log2Floor(a.age) = Log2Floor(b.age) /\ a.evictions > b.evictions)
108+
109+
\***********************************************************************************************************************
110+
\* The result is Nothing if no such topic exists.
111+
\* Topics are stored in sets because all operations on them are ordering-invariant,
112+
\* which is a basic prerequisite for CRDT operations.
113+
GetByHash(hash, topics) == Get(topics, LAMBDA x: x.hash = hash)
114+
GetBySubjectID(subject_id, topics) == Get(topics, LAMBDA x: SubjectID(x.hash, x.evictions) = subject_id)
115+
116+
\***********************************************************************************************************************
117+
\* A set of topics without the specified one. Same set if the specified topic is not a member.
118+
RemoveTopic(hash, topics) == { t \in topics : t.hash # hash }
119+
120+
\* Add or replace a topic into a set of topics.
121+
\* Use this to mutate a topic state in a set.
122+
ReplaceTopic(new, topics) == {new} \cup RemoveTopic(new.hash, topics)
123+
124+
\* A set of topics extended with the specified one, and the existing topics possibly altered.
125+
\* Uniqueness is guaranteed; if the topic is in the set already, it will be modified.
126+
\* This can also be used to model state update of the local topic table.
127+
RECURSIVE AllocateTopic(_, _)
128+
AllocateTopic(t, topics) ==
129+
LET ts == RemoveTopic(t.hash, topics)
130+
x == GetBySubjectID(SubjectID(t.hash, t.evictions), ts)
131+
Evicted(z) == [hash |-> z.hash, evictions |-> 1 + z.evictions, age |-> z.age]
132+
IN IF x = Nothing THEN {t} \cup ts
133+
ELSE IF LeftWinsCollision(t, x) THEN AllocateTopic(Evicted(x), {t} \cup ts)
134+
ELSE AllocateTopic(Evicted(t), ts) \* Retry with evictions+1
135+
136+
\***********************************************************************************************************************
137+
\* Constructs a conflict-free topic set. This is meant for constructing the initial node state.
138+
\* Each hash can occur at most once.
139+
RECURSIVE AllocateTopics(_, _)
140+
AllocateTopics(new, topics) ==
141+
IF Cardinality(new) = 0 THEN topics
142+
ELSE LET t == CHOOSE x \in new : TRUE
143+
IN AllocateTopics(new \ {t}, AllocateTopic(t, topics))
144+
145+
\***********************************************************************************************************************
146+
\* Implementation of the divergence resolution rule with CRDT age merge.
147+
LOCAL AcceptGossip_Divergence(remote, topics) ==
148+
LET hash == remote.hash
149+
local == GetByHash(hash, topics)
150+
\* The serialization protocol floors the remote age to pow2 before transmission.
151+
new_age == Max({IF local # Nothing THEN local.age ELSE 0, FloorToPow2(remote.age)})
152+
IN
153+
IF local # Nothing THEN
154+
IF local.evictions # remote.evictions /\ LeftWinsDivergence(remote, local)
155+
THEN AllocateTopic([hash|->hash, evictions|->remote.evictions, age|->new_age], topics)
156+
ELSE {[hash|->hash, evictions|-> local.evictions, age|->new_age]} \cup RemoveTopic(hash, topics)
157+
ELSE topics
158+
159+
\* Implementation of the collision resolution rule.
160+
LOCAL AcceptGossip_Collision(remote, topics) ==
161+
LET local == GetBySubjectID(SubjectID(remote.hash, remote.evictions), topics)
162+
IN IF local # Nothing /\ LeftWinsCollision(remote, local)
163+
THEN AllocateTopic([hash |-> local.hash, evictions |-> local.evictions + 1, age |-> local.age], topics)
164+
ELSE topics
165+
166+
\* An updated sequence of topics based on a received gossip message.
167+
\* The sequence may be modified if the gossiped entry is observed to diverge or collide with a local entry,
168+
\* and the affected local entry loses arbitration to the gossiped one.
169+
\* The naive implementation is as simple as:
170+
\*
171+
\* AcceptGossip_Collision(remote, AcceptGossip_Divergence(remote, topics))
172+
\*
173+
\* which is correct and converges eventually; however, it is not optimal because it may cause a local entry
174+
\* to lose a collision arbitration to a divergent remote entry, which will cause unnecessary eviction counter
175+
\* increments and thus delay convergence. To avoid this, we ignore collisions if the remote entry is seen to be
176+
\* divergent from a local entry and the local entry wins the divergence arbitration.
177+
AcceptGossip(remote, topics) ==
178+
IF GetByHash(remote.hash, topics) # Nothing \* If the topic exists locally
179+
THEN AcceptGossip_Divergence(remote, topics) \* Divergence resolver will ensure collision-freedom as well
180+
ELSE AcceptGossip_Collision(remote, topics) \* We don't know this topic, just ensure collision-freedom
181+
182+
\***********************************************************************************************************************
183+
\* Divergent allocation detector operating on a function node_id -> {topic_0, topic_1, ..., topic_n}
184+
\* A divergent allocation occurs when topic records with the same hash have distinct eviction counters.
185+
\* If no divergences are found, the result is an empty set.
186+
FindDivergent(node_topics) ==
187+
LET hashes == { t.hash : t \in UNION Range(node_topics) }
188+
flat == { [hash |-> t.hash, evictions |-> t.evictions] : t \in UNION Range(node_topics) }
189+
IN { h \in hashes : Cardinality({ s \in flat : s.hash = h }) > 1 }
190+
191+
\***********************************************************************************************************************
192+
\* Allocation collision detector operating on a function node_id -> {topic_0, topic_1, ..., topic_n}
193+
\* A collision occurs when topic records with distinct hashes have identical subject-ID.
194+
\* If no collisions are found, the result is an empty set.
195+
FindCollisions(node_topics) ==
196+
LET flat == { [hash |-> t.hash, subject_id |-> SubjectID(t.hash, t.evictions)] : t \in UNION Range(node_topics) }
197+
subject_ids == { t.subject_id : t \in flat }
198+
IN { s \in subject_ids : Cardinality({ t \in flat : t.subject_id = s }) > 1 }
199+
200+
\***********************************************************************************************************************
201+
\* Whether the network has converged to a stable state.
202+
\* Once a stable state is reached, the topics that are involved in it will no longer be altered as long as
203+
\* no older topics are introduced to the network (such as in the event of network departitioning).
204+
\* Appearance of young topics will not affect the existing entries by design.
205+
Converged(node_topics) ==
206+
/\ FindDivergent(node_topics) = {}
207+
/\ FindCollisions(node_topics) = {}
208+
209+
========================================================================================================================
210+
\end{minted}
211+
212+
\begin{minted}[fontsize=\scriptsize]{text}
213+
------------------------------ MODULE Utils ------------------------------
214+
\* General-purpose utilities not specific to this problem.
215+
\* There are a few utilities here copied over from https://github.com/tlaplus/CommunityModules (MIT license)
216+
217+
LOCAL INSTANCE Integers
218+
LOCAL INSTANCE Naturals
219+
LOCAL INSTANCE TLC
220+
LOCAL INSTANCE Sequences
221+
LOCAL INSTANCE FiniteSets
222+
223+
CONSTANT Nothing
224+
225+
\* From https://github.com/tlaplus/CommunityModules (MIT license)
226+
PrintVal(id, exp) == Print(<<id, exp>>, TRUE)
227+
IsInjective(f) == \A a,b \in DOMAIN f : f[a] = f[b] => a = b
228+
SeqToSet(s) == {s[i] : i \in DOMAIN s}
229+
SetToSeq(S) == CHOOSE f \in [1..Cardinality(S) -> S] : IsInjective(f)
230+
SetToSeqs(S) == LET D == 1..Cardinality(S) IN { f \in [D -> S] : \A i,j \in D : i # j => f[i] # f[j] }
231+
Range(f) == { f[x] : x \in DOMAIN f }
232+
233+
Min(S) == CHOOSE s \in S : \A t \in S : s <= t
234+
Max(S) == CHOOSE s \in S : \A t \in S : s >= t
235+
236+
\* The first element in a sequence where test(e) is TRUE; Nothing if the test is FALSE for all elements.
237+
FirstMatch(haystack, test(_)) ==
238+
LET i == CHOOSE i \in 1..(Len(haystack)+1): (i > Len(haystack)) \/ test(haystack[i])
239+
IN IF i > Len(haystack) THEN Nothing ELSE haystack[i]
240+
241+
\* The set element where test(e) is TRUE; Nothing if the test is false for all elements.
242+
Get(haystack, test(_)) == LET matches == { x \in haystack : test(x) }
243+
IN IF matches = {} THEN Nothing ELSE CHOOSE x \in matches : TRUE
244+
245+
\* A sequence that contains zero needles.
246+
SeqWithout(haystack, needle) == SelectSeq(haystack, LAMBDA x: x # needle)
247+
248+
\* <<1, 2, 3>> ==> <<2, 3, 1>>
249+
SeqRotLeft(seq) == IF Len(seq) > 0 THEN Tail(seq) \o <<Head(seq)>> ELSE seq
250+
251+
\* Converts {<<k1, v1>>, <<k2, v2>>, ...} into a function [k |-> v]. Keys must be unique.
252+
FunFromTupleSet(S) == [
253+
k \in {p[1] : p \in S} |-> CHOOSE v \in {r[2] : r \in S} : <<k, v>> \in S
254+
]
255+
========================================================================================================================
256+
\end{minted}
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
\chapter{Eviction counter recovery}\label{sec:appendix_eviction_solver}
2+
3+
Given a topic hash, a subject-ID, and the subject-ID modulus, the eviction
4+
counter of a non-pinned topic can be reconstructed in constant time.
5+
This works because the subject-ID mapping
6+
(section~\ref{sec:session_subject_id_mapping})
7+
uses quadratic probing modulo a prime $M$ with $M \bmod 4 = 3$, which makes
8+
the probing sequence invertible via the closed-form square root
9+
$r = d^{(M+1)/4} \bmod M$ for quadratic residues $d$.
10+
11+
The recovery is useful for diagnostic tools -- for example, to cross-check a
12+
running network against the mapping formula, or to display a topic's
13+
eviction history in a monitoring dashboard -- but it is not required for
14+
protocol participation. Any implementation that participates in the
15+
consensus protocol already has the eviction counter on hand.
16+
17+
\begin{remark}
18+
The solver below has been exhaustively brute-force verified for the
19+
moduli $57203$, $131071$, and $8388607$.
20+
Implementations that use it for diagnostics should treat a
21+
\texttt{UINT32\_MAX} return value as \emph{the subject-ID was not
22+
produced from the given hash by the mapping function}, which is a valid
23+
outcome when the diagnostic tool is probing an unrelated subject.
24+
The solver assumes that the eviction counter does not exceed
25+
$\lfloor M / 2 \rfloor$; above that bound the probing sequence starts
26+
to repeat and the recovery is no longer unique.
27+
\end{remark}
28+
29+
\begin{minted}[fontsize=\scriptsize]{cpp}
30+
// Pavel Kirienko <pavel@opencyphal.org>, MIT license
31+
32+
// a**e mod m
33+
static uint32_t pow_mod_u32(uint32_t a, uint32_t e, const uint32_t m)
34+
{
35+
uint32_t r = 1 % m;
36+
a %= m;
37+
while (e) {
38+
if (e & 1U) {
39+
r = (uint32_t)(((uint64_t)r * (uint64_t)a) % m);
40+
}
41+
a = (uint32_t)(((uint64_t)a * (uint64_t)a) % m);
42+
e >>= 1U;
43+
}
44+
return r;
45+
}
46+
47+
// Legendre symbol: a^((p-1)/2) mod p is 1 for residues, p-1 for non-residues, 0 for a==0.
48+
static bool is_quadratic_residue_prime(const uint32_t a, const uint32_t p)
49+
{
50+
return (a == 0) || (pow_mod_u32(a, (p - 1U) / 2U, p) == 1U);
51+
}
52+
53+
// Derives evictions from a non-pinned subject-ID. For pinned subject-IDs returns 0.
54+
// Assumes subject_id_modulus is a prime with (subject_id_modulus % 4) == 3.
55+
// Returns UINT32_MAX if the subject-ID was obtained using distinct parameters.
56+
// If evictions > floor(modulus/2), the subject-ID sequence repeats, leading to non-unique solutions;
57+
// the solver assumes that evictions stay below that bound. Complexity is O(1).
58+
static uint32_t topic_evictions_from_subject_id(const uint64_t hash,
59+
const uint32_t subject_id,
60+
const uint32_t subject_id_modulus)
61+
{
62+
const uint32_t p = subject_id_modulus;
63+
assert((p > 3) && ((p & 3U) == 3U));
64+
if (subject_id <= 8191U) {
65+
return 0; // Pinned subject; eviction count is not meaningful here.
66+
}
67+
const uint32_t base = subject_id - 8192U;
68+
if (base >= p) {
69+
return UINT32_MAX;
70+
}
71+
const uint32_t delta = (uint32_t)(((uint64_t)base + (uint64_t)p - (hash % p)) % p);
72+
if (!is_quadratic_residue_prime(delta, p)) {
73+
return UINT32_MAX;
74+
}
75+
// sqrt(delta) mod p (since p % 4 == 3): r = delta^((p+1)/4) mod p
76+
const uint32_t r1 = (delta == 0U) ? 0U : pow_mod_u32(delta, (p + 1U) / 4U, p);
77+
const uint32_t r2 = (r1 == 0U) ? 0U : (p - r1);
78+
uint32_t best = UINT32_MAX;
79+
const uint32_t roots[2] = { r1, r2 };
80+
for (unsigned i = 0; i < 2; i++) {
81+
const uint64_t s = roots[i];
82+
if ((s <= (p / 2U)) && (s < best)) {
83+
best = (uint32_t)s;
84+
}
85+
}
86+
return best;
87+
}
88+
\end{minted}

0 commit comments

Comments
 (0)