Skip to content

Review and update pattern subsumption and exhaustion#1691

Open
BillWagner wants to merge 4 commits into
dotnet:draft-v8from
BillWagner:subsumption-exhaustion
Open

Review and update pattern subsumption and exhaustion#1691
BillWagner wants to merge 4 commits into
dotnet:draft-v8from
BillWagner:subsumption-exhaustion

Conversation

@BillWagner

@BillWagner BillWagner commented May 13, 2026

Copy link
Copy Markdown
Member

Fixes #1650

This should be reviewed commit by commit

The first commit takes a minimal approach to describing the new rules for subsumption and exhaustiveness. That keeps the rules at a high level, and avoids any discussion of the particular algorithm to determine either subsumption or exhaustiveness in property patterns and other recursive patterns.

The second commit takes a more maximal approach to the rules, but still tries to avoid the actual algorithm used by a well known compiler. I'm less than satisfied, as it makes more assumptions, but still does a fair amount of handwaving. (Fourth commit fixes a lint issue).

My recommendation is to revert the final commit and then merge.

After the last meeting, we agreed to update the PR with a new commit to define the terms of subsumption and exhaustiveness, and limit the requirement to produce the diagnostics when either non-exhaustive patterns or subsumed branches are detected.

@BillWagner BillWagner requested review from Nigel-Ecma and jskeet May 13, 2026 20:20

@Nigel-Ecma Nigel-Ecma left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second sentence of §11.3 starts “Informally” but I don't think the following text is informal at all, it simply states the definition of subsumption (in terms of patterns not Engish!)

That aside before I can aye/nay this I need some clarification of what §11.3 (and from that §11.4) are trying to say.

You could reduce §11.3 to:

In a switch statement, it is an error if a case’s pattern is subsumed by the preceding set of cases, this means that any input value would have been matched by one of the previous cases.

That’s a concise high-level definition.

The use of “unguarded cases” in the original is an indication that subsumption for patterns wass understood to be either unsolvable, or deemed potentially too expensive, in general to determine. Given that is the goal here to:

  • encourage error messages when an implementation can prove subsumption; and
  • require error messages is a set of known provable cases

or something else?

If the above is the goal then material in the informative note regarding “precise extent” is surely normative? (The bit about “applies recursively” is informative as it is a direct consequence of the preceding bullet.)

@BillWagner BillWagner force-pushed the subsumption-exhaustion branch from 42a615f to 035b5ca Compare June 3, 2026 12:18
@BillWagner

BillWagner commented Jun 3, 2026

Copy link
Copy Markdown
Member Author

I haven't made edits for the review comments yet. We should be sure to agree on what the scope of the spec should be for this area before going forward. I'd like us to get to consensus on that discussion in our next meeting.

I'm writing a summary of the exhaustiveness and subsumption algorithm used by a certain compiler to help us discuss what we want the standard to say about these areas of pattern matching. A more complete discussion is in the repo for that compiler. My purpose here is to provide enough of a framework for our discussion.

The algorithm determines the set of all possible input values for the pattern input expression. For example, if the input expression is of type bool, the set of possible input values is [true, false]. If the input expression is of type int, the set of possible input values is [ Int32.MinValue .. Int32.MaxValue], inclusive. For an input expression of tuple type (int, int), the set of possible values is the cross product of [ Int32.MinValue .. Int32.MaxValue], inclusive. Some types of input expressions are effectively unbounded, like string. While machine limits mean that only a finite number of strings are possible, realistically, the set is unbounded. When the input pattern is unbounded, the only realistic guarantee for exhaustiveness is to have a _ or var pattern for "everything else".

Note that some input expression types which might appear unbounded actually aren't because an ordering function applies to the type. For example, an input expression of type double is exhaustive on the following cases:

var output = someDouble switch
{
    < 0 => "negative",
    0 => "zero",
    > 0 => "positive",
    double.NaN => "Not a number",
};

As this compiler examines each switch arm, it determine if any of the set of input values matches that pattern. If none of the input values in the set matches, that switch are is considered to be subsumed. If any of the input values matches, those values are removed from the input set as "handled" before proceeding to the next arm. Consider this example:

bool b1 = SomeTruthiness();
bool b2 = SomeTruthiness();

var output = (b1, b2) switch
{
    (true, true) => "very truthy",
    (false, false) => "much false",
    _ => "mixed",
};

At the beginning of the switch, the set of values is:

(true, true)
(true, false)
(false, true)
(false, false)

After the first arm, the remaining set is:

(true, false)
(false, true)
(false, false)

After the second arm, it's:

(true, false)
(false, true)

The third arm is a catch-all, and handles the rest.

The same analysis works on the double example:

  • Before any switch arms, the input set is all possible values of the type double.
  • After the first arm, the remaining input set is [0, > 0, NaN]
  • After the second arm, the remaining input set is [> 0, NaN ]
  • After the third arm, the remaining input set is NaN.

The fourth arm makes the set exhaustive. After the sixth arm, the remaining input set is empty.

A modification to the first example shows how this process works for subsumption:

var output = someDouble switch
{
    double.NegativeInfinity => "negative infinity",
    < 0 => "negative",
    0 => "zero",
    > 0 => "positive",
    double.PositiveInfinity => "positive infinity",
    double.NaN => "Not a number",
};

In this version, the analysis is:

  • Before any switch arms, the input set is all possible values of the type double.
  • After the first arm, the remaining input is all possible values of the type double except negative infinity.
  • After the second arm, the remaining input set is [0, > 0, NaN]
  • After the third arm, the remaining input set is [> 0, NaN ]
  • After the fourth arm, the remaining input set is NaN.
  • The fifth arm fails subsumption because double.PositiveInfinity isn't in the remaining set.
  • The sixth arm makes the set exhaustive.

I've only described the process in terms of a single pattern. More complex patterns (those involving recursive patterns, or logical patterns) becomes more complicated to analyze. That's important for the questions below in that this overview isn't nearly complete enough to normatively describe the algorithm.

The discussion questions we should cover:

  • Should the standard make very vague statements about exhaustiveness and subsumption? This is the minimum amount of normative language.
  • Can we normatively describe the situations where the set of input values is bounded vs. unbounded? I'm not certain because of recursive patterns, tuples as input expression, and all the logical patterns. Maybe that's OK if we lean on "known bounded" and "other".
  • Do we describe the process above as an informative "one way to analyze this" or not? It is specific to a current implementation. It is (at least to me) very informative on how the process works

@jskeet

jskeet commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

I'd suggest we try to establish an approach which may not be as exhaustive as we want it to be eventually, but allows us to get there over time. Ideally, let's start with just simple cases - I'd suggest at least bool and maybe int as a reasonably simple starting point.

Possibly we should even do this in two stages even for C# 8:

  • An initial PR which handles bool but in a way which invites further options
  • A second PR which adds int, as an example of how another option can be added

I think it might be worth having a separate numbered subsection for each "kind" of subsumption (so that it's easy to refer to that specific kind, and it's easy when reading to know that you've read everything relevant to that kind). That'll look a bit odd when we've only got bool, but will become more reasonable when we add int.

We can then have an umbrella issue with "things we want to add" (other integer types, double, tuples, recursive matching), and tick them off over time.

How does that sound in terms of a general principle?

(I don't yet have a firm opinion on normative vs informative. I think it wouldn't be unreasonable to be normative with an increasing set of kinds of subsumption, explicitly allowing an implementation to implement additional unspecified kinds. We don't really need to worry about other implementations becoming non-compliant IMO - but we do want to make sure that everything that's normative is "reasonable".)

@jskeet jskeet added the meeting: discuss This issue should be discussed at the next TC49-TG2 meeting label Jun 4, 2026
@jskeet

jskeet commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

(I've read through the changes in the commits, but I don't think I'm in the right headspace to approve or request changes; discussion at the meeting may well help there.)

This first commit makes minimal adjustments to the subsumption and exhaustiveness sections. This keeps the rules fairly high level, and provides more leeway for an implementation to devise any algorithm for managing recursive patterns.
Provide more extensive rule definitions for recursive patterns on subsumption and exhaustiveness.

This version leans more closely to the implementation, but also doesn't dive deep enough to provide more clarity on pattern forms where exhaustiveness or subsumption might not be detected.

I'm tempted to use the first commit, but I want people to see both.
Use a minimal definition for subsumption and exhaustiveness.
@BillWagner BillWagner force-pushed the subsumption-exhaustion branch from 035b5ca to fef79e0 Compare June 11, 2026 21:25
@BillWagner

Copy link
Copy Markdown
Member Author

@jskeet @Nigel-Ecma I've made the updates to remove discussion of which patterns could correctly catch exhaustiveness.

@Nigel-Ecma Nigel-Ecma left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good. Some suggestions & questions.

Comment thread standard/patterns.md
> ```
>
> Arms 1 and 2 have non-constant guards and so are not *unguarded*; only arm 3 is *unguarded* with pattern `int i`, which does not subsume the final `_` arm because it does not match a non-`int` value such as `null`. *end example*
> *Note*: This means that any input value would have been matched by one of the previous cases or arms. *end note*

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the detail removed this line should no longer be a note. Line 450 introduces the use of set, so maybe something along the lines of:

Suggested change
> *Note*: This means that any input value would have been matched by one of the previous cases or arms. *end note*
A pattern `P` is *subsumed* by set of unguarded patterns `Q` if any input value matched by `P` is matched by one of the members of `Q`.

This maybe should become before 450 – i.e. short definition of subsumed followed by the error cases?

Can a pattern only be subsumed if it is itself unguarded? I.e. should it be “An unguarded pattern P is…”?

Comment thread standard/patterns.md
> ```
>
> *end example*
A set of patterns is exhaustive if, for every possible input value, some pattern in the set is applicable. When an implementation detects that a set of patterns is not exhaustive, it shall issue a warning.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now an incredibly short section, and we're not planning on adding more detail. 11.3 is really short now too.

Is there a reasonable way of combining them, or do we want to avoid confusing the two terms?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

meeting: discuss This issue should be discussed at the next TC49-TG2 meeting

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Review pattern subsumption and exhaustiveness

3 participants