Skip to content

Commit 9c071c5

Browse files
committed
[Spark Runner] Spark 4 EncoderFactory: stable constructor lookup + document trait setter
Replace getConstructors()[0] (JVM-defined ordering, not stable) with a helper that picks the widest public constructor. The downstream switch already dispatches on parameter count to pick the right argument shape per Spark version, so this just makes the choice deterministic. Also document the org$apache$spark...$_setter_$isStruct_$eq method — it is the synthetic setter the Scala compiler emits for trait val fields, required when implementing AgnosticEncoders.StructEncoder from Java. Both flagged by Gemini Code Assist on PR apache#38255.
1 parent 8bebced commit 9c071c5

1 file changed

Lines changed: 29 additions & 5 deletions

File tree

  • runners/spark/4/src/main/java/org/apache/beam/runners/spark/structuredstreaming/translation/helpers

runners/spark/4/src/main/java/org/apache/beam/runners/spark/structuredstreaming/translation/helpers/EncoderFactory.java

Lines changed: 29 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -44,15 +44,29 @@
4444
import scala.reflect.ClassTag;
4545

4646
public class EncoderFactory {
47-
// default constructor to reflectively create static invoke expressions
47+
// Resolve the Scala case-class primary constructor (the one with the most parameters).
48+
// Constructor ordering returned by Class.getConstructors() is JVM-defined and not stable
49+
// across Spark versions, so we pick the widest constructor explicitly and then dispatch on
50+
// parameter count below to pick the right argument shape per Spark version.
4851
private static final Constructor<StaticInvoke> STATIC_INVOKE_CONSTRUCTOR =
49-
(Constructor<StaticInvoke>) StaticInvoke.class.getConstructors()[0];
52+
primaryConstructor(StaticInvoke.class);
5053

51-
private static final Constructor<Invoke> INVOKE_CONSTRUCTOR =
52-
(Constructor<Invoke>) Invoke.class.getConstructors()[0];
54+
private static final Constructor<Invoke> INVOKE_CONSTRUCTOR = primaryConstructor(Invoke.class);
5355

5456
private static final Constructor<NewInstance> NEW_INSTANCE_CONSTRUCTOR =
55-
(Constructor<NewInstance>) NewInstance.class.getConstructors()[0];
57+
primaryConstructor(NewInstance.class);
58+
59+
@SuppressWarnings("unchecked")
60+
private static <T> Constructor<T> primaryConstructor(Class<T> cls) {
61+
Constructor<?>[] ctors = cls.getConstructors();
62+
Constructor<?> widest = ctors[0];
63+
for (int i = 1; i < ctors.length; i++) {
64+
if (ctors[i].getParameterCount() > widest.getParameterCount()) {
65+
widest = ctors[i];
66+
}
67+
}
68+
return (Constructor<T>) widest;
69+
}
5670

5771
@SuppressWarnings({"nullness", "unchecked"})
5872
static <T> ExpressionEncoder<T> create(
@@ -142,6 +156,16 @@ public boolean isStruct() {
142156
return true;
143157
}
144158

159+
/**
160+
* Setter required by the Scala compiler when implementing the {@link
161+
* AgnosticEncoders.StructEncoder} trait from Java. Scala traits with concrete {@code val}
162+
* fields generate a synthetic mangled setter ({@code <trait>$_setter_<field>_$eq}) that the
163+
* trait's initializer invokes on subclasses. Java cannot declare {@code val} fields, so we
164+
* implement {@link #isStruct()} directly above and accept-but-ignore the trait setter here. The
165+
* mangled name is brittle and tied to Spark's Scala source layout — if Spark removes the {@code
166+
* isStruct} field from {@code StructEncoder}, this method becomes dead code; if Spark renames
167+
* it, compilation will fail and the new mangled name must be substituted.
168+
*/
145169
@Override
146170
public void
147171
org$apache$spark$sql$catalyst$encoders$AgnosticEncoders$StructEncoder$_setter_$isStruct_$eq(

0 commit comments

Comments
 (0)