Summary
Every attribute read in AttributesImpl enters a per-element monitor, and EdgeImpl.getWeight() adds a second monitor on top of it. Because getWeight() and getAttribute() are the hottest read paths in the library (weighted degree, weighted PageRank, modularity, layout, shortest paths), this synchronization is paid millions of times per analytics pass. In an isolated microbenchmark, removing the monitor cuts attribute-read time by ~3.5×. The lock exists only to guard a rare array-resize on column addition, which can be made safe without locking readers.
This is a single, self-contained change with what looks like a large upside and low risk, so I wanted to raise it before opening a PR.
Environment
org.gephi:graphstore 0.8.7-SNAPSHOT, commit 5028d21
- Build target JDK 17; measured on OpenJDK 21
- Relevant because biased locking is disabled (JDK 15) / removed (JDK 18), so uncontended
synchronized is no longer near-free
Where
AttributesImpl.getAttribute(int) (src/main/java/org/gephi/graph/impl/AttributesImpl.java:70):
public Object getAttribute(int index) {
Object res = null;
synchronized (this) { // entered on every read
if (index < attributes.length) res = attributes[index];
}
return res;
}
EdgeImpl.getWeight() (EdgeImpl.java:79) then wraps that already-synchronized call in a second monitor on a different object (the edge), so a single weight read enters two monitors:
public double getWeight() {
synchronized (this) { // monitor #1: the EdgeImpl
Object weightObject = attributes.getAttribute(EDGE_WEIGHT_INDEX); // monitor #2: the AttributesImpl
if (weightObject instanceof Double) {
return (Double) weightObject;
} else {
return getWeight(graphStore.getView());
}
}
}
The only writer that mutates shared state under this monitor is the resize in setAttribute(int, Object) (AttributesImpl.java:124), which swaps the attributes array reference when a newly added column's index exceeds the current array length. Column additions are rare and already happen under the graph write lock; readers are paying a monitor on every access to protect against an event that essentially never occurs during an algorithm run.
Why it matters
getWeight() is read in the inner loop of most weighted algorithms, and getAttribute() underlies every column read (sizes, partitions, rankings, filters). The cost is incurred even single-threaded, since the monitor is unconditional.
Evidence
Isolated microbenchmark: 5M elements, sum the weight across all of them per pass, warmed up. Single-file source mode (java Bench.java) on OpenJDK 21, ParallelGC. Not JMH, so treat as directional — but the effect size is large and the mechanism is well understood.
| Variant |
ms / 5M reads |
synchronized + boxed Double (current shape) |
~116–136 |
boxed, no synchronized |
~32 |
primitive field, but keep synchronized |
~101–104 |
primitive field, no synchronized |
~27 |
The monitor alone accounts for roughly 75 ms of the 116 ms; the rest is boxing (tracked separately). Removing the monitor is a ~3.5× improvement on attribute reads in isolation, before counting the second monitor in getWeight() that the harness does not even model.
Benchmark harness
public class Bench {
static final int N = 5_000_000, REPS = 6;
static final class E {
final Object[] attrs; final double prim;
E(double w){ attrs = new Object[]{ Double.valueOf(w) }; prim = w; }
double syncBoxed(){ Object r; synchronized(this){ r = attrs[0]; } return (Double) r; }
double boxedNoSync(){ return (Double) attrs[0]; }
double primSync(){ double v; synchronized(this){ v = prim; } return v; }
double prim(){ return prim; }
}
public static void main(String[] a){
E[] e = new E[N];
for (int i = 0; i < N; i++) e[i] = new E(i * 0.5);
double sink = 0;
for (int r = 0; r < REPS; r++){
long t0 = System.nanoTime(); double s0 = 0; for (int i=0;i<N;i++) s0 += e[i].syncBoxed();
long t1 = System.nanoTime(); double s1 = 0; for (int i=0;i<N;i++) s1 += e[i].boxedNoSync();
long t2 = System.nanoTime(); double s2 = 0; for (int i=0;i<N;i++) s2 += e[i].primSync();
long t3 = System.nanoTime(); double s3 = 0; for (int i=0;i<N;i++) s3 += e[i].prim();
long t4 = System.nanoTime();
sink += s0+s1+s2+s3;
if (r >= REPS-2)
System.out.printf("syncBoxed %.1f | boxedNoSync %.1f | primSync %.1f | prim %.1f (ms)%n",
(t1-t0)/1e6,(t2-t1)/1e6,(t3-t2)/1e6,(t4-t3)/1e6);
}
if (sink == Double.NaN) System.out.println(sink);
}
}
Proposed fix
Make the read path lock-free by publishing the array through a volatile reference and reading it once into a local; keep writes serialized (under the existing monitor or the graph write lock).
protected volatile Object[] attributes;
public Object getAttribute(int index) {
Object[] a = attributes; // single volatile read
return index < a.length ? a[index] : null;
}
And drop the redundant outer monitor in EdgeImpl.getWeight() so the static-weight path is fully lock-free.
Why this is safe
A resize copies the existing values into a strictly larger array and then swaps the reference. With a volatile field, a reader observes either the old array or the new one — never a torn reference — and the value at index is valid in both (the new array is a superset copy). This is the same read/write consistency the graph's ReentrantReadWriteLock already governs at the structural level; the per-element monitor was only ever protecting the reference swap, which volatile publication handles.
Writers (setAttribute and the time-attribute variants) keep their synchronization to serialize concurrent array swaps among themselves.
Scope / risk
- Touches
AttributesImpl reads and the resize swap, plus the getWeight() wrapper. Behavior is unchanged for single-threaded callers; concurrent callers gain lock-free reads with identical consistency guarantees.
- Existing tests around concurrent attribute access should cover this; happy to add a stress test that adds a column while readers iterate.
Related findings from the same review (separate concerns; I can file individually if useful): edge weight stored as a boxed Double in the generic Object[] rather than a primitive field, and the primary adjacency index using Long2ObjectOpenCustomHashMap<int[]> (one int[] allocation per edge) where a primitive-valued map would do when parallel edges are disabled. Both are mainly memory/GC wins on large graphs.
Summary
Every attribute read in
AttributesImplenters a per-element monitor, andEdgeImpl.getWeight()adds a second monitor on top of it. BecausegetWeight()andgetAttribute()are the hottest read paths in the library (weighted degree, weighted PageRank, modularity, layout, shortest paths), this synchronization is paid millions of times per analytics pass. In an isolated microbenchmark, removing the monitor cuts attribute-read time by ~3.5×. The lock exists only to guard a rare array-resize on column addition, which can be made safe without locking readers.This is a single, self-contained change with what looks like a large upside and low risk, so I wanted to raise it before opening a PR.
Environment
org.gephi:graphstore0.8.7-SNAPSHOT, commit5028d21synchronizedis no longer near-freeWhere
AttributesImpl.getAttribute(int)(src/main/java/org/gephi/graph/impl/AttributesImpl.java:70):EdgeImpl.getWeight()(EdgeImpl.java:79) then wraps that already-synchronized call in a second monitor on a different object (the edge), so a single weight read enters two monitors:The only writer that mutates shared state under this monitor is the resize in
setAttribute(int, Object)(AttributesImpl.java:124), which swaps theattributesarray reference when a newly added column's index exceeds the current array length. Column additions are rare and already happen under the graph write lock; readers are paying a monitor on every access to protect against an event that essentially never occurs during an algorithm run.Why it matters
getWeight()is read in the inner loop of most weighted algorithms, andgetAttribute()underlies every column read (sizes, partitions, rankings, filters). The cost is incurred even single-threaded, since the monitor is unconditional.Evidence
Isolated microbenchmark: 5M elements, sum the weight across all of them per pass, warmed up. Single-file source mode (
java Bench.java) on OpenJDK 21, ParallelGC. Not JMH, so treat as directional — but the effect size is large and the mechanism is well understood.synchronized+ boxedDouble(current shape)synchronizedsynchronizedsynchronizedThe monitor alone accounts for roughly 75 ms of the 116 ms; the rest is boxing (tracked separately). Removing the monitor is a ~3.5× improvement on attribute reads in isolation, before counting the second monitor in
getWeight()that the harness does not even model.Benchmark harness
Proposed fix
Make the read path lock-free by publishing the array through a
volatilereference and reading it once into a local; keep writes serialized (under the existing monitor or the graph write lock).And drop the redundant outer monitor in
EdgeImpl.getWeight()so the static-weight path is fully lock-free.Why this is safe
A resize copies the existing values into a strictly larger array and then swaps the reference. With a
volatilefield, a reader observes either the old array or the new one — never a torn reference — and the value atindexis valid in both (the new array is a superset copy). This is the same read/write consistency the graph'sReentrantReadWriteLockalready governs at the structural level; the per-element monitor was only ever protecting the reference swap, whichvolatilepublication handles.Writers (
setAttributeand the time-attribute variants) keep their synchronization to serialize concurrent array swaps among themselves.Scope / risk
AttributesImplreads and the resize swap, plus thegetWeight()wrapper. Behavior is unchanged for single-threaded callers; concurrent callers gain lock-free reads with identical consistency guarantees.Related findings from the same review (separate concerns; I can file individually if useful): edge weight stored as a boxed
Doublein the genericObject[]rather than a primitive field, and the primary adjacency index usingLong2ObjectOpenCustomHashMap<int[]>(oneint[]allocation per edge) where a primitive-valued map would do when parallel edges are disabled. Both are mainly memory/GC wins on large graphs.