Skip to content
29 changes: 27 additions & 2 deletions std/hash/hash.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ package hash

import (
"github.com/consensys/gnark/frontend"
"github.com/consensys/gnark/std/lookup/logderivlookup"
"github.com/consensys/gnark/std/math/uints"
)

Expand Down Expand Up @@ -112,8 +113,11 @@ type merkleDamgardHasher struct {
api frontend.API
}

// NewMerkleDamgardHasher transforms a 2-1 one-way function into a hash
// initialState is a value whose preimage is not known
// NewMerkleDamgardHasher range-extends a 2-1 one-way hash compression function into a hash by way of the Merkle-Damgård construction.
// Parameters:
// - api: constraint builder
// - f: 2-1 hash compression (one-way) function
// - initialState: the initialization vector (IV) in the Merkle-Damgård chain. It must be a value whose preimage is not known.
func NewMerkleDamgardHasher(api frontend.API, f Compressor, initialState frontend.Variable) FieldHasher {
return &merkleDamgardHasher{
state: initialState,
Expand All @@ -136,3 +140,24 @@ func (h *merkleDamgardHasher) Write(data ...frontend.Variable) {
func (h *merkleDamgardHasher) Sum() frontend.Variable {
return h.state
}

// SumMerkleDamgardDynamicLength computes the Merkle-Damgård hash of the input data, truncated at the given length.
// Parameters:
// - api: constraint builder
Comment thread
Tabaie marked this conversation as resolved.
Outdated
// - f: 2-1 hash compression (one-way) function
// - initialState: the initialization vector (IV) in the Merkle-Damgård chain. It must be a value whose preimage is not known.
// - length: length of the prefix of data to be hashed. The verifier will not accept a value outside the range {0, 1, ..., len(data)}.
// The gnark prover will refuse to attempt to generate such an unsuccessful proof.
// - data: the values a prefix of which is to be hashed.
func SumMerkleDamgardDynamicLength(api frontend.API, f Compressor, initialState frontend.Variable, length frontend.Variable, data []frontend.Variable) frontend.Variable {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this function signature allow for length extension attacks? I.e. lets say:

  • IV = 0
  • f_0 = compress(IV, msg_0)
  • f_1 = compress(f_0, msg_1)

Then, when we do either:
SumMerkleDamgardDynamicLength(api, Compressor, IV, 2, [msg_0, msg_1])
or
SumMerkleDamgardDynamicLength(api, Compressor, f_0, 1, [msg_1, msg_xxx])
we would get the same result? And considering all are user inputs then it could imo lead to possible collisions.

We also provide variable-length mode for binary hashes (see interface BinaryFixedLengthHasher). But there depending on the underlying hash function (sha256, ripemd, sha3) the length is appended to the input, avoid collision problems.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't say that's a collision, since Merkle-Damgard chains initialized with different IV's are considered different hash functions, and you can always design another hash function H' that could at some point produce the same output as a given hash function H.

As to length extension attacks, yes Merkle-Damgard is generally prone to length extension attacks, afaik that attack is not in scope in our stack since we're not privacy focused. (the "attacker" in length extension obtains a valid hash for a longer string when they weren't supposed to know what the preimage of the original hash was)

Your proposed solution for preventing length extension attacks is valid imo (or we could just use Sponge instead of MD for range extension) but I wanted something that would match the current MiMC and Poseidon2 implementations that we have. i.e. MiMC(a,b) = DynamicLengthMiMC(2, a, b, c).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, MD hashes with different IVs are essentially different hash functions (we can consider it as keyed hash function). But usually this means that IV (or the first message to MD) is fixed and constant. But in this implementation it is provided as an input, so for me it completely changes the model.

Imo we didn't consider length extension attack previously as all the hash functions in-circuit return the hash of all written variables. So it is not possible to mount the length-extension attack. But now, when the length is also provided as a variable, so the length-extension attacks at least for me return to scope.

It may be that any of these attacks are not applicable in the specific use case you need, but imo it is also very easy to use it in a way allowing length extension attacks and exposing something in std/hash/ should have safe-by-default design imo.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-reviewed your implementation. I think the attack becomes less obvious now. If we don't use State/SetState and the IV provided to NewMerkleDamgardHasher is well defined, then imo its quite good. There is still the issue that we don't write input length into the stream, but I think after documenting it nicely it is good enough.

resT := logderivlookup.New(api)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And imo this method diverges from the rest of the APIs we use for hashers - otherwise we have defined FieldHasher for algebraic hash, and BinaryHasher/BinaryFixedLengthHasher for binary hashes (the ..FixedLengthHasher providing the same functionality i.e. allowing to dynamically set the input length to be hash).

I recommend following similar patter for consistency -- create FieldFixedLengthHasher interface

type FieldFixedLengthHasher interface {
    FieldHasher

    FixedLengthSum(length frontend.Variable) frontend.Variable
}

and then make in MiMC and Poseidon2 packages return this instead of FieldHasher.

Copy link
Copy Markdown
Contributor Author

@Tabaie Tabaie Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you're suggesting to take the stateful hasher, Write to it normally, and in the end instead of the regular Sum do some other FixedLengthSum function (wouldn't DynamicLengthSum be a better name?)
That does seem much cleaner.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, thats the idea. We already have the exact same interface for binary hashes (sha2, sha3) (see https://pkg.go.dev/github.com/consensys/gnark@v0.14.0/std/hash#BinaryFixedLengthHasher) and imo it works quite well.

Perhaps indeed DynamicLengthSum could be better, but at least for me consistency is more important. If one interface defines as FixedLengthSum and other as DynamicLengthSum, but their are essentially same thing, then it just confuses the users.

state := initialState

resT.Insert(state)
for _, v := range data {
state = f.Compress(state, v)
resT.Insert(state)
}

return resT.Lookup(length)[0]
}
43 changes: 35 additions & 8 deletions std/hash/poseidon2/poseidon2_test.go
Original file line number Diff line number Diff line change
@@ -1,41 +1,68 @@
package poseidon2

import (
"fmt"
"testing"

"github.com/consensys/gnark-crypto/ecc"
"github.com/consensys/gnark-crypto/ecc/bls12-377/fr/poseidon2"
"github.com/consensys/gnark-crypto/ecc/bls12-377/fr"
gcPoseidon2 "github.com/consensys/gnark-crypto/ecc/bls12-377/fr/poseidon2"
"github.com/consensys/gnark/frontend"
"github.com/consensys/gnark/std/hash"
"github.com/consensys/gnark/std/permutation/poseidon2"
"github.com/consensys/gnark/test"
)

type Poseidon2Circuit struct {
Input []frontend.Variable
Expected frontend.Variable `gnark:",public"`
Expected []frontend.Variable `gnark:",public"` // Expected[i] = H(Input[:i+1])
}

func (c *Poseidon2Circuit) Define(api frontend.API) error {
if len(c.Input) != len(c.Expected) {
return fmt.Errorf("length mismatch")
}
hsh, err := NewMerkleDamgardHasher(api)
if err != nil {
return err
}
hsh.Write(c.Input...)
api.AssertIsEqual(hsh.Sum(), c.Expected)

compressor, err := poseidon2.NewPoseidon2(api)
if err != nil {
return err
}

for i := range c.Input {
hsh.Write(c.Input[i])
api.AssertIsEqual(c.Expected[i], hsh.Sum())
api.AssertIsEqual(c.Expected[i], hash.SumMerkleDamgardDynamicLength(api, compressor, 0, i+1, c.Input))
Comment thread
Tabaie marked this conversation as resolved.
Outdated
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'l also add test if you're trying to read hash digest not from the end, but from earlier. A la varlen.SumWithLength(2) in the end.


return nil
}

func TestPoseidon2Hash(t *testing.T) {
assert := test.NewAssert(t)

var buf [fr.Bytes]byte
const nbInputs = 5
// prepare expected output
h := poseidon2.NewMerkleDamgardHasher()
h := gcPoseidon2.NewMerkleDamgardHasher()
expected := make([]frontend.Variable, nbInputs)
circInput := make([]frontend.Variable, nbInputs)
for i := range nbInputs {
_, err := h.Write([]byte{byte(i)})
buf[fr.Bytes-1] = byte(i)
_, err := h.Write(buf[:])
assert.NoError(err)
circInput[i] = i
expected[i] = h.Sum(nil)
}
res := h.Sum(nil)
assert.CheckCircuit(&Poseidon2Circuit{Input: make([]frontend.Variable, nbInputs)}, test.WithValidAssignment(&Poseidon2Circuit{Input: circInput, Expected: res}), test.WithCurves(ecc.BLS12_377)) // we have parametrized currently only for BLS12-377
assert.CheckCircuit(
&Poseidon2Circuit{
Input: make([]frontend.Variable, nbInputs),
Expected: make([]frontend.Variable, nbInputs),
}, test.WithValidAssignment(&Poseidon2Circuit{
Input: circInput,
Expected: expected,
}), test.WithCurves(ecc.BLS12_377)) // we have parametrized currently only for BLS12-377
}
Loading