[Sigsum-general] Re: We include the checksum in tree_leaf. But what is it used for?

29 Sep 2022

      On Thu, Sep 29, 2022 at 02:46:43PM +0200, Niels Möller wrote:
...
rgdd--- via Sigsum-general sigsum-general@lists.sigsum.org writes:
...
Top-posting because I think all questions are quite related.  Ultimately, what
we need is for each signature that appears in a leaf to be verifiable without
any further information.  Otherwise, a monitor cannot distinguish between a leaf
that was fabricated by the log and an actual signature operation by someone.
Thanks, I was missing that usecase for verifying a leaf signature based
only on the leaf and the public key. So a monitor is expected to be
configured with the actual public key to look for, rather than just it's
key hash.
Yep!
...
...

Removing checksum: this leaves a monitor with a 64-byte Ed25519 signature.
In other words, the bytes that were signed would be missing completely.
Permitting SHA-512 while storing SHA-256(M) as checksum: this leaves a
monitor with a 64-byte Ed25519 signature and a 32-byte checksum that is
unrelated to signature verification.

I see one problem with this, though. The monitor can't simply use the
ssh-keygen command to verify the signature, since that command expects
to get the *message* as input, not the hash thereof. Which kind-of
defeats the idea of piggybacking on ssh tools.
I disagree.  The value of piggy-backing on SSH tooling is for the signer
who can access their private key with good solutions that already exist.
Note that the verifier will never get sufficient amounts of verification
by only using ssh-keygen.  For example, ssh-keygen does not "speak"
transparency log proofs, transparency log policies, etc.  Sigsum needs
to provide such tools and libraries, and verifiers need to rely on them.
...
To make it possible to verify hte signature based on only public key and
leaf, and sticking to black-box usage of ssh-style signatures, I think
we need one more level of hashing.
message               ; submitted to the log
checksum = H(message) ; to be published by log
signature = ssh-style signature on checksum (i.e., M = checksum in the
  signature format spec).
The ssh-style signature will internally compute H(H(message)) when
formatting the data passed to the ed25519 signature primitives.
And in the typical case that message = H(data) of some data not revealed
to the log, we will end up with H^3(data). Which certainly looks like
overdoing it, but the nice thing is that each level of hashing is owned
by its own layer, and they're not interacting. It will, e.g,, work
perfectly fine with
message=SHA3(data)         ; application layer
Note that message must be exactly 32 bytes in Sigsum.  So, you wouldn't
be able to use SHA3 here (and you probably shouldn't; that means you
rely on two hash functions to be collision resistant instead of one).
...
checksum = SHA256(message) ; sigsum layer
  hash = SHA512(checksum)    ; ssh signature layer
And in this model, the only purpose of the sigsum layer hash, as I
understand it, is to avoid log poisoning.
Yes, the sigsum layer hash is to avoid poisning; and the primary purpose
of an application layer hash is to limit what a log learns about
application messages.  (Small messages are of course also nice though.)
Note that we wouldn't need a separate "sigsum-layer hash" if we were OK
accepting arbitrary-sized application messages (which we are not).
...
And the hashing in the ssh
layer serves no purpose for us, but it's the way that signature
operation is defined (likely because it makes it easier to sign large
files, something that we don't need).
I see your point if it is a desired property to verify leaf signatures
in isolation with ssh-keygen.  Would you say that the complexity is
decreased, about the same, or increased if this change was proposed?
-Rasmus

2025

2024

2023

2022

2021

[Sigsum-general] Re: We include the checksum in tree_leaf. But what is it used for?