On Thu, Sep 29, 2022 at 02:46:43PM +0200, Niels Möller wrote:
rgdd--- via Sigsum-general sigsum-general@lists.sigsum.org writes:
Top-posting because I think all questions are quite related. Ultimately, what we need is for each signature that appears in a leaf to be verifiable without any further information. Otherwise, a monitor cannot distinguish between a leaf that was fabricated by the log and an actual signature operation by someone.
Thanks, I was missing that usecase for verifying a leaf signature based only on the leaf and the public key. So a monitor is expected to be configured with the actual public key to look for, rather than just it's key hash.
Yep!
- Removing checksum: this leaves a monitor with a 64-byte Ed25519 signature. In other words, the bytes that were signed would be missing completely.
- Permitting SHA-512 while storing SHA-256(M) as checksum: this leaves a monitor with a 64-byte Ed25519 signature and a 32-byte checksum that is unrelated to signature verification.
I see one problem with this, though. The monitor can't simply use the ssh-keygen command to verify the signature, since that command expects to get the *message* as input, not the hash thereof. Which kind-of defeats the idea of piggybacking on ssh tools.
I disagree. The value of piggy-backing on SSH tooling is for the signer who can access their private key with good solutions that already exist.
Note that the verifier will never get sufficient amounts of verification by only using ssh-keygen. For example, ssh-keygen does not "speak" transparency log proofs, transparency log policies, etc. Sigsum needs to provide such tools and libraries, and verifiers need to rely on them.
To make it possible to verify hte signature based on only public key and leaf, and sticking to black-box usage of ssh-style signatures, I think we need one more level of hashing.
message ; submitted to the log
checksum = H(message) ; to be published by log
signature = ssh-style signature on checksum (i.e., M = checksum in the signature format spec).
The ssh-style signature will internally compute H(H(message)) when formatting the data passed to the ed25519 signature primitives.
And in the typical case that message = H(data) of some data not revealed to the log, we will end up with H^3(data). Which certainly looks like overdoing it, but the nice thing is that each level of hashing is owned by its own layer, and they're not interacting. It will, e.g,, work perfectly fine with
message=SHA3(data) ; application layer
Note that message must be exactly 32 bytes in Sigsum. So, you wouldn't be able to use SHA3 here (and you probably shouldn't; that means you rely on two hash functions to be collision resistant instead of one).
checksum = SHA256(message) ; sigsum layer hash = SHA512(checksum) ; ssh signature layer
And in this model, the only purpose of the sigsum layer hash, as I understand it, is to avoid log poisoning.
Yes, the sigsum layer hash is to avoid poisning; and the primary purpose of an application layer hash is to limit what a log learns about application messages. (Small messages are of course also nice though.)
Note that we wouldn't need a separate "sigsum-layer hash" if we were OK accepting arbitrary-sized application messages (which we are not).
And the hashing in the ssh layer serves no purpose for us, but it's the way that signature operation is defined (likely because it makes it easier to sign large files, something that we don't need).
I see your point if it is a desired property to verify leaf signatures in isolation with ssh-keygen. Would you say that the complexity is decreased, about the same, or increased if this change was proposed?
-Rasmus