Hi,
I said during the witnessing breakout at tdev that I'm afraid that the origin line as id is coming back to bite us. Let me expand a little on the problem I see.
For a start, when adding keys for logs or witnesses to the trust policy for log users, it's clear that care and due diligence is required. That's natural, and I don't think origin line or other non-cryptographic ids in the picture is much of a problem.
However, I think it's desirable that logs and witnesses are decoupled. When an operator of a log or witness gets an email "please add my new shiny witness/log to your config", I think it's rather important that no subtle or complicated due diligence is required, and that there are no severe or surprising consequences for temporarily adding a malicious entry.
Main problem is for witness operators, and in particular if trying to run a witness on a device like the tkey with minimal configuration.
For a sigsum log, with origin line based on keyhash, the only irreversible consequence when a witness operator adds a new log to the witness config is a commitment to storing a record (on the order of 100 bytes) for that log for the entire lifetime of the witness. If the log causes other operational problems, it could be rate limited or completely removed from the config, and that's it. The tkey app could happily accept any add-checkpoint request + log pubkey from the host, verify everything, and store the (pubkey, treehead) on success. A compromised host could fill up the tkey storage, preventing the witness from adding new logs later on, but that's about the worst it could do.
On the other hand, if the host provides a pubkey and a checkpoint with an arbitrary origin line, that mapping needs to be authenticated. If not, an attacker could make it's own keypair for anyone else origin line, push a tree head with new leaves to the witness, and the witness would then refuse cosiging genuine tree heads for that origin.
If we envision that people will start lots of application specific logs, and want them cosigned by public witnesses (e.g., consider a thousand github projects doing "serverless logs" in their github actions (if that makes sense, I'm not that familiar with github)), it's clear that adding logs to a witness config must be easy.
The witness needs an authority mapping origin lines to keys. In the tkey case, the simple solution would be to have that authority sign some kind of certs defining this binding, and embed the ca pubkey in the tkey app binary. (For key rotation, one could consider accepting certs signed by previous key rather than the ca, but the ca is still needed for new logs). And then origin line owners should demand transparency for those certs, and we're down the rabbit whole.
Finally, on the log side, for a non-sigsum log that publishes cosignatures as checkpoints, we have a related issue with the witness key names. I think that's less severe: if a bad witness is added, the log might publish cosignatures where the pubkey doens't belong to the "proper" owner of a key name, which will look invalid to users that have the right keys for that name, but that is a recoverable problem; once the problem is pointed out, the log can just drop that witness, and it will not appear on later checkpoints.
So what would be my take aways, accepting that the "ship has sailed" on radical changes, and this isn't the right time for cosignature/v2?
1. We should define a origin line scheme similar to sigsum.org/v1/tree for use by non-sigsum logs that don't need key rotation. And strongly recommend that logs that don't have an urgent need for key rotation use that scheme. The advantage for logs that follow this scheme is that it's very cheap for witnesses to add their logs, since no due diligence on who's the proper "owner" of that origin line is needed.
We will in effect get two classes of logs: Those with self-authenticating origin lines, and those that need require additional data or context to establish an authentic mapping between name and keys. (Again, this is an issue in the context of making witness operation easy; defining a proper trust policy will always require more information about a log than just what its key is).
2. Witness operators need to document what procedures they use to validate origin lines of logs they are asked to witness, and how they validate new public keys to be added for that origin line.
3. It would be nice if those operating logs identified by arbitrary origin lines, like "go.sum database tree", outline what procedures they'd like a witness to take before accepting a new public key for that log. For example, say Debian wants to run a witness for the go checksum database and patch go tooling to require that as an additional witness in the trust policy, how will they get authentic information about key rotation events? (Making a replica of the log and witness that, instead of witnessing the upstream log, as was suggested at the breakout, doesn't seem like an attractive alternative to me. With a new origin line, witness cosignatures on the upstream log won't verify, so they are effectively lost. And one would still need that authentic key for the upstream log, when building the replica. But maybe that alternative could be fleshed out).
4. How should we act when we find that a witness used a "wrong" or unexpected public key to verify a checkpoint for some origin? The witness clearly can't cosign any later tree heads for the proper view of that log, and it should be removed from policies involving that log. But perhaps we should make it very clear that this is possible result of an honest mistake, and not hold that against the witness operator's reputation?
5. We could think of alternative ways to do key rotation, e.g., starting a new log under a fresh key, and adding a special leaf at index 0 including the signed tree head of the old log + needed metadata. And possibly with a corresponding forward link in the tombstone message of the old log.
/nisse