On Fri, Feb 02, 2024 at 01:40:21PM -0800, Hayden Blauzvern wrote:
Thanks Rasmus! I've cc'd the list and added Bob who's interested in this topic too.
Great, and hi Bob! Happy to have you CC:ed here as well :).
What submit latency are you willing to accept? I'm asking because
depending on if you need ~1s or ~10s will influence the options.
I'd like to keep this latency as low as possible. It would be a breaking change across the ecosystem if we upped latency to ~10s, as I'm assuming clients have not configured their timeouts to expect this high of a latency. That's not to say we couldn't make this change, as we could provide a different API, I'd just like to explore a low latency initially.
FWIW we havn't made any detailed analysis of why the KISS approach that I referred to as ~10s couldn't be, e.g., ~3s. So if you'd like to weight that into your exploring, it might be worth thinking about. But it makes sense to try and minimize latency given your current design!
I.e., the log can keep track of a witness' latest state X, then provide
to the witness a new checkpoint Y and a consistency proof that is valid from X -> Y. If all goes well, the witness returns its cosignature. If they are out of sync, the log needs to try again with the right state.
Assuming that all witnesses are responsive and maintain the same state, this could work. Keeping track of N different witnesses is doable, but I think it's likely they would get out of sync, e.g. a request to cosign a checkpoint times out but the witness still verifies and persists the checkpoint. This isn't a blocker though, it's just an extra call if needed.
I think that's a reasonable (and crucial) assumption. I would not recommmend putting a witness into a trust policy that doesn't have a convincing plan for how to stay online and responsive most of the time.
Pretty sure you will run into some interesting implementation details here though if you go for the lowest-latency option as no one have dog fooded such an implementation of the protocol yet. It is a bit more involved than the KISS approach (so trading latency vs complexity here).
The current plan for Sigsum is to accept up to T seconds of logging
latency, where T is in the order of 5-10s. Every T seconds the log selects the current checkpoint, then it collects as many cosignatures as possible before making the result available and starting all over again.
This seems like the most sensible approach assuming that latency can be accepted by the ecosystem. Batching entries is something we've discussed before, there's other performance benefits besides witnessing.
Yeah, and it kinda makes sense to not push more complicated low-latency solutions on use cases that work without it. I think most use cases can tolate a little bit of latency, but there are of course some exceptions.
An alternative implementation of the same witness protocol would be as follows: always be in the process of creating the next witnessed checkpoint. I.e., as soon as one finalized a witnessed checkpoint, start all over again because the log's tree already moved forward. To keep the latency down, only collect the minimum number of cosignatures
needed to satisfy all trust policies that the log's users depend on.
This makes sense, though I think adding some latency as suggested above makes this more straightforward. One detail, which may not be relevant depending on your order of operations, is that we just need to confirm that the inclusion proof returned will be based on the cosigned checkpoint. Currently our workflow is first requesting an inclusion proof for the latest tree head, then signing the tree head.
You could also consider making the submitter fetch the proof using a separate API based on the cosigned checkpoint that they are happy with.
-Rasmus