15 Hash-based signatures

Hash-based signatures are signature schemes that only use hash functions in their construction (no LWE or similar). Using only assumptions about hash functions is beneficial because assumptions like collision-resistance are less likely to get broken than more complicated structured assumptions like LWE. A representative of a hash-based signature is the Lamport signature, outlined below:

Definition 15.1

Given an honest signature \(\sigma\) for some \(h\), only one value from each \(sk\)-column is revealed. To produce a signature for different \(h'\neq h\), the adversary needs at least one entry in a different row. Consequently, an adversary cannot produce signature for \(h'\). So we expect the Lamport signature scheme to be at least one-time secure:

Theorem

: If \(H\) is one-way and collision-resistant, then Lamport-signatures are EF-OT-CMA secure.

: Since \(H\) is collision-resistant, it suffices to show that given a signature for some \(h\), one cannot find signature for \(h^*\neq h\).

Assume an adversary \(A\) has a signature \(\sigma\) for \(h\). We want to construct another signature \(\sigma^*\) for an \(h^*\neq h\). \(h^*\) and \(h\) must differ in at least one index \(i\), at which \(A\) get value \(x_{ih_i}\) form \(\sigma\) but not \(x_{ih^*_i}\). If \(A\) were to construct a signature \(\sigma^*\) for \(h^*\), it would find \(x_{ih^*_i}\) given only \(H(ih^*_i)\), which is can obtain from \(pk\). This is not possible by the one-wayness assumption of \(H\), therefore, \(A\) cannot produce \(\sigma^*\) for \(m^*\). \(\hfill\blacksquare\)

The argument above does not hold if \(A\) sees signatures for two messages \(m\) and \(m'\). There could be many indices where \(h=H(m)\) and \(h'=H(m')\) differ, and for each of these, we know both elements of a column in \(sk\). This means that the Lamport signature insecure when several messages are signed, making it useful for one-time signatures only.

15.1 Tree-based signatures

The main idea behind this kind of signature is that we build a tree of public keys. The KeyGen algorithm begins with a \(pk_\lambda\) in the root node, where \(\lambda\) is the empty word. Each parent node with public key \(pk_x\), where \(x\) is a bitstring, branches into two child nodes with public keys \(pk_{i0}\) and \(pk_{i1}\) for the left and right branches, respectively, as illustrated below for a 4-level tree:

Basic \(pk\)-tree

In addition to this \(pk\)-tree, KeyGen also generates an \(sk\)-tree with each node containing an \(sk_x\) for its corresponding bitstring \(x\). \(pk_\lambda\) is then transmitted to a Verify algorithm, which given another \(pk_x\), should be able to tell whether \(pk_x\) belongs to the \(pk\)-tree generated by \(pk_\lambda\) or not. To do this, we add to each node (except those in the lowest level) a signature \(\sigma_x=\textrm{Sign}_{\textrm{OT}}(sk_x, (pk_{x0}, pk_{x1}))\), as shown below (here \(\textrm{Sign}_{\textrm{OT}}\) is an arbitrary one-time signature XXXX):

\(pk\)-tree with signatures

Suppose we want to authenticate some \(pk_x\) from the lowest level in the tree. To do this, we need send \(pk_{x'}\) and \(\sigma_{x'}\) of all the parent nodes of \(x\) since the root nodes and their immediate neigbors. This way, Verifier can use \(\sigma_{x'}\) to verify the authenticity of \(pk_{x'0}\) and \(pk_{x'1}\) all the way from \(\lambda\) to \(x\):

Authentication path

Note that even though the size of the tree grows as \(\mathcal{O}(2^n)\), the size of the authentication path grows only as \(\mathcal{O}(n)\).

We can sign a message \(m\) as follows:

This allows the verifier to check \(pk_a\), and whether \(m\) was signed with respect to \(pk_a\). Note that no \(sk_{x'}\) is used twice in this process unless the same \(a\) is picked twice, which happens with probability \(\approx 2^{-n}\).

We can try using this to build a multi-time signature:

Notes: For a tree of depth \(n\), up to \(2^n\) messages can be signed, \(\sigma\) contains \(\mathcal{O}(n)\) \(pk\)’s and signatures, so signing and verification are efficient, but \(pk_n\)-generation produces \(\Omega(2^n)\) values, making key generation exponentially inefficient.

One possible way around this could be to compute the \(pk_x\), \(sk_x\), \(\sigma_x\) on the fly, as needed. The problem is that the signing algorithm would need to remember which ones have already been produced, otherwise it would reuse \(sk_x\) on a different message. Instead, we can use pseudorandomness to generate all nodes consistently.

We will use a pseudorandom function (PRF) \(F(k_F, x)\). The definition of a PRF is analogous to a PRP. We pick \(k_F\) during key generation, and to generate some \(pk_x\), do: \[ (pk_x, sk_x)\leftarrow\textrm{KeyGen}_\textrm{OT}(; F(k_F, x))\]

This will generate our final tree-based scheme:

Theorem

: If (\(\textrm{KeyGen}_{\textrm{OT}}\), \(\textrm{Sign}_{\textrm{OT}}\), \(\textrm{Verify}_{\textrm{OT}}\)) is EF-OT-CMA secure and \(\textrm{Sign}_{\textrm{OT}}\) is deterministic, then (\(\textrm{KeyGen}_{\textrm{MT}}\), \(\textrm{Sign}_{\textrm{MT}}\), \(\textrm{Verify}_{\textrm{MT}}\)) is EF-CMA secure.

Some pros of tree-based signatures are their security (only using hash-assumptions), and small \(pk\) and \(sk\), but the signatures produced are quite big for practical purposes (especially using the unoptimized form we described).