From MAC to Wabisabi

From MAC to Wabisabi

Thanks to nothingmuch for answering several questions about the mechanics of Wabisabi.

Preamble - big-and-randomized

First, assume we think it's valuable to have big coinjoins with random amounts for all the inputs and outputs, and probably for this one specific reason: we want to make payments from coinjoins and, possibly, to make payments within coinjoins (the latter is literally what is meant by payjoin, so that is part of this discussion, note however we'd be talking about payjoin batched together with other sub-transactions, so a lot of earlier analysis of payjoin doesn't apply).

Second let's partially address why this is, at least superficially, a bad idea, even a terrible one: previous discussion of the subset sum problem pointed out that some of the time (being deliberately vague about how much of the time!) a coinjoin with non-equal amounts can be easily analyzed to find the sub-transactions which are really happening, removing any privacy boost. So that's not good.

Then, let's mention, without writing an essay about it (though it's a fascinating topic), that there are surprising outcomes of scaling the number of inputs and outputs (or just "coins") in such a model. Due the combinatorial nature of the subset sum problem (or more generally the "knapsack problem"), even having numbers like 50-100 on the input and output side (remember: these may be batched payments! not like separately created coinjoins, extra to payment transactions) can lead to a ridiculous combinatorial blowup making calculation of subsets near impossible. To illustrate: he set of subsets of a set is called the "power-set" and its size is \(2^N\) where \(N\) is the number of elements of the set; but the number of partitions of a set is found using Bell's number \(B_N\), which scales (or doesn't!) even faster than exponential (i.e. faster than \(a^N\) where \(a\) is constant, here \(a\) is a function of \(N\), although it's pretty complex). \(B_2=2, B_{10} \simeq 115000\), while \(B_{100}\) has 116 digits, in decimal. So it's easy to see than even at 50 inputs and 50 outputs, the enumeration process by brute force is not possible.

This point is expanded on in some detail in the cashfusion writeup.

However the point is definitely controversial, basically because brute force is not the only way to approach an attempt to deanonymize a coinjoin. A simple thing like a rounded value (0.25000000 BTC) substantially (sort of exponentially) reduces the search space.

Those who completely dismiss this approach based on the idea "well sure, it's worst case impossible to analyze, but not average, typical case!" should notice a really key subtlety though: the claim is not only that such constructions are computationally impractical to analyze - it's also that they have multiple, and in fact a huge number of mathematically valid solutions, at least when we scale up to very large numbers of utxos. Moreover this paper ("Knapsack") from 2017 tries to construct a framework for deliberately creating such obfuscation (with the same goal - unequal sized coinjoins, allowing payment).

This deserves more argument and discussion, see e.g. this discussion on the bitcoin mailing list from February this year. But we are going to move on to other elements of this story.

Wabisabi, from the ground up.

The paper is here for those who don't need context. I suspect that group is quite small though!

With respect the previous section, the plan for Wabisabi is not to use unconstrained random input and output sizes, as I understand it from the developers, but to use sophisticated ideas based partly on the "Knapsack" style approach mentioned above, but this is a topic for another blog post.

But let's say we buy into the basic idea: a large coinjoin, we'll say for simplicity, 50 inputs and 50 outputs, where there may be a complex ruleset about the values of those inputs and outputs, which we aren't specifying here. Some users will just be mixing and some may be paying someone for a specific good or service, with the output. Probably rarer, but particularly cool, will be if Alice is paying Bob but Bob also participates in the coinjoin, i.e. he is also contributing input utxos to the coinjoin, but gets more out and Alice gets less, effecting a payjoin.

Scenario #1 : Server as coordinator, meaning a server-defined schedule, and no privacy for users w.r.t. server

If we don't care if the server knows everything, each user can just securely connect and pass (set of inputs, set of outputs); they can be random amounts as per the preamble, and the server will accept if the inputs are verifiable on the blockchain, and if the total payment balances. This would be tricky for the payjoin style payments as that means interaction between the users, but in principle that could work too.

Note how this is hardly a new idea, even the earliest implementation SharedCoin did something similar to this (it's a long story! but let's say).

However this SPOF scenario seems unacceptable to anybody. The server could be keeping a record of every linkage, ever, of the coinjoins created in this system. Ultimately this level of centralization breaks, anyway, via external pressure or otherwise.

Scenario #2: Taker as coordinator, meaning taker chooses time of event, and privacy only for takers, not for makers

The description of what is done is exactly as above, except substitute Taker for server. The outcome is practically different: at least one user (who likely pays for it) gets a privacy guarantee. How is this different from Joinmarket today? First, it hasn't been considered seriously to use randomized amounts; second, 50 party joins have not been at all practical (until recently it was not very practical, due to low participation rate (unless you chose a narrow range of amount), however that has increased; but, the IRC message channel used is not really able to handle the traffic needed for 50 party joins, see this earlier blog post for some thoughts on that). But if you take away those issues, this scenario is possible.

But notice something - exactly what makes this new "random payments, large numbers of counterparties" paradigm attractive is the possibility of multiple payments going on at once - and that's counter to Joinmarket's original concept of "there is a guy paying for the privilege of controlling everything". More on this later.

Scenario #3: Current Wasabi, Chaumian coinjoin

I have only passing familiarity with the technical underpinnings of Wasabi as is, but essentially it is based on blinded signatures of a coinjoin output (see Chaumian coinjoin for a pretty intuitive diagrammatic explanation). This fairly simple cryptographic primitive (the blinded signature) is in itself enough, because Wasabi currently is only blinding the specific set of outputs (utxos-to-be) which all have equal size and are indistinguishable. As long as the Wasabi coordinating server is prevented from knowing those linkages, due to the blinding, then the later full construction of the transaction will not expose ownership of the equal-sized outputs ("coinjoin outputs"). On the other hand, let's not have too simple of a mental model of Wasabi - it's crucial in this that the users make separate network connections (effectively, have separate pseudonyms) for when they present their cleartext outputs, and when they earlier presented their inputs (and change)); otherwise the cryptography would be sidestepped and the server would know all the mappings.

Can you get the same protection, i.e. keeping the linkages private from the server, in a big-and-randomized model, using current Wasabi blind signatures?

It's easy to see the problem: when the user comes along with a new identity and says "here are my outputs: 0.29511342 BTC and 0.01112222 BTC' the server has no way of knowing that these amounts correspond to anything in the inputs. If the blind signature is being used as a token to say "I am entitled to add outputs to this coinjoin", fine, but in this scenario: a token of what, exactly?

The difference is clear: in equal-output coinjoin there is only one kind of thing you could be entitled to: a single output of the prescribed amount ("denomination"); typically it's things like 0.1BTC.

Here, if we were to preserve the tokenization approach, we'd have to have a more sophisticated object ... something similar to a supermarket gift card : it gives you the right to have a certain amount of stuff, restricted perhaps in time and space, but quantified. It's something that's issued to you, which you can use under the given conditions, but which does not have your name attached. I realise the analogy is a bit of a stretch, but you can see that gift cards have divisibility, which is crucial here in our big-and-randomized model. They usually also have anonymity which is clearly necessary.

What we need here is a homomorphic anonymous credential with attributes:

  • homomorphic - here it means we could linearly split and combine credentials. Take a credential for 10 and turn it into two credentials for 3 and 7, for example.
  • anonymous - if the credential presented could be linked to the one issued earlier, the coordinating entity can see all the linkages in the coinjoin
  • credential - this term is used in cryptography for any of a number of schemes that give rights to holders. The rights are usually with respect to some centralized entity, usually holding a private key that allows them to create such credentials (modulo a nuance about who is verifying them; we'll get in to that).
  • attributes - a credential could in its simplest form be simply binary: you are allowed to do X if you have the credential, and not otherwise. But sometimes attaching metadata inside the credential (think of e.g. a signature from a server that proved you should have access, but also that you are a level 3 user not a level 1 user, by including the level in the message that was signed).

What the Wabisabi paper, and this article, do not cover

What follows is a detailed review of the crypto constructions leading to the possibility of building a coinjoin system, with such a credential system. A full protocol however must cover other things:

  • The rules for transaction construction
  • Valid choices of amounts for inputs and outputs

This is not covered here, other than some general thoughts as outlined above.

Signatures, keyed MACs and credentials

Digital signatures are probably very familiar to any reader of this blog, and there is a detailed discussion of some fundamentals in this post. MACs, or Message Authentication Codes can be thought of as the symmetric crypto equivalent. In symmetric crypto, there is only a secret key, no public key, and that means there is no such thing as "public verification". The owner or owners of such a secret key can create a (probably unique; this is a nuance of the theory) "tag" on a message, which only a holder of the same key can verify.

On its face, such tagging might seem pointless without public verifiability, but the classic use case is for point to point communications over the public internet, in which both endpoints of the communication hold the secret key; by tagging messages in this way, integrity is assured, and the message is authenticated as coming from the intended source. Such secret keys can be pre-shared over a public communication channel using techniques like ECDH.

Creating a MAC

A simple and currently very common way of making a MAC is to use a cryptographic hash function as a PRF: just hash the key and message together (HMAC is a bit more complicated than this, but that's the basic idea): H(message || key).

At first sight it may seem weird that I'm talking about this construct - how is this related to credentials?

The most natural way to create a credential of the type described above, is to use a signature, which simply signs the rights of the holder. That's effectively what Wasabi's original design ("Chaumian coinjoin") does, but with the crucial extra feature that the signature is blind, so that the credential's redemption is not linked to its creation. Early ecash designs (indeed, from David Chaum as well as others) were heavily sophisticated variants of that basic idea. Just as original Wasabi uses fixed denominations, so did those ecash designs.

This is where we get some interesting twists, which bring in MAC as an alternative to signatures, here.

First, traditionally, MACs were preferable to signatures for performance reasons: they use hash functions, not expensive crypto math operations like RSA or - less expensive but still a lot more so than hashes - elliptic curve calculations. This is less a consideration today, but still relevant. Second, the more restrictive model of the MAC w.r.t. verification does create a different effect: such MACs are repudiable, whereas digital signatures are not repudiable (if you think about it, this is the same property as transferrability, which is of course a key property of signatures).

This plain vanilla style of MAC though (hash based), trades off functionality in favour of performance - hashes like SHA256 are intrinsically black boxy and not "algebraic". They are functions which do not allow composition; as I've had occasion to remark many times before, there is no such formula as \(H(a+b) = H(a) + H(b)\) for these traditional hash functions.

Algebraic MACs

The other approach to building a MAC might be to use discrete log or elliptic curve hardness assumptions, for example in the crudest case take \(\textrm{MAC}_k(m) = m^k\ \textrm{mod}\ p\) for the discrete log case. Comparing the two approaches, Dodis et al in Mesage Authentication, Revisited have this to say:

The former have the speed advantage, but cannot be reduced to simple number-theoretic hardness assumptions (such as the DDH assumption for NR-PRF), and are not friendly to efficient zero-knowledge proofs about authenticated messages and/or their tags, which are needed in some important applications, such as compact e-cash [12]. On the other hand, the latter are comparably inefficient due to their reliance on number theory.

Here NR-PRF refers to the Naor-Reingold. construction for a pseudorandom function.

The point about zero knowledge proofs is the trump card though: in building something like a anonymous credential with attributes, you are perforce required to be able to make attestations, using such proofs, in zero knowledge.

Security notions needed for algebraic MACs used for anonymous credentials

MACs generally want to have something called UF-CMA (unforgeability under chosen message attack) ; something we already discussed for signatures here. There are several nuances that are MAC-specific but we won't delve into too much detail (I recommend Section 6.1, 6.2 of Boneh and Shoup for an excellent rigorous description); the bottom line is that MACs must not be forgeable by non-key holders, just like signatures.

For our use case (and some others), such a MAC will also need to have a kind of "hiding" property : indistinguishability (under chosen message attack, or IND-CMA) - the tags output should not allow an attacker to guess anything about the message being tagged.

So concretely how can we use simple discrete log to build a MAC? Let's use an elliptic curve group of the type we're familiar with, generator \(G\), order \(p\). We'll try the simplest versions first and see what we need to do to make it secure:

Algebraic MAC attempt 1
  • Keygen: choose a scalar \(k\) at random
  • Tag: given a message \(m\), set the tag to \(T = mkG\).
  • Verify: not a relevant definition for such a deterministic MAC; we didn't add randomness so it's the same calculation as "Tag".

Note how this "determinism" is the same for familiar existing MAC functions like HMAC. Since it's the same information needed (the secret key \(k\)) and the same calculation, the distinction is not interesting. Shortly we'll be looking at probabilistic MACs.

Attempt 1 clearly fails, and here's one reason why: if the attacker gets to query the algorithm and ask for any MAC it likes it can choose to ask for the MAC of the message 1. That MAC is \(kG = K\). It can then take that curve point and create forgeries on any message \(m'\) it likes: \(m'K\). Secondly this kind of deterministic MAC clearly can't have the kind of hiding property we want, since it's like a commitment without any blinding factor: if you guess the value of \(m\) correctly, you can verify your guess. Thirdly, extend the above case of message '1' and we can see that it's non-resistant to forgery more generally: whenever you know the message that was tagged, you can take the output tag given by the signer, \(T = mkG\) and multiplicatively tweak the message to \(m_2 = a \times m\) by just outputting \(aT\) as the new tag. So this is very insecure.

Algebraic MAC attempt 2
  • Keygen: choose a scalar \(k\) at random
  • Tag: choose a curve point \(U\) at random, and message \(m\), output \((U, T=mkU)\)
  • Verify: Given \((U, T)\) and message \(m\), check if \(T == mkU\)

This addresses the second part of our complaint with Attempt 1, by making the MAC "probabilistic". Each new MAC is generated with a fresh random curve point (or equivalently a scalar \(u\)).

Unfortunately, Attempt 2 fails just as Attempt 1 did, when it comes to preventing forgeries (bearing in mind the previous sentence), because we can still tweak created tags in the same way. Perhaps slightly less obvious is that we can not just tweak \(T\) but also \(U\). (But it's important to bear in mind that our security game is also concerned with whether an attacker can do something clever with re-used values of that U.)

Algebraic MAC attempt 3
  • Keygen: choose a scalar \(k\) at random
  • Tag: choose a curve point \(U\) at random, and message \(m\), output \((U, T=(m+k)U)\)
  • Verify: Given \((U, T)\) and message \(m\), check if \(T == (m+k)U\)

This prevents the multiplicative tweaking which killed our first two attempts; even supposing the attacker has a given \(T\) on a given, known message \(m\), multiplying \(T\) by any constant \(a\), will create \(T^{*} = aT = (am+ak)U\) which is not a tag on any message he can state (it is a tag on \(am + (a-1)k\) but he doesn't know \(k\) , so even if he knows, or guesses, \(m\), he is stuck). However this construction still allows trivial forgery, (and fundamentally for the same reason: the additive homomorphism of the group). Here, because the key is "additively separate" from the message, you can just insert new messages using addition. If you happen to know \(m\) and you want a tag on \(m_2\) instead, just make \(T_2 = T + m_2U - mU\) (see previous note : reusing \(U\) is in-scope for our attacker).

So if we review these first 3 attempts, it's fairly clear what's going on; it's a paradigm we've seen before in the Schnorr protocol. If you only add a random secret, you allow additive forgery, while if you only multiply a random secret, you allow multiplicative forgery, but if we add both ...

Algebraic MAC attempt 4
  • Keygen: choose two scalars \(k_1, k_2\) at random
  • Tag: choose a curve point \(U\) at random, and message \(m\), output \((U, T = (mk_1 + k_2)U)\)
  • Verify: Given \((U, T)\) and message \(m\), check if \(T == (mk_1 + k_2)U\)

To expand on the Schnorr analogy, it's as if one of the keys were the randomizing nonce, and the other were the private key (the analogy is not exact, to be clear). Now neither the additive nor the multiplicative tweak gives the attacker a way to forge new tags on messages that the genuine key holder never created.

The construction in attempt 4 is one of several elucidated by Dodis et al in their 2012 paper "Symmetric Key Authorization, revisited". They identify exponentiation in a group of prime order as an example of a "weak PRF", and moreover, specifically a key-homomorphic weak PRF, and build the above construction in abstract from such a function. Then they prove by quite sophisticated arguments (see 4.3 of the full paper), that this construction has "suf-CMA" (or "suf-CMVA" with a transformation) where the "s" refers to selective security. This is a weaker notion of security; the idea is that we only defend against the attacker who has to choose the message he will forge on, before he gets to query the signer/tagger to see a bunch of other messages. Their proof strategy is basically to show that with clever use of linear transformations you can reduce the security argument to that of the underlying weak PRF; its randomness gives you both the unforgeability and the hiding (indistinguishability) properties that we want.

MAC-GGM - a vector of messages; different security arguments

In 2013 Zaverucha, Chase and Meiklejohn described, in this paper (which we will sometimes abbreviate to CMZ13), a small but meaningful finesse on the above construction from 2012, which they call "MAC-GGM" (they also describe MAC-DDH in the same paper, which we won't cover here):

  • Instead of a MAC on a single message, the MAC is designed to support multiple distinct messages, and this is specifically to allow the credentials we'll describe next, to support attributes.
  • The paper gives an argument in the so-called "generic group model" (GGM) that this construction has the full UF-CMVA security property (the original argument for only selective security is not really OK in any scenario where users can query verification on tags).

As is probably obvious, in this new construction, the tag is calculated by: \(T = (U, (k_1m_1 + k_2m_2 + \ldots + k_nm_n + k_0)U)\); it's easy to see that the "multiplicative and additive" arguments mentioned above still apply (note the presence of \(k_0\)) (although the security argument is very different, see Appendix A of the paper). This looks superficially similar to a vector Pedersen commitment of the form seen in constructions like Bulletproofs (only superficially: here also, the vector is blinded, but at the level of scalars; we don't use different base points).

The main reason this is even interesting is how it naturally supports selective revelation - zero knowledge proofs that certain of these messages have particular values or perhaps are in a range. Other previous MAC constructions couldn't do this in any reasonable way (although there was a big literature of achieving similar properties using (blind) signatures).

Key-Verified Anonymous Credentials (KVAC)

Now we have the theoretical basis, we can construct a credential system with two of the properties we want - anonymity, and attributes. And that's what the meat of the Chase et al. paper does. It describes a credential system, using MAC-GGM as a primitive. The functionality of this credential system can be boiled down to:

  • Keygen: generate secret keys and public parameters for the protocol instance (called iparams, short for issuer parameters). These parameters include public commitments to the secret keys.
  • Blind Issuance: a user can request and the issuer can provide credentials on a set of attributes (\(m_i\)) in the above, where some of the attributes are allowed to be hidden from the issuer.
  • Show-Verification: a user can prove to the issuer (or, any other holder of the secret key material), in zero knowledge, that they possess a credential whose attributes satisfy a specific set of constraints.

How does issuance work?

Because we've laid the foundations, this is pretty easy to describe, albeit the concrete steps of mathematically creating the credential, isn't.

Without any blinding:

(From here the private key set of the issuer is denoted with \(x\) rather than \(k\).

Issuance - We issue a credential consisting of a MAC-GGM style of MAC, combined with a proof of its validity. Form, on a set of messages \(m_i\):

\((U, x_1m_1+x_2m_2 + \ldots +x_nm_n + x_0)U, \pi)\) - the proof \(\pi\) exists because the credential must be accompanied by a proof that it is correctly formed with respect to the issuer parameters that were decided at the start of the protocol, but without revealing the issuer's secret key material.

Show/Verify - this is where it gets interesting. The user does not just "present his MAC" as that would violate our intention to make the credentials anonymous. Instead, he presents commitments to his MAC along with a zero knowledge proof of correct formation. He presents \((U, {C_{m_i}}^{n}_{i=1}, C_{U'},\pi)\). Taking those elements in order:

  • \(U\)- this is the base point of the MAC which was issued as credential, but it will have been rerandomised as \(U=aU_0\) for some \(a\). (There is a point of confusion in the paper here; in Appendix E the detailed treatment correctly notes that \(U, U'\) must be re-randomised by multiplication with a random scalar, in order to prevent trivial linkability between the Issue and Show/Verify steps, but this is not mentioned in Section 4.2).
  • \({C_{m_i}}^{n}_{i=1}\) - these are Pedersen commitments to the individual attribute messages (note - the plaintext messages can be sent instead for those messages which are not hidden/encrypted, to save communication - we will talk about hidden attributes next). The blinding value for each commitment is \(z_i\). (So \(C_{m_i} = m_iU + z_iH\)).
  • \(C_{U'}\) is a single Pedersen commitment to the second element of the tag. The blinding value is \(r\). (So \(C_{U'} = U' + rH\)).
  • \(\pi\) - as mentioned, we need a zero knowledge proof of correct formation - this consists of a proof that the commitments \(C_{m_i}\) and \(C_{U'}\), when combined with the secret keys that only the verifier holds, will give the same outpoint group element \(V\) from the calculation \(x_0U + \sum_{i} x_i C_{m_i} - C_{U'} = V\) as the prover obtained from the calculation with public issuer parameters \(X_i\), i.e. \(V = \sum_{i} z_i X_i - rG\).

That last point is very tricky so I'm going to expand on it. What makes this credential construction special is its requirement to hide something from both sides - the user wants to hide the attributes \(m_i\) (in general if not always) from the issuer, and the issuer of course wants to hide the secret keys from the user.

This is dealt with algebraically by using something similar to ECDH keys, where \(S = pqG = pQ = qP\), i.e. both sides have their own secret they keep from each other, but still create a shared secret. The variable \(V\) represents this, but to keep the following simple we'll imagine \(n=1\), i.e. that there's only one message/attribute.

On the user side, we are summing elements of the form \(z_i X_i = z_i x_i H\), but the blinding terms in the message commitments \(C_{m_i}\) are also \(z_i H\), so that they can be converted into part of a term \(x_iC_{m_i}\) that the issuer can verify using the secret keys. The remaining term in the commitments \(C_{m_i}\) is \(m_iU\) which is converted into part of \(U'\) by the same multiplication by the secret key: \(x_i C_{m_i} = x_i m_i U = U' - x_0 U\). This equality is worked through in detail in the paper, but notice that basically, the group homomorphism can be used to allow the issuer to verify, using his own secret values, what the user constructed as message commitments, with his secret values.

Side note: what are these mysterious "zero knowledge proofs"?

The proof systems used for these kind of statements are all variants of the basic Schnorr protocol + Fiat-Shamir transform that I explained in great detail here (Section 3), though I also strongly recommend Boneh and Shoup Chapter 19 for more rigorous treatments. Note that very often we are using the "AND of sigma protocols" paradigm, in which multiple statements are proved concurrently, and this is achieved by committing to all the statements in the first step, including all those commitments in the hash challenge preimage before constructing the response. As well as the aforementioned links, you can see a good simple example of this paradigm in Appendix E of CMZ13, albeit there are two serious errors in the description of the verification algorithm, as I explained on stackexchange here.

With blinding of attributes:

These attributes can be encrypted using El Gamal encryption in such a way that a credential can still be issued without revealing (some of) them. The mechanics of El Gamal are about as simple as an encryption scheme gets:

  • Key: a normal public/private key pair from the group, say \(P=pG\)
  • Encrypt: take as message point \(M\), create new randomness r, output \((rG, rP+M)\)(remember, asymmetric encryption, so not necessarily private key holder)
  • Decrypt: take ciphertext \((c_1, c_2)\) as per above and note \(p(rG) = r(pG) = rP\) (the Diffie Hellman shared secret primitive), so that \(c_2 - pc_1 = M\)

Note that this system encrypts and decrypts group elements \(M\) rather than scalars, \(m\). In cases where it's the latter that needs to be encrypted, sometimes this hiccup can be circumvented with a ZKP of the underlying message scalar. However the next paper, CPZ19, addresses this point (see next section) along with a lot of other things.

Now, how could this El Gamal scheme be used to aid getting credentials on hidden attributes?

(EC) El Gamal encryption has an additive homomorphism (this was noted in one of my earlier blog posts here. (additive here means for elliptic curve point addition): \(E(A) + E(B) = E(A+B)\) (the notation is very poor here: encryptions have attached randomness, but anyway). Whenever you have this additive homomorphism, you also have the scalar multiply, trivially: \(aE(A) = E(aA)\). We can leverage this to pass the encryption "through" the MAC procedure.

The user would give the El Gamal encryption of one or more attributes \(m_i\) to the issuer to be tagged. They would give \((P, (r_1 G, m_1 G + r_1 P))\) as the two elements of the ciphertext, where we stick with just one attribute index for simplicity. The issuer would pick \(U\) here as \(bG\) where \(b\) is a random scalar (we'll see why this is needed rather than NUMS in a moment), and create an encrypted tag on the encrypted attribute message: \(E(U') = E(x_0 U + m_1 x_1 U) = E(x_0 U) + x_1 E(m_1 U) = E(x_0 U) + x_1 b E(m_1 G)\) and \(E(m_1G)\) is exactly what the user gave them. Thus they can easily create an encryption of \(U'\) on the message \(m_1\) without ever seeing \(m_1\). This encryption must then be blinded, but that's easy, by adding extra randomness in the form of an encryption of 0. Again a ZKP will need to be attached when the issuer returns this encryption to the user, but if it is valid, the user can know that when he decrypts \(E(U')\) to \(U'\), he will have a valid credential (tag) \((U, U')\) for his attribute/message.

Chase-Perrin-Zaverucha 2019

The scope of this paper (CPZ19 for short), which is intended to provide a credential system for the Signal messenger system, is much larger, but part of it is creating a more powerful credential design (albeit an inheritor; the security proof for CPZ19 uses a reduction to CMZ13) than that found in CMZ13. These credentials support both scalar attributes and group elements as attributes - the latter can be appealing for general purposes of creating efficient ZKPs (or a more elementary aspect of the same thing: easier El-Gamal encryption of the type described in the previous section - indeed having attributes encrypted in this way is a fundamental part of their design).

The credential construction looks much more complicated as presented since it uses different NUMS base points for multiple different components: the secret key elements \((x_0, x_1)\) as before for the basic idea of the Dodis et al algebraic MAC, but there are then base points for each of a vector of secret keys, one per attribute (and more, see paper for full setup). Notably the construction of the MAC tag itself, looks quite different:

\((t, U, (W + (x_0 +x_1 t)U + \sum_{i=1}^n y_i M_i )\)

here \(t\) and \(U\) are generated by the issuer at random, while the \(y_i\) are the aforementioned vector of secret keys corresponding to each message/attribute.

While the construction is significantly more complex, the basic principle of how crendentials are issued, and then show/verified, is essentially the same, and that encludes the idea of encrypting credentials using El Gamal. The same construct carries over as was described under "with blinding of attributes", but the authors have a slightly different approach in mind:

Given an existing credential/MAC, you can quite elegantly prove that an ElGamal encryption of a specific attribute is in fact attested to by the MAC, using again a ZKP about a relationship between the MAC and the encryption. However the authors do caution:

"We caveat that this is only a promising direction for a new (public-key) verifiable encryption scheme, since the above basic Elgamal scheme is not CCA secure, and we have not carefully analyzed its security."

(here CCA means "chosen ciphertext attack"; security under this condition is the main goal of provably secure encryption schemes).

At the beginning of this section I mentioned that the security argument for this flavor of algebraic MAC is based on a reduction to the case of CMZ13 above, which was proven UF-CMVA secure in the generic group model. However this reduction only produces SUF-CMVA, which is to say 'selective security' - here, we only consider an attacker who specifies the message \(m^{*}\) in advance of their message tagging and verification queries. I'm not sure if this is sufficient.

Wabisabi: credentials on amounts with splitting

Wabisabi uses basically exactly the CPZ19 construction for its credentials. The main "twist" is a simplification: only 'value' (value in BTC) attributes are needed, and they are of course integer values. These credentials will allow a coinjoin participant to follow the workflow mentioned at the start of this article:

  • As one pseudonym, register 1 or more inputs and request N credentials for the input, with the values of each credential hidden, but accompanied with a zero knowledge proof that the sum of those values is as it should be (the input's value).
  • As another pseudonym, present the credentials and the intended coinjoin outputs, with a proof that the sum of the redeemed credentials tallies up to the total of the outputs.

Serial numbers are also used as part of the credential, to prevent double spend of the same credential (remember, the credentials are specifically designed to be unlinkable).

However the paper is careful to build up to what it calls a "unified registration protocol" where it generalises the whole process of both creating and redeeming these credentials, and makes the interaction more efficient.

Range proofs

Any former student of the ideas behind Confidential Transactions will find this part obvious. Simply presenting commitments to integer amounts (in satoshis, say) doesn't provide the intended security: since in modular arithmetic, a very large integer is mathematically equivalent to a small negative integer, it would be easy for users to cheat the amounts they get out by requesting commitments on (effectively) negative amounts. The way round this is again a ZKP, of a particular flavor known as a range proof: you prove that the integer \(a\) is between say 1 and \(2^{32}\) or whatever suits. This can be done e.g. with Bulletproofs but also the range proof can be embedded as another statement in the overall ZKP provided by the user.

A relevant question, though, and one worth pondering: are the range proofs actually necessary (ZmnSCPxj has raised this in review, and it occurred to me too)? As discussed in the next section, there isn't a risk of funds loss in the basic coinjoin construct, with or without this extra crypto magic of credentials. So a malicious user constructing credentials in invalid negative amounts is not going to be able to claim more money, but this does represent a DOS vector, one that is usually addressed just with the requirement of users to provide and sign off on a valid utxo.

However there still may be further room for thought here; the range proof could be provided as part of a blame phase of a protocol, and avoided in the happy path of correct coinjoins being presented for signing. Apparently the authors have considered this.

Final thoughts on the security and functionality proposed in Wabisabi

This article has just been a survey of some of the technical (cryptographic) underpinnings; the paper itself is specifically more about the theoretical construct, and not a fully fleshed out system spec as would be needed for a full software instantiation.

  • How secure is it?

As the paper notes in the final section 5, we should not forget the fundamental security inherent in Coinjoin, however it is coordinated: users only sign what does not rob them of money, and a single transaction does not suffer from anything related to blockchain mechanics (delays, reorgs etc). So what risks exist will be around DOS (inconvenience, lost time) and much more importantly, privacy loss:

  • How strong are the privacy guarantees?

First, to state the obvious, there is a dependency on discrete log hardness, but that's just at basis, more exactly, there is a DDH hardness assumption (see 6.2 of CPZ19) underlying the security of this MAC construction. As mentioned in the previous bullet point, this is effectively only relevant to the privacy of the users w.r.t. the issuer (here the coinjoin coordinator) of the credentials, although nominally a breakage of that security (assume in the worst case, ability forge MACs arbitrarily) would "allow the user to forge credentials for arbitrary bitcoin amounts", but that is a DOS vector only as it creates invalid coinjoins that won't be signed.

  • How much defence against the issuer is there, i.e. is trust in the coordinator required for privacy?

This is actually a fairly tricky point. Restricting the coordinator's ability to tag (pun intended) or selectively censor is quite critical, and non trivial.

The splitting into multiple credentials helps; it is less easy for the malicious coordinator to figure out how to jam individual participants if they are going through multiple rounds of credential issuance and redemption. From conversations with nothingmuch it appears that quite a lot of thought is being put into this aspect of the protocol; those interested may want to read this protocol spec document for the latest. Also along the same lines, the paper notes:

A malicious coordinator may also tag users by providing them with different issuer parameters. When registering inputs a proof of ownership must be provided. If signatures are used, by covering the issuer parameters and a unique round identifier these proofs allow other participants to verify that everyone was given the same parameters.

Basically what is going on here is that there is a kind of "public" aspect to input registration; users sign the issuer parameters for the round, and then these signatures, at a certain point in the negotiation, are broadcast to all participants (with encryption), so that a malicious coordinator can be prevented from tagging users by giving them all different round issuer parameters.