Subject: Re: [pqc-forum] Structural analysis of McEliece asymptotically better than generic decoding From: "D. J. Bernstein" Date: Fri, 16 Aug 2024 21:49:30 +0200 To: pqc-forum@list.nist.gov Message-ID: <20240816194930.1058322.qmail@cr.yp.to> This is a message from the Classic McEliece team. We appreciate the extensive community interest in all aspects of the McEliece cryptosystem. https://classic.mceliece.org/papers.html tracks a wide range of papers, and we encourage further research. Confidence in the security of the cryptosystem is founded upon the long, continuing, publicly documented series of failures to break the cryptosystem. This message analyzes what the claims in https://ia.cr/2024/1193 mean for Classic McEliece, assuming the claims are correct. The short summary is that the paper's syzygy distinguisher does not affect any of the Classic McEliece security claims. The Classic McEliece security analysis avoids relying on the indistinguishability property targeted by 2024/1193. Furthermore, the distinguisher has vastly higher cost than any of the security levels claimed by Classic McEliece, so it would still not affect the system even if it could somehow be converted into a key-recovery attack. The paper's main claims are nevertheless of potential interest for computational complexity theory. The claims rely on conjectures formulated in the paper, and we encourage further investigation of whether the distinguisher works. The rest of this message is organized as follows: (1) This distinguisher has cost far above 2^256. (2) Classic McEliece avoids relying on this indistinguishability property. (3) Notes on the conjectures in 2024/1193. (4) Further comments. (5) Summary of impact. 1. This distinguisher has cost far above 2^256 The syzygy distinguisher is, according to the estimates in 2024/1193, much slower than a ludicrously expensive key-recovery attack already explained in the Classic McEliece documentation. Specifically, an attacker can recover private keys in at most 2^256 key-generation operations by searching through all 256-bit seeds. This attack is noted on, e.g., page 24 of the Classic McEliece security guide, https://classic.mceliece.org/mceliece-security-20221023.pdf. A side effect of recovering a private key is that the public key is distinguished from a uniform random matrix (with probability extremely close to 1, since private keys are much shorter than public keys). A much more important side effect of recovering a private key is being able to break every ciphertext encapsulated to the public key. The reason this attack is not of concern is that searching through guesses for 256-bit secrets is infeasible, even with a future quantum computer. As page 3 of the security guide puts it, "guessing 256-bit secrets" is "not a real-world threat". For comparison, page 26 of paper 2024/1193 considers our smallest proposed parameters (3488,64), and conjectures that distinguishing public keys from uniform random matrices has "complexity around 2^529" (using linear algebra acting on vectors of length "around 2^223"). Whether or not this conjecture is correct, complexity 2^529 is much slower than searching through 2^256 seeds. There are many previous papers on infeasible methods to detect the structure of McEliece keys. Section 3.1 of the security guide says the following: For all parameters of interest, the fastest attacks known against the OW-CPA problem treat G as a general matrix with no evident structure. ... There are also much slower “structural attacks” known that try to exploit the secret structure of G, for example by recovering Goppa-code parameters for G. Distinguishers costing far above 2^256, whether or not they recover keys, are consistent with this statement and with the rest of the security analysis. Section 3.6 of the security guide mentions various other examples of very slow key-recovery attacks. The "fastest attacks" comment is referring to ISD, which takes about 2^150 bit operations for (3488,64). For precise quantification and high assurance regarding the number of bit operations used by many ISD variants, see the CryptAttackTester paper at Crypto 2024. 2. Classic McEliece avoids relying on this indistinguishability property Beyond being extremely slow, the syzygy distinguisher is attacking a problem that appears in some papers but that the Classic McEliece security analysis explicitly avoids relying on. The security guide organizes the security analysis as follows: * The Classic McEliece security goal is IND-CCA2. See Section 1. (We recommend using separate modules for generic transformations that aim at anything beyond an IND-CCA2 KEM; see Section 6.) * IND-CCA2 attacks much faster than QROM IND-CCA2 attacks would be surprising for the selected hash function. See Section 5.3.3. This lets reviewers focus on QROM IND-CCA2 attacks. * QROM IND-CCA2 security follows tightly from the hypothesis of OW-CPA security of the underlying PKE. See Section 5. After checking this, reviewers can focus on OW-CPA security. * OW-CPA security of the underlying PKE follows tightly from the hypothesis of OW-CPA security of the original McEliece PKE. See Section 4. This lets reviewers focus on McEliece OW-CPA security. * Section 3 reviews the state of the art in McEliece OW-CPA attacks. This provides a clear structure for Classic McEliece security reviews. The only ways that something can possibly go wrong are some sort of disaster involving SHAKE256, a mistake in the tight reductions, or a better OW-CPA attack against the original McEliece system. The literature sometimes mentions the following trivial implication: McEliece OW-CPA security follows from OW-CPA security of a uniform random matrix if the McEliece public key is indistinguishable from a uniform random matrix. Of course, finding a distinguishing attack faster than the target OW-CPA security level would render this implication vacuous. The Classic McEliece security guide notes this motivation for studying distinguishing attacks, but it also notes that this is "not a two-way implication---perhaps there are distinguishers even if OW-CPA is secure". Even fast high-probability distinguishers would not be relevant to any of the Classic McEliece security claims. As a (real) pre-quantum analogy for such (hypothetical) distinguishers, consider the following trivial implication: if RSA public keys are indistinguishable from uniform random integers of the same size then OW-CPA for an RSA public key follows from OW-CPA for the same system with a uniform random integer. This is vacuous, since RSA keys are efficiently distinguishable from uniform. OW-CPA is nevertheless a reasonable pre-quantum hypothesis for RSA. The distinguisher doesn't threaten any of the security properties that NIST claims for RSA. In other words, for an algorithm to have an effect on the Classic McEliece security analysis, it would have to be much faster than the 2024/1193 distinguisher _and_ would have to be a key-recovery attack or something else that breaks OW-CPA. 3. Notes on the conjectures in 2024/1193 Paper 2024/1193 claims that its distinguisher is "subexponential in the error-correcting capability, hence better than that of generic decoding algorithms". Quantitatively, with the usual choice of parameters (n,t) where t is proportional to n/log n, the cost of ISD is exponential in n/log n (for almost all matrices, under amply tested assumptions), while this paper claims a distinguishing cost exponential in n (log log n)^3 / (log n)^2. (Note that exponential in n (log log n)^3 / (log n)^2 is slightly subexponential in n/log n. Once n is large enough, the (log log n)^3 in the numerator is outweighed by the extra log n in the denominator.) If this claim is correct then presumably a wider range of algebraic attacks should also try iterating XL to compute higher-order syzygies. However, the question of whether the claim is correct needs further investigation. The largest distinguishing experiments reported in the paper, at the bottom of page 25, are for t = 3 and n below field size 64. The paper reports that the distinguisher becomes less reliable for n below 56 and completely unreliable for n below 50. The paper explains this as being correlated with n crossing a line in "Heuristic 1". It is surprising that the abstract does not mention that the paper's claims depend on a new heuristic formulated in the paper. The evidence presented for the heuristic consists of other experiments for n = 56 reported on page 34. Those experiments also detect failures. Heuristic 1 says "high probability" without quantification; there is no evidence of how the failure probability scales with n and t. We encourage the author to carry out a wider range of experiments. 4. Further comments Four specific comments in the paper might seem to indicate that the paper's claims contradict Classic McEliece security claims, so we address those comments here. 4a. The paper claims that "a security proof" for the McEliece cryptosystem relies on (1) the assumption of indistinguishability from uniform random matrices and (2) the hardness of decoding random codes, which the paper says "stands on a firm theoretical ground: the decoding problem for generic linear codes is known to be NP-hard". It is true that the literature sometimes assumes indistinguishability, and sometimes claims connections between NP-hardness and security. However, the Classic McEliece security analysis does not claim any such connections, and does not assume indistinguishability. The analysis is instead explicitly founded upon a simpler assumption, OW-CPA, which is a major target of McEliece cryptanalysis. 4b. The paper claims to disprove the belief that "we could possibly have reached the intrinsic complexity of cryptanalysis of this system", and claims that this is "the first time an analysis of the McEliece cryptosystem breaks the exponential barrier". As a counterexample, "sloppy Alice" attacks from the turn of the century take polynomial time and are analyses of the McEliece cryptosystem. NTS-KEM and PALOMA are two examples of newer McEliece-based KEMs that had IND-CCA2 efficiently broken using variants of these attacks. The way that Classic McEliece obtains QROM IND-CCA2 security from just OW-CPA security is an important, explicit feature of the design of Classic McEliece, and also means that a hypothetical fast distinguisher would still not qualify as a break of Classic McEliece. It is important to be clear about which problems are being attacked. 4c. The paper claims that "previous distinguishers" have "strong regime limitations", making those distinguishers inapplicable to "the codes used in Classic McEliece". The documentation already notes various examples of attacks finding the codes used in Classic McEliece. Again, these key-recovery attacks imply distinguishers and, more importantly, OW-CPA attacks. The reason that Classic McEliece dismisses these attacks is explicitly that the attacks are much slower than the state-of-the-art OW-CPA attacks for "all parameters of interest". 4d. The paper says "Observe [5, §3.4] that the security parameter in the Classic McEliece system, based on the complexity of generic decoding algorithms, is linear in t ... our complexity is subexponential in the security parameter". Here [5] is the security guide. Presumably the intention here was to say "claimed security level" rather than "security parameter". (There is no security parameter in Classic McEliece. The Classic McEliece security analysis maps _forward_ from the selected choices of parameters n, t, etc. to security levels; the Classic McEliece rationale says that mapping backwards from security parameters is not robust.) However, Section 3.4 is explicitly on "Asymptotic costs of information-set decoding", and Section 3 is explicitly on "OW-CPA security", not on broader notions of security. 5. Summary of impact The claims in 2024/1193 do not contradict any of the Classic McEliece security claims. There are two independent reasons for this, one quantitative and one structural: * The costs conjectured in 2024/1193 are much more expensive than ISD for all proposed Classic McEliece parameters. * The costs are for an algorithm that is merely a distinguisher, not an OW-CPA attack. The targeted indistinguishability assumption is not used in the Classic McEliece security analysis; it is even explicitly disclaimed by the Classic McEliece security analysis. As noted above, even a fast distinguisher wouldn't violate the Classic McEliece security claims. A distinguisher above the target OW-CPA security level is even weaker: it doesn't even render vacuous the aforementioned trivial implication. A hypothetical extension of the distinguisher to an OW-CPA attack would certainly not make it faster. ---D. J. Bernstein (on behalf of the Classic McEliece team)