Subject: Re: [pqc-forum] ROUND 3 OFFICIAL COMMENT: Classic McEliece
From: "D. J. Bernstein" <djb@cr.yp.to>
Date: Thu, 19 Nov 2020 13:37:46 +0100
To: pqc-comments@nist.gov
Cc: pqc-forum <pqc-forum@list.nist.gov>
Message-ID: <20201119123746.95733.qmail@cr.yp.to>

Kirk Fleming writes:
>    mceliece-3488-064     152.51     34.68     143
>    mceliece-4608-096     194.36     35.66     207
>    mceliece-6688-128     270.46     37.48     272
>    mceliece-6960-119     271.18     47.58     272
>    mceliece-8192-128     306.63     67.64     272

We agree that the second column is sometimes less than the fourth column.
However, the second column ignores the huge overheads measured by the
third column. As stated in the submission:

   A closer look shows that the attack in [11] is bottlenecked by random
   access to a huge array (much larger than the public key being
   attacked), and that subsequent ISD variants use even more memory. The
   same amount of hardware allows much more parallelism in attacking,
   e.g., AES-256.

The submission goes on to state expectations regarding (e.g.) 6960119
being more expensive to break than AES-256. The numbers you quote (e.g.,
2^271.18 operations on 2^47.58 bits of RAM) are consistent with these
expectations.

The literature shows that optimized hardware for AES attacks has similar
efficiency to multipliers. The multipliers in an Intel CPU core carry
out millions of bit operations in the same time that it takes the core
to retrieve data from (say) 6GB of RAM, never mind RAM contention across
cores. GPUs and TPUs show the costs of RAM even more clearly. On a
larger scale, decades of supercomputing literature consistently fit a
square-root scaling of RAM costs, and science-fiction 3D computers
(which seem much harder to build than quantum computers) would still
have cube-root costs.

> I am filing this comment to request that the Classic McEliece submitters
> justify the claimed security categories for their parameters.

Can you please clarify how you think that the numbers already in the
literature don't already justify the assignments in the submission?

Our best guess is that you're assuming that NIST requires >=2^272
operations in a specified metric for "category 5", and that 2^271.18 is
an evaluation in this metric. But the assumption here isn't correct.
NIST has not issued clear definitions of the cost metrics for its
"categories". If at some point NIST does issue clear, stable definitions
of its cost metrics, then it should also allow all submitters to set
their "category" assignments accordingly. 

Note that one can make any cryptosystem sound much easier to break by
choosing a cost metric that assigns unrealistically low cost to
operations used in attacks; RAM access is just one example. If the same
operations are not bottlenecks in attacks against AES-m or SHA-n then
this can reverse comparisons to AES-m or SHA-n respectively.

> The Round 3 submission does not include any concrete analysis of the
> security provided by the proposed parameter sets.

Not true. For example, the stated expectation to be "more expensive to
break than AES-256" is concrete, and the rationale is stated. A more
detailed analysis depends on the cost metric.

> There is a lot of hype about the asymptotic result from Canto Torres
> and Sendrier

The graph in https://video.cr.yp.to/2020/0813/video.html shows that
asymptotic lattice security levels against attacks known 10 years ago
were 42% higher than lattice security levels against attacks known today
(for otherwise unbroken lattice systems), while for McEliece's system
the change is 0%. Any thorough evaluation of security risks will take
these facts into account, along with other risk indicators.

> but this is ultimately irrelevant for any finite choice of parameters.

The submission already says, at the top of the "Information-set
decoding, concretely" section:

   We emphasize that o(1) does not mean 0: it means something that
   converges to 0 as n->infty. More detailed attack-cost evaluation is
   therefore required for any particular parameters.

The section goes on to summarize what the literature says about concrete
cost analyses, and ends with the AES-256 comparison reviewed above.

> The submission argues that any reduction in classical security from
> improved ISD attacks will be offset by the increased memory
> requirements. While this may be true for some ISD variants such as
> BJMM, they provide no analysis to back this up.

Again, here's the critical quote from the submission:

   A closer look shows that the attack in [11] is bottlenecked by random
   access to a huge array (much larger than the public key being
   attacked), and that subsequent ISD variants use even more memory. The
   same amount of hardware allows much more parallelism in attacking,
   e.g., AES-256.

The exact impact depends on the cost metric.

> It also ignores other
> ISD variants such as Stern that need significantly less memory.

No, the submission doesn't ignore this. The "attack in [11]" (see quote
above) is a Stern-type attack. The submission's comment that "subsequent
ISD variants use even more memory" (see quote above) is pretty much the
same as your comment that Stern-type attacks "need significantly less
memory".

>    - The mceliece-4608-096 parameters are about 13 bits below Category 3 with
>      an attack that needs slightly over 6 GiB of memory.

Evaluating such a claim would require a clear, stable definition of
"Category 3". If NIST settles on a definition to be used in NISTPQC then
submitters will be able to make "category" assignments accordingly.

See above regarding the actual costs of accessing 6GB of RAM.

> If Classic McEliece is to be standardized as a conservative option
> then the parameter sets that are standardized with it should also be
> chosen conservatively. The NTS-KEM parameters were. Three out of the
> five Classic McEliece parameters were not.

Please clarify. Are you saying anything here beyond the 2^194 vs. 2^207,
2^270 vs. 2^272, and 2^271 vs. 2^272 discussed above?

If you're concerned about 2^270 RAM operations vs. 2^272 bit operations
then you should be far more concerned about NIST dropping the targets
from 2^256 to 2^128:

   * Multi-target attacks turn 2^128 into something already feasible for
     large-scale attackers today. ECC is an exception to this (provably,
     via a tight self-reduction), but codes and lattices and AES aren't
     exceptions. See https://blog.cr.yp.to/20151120-batchattacks.html.

   * Both round-1 Classic McEliece options were at the 2^256 level.
     However, NIST then praised NTS-KEM for offering lower-security
     options, and asked Classic McEliece for "parameter sets for other
     security categories".

From this perspective, NTS-KEM and NIST were far less conservative than
Classic McEliece was. Note that general trends in networking and in CPUs
are making the high-security Classic McEliece parameter sets affordable
for more and more users.

There's also a tremendous amount to say about the dangers of unnecessary
complexity. One aspect of this is specifying more parameter sets than
necessary. Of course, NIST's request for lower-security parameter sets
was an offer that Classic McEliece couldn't refuse, but we continue to
recommend the 2^256 parameter sets. Within those parameter sets:

   * The 6960119 parameter set goes back to 2008, maximizing security
     for 1MB keys. This has held up just fine, and can reasonably be
     viewed as the default choice.

   * The new 6688128 parameter set is a slight variant that requires n
     and t to be multiples of 32, which _might_ be slightly better for
     formal verification, but the jury is still out on this. (Components
     are being verified---see https://cr.yp.to/papers.html#controlbits
     for the latest news---but this work hasn't yet reached the
     components that could interact with the multiple-of-32 question.)

   * The 8192128 parameter set is bigger, but the 6960119 and 6688128
     parameter sets include an extra defense explained in the
     submission. People paranoid enough to imagine 2^306 vs. 2^270
     making a difference should also be paranoid enough to imagine this
     defense making a difference.

One can easily write down even larger parameter sets. Also, the ISD
attack analyses in the literature are precise and straightforward, and
it should be easy to evaluate the security of any desired parameters in
any well-defined metric. However, the right order of events is, first,
for NIST to fully define the cost metric to be used for "categories",
and, second, for all submission teams to evaluate costs in this metric.

---Dan (on behalf of the Classic McEliece team)