Subject: Re: [pqc-forum] ROUND 2 OFFICIAL COMMENT: Classic McEliece From: Ruben Niederhagen Date: Wed, 27 May 2020 17:07:20 +0200 To: James Howe , pqc-forum Message-ID: <9b6fd844-a4b0-9827-cd61-e18d4164d85e@sit.fraunhofer.de> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 Dear James, thank you for your interest! The submission document [3] reports "measurements of a separate FPGA implementation of the core mathematical functions (not including, e.g., hashing)". Page 34 of [3] says that encapsulation, decapsulation, and key generation are fast in hardware, and then says that the "hardware speeds of key generation and decoding are already demonstrated by our FPGA implementation". Decoding is a major subroutine in decapsulation. A natural next step is to build a complete implementation of the KEM. There is plenty of literature on hardware implementations of SHAKE. We expect that the total hardware resources for the KEM will be not much more than the core mathematical operations, and that the additional cost in terms of latency due to the KEM construction will be similar to other KEM constructions. On 18/05/2020 17:39, James Howe wrote: > I wanted to see if you could add any more insights into the hardware > designs quoted in your latest specifications [3]. The paper by Wang et al. > [1], cited a lot in the specifications, is for Niederreiter with Goppa > codes as an encryption scheme, so I'm interested to see the differences > between this and the McEliece KEM submission. I am aware of only two references to [1] in the submission document; neither of them is part of the specification. As the submission document explains, the FPGA implementation covers only the core mathematical functions. For example, the implementation includes decryption, but it does not include the hashing required for full decapsulation. > One of my main questions is, as seen in a lot of hardware/software > implementations before, how much more SHAKE would cost you in terms of area > and/or performance. Please see the literature on SHAKE implementations. The focus in [1] and [2] is on the core mathematical computations. > Your specifications [3] seems to suggest [1] and/or [2] > implement the McEliece KEM, No, [3] does not suggest this. It states that the FPGA implementation is "of the core mathematical functions (not including, e.g., hashing)". > as on page 34 of [3] you state that > "encapsulation and decapsulation are reasonably fast in software, and > impressively fast in hardware, due to the simple nature of the objects > (binary vectors) and operations (such as binary matrix-vector > multiplications)". This is correct. The next two sentences say "Key generation is also quite fast in hardware. The hardware speeds of key generation and decoding are already demonstrated by our FPGA implementation." > So maybe I'm missing something? The submission document [3] includes a specification. It also includes a performance analysis. The performance claims are clear and specific. [1] and [2] demonstrate the achievable FPGA performance for the core mathematical operations. > Perhaps you have a link to a paper > describing your hardware design [2] that's more up-to-date than [1]? > Because typically it's also interesting to see how much certain > components/modules consume, what are the bottlenecks, throughput > performances, and even % of area used on the device. Please see [1], [2], and the SHAKE literature for analysis of the main components. > Looking at [1], which seems to be a basis for the McEliece KEM hardware > designs, it seems to be CPA secure instead of CCA? Fig. 2 in [1] doesn't > have a re-encryption phase and interestingly on page 15 of [1] they state > that "note that we are using two simple 64-bit Xorshift PRNGs in our design > to enable deterministic testing. For real deployment, these PRNGs must be > replaced with a cryptographically secure random-number generator, e.g., > [8]". Checking [2] I also see the same 64-bit XOR shift PRNG verilog file. > I assume you'd replace this with SHAKE256 to comply with the specifications > and security estimates? Do you have any insights how this would affect > area/performance? I do not know what "the McEliece KEM hardware design" is referring to. [1] and [2] provide an FPGA implementation of the Niederreiter cryptosystem (as the paper "FPGA-based Niederreiter Cryptosystem using Binary Goppa Codes" clearly states) including key generation, encryption and decryption. This is not the same as the Classic McEliece KEM, but it uses the same core mathematical operations. The performance of those operations is reported in the Classic McEliece performance analysis. As the quotes from [1] indicate, [1] describes a CPA secure version of the Niederreiter PKE. We also have a CCA secure version of the decryption operation that we can make available. The simple PRNG indeed must be replaced by a secure PRNG as we point out in [1]. Using e.g. SHAKE in key generation for a PRNG construction, the overhead in time and area compared to the entire key generation is small; parallelism can be exploited to reduce the impact on time, e.g., by performing matrix row reduction during generating the next private key candidate. > Anyway, some more clarifications on this would be helpful :) I hope this helped to clear up some misunderstandings. Best regards Ruben (on behalf of the Classic McEliece team) > [1] https://eprint.iacr.org/2017/1180.pdf > [2] http://caslab.csl.yale.edu/code/niederreiter/ > [3] https://classic.mceliece.org/nist/mceliece-20190331.pdf