See section C.2 of https://doi.org/10.6028/NIST.FIPS.203 Signed-off-by: Anjan Roy <hello@itzmeanjan.in>
Caution
This ML-KEM implementation is conformant with ML-KEM draft standard https://doi.org/10.6028/NIST.FIPS.203.ipd and I also try to make it timing leakage free, using dudect (see https://github.com/oreparaz/dudect) -based tests, but be informed that this implementation is not yet audited. If you consider using it in production, please be careful !
ML-KEM (formerly known as Kyber)
Module-Lattice -based Key Encapsulation Mechanism Standard by NIST.
Motivation
ML-KEM is being standardized by NIST as post-quantum secure key encapsulation mechanism (KEM), which can be used for key establishment, between two parties, communicating over insecure channel.
ML-KEM offers an IND-CCA-secure Key Encapsulation Mechanism - its security is based on the hardness of solving the learning-with-errors (LWE) problem in module (i.e. structured) lattices.
ML-KEM is built on top of IND-CPA-secure K-PKE, where two communicating parties, both generating their key pairs, while publishing only their public keys to each other, can encrypt fixed length ( = 32 -bytes ) message using peer's public key. Cipher text can be decrypted by corresponding secret key ( which is private to the keypair owner ) and 32 -bytes message can be recovered back. Then a slightly tweaked Fujisaki–Okamoto (FO) transform is applied on IND-CPA-secure K-PKE - giving us the IND-CCA-secure ML-KEM construction. In KEM scheme, two parties interested in establishing a secure communication channel, over public & insecure channel, can generate a 32 -bytes shared secret key. Now they can be use this 32 -bytes shared secret key in any symmetric key primitive, either for encrypting their communication (in much faster way) or deriving new/ longer keys.
| Algorithm | Input | Output |
|---|---|---|
| KeyGen | - | Public Key and Secret Key |
| Encapsulation | Public Key | Cipher Text and 32B Shared Secret |
| Decapsulation | Secret Key and Cipher Text | 32B Shared Secret |
Here I'm maintaining ml-kem - a C++20 header-only constexpr library, implementing ML-KEM, supporting ML-KEM-{512, 768, 1024} parameter sets, as defined in table 2 of ML-KEM draft standard. It's pretty easy to use, see usage.
Note
Find ML-KEM draft standard @ https://doi.org/10.6028/NIST.FIPS.203.ipd - this is the document that I followed when implementing ML-KEM. I suggest you go through the specification to get an in-depth understanding of the scheme.
Prerequisites
- A C++ compiler with C++20 standard library such as
clang++/g++.
$ clang++ --version
Ubuntu clang version 17.0.6 (9ubuntu1)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
- Build tools such as
make,cmake.
$ make --version
GNU Make 4.3
$ cmake --version
cmake version 3.25.1
- For testing ML-KEM implementation, you need to globally install
google-testlibrary and headers. Follow guide @ https://github.com/google/googletest/tree/main/googletest#standalone-cmake-project, if you don't have it installed. - For benchmarking ML-KEM implementation, you'll need to have
google-benchmarkheader and library globally installed. I found guide @ https://github.com/google/benchmark#installation helpful.
Note
If you are on a machine running GNU/Linux kernel and you want to obtain CPU cycle count for ML-KEM routines, you should consider building
google-benchmarklibrary withlibPFMsupport, following https://gist.github.com/itzmeanjan/05dc3e946f635d00c5e0b21aae6203a7, a step-by-step guide. Find more about libPFM @ https://perfmon2.sourceforge.net.
Tip
Git submodule based dependencies will normally be imported automatically, but in case that doesn't work, you can manually initialize and update them by issuing
$ git submodule update --initfrom inside the root of this repository.
Testing
For testing functional correctness of this implementation and conformance with ML-KEM draft standard, you have to issue
Note
Known Answer Test (KAT) files living in this directory are generated by following (reproducible) steps, described in https://gist.github.com/itzmeanjan/c8f5bc9640d0f0bdd2437dfe364d7710.
make -j # Run tests without any sort of sanitizers
make asan_test -j # Run tests with AddressSanitizer enabled
make ubsan_test -j # Run tests with UndefinedBehaviourSanitizer enabled
PASSED TESTS (15/15):
2 ms: build/test.out ML_KEM.ML_KEM_1024_KeygenEncapsDecaps
3 ms: build/test.out ML_KEM.ML_KEM_512_KeygenEncapsDecaps
3 ms: build/test.out ML_KEM.ML_KEM_1024_EncapsFailureDueToNonReducedPubKey
3 ms: build/test.out ML_KEM.ML_KEM_1024_DecapsFailureDueToBitFlippedCipherText
3 ms: build/test.out ML_KEM.ML_KEM_512_DecapsFailureDueToBitFlippedCipherText
3 ms: build/test.out ML_KEM.ML_KEM_768_KeygenEncapsDecaps
3 ms: build/test.out ML_KEM.PolynomialSerialization
4 ms: build/test.out ML_KEM.ML_KEM_512_EncapsFailureDueToNonReducedPubKey
4 ms: build/test.out ML_KEM.ML_KEM_768_DecapsFailureDueToBitFlippedCipherText
4 ms: build/test.out ML_KEM.ML_KEM_768_EncapsFailureDueToNonReducedPubKey
27 ms: build/test.out ML_KEM.ML_KEM_512_KnownAnswerTests
45 ms: build/test.out ML_KEM.ML_KEM_768_KnownAnswerTests
60 ms: build/test.out ML_KEM.ML_KEM_1024_KnownAnswerTests
243 ms: build/test.out ML_KEM.CompressDecompressZq
304 ms: build/test.out ML_KEM.ArithmeticOverZq
In case you're interested in running timing leakage tests using dudect, execute following
Note
dudectis integrated into this library implementation of ML-KEM to find any sort of timing leakages. It checks for constant-timeness of all vital functions including Fujisaki-Okamoto transform, used in decapsulation step. It doesn't check constant-timeness of function which samples public matrixA, because that fails the check anyway, due to use of uniform rejection sampling. As matrixAis public, it's not critical that it must be strictly constant-time.
# Can only be built and run x86_64 machine.
make dudect_test_build -j
# Before running the constant-time tests, it's a good idea to put all CPU cores on "performance" mode.
# You may find guide @ https://github.com/google/benchmark/blob/main/docs/reducing_variance.md helpful.
timeout 10m taskset -c 0 ./build/dudect/test_ml_kem_512.out
timeout 10m taskset -c 0 ./build/dudect/test_ml_kem_768.out
timeout 10m taskset -c 0 ./build/dudect/test_ml_kem_1024.out
Tip
dudectdocumentation says iftstatistic is < 10, we're probably good, yes probably. You may want to readdudectdocumentation @ https://github.com/oreparaz/dudect. Also you might find the original paper @ https://ia.cr/2016/1123 interesting.
...
meas: 58.90 M, max t: +2.61, max tau: 3.40e-04, (5/tau)^2: 2.16e+08. For the moment, maybe constant time.
meas: 58.99 M, max t: +2.65, max tau: 3.45e-04, (5/tau)^2: 2.10e+08. For the moment, maybe constant time.
meas: 59.07 M, max t: +2.65, max tau: 3.44e-04, (5/tau)^2: 2.11e+08. For the moment, maybe constant time.
meas: 59.16 M, max t: +2.63, max tau: 3.42e-04, (5/tau)^2: 2.13e+08. For the moment, maybe constant time.
meas: 59.25 M, max t: +2.68, max tau: 3.49e-04, (5/tau)^2: 2.06e+08. For the moment, maybe constant time.
meas: 59.33 M, max t: +2.65, max tau: 3.44e-04, (5/tau)^2: 2.12e+08. For the moment, maybe constant time.
meas: 59.42 M, max t: +2.75, max tau: 3.57e-04, (5/tau)^2: 1.96e+08. For the moment, maybe constant time.
meas: 59.50 M, max t: +2.72, max tau: 3.53e-04, (5/tau)^2: 2.01e+08. For the moment, maybe constant time.
meas: 59.59 M, max t: +2.68, max tau: 3.47e-04, (5/tau)^2: 2.08e+08. For the moment, maybe constant time.
meas: 59.66 M, max t: +2.70, max tau: 3.50e-04, (5/tau)^2: 2.04e+08. For the moment, maybe constant time.
meas: 59.74 M, max t: +2.70, max tau: 3.50e-04, (5/tau)^2: 2.05e+08. For the moment, maybe constant time.
meas: 59.82 M, max t: +2.72, max tau: 3.51e-04, (5/tau)^2: 2.03e+08. For the moment, maybe constant time.
meas: 59.89 M, max t: +2.72, max tau: 3.51e-04, (5/tau)^2: 2.03e+08. For the moment, maybe constant time.
meas: 59.97 M, max t: +2.64, max tau: 3.41e-04, (5/tau)^2: 2.14e+08. For the moment, maybe constant time.
Benchmarking
For benchmarking ML-KEM public functions such as keygen, encaps and decaps, for various suggested parameter sets, you have to issue.
make benchmark -j # If you haven't built google-benchmark library with libPFM support.
make perf -j # If you have built google-benchmark library with libPFM support.
Caution
When benchmarking, ensure that you've disabled CPU frequency scaling, by following guide @ https://github.com/google/benchmark/blob/main/docs/reducing_variance.md.
Note
make perf- was issued when collecting following benchmarks. Notice, cycles column, denoting cost of executing ML-KEM functions, in terms of CPU cycles. Follow https://github.com/google/benchmark/blob/main/docs/perf_counters.md for more details.
On 12th Gen Intel(R) Core(TM) i7-1260P
Compiled with gcc (Ubuntu 14-20240412-0ubuntu1) 14.0.1 20240412.
$ uname -srm
Linux 6.8.0-35-generic x86_64
2024-06-18T21:12:04+04:00
Running ./build/perf.out
Run on (16 X 842.086 MHz CPU s)
CPU Caches:
L1 Data 48 KiB (x8)
L1 Instruction 32 KiB (x8)
L2 Unified 1280 KiB (x8)
L3 Unified 18432 KiB (x1)
Load Average: 0.59, 0.65, 0.66
------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations CYCLES items_per_second
------------------------------------------------------------------------------------------------
ml_kem_1024/keygen_mean 37.7 us 37.7 us 10 168.625k 26.5586k/s
ml_kem_1024/keygen_median 37.8 us 37.8 us 10 168.466k 26.4937k/s
ml_kem_1024/keygen_stddev 0.867 us 0.856 us 10 883.281 605.108/s
ml_kem_1024/keygen_cv 2.30 % 2.27 % 10 0.52% 2.28%
ml_kem_1024/keygen_min 36.5 us 36.5 us 10 167.909k 25.8962k/s
ml_kem_1024/keygen_max 38.7 us 38.6 us 10 171.052k 27.3982k/s
ml_kem_512/decap_mean 20.4 us 20.4 us 10 92.5549k 49.0213k/s
ml_kem_512/decap_median 20.3 us 20.3 us 10 92.4039k 49.1818k/s
ml_kem_512/decap_stddev 0.258 us 0.252 us 10 577.305 600.776/s
ml_kem_512/decap_cv 1.26 % 1.23 % 10 0.62% 1.23%
ml_kem_512/decap_min 20.0 us 20.0 us 10 92.1723k 47.8732k/s
ml_kem_512/decap_max 20.9 us 20.9 us 10 94.1701k 49.888k/s
ml_kem_512/encap_mean 16.4 us 16.4 us 10 72.6916k 60.9038k/s
ml_kem_512/encap_median 16.4 us 16.4 us 10 72.6753k 60.8974k/s
ml_kem_512/encap_stddev 0.253 us 0.250 us 10 97.0585 935.823/s
ml_kem_512/encap_cv 1.54 % 1.53 % 10 0.13% 1.54%
ml_kem_512/encap_min 15.9 us 15.9 us 10 72.5484k 59.7296k/s
ml_kem_512/encap_max 16.8 us 16.7 us 10 72.8346k 62.8025k/s
ml_kem_768/decap_mean 33.0 us 33.0 us 10 148.191k 30.3166k/s
ml_kem_768/decap_median 33.1 us 33.1 us 10 148.138k 30.1903k/s
ml_kem_768/decap_stddev 0.518 us 0.509 us 10 212.758 473.277/s
ml_kem_768/decap_cv 1.57 % 1.54 % 10 0.14% 1.56%
ml_kem_768/decap_min 32.1 us 32.1 us 10 147.836k 29.7687k/s
ml_kem_768/decap_max 33.6 us 33.6 us 10 148.61k 31.1568k/s
ml_kem_512/keygen_mean 14.6 us 14.6 us 10 63.4765k 68.3813k/s
ml_kem_512/keygen_median 14.8 us 14.8 us 10 63.4589k 67.7965k/s
ml_kem_512/keygen_stddev 0.241 us 0.240 us 10 60.264 1.14394k/s
ml_kem_512/keygen_cv 1.65 % 1.64 % 10 0.09% 1.67%
ml_kem_512/keygen_min 14.1 us 14.1 us 10 63.3859k 67.5222k/s
ml_kem_512/keygen_max 14.8 us 14.8 us 10 63.5564k 71.0285k/s
ml_kem_1024/decap_mean 49.3 us 49.3 us 10 216.516k 20.2885k/s
ml_kem_1024/decap_median 49.5 us 49.4 us 10 216.383k 20.2235k/s
ml_kem_1024/decap_stddev 0.649 us 0.634 us 10 346.756 261.841/s
ml_kem_1024/decap_cv 1.32 % 1.29 % 10 0.16% 1.29%
ml_kem_1024/decap_min 48.3 us 48.3 us 10 216.031k 19.967k/s
ml_kem_1024/decap_max 50.1 us 50.1 us 10 217.187k 20.6884k/s
ml_kem_1024/encap_mean 41.8 us 41.8 us 10 183.083k 23.9532k/s
ml_kem_1024/encap_median 41.8 us 41.8 us 10 183.077k 23.9381k/s
ml_kem_1024/encap_stddev 0.563 us 0.551 us 10 218.08 315.804/s
ml_kem_1024/encap_cv 1.35 % 1.32 % 10 0.12% 1.32%
ml_kem_1024/encap_min 41.0 us 41.0 us 10 182.737k 23.5351k/s
ml_kem_1024/encap_max 42.6 us 42.5 us 10 183.483k 24.4145k/s
ml_kem_768/encap_mean 27.4 us 27.4 us 10 121.805k 36.5012k/s
ml_kem_768/encap_median 27.4 us 27.4 us 10 121.632k 36.553k/s
ml_kem_768/encap_stddev 0.692 us 0.687 us 10 644.207 909.698/s
ml_kem_768/encap_cv 2.52 % 2.50 % 10 0.53% 2.49%
ml_kem_768/encap_min 26.5 us 26.5 us 10 121.249k 35.0289k/s
ml_kem_768/encap_max 28.6 us 28.5 us 10 123.228k 37.7644k/s
ml_kem_768/keygen_mean 25.0 us 25.0 us 10 110.546k 40.0317k/s
ml_kem_768/keygen_median 25.0 us 25.0 us 10 110.151k 40.0223k/s
ml_kem_768/keygen_stddev 0.855 us 0.854 us 10 861.179 1.36001k/s
ml_kem_768/keygen_cv 3.42 % 3.41 % 10 0.78% 3.40%
ml_kem_768/keygen_min 24.1 us 24.1 us 10 109.801k 38.1413k/s
ml_kem_768/keygen_max 26.2 us 26.2 us 10 112.141k 41.5697k/s
On ARM Cortex-A72 i.e. Raspberry Pi 4B
Compiled with gcc (Ubuntu 13.2.0-23ubuntu4) 13.2.0.
$ uname -srm
Linux 6.8.0-1005-raspi aarch64
2024-06-18T21:49:48+04:00
Running ./build/bench.out
Run on (4 X 1800 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x4)
L1 Instruction 48 KiB (x4)
L2 Unified 1024 KiB (x1)
Load Average: 3.51, 3.90, 2.28
-------------------------------------------------------------------------------------
Benchmark Time CPU Iterations items_per_second
-------------------------------------------------------------------------------------
ml_kem_1024/decap_mean 258 us 258 us 10 3.87579k/s
ml_kem_1024/decap_median 258 us 258 us 10 3.88038k/s
ml_kem_1024/decap_stddev 0.963 us 0.959 us 10 14.346/s
ml_kem_1024/decap_cv 0.37 % 0.37 % 10 0.37%
ml_kem_1024/decap_min 257 us 257 us 10 3.84585k/s
ml_kem_1024/decap_max 260 us 260 us 10 3.89065k/s
ml_kem_768/decap_mean 174 us 174 us 10 5.7436k/s
ml_kem_768/decap_median 174 us 174 us 10 5.74181k/s
ml_kem_768/decap_stddev 0.323 us 0.324 us 10 10.6771/s
ml_kem_768/decap_cv 0.19 % 0.19 % 10 0.19%
ml_kem_768/decap_min 174 us 174 us 10 5.72691k/s
ml_kem_768/decap_max 175 us 175 us 10 5.75986k/s
ml_kem_768/keygen_mean 119 us 119 us 10 8.40489k/s
ml_kem_768/keygen_median 119 us 119 us 10 8.4065k/s
ml_kem_768/keygen_stddev 0.217 us 0.237 us 10 16.7154/s
ml_kem_768/keygen_cv 0.18 % 0.20 % 10 0.20%
ml_kem_768/keygen_min 119 us 119 us 10 8.37403k/s
ml_kem_768/keygen_max 119 us 119 us 10 8.43292k/s
ml_kem_1024/encap_mean 216 us 216 us 10 4.6302k/s
ml_kem_1024/encap_median 216 us 216 us 10 4.63436k/s
ml_kem_1024/encap_stddev 1.03 us 1.02 us 10 21.7423/s
ml_kem_1024/encap_cv 0.48 % 0.47 % 10 0.47%
ml_kem_1024/encap_min 215 us 215 us 10 4.59301k/s
ml_kem_1024/encap_max 218 us 218 us 10 4.65477k/s
ml_kem_512/decap_mean 109 us 109 us 10 9.21521k/s
ml_kem_512/decap_median 108 us 108 us 10 9.22127k/s
ml_kem_512/decap_stddev 0.248 us 0.243 us 10 20.5809/s
ml_kem_512/decap_cv 0.23 % 0.22 % 10 0.22%
ml_kem_512/decap_min 108 us 108 us 10 9.17837k/s
ml_kem_512/decap_max 109 us 109 us 10 9.24305k/s
ml_kem_768/encap_mean 140 us 140 us 10 7.12907k/s
ml_kem_768/encap_median 140 us 140 us 10 7.13583k/s
ml_kem_768/encap_stddev 0.597 us 0.596 us 10 30.1105/s
ml_kem_768/encap_cv 0.43 % 0.42 % 10 0.42%
ml_kem_768/encap_min 140 us 140 us 10 7.05566k/s
ml_kem_768/encap_max 142 us 142 us 10 7.16165k/s
ml_kem_1024/keygen_mean 188 us 188 us 10 5.32413k/s
ml_kem_1024/keygen_median 188 us 188 us 10 5.32187k/s
ml_kem_1024/keygen_stddev 0.537 us 0.534 us 10 15.1453/s
ml_kem_1024/keygen_cv 0.29 % 0.28 % 10 0.28%
ml_kem_1024/keygen_min 187 us 187 us 10 5.29511k/s
ml_kem_1024/keygen_max 189 us 189 us 10 5.34655k/s
ml_kem_512/encap_mean 83.7 us 83.7 us 10 11.9524k/s
ml_kem_512/encap_median 83.5 us 83.5 us 10 11.9776k/s
ml_kem_512/encap_stddev 0.421 us 0.420 us 10 59.8055/s
ml_kem_512/encap_cv 0.50 % 0.50 % 10 0.50%
ml_kem_512/encap_min 83.2 us 83.2 us 10 11.8419k/s
ml_kem_512/encap_max 84.4 us 84.4 us 10 12.0191k/s
ml_kem_512/keygen_mean 69.2 us 69.2 us 10 14.4436k/s
ml_kem_512/keygen_median 69.2 us 69.2 us 10 14.4496k/s
ml_kem_512/keygen_stddev 0.267 us 0.269 us 10 55.9869/s
ml_kem_512/keygen_cv 0.39 % 0.39 % 10 0.39%
ml_kem_512/keygen_min 68.9 us 68.9 us 10 14.3569k/s
ml_kem_512/keygen_max 69.7 us 69.7 us 10 14.5198k/s
On Apple M1 Max
Compiled with Apple clang version 15.0.0 (clang-1500.3.9.4).
$ uname -srm
Darwin 23.5.0 arm64
2024-06-18T21:24:57+04:00
Running ./build/bench.out
Run on (10 X 24 MHz CPU s)
CPU Caches:
L1 Data 64 KiB
L1 Instruction 128 KiB
L2 Unified 4096 KiB (x10)
Load Average: 2.12, 4.39, 7.54
-------------------------------------------------------------------------------------
Benchmark Time CPU Iterations items_per_second
-------------------------------------------------------------------------------------
ml_kem_768/keygen_mean 20.7 us 20.7 us 10 48.4041k/s
ml_kem_768/keygen_median 20.7 us 20.7 us 10 48.4089k/s
ml_kem_768/keygen_stddev 0.031 us 0.029 us 10 68.1992/s
ml_kem_768/keygen_cv 0.15 % 0.14 % 10 0.14%
ml_kem_768/keygen_min 20.6 us 20.6 us 10 48.2768k/s
ml_kem_768/keygen_max 20.7 us 20.7 us 10 48.5023k/s
ml_kem_1024/keygen_mean 32.5 us 32.5 us 10 30.8076k/s
ml_kem_1024/keygen_median 32.4 us 32.4 us 10 30.8861k/s
ml_kem_1024/keygen_stddev 0.159 us 0.161 us 10 152.372/s
ml_kem_1024/keygen_cv 0.49 % 0.50 % 10 0.49%
ml_kem_1024/keygen_min 32.4 us 32.3 us 10 30.5386k/s
ml_kem_1024/keygen_max 32.8 us 32.7 us 10 30.9448k/s
ml_kem_768/encap_mean 22.7 us 22.7 us 10 44.144k/s
ml_kem_768/encap_median 22.7 us 22.7 us 10 44.1494k/s
ml_kem_768/encap_stddev 0.037 us 0.037 us 10 72.779/s
ml_kem_768/encap_cv 0.16 % 0.16 % 10 0.16%
ml_kem_768/encap_min 22.6 us 22.6 us 10 43.9993k/s
ml_kem_768/encap_max 22.8 us 22.7 us 10 44.26k/s
ml_kem_768/decap_mean 26.7 us 26.6 us 10 37.5449k/s
ml_kem_768/decap_median 26.6 us 26.6 us 10 37.5935k/s
ml_kem_768/decap_stddev 0.108 us 0.098 us 10 137.284/s
ml_kem_768/decap_cv 0.40 % 0.37 % 10 0.37%
ml_kem_768/decap_min 26.6 us 26.5 us 10 37.2779k/s
ml_kem_768/decap_max 26.9 us 26.8 us 10 37.6739k/s
ml_kem_512/keygen_mean 12.1 us 12.1 us 10 82.8747k/s
ml_kem_512/keygen_median 12.1 us 12.1 us 10 82.9135k/s
ml_kem_512/keygen_stddev 0.016 us 0.018 us 10 120.443/s
ml_kem_512/keygen_cv 0.13 % 0.15 % 10 0.15%
ml_kem_512/keygen_min 12.1 us 12.0 us 10 82.7218k/s
ml_kem_512/keygen_max 12.1 us 12.1 us 10 83.0684k/s
ml_kem_512/encap_mean 13.4 us 13.4 us 10 74.4965k/s
ml_kem_512/encap_median 13.4 us 13.4 us 10 74.512k/s
ml_kem_512/encap_stddev 0.016 us 0.016 us 10 88.0048/s
ml_kem_512/encap_cv 0.12 % 0.12 % 10 0.12%
ml_kem_512/encap_min 13.4 us 13.4 us 10 74.3506k/s
ml_kem_512/encap_max 13.5 us 13.4 us 10 74.6472k/s
ml_kem_1024/encap_mean 35.5 us 35.4 us 10 28.2336k/s
ml_kem_1024/encap_median 35.5 us 35.4 us 10 28.209k/s
ml_kem_1024/encap_stddev 0.133 us 0.134 us 10 106.629/s
ml_kem_1024/encap_cv 0.38 % 0.38 % 10 0.38%
ml_kem_1024/encap_min 35.3 us 35.2 us 10 28.0729k/s
ml_kem_1024/encap_max 35.6 us 35.6 us 10 28.3909k/s
ml_kem_1024/decap_mean 40.4 us 40.3 us 10 24.8064k/s
ml_kem_1024/decap_median 40.4 us 40.3 us 10 24.8086k/s
ml_kem_1024/decap_stddev 0.066 us 0.070 us 10 42.8027/s
ml_kem_1024/decap_cv 0.16 % 0.17 % 10 0.17%
ml_kem_1024/decap_min 40.3 us 40.2 us 10 24.734k/s
ml_kem_1024/decap_max 40.5 us 40.4 us 10 24.8586k/s
ml_kem_512/decap_mean 16.4 us 16.3 us 10 61.1867k/s
ml_kem_512/decap_median 16.4 us 16.3 us 10 61.1979k/s
ml_kem_512/decap_stddev 0.024 us 0.022 us 10 81.9971/s
ml_kem_512/decap_cv 0.15 % 0.13 % 10 0.13%
ml_kem_512/decap_min 16.3 us 16.3 us 10 61.0308k/s
ml_kem_512/decap_max 16.4 us 16.4 us 10 61.308k/s
Usage
ml-kem is written as a header-only C++20 constexpr library, majorly targeting 64 -bit desktop/ server grade platforms and it's pretty easy to get started with. All you need to do is following.
- Clone
ml-kemrepository.
cd
# Multi-step cloning and importing of submodules
git clone https://github.com/itzmeanjan/ml-kem.git && pushd ml-kem && git submodule update --init && popd
# Or do single step cloning and importing of submodules
git clone https://github.com/itzmeanjan/ml-kem.git --recurse-submodules
# Or clone and then run tests, which will automatically bring in dependencies
git clone https://github.com/itzmeanjan/ml-kem.git && pushd ml-kem && make -j && popd
- Write your program while including proper header files ( based on which variant of ML-KEM you want to use, see include directory ), which includes declarations ( and definitions ) of all required ML-KEM routines and constants ( such as byte length of public/ private key, cipher text etc. ).
// main.cpp
#include "ml_kem/ml_kem_512.hpp"
#include <algorithm>
#include <array>
#include <cassert>
int
main()
{
std::array<uint8_t, ml_kem_512::SEED_D_BYTE_LEN> d{};
std::array<uint8_t, ml_kem_512::SEED_Z_BYTE_LEN> z{};
std::array<uint8_t, ml_kem_512::PKEY_BYTE_LEN> pkey{};
std::array<uint8_t, ml_kem_512::SKEY_BYTE_LEN> skey{};
std::array<uint8_t, ml_kem_512::SEED_M_BYTE_LEN> m{};
std::array<uint8_t, ml_kem_512::CIPHER_TEXT_BYTE_LEN> cipher{};
std::array<uint8_t, ml_kem_512::SHARED_SECRET_BYTE_LEN> sender_key{};
std::array<uint8_t, ml_kem_512::SHARED_SECRET_BYTE_LEN> receiver_key{};
// Be careful !
//
// Read API documentation in include/ml_kem/internals/rng/prng.hpp
ml_kem_prng::prng_t<128> prng;
prng.read(d);
prng.read(z);
prng.read(m);
ml_kem_512::keygen(d, z, pkey, skey);
assert(ml_kem_512::encapsulate(m, pkey, cipher, sender_key)); // Key Encapsulation might fail, if input public key is malformed
ml_kem_512::decapsulate(skey, cipher, receiver_key);
assert(sender_key == receiver_key);
return 0;
}
- When compiling your program, let your compiler know where it can find
ml-kem,sha3andsubtleheaders, which includes their definitions ( all of them are header-only libraries ) too.
# Assuming `ml-kem` was cloned just under $HOME
ML_KEM_HEADERS=~/ml-kem/include
SHA3_HEADERS=~/ml-kem/sha3/include
SUBTLE_HEADERS=~/ml-kem/subtle/include
g++ -std=c++20 -Wall -Wextra -pedantic -O3 -march=native -I $ML_KEM_HEADERS -I $SHA3_HEADERS -I $SUBTLE_HEADERS main.cpp
| ML-KEM Variant | Namespace | Header |
|---|---|---|
| ML-KEM-512 Routines | ml_kem_512:: |
include/ml_kem/ml_kem_512.hpp |
| ML-KEM-768 Routines | ml_kem_768:: |
include/ml_kem/ml_kem_768.hpp |
| ML-KEM-1024 Routines | ml_kem_1024:: |
include/ml_kem/ml_kem_1024.hpp |
Note
ML-KEM parameter sets are taken from table 2 of ML-KEM draft standard @ https://doi.org/10.6028/NIST.FIPS.203.ipd.
All the functions, in this ML-KEM header-only library, are implemented as constexpr functions. Hence you should be able to evaluate ML-KEM key generation, encapsulation or decapsulation at compile-time itself, given that all inputs are known at compile-time. I present you with following demonstration program, which generates a ML-KEM-512 keypair and encapsulates a message, producing a ML-KEM-512 cipher text and a fixed size shared secret, given seed_{d, z, m} as input - all at program compile-time. Notice, the static assertion.
// compile-time-ml-kem-512.cpp
//
// Compile and run this program with
// $ g++ -std=c++20 -Wall -Wextra -pedantic -I include -I sha3/include -I subtle/include compile-time-ml-kem-512.cpp && ./a.out
// or
// $ clang++ -std=c++20 -Wall -Wextra -pedantic -fconstexpr-steps=4000000 -I include -I sha3/include -I subtle/include compile-time-ml-kem-512.cpp && ./a.out
#include "ml_kem/ml_kem_512.hpp"
// Compile-time evaluation of ML-KEM-512 key generation and encapsulation, using NIST official KAT no. (1).
constexpr auto
eval_ml_kem_768_encaps() -> auto
{
using seed_t = std::array<uint8_t, ml_kem_512::SEED_D_BYTE_LEN>;
// 7c9935a0b07694aa0c6d10e4db6b1add2fd81a25ccb148032dcd739936737f2d
constexpr seed_t seed_d = { 124, 153, 53, 160, 176, 118, 148, 170, 12, 109, 16, 228, 219, 107, 26, 221, 47, 216, 26, 37, 204, 177, 72, 3, 45, 205, 115, 153, 54, 115, 127, 45 };
// b505d7cfad1b497499323c8686325e4792f267aafa3f87ca60d01cb54f29202a
constexpr seed_t seed_z = {181, 5, 215, 207, 173, 27, 73, 116, 153, 50, 60, 134, 134, 50, 94, 71, 146, 242, 103, 170, 250, 63, 135, 202, 96, 208, 28, 181, 79, 41, 32, 42};
// eb4a7c66ef4eba2ddb38c88d8bc706b1d639002198172a7b1942eca8f6c001ba
constexpr seed_t seed_m = {235, 74, 124, 102, 239, 78, 186, 45, 219, 56, 200, 141, 139, 199, 6, 177, 214, 57, 0, 33, 152, 23, 42, 123, 25, 66, 236, 168, 246, 192, 1, 186};
std::array<uint8_t, ml_kem_512::PKEY_BYTE_LEN> pubkey{};
std::array<uint8_t, ml_kem_512::SKEY_BYTE_LEN> seckey{};
std::array<uint8_t, ml_kem_512::CIPHER_TEXT_BYTE_LEN> cipher{};
std::array<uint8_t, ml_kem_512::SHARED_SECRET_BYTE_LEN> shared_secret{};
ml_kem_512::keygen(seed_d, seed_z, pubkey, seckey);
(void)ml_kem_512::encapsulate(seed_m, pubkey, cipher, shared_secret);
return shared_secret;
}
int
main()
{
// This step is being evaluated at compile-time, thanks to the fact that my ML-KEM implementation is `constexpr`.
static constexpr auto computed_shared_secret = eval_ml_kem_768_encaps();
// 500c4424107df96b01749b95f47a14eea871c3742606e15d2b6c91d207d85965
constexpr std::array<uint8_t, ml_kem_512::SHARED_SECRET_BYTE_LEN> expected_shared_secret = { 80, 12, 68, 36, 16, 125, 249, 107, 1, 116, 155, 149, 244, 122, 20, 238, 168, 113, 195, 116, 38, 6, 225, 93, 43, 108, 145, 210, 7, 216, 89, 101 };
// Notice static_assert, yay !
static_assert(computed_shared_secret == expected_shared_secret, "Must be able to compute shared secret at compile-time !");
return 0;
}
See example program, where I show how to use ML-KEM-512 API.
g++ -std=c++20 -Wall -Wextra -pedantic -O3 -march=native -I ./include -I ./sha3/include -I ./subtle/include/ examples/ml_kem_768.cpp && ./a.out
ML-KEM-768
Pubkey : 6653a1f5242faad7b37863433dc56538957f3c412102a17d28bc328c4781c566331f8c0b77093baef24a58d6312ddc719ac67ac2874f3adc8a3e6530adbc14cc069159a99e56277895c17c04da1644db23a6e9c16f31c21959400a8abd483a3fcfc0c5fd759917322a66a2aa77a6956f3b8387443640746b0ac8a282521dd784332d56aa745898c3fcc60a56a0716931bbe69b26c4514d529c79979355c8b40eb97fe7c485ceaa45d610145b4bce7da6343db46b6bf42182931a3ed98bafb66614e024cf8c9e51a90b1fc3702a2b4fe3b0c537fa9a1680b4d2f2044c557b1819300a6225be6c234d07d06a702eeb7110ef05c8973b0cab182efbb9ba07811b87b24e2a652cb428240c53423efcaf201973bf3342e86a8d477191d3544217f143586ba351fb7729ac8a51a51c8ab719fd3568c615a7a438b8967301754cac96a8552af82d8ce8840da56cb7481ad54581904c0d390732eceb23df4483cd7593d949bb0c985042f71018862b0d126702a7b55c8c7d9d44cfea157d4013c57ceb18bcfc2c95d8bb8d6178e0ac738ffcc1b3fca525a7ae83652e0b75836fe6c77d182626a8ca85262a17bc60645105a503b2f0f707e765552d49979273b0cb5870124933c6557ef795b36bf093f6c35ef722c9b2854999d20b5fd23dc6d2381ef38bcf547e37faa8ccda3dea2409deda7992a1951849ce3e7b11f3f98cb0d2283e458af854af9d74c57516a924e74222e9bcac529e88a02913d9ae29ec3cc42269d08ca1bf13a941f95d0bb05da9ac4a1ea2bb86b4c631853ec5f2129834a70ba923c8f1bc3fd5cad8692ef4401417b362f0e729497633794abc21e95a59319403002085b113a7b0544165210c1726346a088a933347a265ba055429e637fc40b111d38446461d77546166f923e5249427e5c62092b6ee2a2b585273c3d545b99673419194e54978d71ca606e238a053988db999904207b8326c5b27a38966c4d99460386c453d12821602b444320da205c980da3ab9d3461d405601a7226c143cf4492b8bf4c63a949a8ad81224c71005abcf6afb4ba7ba94ee437079494f9d78c69ac950711765a7e50ab42ba6bf64a5a7d30e64d14fb845ab37b4cc5b099c44e3cf4f9bc61f3640b5b98560474f9f1054dce9be10db77dea35a2375c66d0a26d7f73e7385b03ca8194dbaaf601184f826bd0a86b1d023ae9548b6d4602cd25c3f46e1c66dca6fd183007d043f6f3a6f5a56089180744d579bfe3cc65f003d91084adac4c0ac5811ea8a3e5aa6500eac125a423000297a7975585a083bcfc807b7722b5a0438c1b11b62abcb9a4623f8090c451690455acf97814074c0b6d6d19180f08afbd5ba0e259ee910a61f684d14e7996e47ba5a19994a13a642a1bb563411979bcaf7b302d70ba750a89867653b93596d03b260c7c0a024949bb7b1b110dc8267d5c5305390da26c7e6296add6ba7533b92540c28b337c5b392a6024c57cc09b899ec72e1466de604aa0c909baeb0b0078324e810481f760ae694b5e88cb034bca48bf70881047c7b6ab7ce04b96ad2bd0142b387bc1824a22742c7ce18ebac7744a616a5631e40ab817426c6130ece8641661ab863c44c2adb64029990aea24c94bec0ad7bfba46cd1894775ac6549b1a63446cade59357e125c589a73
Seckey : 7fc5b38c54324b214d9db93e4069c03a097931156a9e18291ac9b0ac787dfcd189a5971b3dd79fab1b5bacbb35a8d33d08faa3493c8d26886c0f487bee757b16837799912bbd99200502227ff510a4c39d924434af8233220084888b134fb608b9d424c66b8ec5c34069327c5ac591ee0b774ef42f922aceeae21d44738b74e644c5dc0fff4c8ed2cb25f495126840442370159a7c5bb6631c4b7aa4d0445a1fa63c104b492e986e1109bcab369fa0889ca41202d22c8511b8a35719163336a87ea672798168038439ab5645acf813f773a0a575169250c2a2062618aa3b77d015750abaf6a707019945e590beaf6cc499b8305ef66583e42c24f9b2d720ac286a8b513261d36a4c34eb7a6a012488aa3e58553e5245ac7e7089d327568ec66e9bccc96fa5ce0bcc8b1fda3343857adc564d4a631e695b7ecea241947561519b48a75b4f89ab992f201d3f5c958254c9c8653ee2d08756388152db8284d0a48db1a7b96282e4b168cf16a51362534c4c90ca21c28f5b76a12c6042f61a7de52c2bf8901f3738cec81c1ac75399b63320217ce8833e06c0b022b5883415009e08428cc07570660cd14a818cc2ad1a00aeb011b9622a48dde6632277b784d28651c17438d8457e67a6dbc016e387cdabf15e3c456c17c1b3825376bdb3c3f418ac251c96605b208f35c0027559405240bc419303ea9d1c9a36c10ccc0b443782e1c1e860ad0af914ab7a53cfaa0d7ef1100be74da5b43fcfe4bc60e7397cf009ddda713eb7610a15716e9b9e7ac55269c9494bc910a3753c0a0942ec654f1500826f900bb4776dde03166ef376dd67cc385406d2f4047ffc819b2657e596c964189727f3b8a9705f4dc011fda20c73846f62146085547d872519636697e16c7e89d741c917974e490812482ab704099e003acab375f1376536b85c5c12196b2a7514461a2fc2032bb65164f6cbe16c8fb6754a3cd75bb26635a1f5b96d113dfb1a3b301438608c39be68adb5b359d00b0b39a77f7167a18c75532b18798aea17d6ea64bf798c3ec0438a1c94d3445f906985f3791a3f4a98df13cd0af24d8feba0dbeb08b42635327c31cff5b6da791ad1c33055b18fa6d8b5c7f454c39a70d1e86f96660cea737e7ca52014a6c662e346a09bc25e4027602b4472dc93ee06c91a698be675bb39d2319e91a083b92e4b7b0bcc38771977c300e889ae60bb38b97ca68836d0bcabde4b6a0d70042ee67e051a1503927c422593df1668bb789d7060cc6b09d0a891384d5bbe90b480ba93afa99a079ea85e28a0236437a0ba397705356b0f90757e754f34fb5a591b3888eb5d719b79f462abab333ed078bb90c4b11dec276713c4cb06bca9d98068f48b90802451f69d2acb08abe66c61f6081f008d04fa7bad176da4201c9c2a3b3b29b3e7734b1961b6e2ea8d963350d8301a2dc48ce2e056156275cae8c23665b47a5a49c1abc5a01c98cee22a9dd7584c15b304e5c61cd449453208417075a6c3b999f626795379f1d556da168832a47ac638ce89a59c5a9c0a9746ad219142cf782280311b9cea87598c46ba673ac30a281a052dcd710e1bac328d822c19db8ee7e38925c7378431937b812e03382963b9c26653a1f5242faad7b37863433dc56538957f3c412102a17d28bc328c4781c566331f8c0b77093baef24a58d6312ddc719ac67ac2874f3adc8a3e6530adbc14cc069159a99e56277895c17c04da1644db23a6e9c16f31c21959400a8abd483a3fcfc0c5fd759917322a66a2aa77a6956f3b8387443640746b0ac8a282521dd784332d56aa745898c3fcc60a56a0716931bbe69b26c4514d529c79979355c8b40eb97fe7c485ceaa45d610145b4bce7da6343db46b6bf42182931a3ed98bafb66614e024cf8c9e51a90b1fc3702a2b4fe3b0c537fa9a1680b4d2f2044c557b1819300a6225be6c234d07d06a702eeb7110ef05c8973b0cab182efbb9ba07811b87b24e2a652cb428240c53423efcaf201973bf3342e86a8d477191d3544217f143586ba351fb7729ac8a51a51c8ab719fd3568c615a7a438b8967301754cac96a8552af82d8ce8840da56cb7481ad54581904c0d390732eceb23df4483cd7593d949bb0c985042f71018862b0d126702a7b55c8c7d9d44cfea157d4013c57ceb18bcfc2c95d8bb8d6178e0ac738ffcc1b3fca525a7ae83652e0b75836fe6c77d182626a8ca85262a17bc60645105a503b2f0f707e765552d49979273b0cb5870124933c6557ef795b36bf093f6c35ef722c9b2854999d20b5fd23dc6d2381ef38bcf547e37faa8ccda3dea2409deda7992a1951849ce3e7b11f3f98cb0d2283e458af854af9d74c57516a924e74222e9bcac529e88a02913d9ae29ec3cc42269d08ca1bf13a941f95d0bb05da9ac4a1ea2bb86b4c631853ec5f2129834a70ba923c8f1bc3fd5cad8692ef4401417b362f0e729497633794abc21e95a59319403002085b113a7b0544165210c1726346a088a933347a265ba055429e637fc40b111d38446461d77546166f923e5249427e5c62092b6ee2a2b585273c3d545b99673419194e54978d71ca606e238a053988db999904207b8326c5b27a38966c4d99460386c453d12821602b444320da205c980da3ab9d3461d405601a7226c143cf4492b8bf4c63a949a8ad81224c71005abcf6afb4ba7ba94ee437079494f9d78c69ac950711765a7e50ab42ba6bf64a5a7d30e64d14fb845ab37b4cc5b099c44e3cf4f9bc61f3640b5b98560474f9f1054dce9be10db77dea35a2375c66d0a26d7f73e7385b03ca8194dbaaf601184f826bd0a86b1d023ae9548b6d4602cd25c3f46e1c66dca6fd183007d043f6f3a6f5a56089180744d579bfe3cc65f003d91084adac4c0ac5811ea8a3e5aa6500eac125a423000297a7975585a083bcfc807b7722b5a0438c1b11b62abcb9a4623f8090c451690455acf97814074c0b6d6d19180f08afbd5ba0e259ee910a61f684d14e7996e47ba5a19994a13a642a1bb563411979bcaf7b302d70ba750a89867653b93596d03b260c7c0a024949bb7b1b110dc8267d5c5305390da26c7e6296add6ba7533b92540c28b337c5b392a6024c57cc09b899ec72e1466de604aa0c909baeb0b0078324e810481f760ae694b5e88cb034bca48bf70881047c7b6ab7ce04b96ad2bd0142b387bc1824a22742c7ce18ebac7744a616a5631e40ab817426c6130ece8641661ab863c44c2adb64029990aea24c94bec0ad7bfba46cd1894775ac6549b1a63446cade59357e125c589a73dd18d5e8aad6acb35a89e0958c3ae122197bb6fed165733ca120172d11335a4d60d73fb91d0ffac552692219ef3082477a0f6399aa5dce8a72fd0afaa3b627c9
Encapsulated ? : true
Cipher : 1d04afad6cf4058acb290f72298587c8afb9e022fc0a4b3e1aa5fdc79cfbe44e7781317adbc1f92fd01a6ad3840386710a369276c50671d2b58272505793736bb9d0e8883c200270ddae19fbc86af41aba366b4ddfd67f8771905b3fccca6da805a1e13a9e697500779cfe52484811e906042fa6e6e93ef641e5e7a46c39969c4683ee7cb440fc4cc452dab5215d6ec32a36fa0e8d7501b5d7dcc9dbfb51cbb1c036b052a7354544a6707099ded7b5e5c5024e2a6f356b2d300585128a30d7b964842d5c06659990c85468b42f5f2b46c39b4fa740a3f7006da01ffa09fb2fd6b5b0e9174bd7a801972b647df2825842b8ad146220a1ddcc9eee6967954e8d960bbf5ea8a74ae0306061c44e2995eb451171bd3eb4679579922e48e713ad40cddcd14343dc57a181e3067f1b01895122a447cf002b600c96a30c5f809efcc459cebc8723ca5b5147d2f9d09186f31bba013f19e63294cc5a57c0184b838cb9d51c62e0303c9a029cf6a5c489ccb43bd0bdd4da61f147d6ef9c2b95a758d0c2b9a9265e7cf4255989c07799940c517ecd527cca2acf62d104e2d45a176e35852d81f42397c93d3b2b1c7fde3cc6f4cd5d6c166f7312e34f690a07ecbaac69a045358564142422b45c58784cd5d2d69d9084b7e9f33176893bee2f1589725ed1a443f4b9095e97294f740e8471f468a51db85cc66176af022db77314579776b69eaa8594dbed5d0e0b549675e12c742913da76e3de732c24f7811d8ee32ade2ac1bcb8763c0e898a67695aaab9478c80dc29cc3ae9f1c4b63c116bda64e1e8727881ebe4c1db30219a87d7ff8805675b56a4907d9408bb96438a5182c66a47739f8b12cd5241b5f4e995f4f1fc85041eeaeb158d7ea9c1601a9b3849c6977137a0e82afb72b16748efa456fbae5b28ed82107d79dec3da87d0c0261267a3dbe9dcedb374d96fc00b7478b30f917b2312e7e79133923c2d9aba394bfcdbd00539f7d2d4fdecf9821fdb4c15f253e5ad80d10e360fb84b45e01415a4d5759cd5000ea5c4e80f60a887f9e8ad35ef7cabab83eeb59bf81b3bb10b440707e877c558ca9c80df8d3d8741b838ddf9a5e0e7826a1f6ee0c4f2241687ab0573b18814d21a668861962400148b45a24fdfeb3638a1f16b7c344b088cfffc851317753c1e0602bb0cbfb5357132baf29d6123862eb8b29229a5fd9b173ad4c1b098d11ff23f6ee1c7d357235e647dd99451162cfbed33b7d05df5578859538a9edbeae2cf8ac0903c36e7db352c147c11725a3c5c611b149a4c87e24589d9e31d30a9a8b2cdd863b8dd3ab8c90cde061426a2afedb4aff424cde10e70f1e38207d0fc8be467b4f063739d920bb1906144a704c7ba5be6645899270e5da6380dabfb16e7f906a1f484501005cb383692e054533697a63c8a2f8e1b891b37d5b23afef1de8f9a257f7c9577466fbd87223c5773795ac23ab4cfc0043a965e8695e764174bdc1c778d3d1d6e2a65d9cb7a4b1eb31ca818b0c8abe779fd61a34ee78cfc49fd7682
Shared secret : ee30e0696c36480afb066fa2971535f195a30ce08aacc3dfc182ed0947a44f3a
Caution
Before you consider using Psuedo Random Number Generator which comes with this library implementation, I strongly advice you to go through include/ml_kem/internals/rng/prng.hpp.
Note
Looking at API documentation, in header files, can give you good idea of how to use ML-KEM API. Note, this library doesn't expose any raw pointer based interface, rather everything is wrapped under statically defined
std::span- which one can easily create fromstd::{array, vector}. I opt for using statically definedstd::spanbased function interfaces because we always know, at compile-time, how many bytes the seeds/ keys/ cipher-texts/ shared-secrets are, for various different ML-KEM parameters. This gives much better type safety and compile-time error reporting.