Commit Graph

77 Commits

Author SHA1 Message Date
release-bot
c6f6e61d60 Bump rust crates' version
icicle-babybear@2.3.1
icicle-bls12-377@2.3.1
icicle-bls12-381@2.3.1
icicle-bn254@2.3.1
icicle-bw6-761@2.3.1
icicle-core@2.3.1
icicle-cuda-runtime@2.3.1
icicle-grumpkin@2.3.1
icicle-hash@2.3.1
icicle-stark252@2.3.1

Generated by cargo-workspaces
2024-05-20 13:43:32 +00:00
Leon Hibnik
db298aefc1 [HOTFIX] rust msm benchmarks (#521)
## Describe the changes

removes unused host to device copy, adds minimum limit to run MSM
benchmarks
2024-05-20 13:51:53 +03:00
yshekel
19a9b76d64 fix: cmake set_gpu_env() and windows build (#520) 2024-05-20 13:05:45 +03:00
release-bot
76a82bf88e Bump rust crates' version
icicle-babybear@2.3.0
icicle-bls12-377@2.3.0
icicle-bls12-381@2.3.0
icicle-bn254@2.3.0
icicle-bw6-761@2.3.0
icicle-core@2.3.0
icicle-cuda-runtime@2.3.0
icicle-grumpkin@2.3.0
icicle-hash@2.3.0
icicle-stark252@2.3.0

Generated by cargo-workspaces
2024-05-17 04:42:17 +00:00
yshekel
9c1afe8a44 Polynomial API views replaced by evaluation on rou domain (#514)
- removed poly API to access view of evaluations. This is a problematic API since it cannot handle small domains and for large domains requires the polynomial to use more memory than need to.
- added evaluate_on_rou_domain() API instead that supports any domain size (powers of two size).
- the new API can compute to HOST or DEVICE memory
- Rust wrapper for evaluate_on_rou_domain()
- updated documentation: overview and Rust wrappers
- faster division by vanishing poly for common case where numerator is 2N and vanishing poly is of degree N.
- allow division a/b where deg(a)<deg(b) instead of throwing an error.
2024-05-15 14:06:23 +03:00
release-bot
940b283c47 Bump rust crates' version
icicle-babybear@2.2.0
icicle-bls12-377@2.2.0
icicle-bls12-381@2.2.0
icicle-bn254@2.2.0
icicle-bw6-761@2.2.0
icicle-core@2.2.0
icicle-cuda-runtime@2.2.0
icicle-grumpkin@2.2.0
icicle-hash@2.2.0
icicle-stark252@2.2.0

Generated by cargo-workspaces
2024-05-09 12:27:17 +00:00
ChickenLover
9da52bc09f Feat/roman/poseidon2 (#510)
# This PR

1. Adds C++ API
2. Renames a lot of API functions
3. Adds inplace poseidon2
4. Makes input const at all poseidon functions
5. Adds benchmark for poseidon2
2024-05-09 19:19:55 +07:00
VitaliiH
49079d0d2a rust ecntt hotfix (#509)
## Describe the changes

This PR fixes Rust ECNTT benches and tests


---------

Co-authored-by: VitaliiH <Vitaliy@ingo>
2024-05-09 11:21:21 +03:00
ChickenLover
094683d291 Feat/roman/poseidon2 (#507)
This PR adds support for poseidon2 permutation function as described in
https://eprint.iacr.org/2023/323.pdf

Reference implementations used (and compared against):
https://github.com/HorizenLabs/poseidon2/tree/main
https://github.com/Plonky3/Plonky3/tree/main

Tasks:

- [x] Remove commented code and prints
- [ ] Add doc-comments to functions and structs
- [x] Fix possible issue with Plonky3 imports
- [x] Update NTT/Plonky3 test
- [x] Add Plonky3-bn254 test (impossible)
2024-05-09 15:13:43 +07:00
nonam3e
c30e333819 keccak docs (#508)
This PR adds keccak docs

---------

Co-authored-by: Leon Hibnik <107353745+LeonHibnik@users.noreply.github.com>
2024-05-08 23:18:59 +03:00
VitaliiH
34f0212c0d rust classic benches with Criterion for ecntt/msm/ntt (#499)
Rust idiomatic benches for EC NTT, NTT, MSM
2024-05-05 10:28:41 +02:00
release-bot
f6758f3447 Bump rust crates' version
icicle-babybear@2.1.0
icicle-bls12-377@2.1.0
icicle-bls12-381@2.1.0
icicle-bn254@2.1.0
icicle-bw6-761@2.1.0
icicle-core@2.1.0
icicle-cuda-runtime@2.1.0
icicle-grumpkin@2.1.0
icicle-hash@2.1.0
icicle-stark252@2.1.0

Generated by cargo-workspaces
2024-05-01 20:11:42 +00:00
nonam3e
e2ad621f97 Nonam3e/golang/keccak (#496)
## Describe the changes

This PR adds keccak bindings + passes cfg as reference in keccak cuda functions
2024-05-01 14:08:33 +03:00
PatStiles
bdc3da98d6 FEAT(stark252 field): Adds Stark252 curve (#494)
## Describe the changes

Adds support for the stark252 base field.
2024-05-01 14:08:05 +03:00
yshekel
36e288c1fa fix: bug regarding MixedRadix coset (I)NTT for NM/MN ordering (#497)
The bug is in how twiddles array is indexed when multiplied by a mixed
(M) vector to implement (I)NTT on cosets.
The fix is to use the DIF-digit-reverse to compute the index of the element in the
natural (N) vector that moved to index 'i' in the M vector. This is
emulating a DIT-digit-reverse (which is mixing like a DIF-compute)
reorder of the twiddles array and element-wise multiplication without
reordering the twiddles memory.
2024-04-25 18:09:27 +03:00
release-bot
14b39b57cc Bump rust crates' version
icicle-babybear@2.0.1
icicle-bls12-377@2.0.1
icicle-bls12-381@2.0.1
icicle-bn254@2.0.1
icicle-bw6-761@2.0.1
icicle-core@2.0.1
icicle-cuda-runtime@2.0.1
icicle-grumpkin@2.0.1
icicle-hash@2.0.1

Generated by cargo-workspaces
2024-04-24 07:13:05 +00:00
release-bot
ff374fcac7 Bump rust crates' version
icicle-babybear@2.0.0
icicle-bls12-377@2.0.0
icicle-bls12-381@2.0.0
icicle-bn254@2.0.0
icicle-bw6-761@2.0.0
icicle-core@2.0.0
icicle-cuda-runtime@2.0.0
icicle-grumpkin@2.0.0
icicle-hash@2.0.0

Generated by cargo-workspaces
2024-04-23 02:30:18 +00:00
ChickenLover
7265d18d48 ICICLE V2 Release (#492)
This PR introduces major updates for ICICLE Core, Rust and Golang
bindings

---------

Co-authored-by: Yuval Shekel <yshekel@gmail.com>
Co-authored-by: DmytroTym <dmytrotym1@gmail.com>
Co-authored-by: Otsar <122266060+Otsar-Raikou@users.noreply.github.com>
Co-authored-by: VitaliiH <vhnatyk@gmail.com>
Co-authored-by: release-bot <release-bot@ingonyama.com>
Co-authored-by: Stas <spolonsky@icloud.com>
Co-authored-by: Jeremy Felder <jeremy.felder1@gmail.com>
Co-authored-by: ImmanuelSegol <3ditds@gmail.com>
Co-authored-by: JimmyHongjichuan <45908291+JimmyHongjichuan@users.noreply.github.com>
Co-authored-by: pierre <pierreuu@gmail.com>
Co-authored-by: Leon Hibnik <107353745+LeonHibnik@users.noreply.github.com>
Co-authored-by: nonam3e <timur@ingonyama.com>
Co-authored-by: Vlad <88586482+vladfdp@users.noreply.github.com>
Co-authored-by: LeonHibnik <leon@ingonyama.com>
Co-authored-by: nonam3e <71525212+nonam3e@users.noreply.github.com>
Co-authored-by: vladfdp <vlad.heintz@gmail.com>
2024-04-23 05:26:40 +03:00
release-bot
a1dc0539ce Bump rust crates' version
icicle-bls12-377@1.10.1
icicle-bls12-381@1.10.1
icicle-bn254@1.10.1
icicle-bw6-761@1.10.1
icicle-core@1.10.1
icicle-cuda-runtime@1.10.1
icicle-grumpkin@1.10.1

Generated by cargo-workspaces
2024-04-11 07:56:32 +00:00
release-bot
8498a962f9 Bump rust crates' version
icicle-bls12-377@1.10.0
icicle-bls12-381@1.10.0
icicle-bn254@1.10.0
icicle-bw6-761@1.10.0
icicle-core@1.10.0
icicle-cuda-runtime@1.10.0
icicle-grumpkin@1.10.0

Generated by cargo-workspaces
2024-04-09 10:02:34 +00:00
Leon Hibnik
a7b0dc40c1 [FEAT] ReleaseDomain API (#465)
## Describe the changes

This PR adds a NTT ReleaseDomain API in Golang and Rust

## Linked Issues

Resolves #

---------

Co-authored-by: Yuval Shekel <yshekel@gmail.com>
2024-04-09 12:58:19 +03:00
Vlad
4a35eece51 transpose kernel in vec_ops and rust binding (#462)
## Describe the changes

This PR adds an extern C link to the transpose kernel, now in
vec_ops.cu.
Also Rust binding, and I updated the test check_ntt_batch to use the new
transpose function.
The test passes.

## Linked Issues

Resolves #

---------

Co-authored-by: LeonHibnik <leon@ingonyama.com>
2024-04-09 08:47:33 +03:00
VitaliiH
4c9b3c00a5 Devmode to Reduce compilation time (including G2 and ECNTT) (#395)
devmode to reduce compilation time
2024-04-09 06:09:04 +02:00
DmytroTym
b93b1d0aaf NTT inplace in Rust (#453)
## Describe the changes

Due to Rust's ownership rules, we can't run NTT inplace using the
[`ntt`](https://github.com/ingonyama-zk/icicle/blob/v1.9.1/wrappers/rust/icicle-core/src/ntt/mod.rs#L139)
function. Which is why we saw a need to add a separate function a couple
of times.

Incidentally an issue with radix-2 NTT was found when ran inplace,
`__syncthreads()` was used in reverse order kernel as if it was a global
barrier for all blocks and not block-local one. Thus data race happened
that is fixed by this PR.
2024-04-08 10:04:04 +03:00
release-bot
25ac705c3b Bump rust crates' version
icicle-bls12-377@1.9.1
icicle-bls12-381@1.9.1
icicle-bn254@1.9.1
icicle-bw6-761@1.9.1
icicle-core@1.9.1
icicle-cuda-runtime@1.9.1
icicle-grumpkin@1.9.1

Generated by cargo-workspaces
2024-03-27 19:00:07 +00:00
release-bot
a1ff989740 Bump rust crates' version
icicle-bls12-377@1.9.0
icicle-bls12-381@1.9.0
icicle-bn254@1.9.0
icicle-bw6-761@1.9.0
icicle-core@1.9.0
icicle-cuda-runtime@1.9.0
icicle-grumpkin@1.9.0

Generated by cargo-workspaces
2024-03-21 07:11:47 +00:00
release-bot
b6b5011a47 Bump rust crates' version
icicle-bls12-377@1.8.0
icicle-bls12-381@1.8.0
icicle-bn254@1.8.0
icicle-bw6-761@1.8.0
icicle-core@1.8.0
icicle-cuda-runtime@1.8.0
icicle-grumpkin@1.8.0

Generated by cargo-workspaces
2024-03-13 21:38:17 +00:00
DmytroTym
7ac463c3d9 MSM pre-computation (#427)
## Brief description

This PR adds pre-computation to the MSM, for some theory see
[this](https://youtu.be/KAWlySN7Hm8?si=XeR-htjbnK_ySbUo&t=1734) timecode
of Niall Emmart's talk.
In terms of public APIs, one method is added. It does the
pre-computation on-device leaving resulting data on-device as well. No
extra structures are added, only `precompute_factor` from `MSMConfig` is
now activated.

## Performance

While performance gains are for now often limited by our inflexibility
in choice of `c` (for example, very large MSMs get basically no speedup
from pre-compute because currently `c` cannot be larger than 16),
there's still a number of MSM sizes which get noticeable improvement:

| Pre-computation factor | bn254 size `2^20` MSM, ms. | bn254 size
`2^12` MSM, size `2^10` batch, ms. | bls12-381 size `2^20` MSM, ms. |
bls12-381 size `2^12` MSM, size `2^10` batch, ms. |
| ------------- | ------------- | ------------- | ------------- |
------------- |
| 1  | 14.1  | 82.8  | 25.5  | 136.7  |
| 2  | 11.8  | 76.6  | 20.3  | 123.8  |
| 4  | 10.9  | 73.8  | 18.1  | 117.8  |
| 8  | 10.6  | 73.7  | 17.2  | 116.0  |

Here for example pre-computation factor = 4 means that alongside each
original base point, we pre-compute and pass into the MSM 3 of its
"shifted" versions. Pre-computation factor = 1 means no pre-computation.
GPU used for benchmarks is a 3090Ti.

## TODOs and open questions

- Golang APIs are missing;
- I mentioned that to utilise pre-compute to its full potential we need
arbitrary choice of `c`. One issue with this is that pre-compute will
become dependent on `c`. For now this is not the case as `c` can only be
a power of 2 and powers of 2 can always share the same pre-computation.
So apparently we need to make `c` a parameter of the precompute function
to future-proof it from a breaking change. This is pretty unnatural and
counterintuitive as `c` is typically chosen in runtime after pre-compute
is done but I don't really see another way, pls let me know if you do.
UPD: `c` is added into pre-compute function, for now it's unused and
it's documented how it will change in the future.

Resolves https://github.com/ingonyama-zk/icicle/issues/147
Co-authored with @ChickenLover

---------

Co-authored-by: ChickenLover <romangg81@gmail.com>
Co-authored-by: nonam3e <timur@ingonyama.com>
Co-authored-by: nonam3e <71525212+nonam3e@users.noreply.github.com>
Co-authored-by: LeonHibnik <leon@ingonyama.com>
2024-03-13 23:25:16 +02:00
HadarIngonyama
287f53ff16 NTT columns batch (#424)
This PR adds the columns batch feature - enabling batch NTT computation
to be performed directly on the columns of a matrix without having to
transpose it beforehand, as requested in issue #264.

Also some small fixes to the reordering kernels were added and some
unnecessary parameters were removes from functions interfaces.

---------

Co-authored-by: DmytroTym <dmytrotym1@gmail.com>
2024-03-13 18:46:47 +02:00
DmytroTym
0e84fb4b76 feat: add warmup for CudaStream (#422)
## Describe the changes

Add a non-blocking `warmup` function to `CudaStream` 

> when you run the benchmark (e.g. the msm example you have) the first
instance is always slow, with a constant overhead of 200~300ms cuda
stream warmup. and I want to get rid of that in my application by
warming it up in parallel while my host do something else.
2024-03-07 19:11:34 +02:00
DmytroTym
4a65758408 Merge branch 'main' into feat/warmup 2024-03-06 22:08:45 +02:00
Jeremy Felder
1abd2ef9c9 Bump rust crates' version
icicle-bls12-377@1.7.0
icicle-bls12-381@1.7.0
icicle-bn254@1.7.0
icicle-bw6-761@1.7.0
icicle-core@1.7.0
icicle-cuda-runtime@1.7.0
icicle-grumpkin@1.7.0

Generated by cargo-workspaces
2024-03-06 22:05:10 +02:00
Jeremy Felder
9d402df0cf Release flow CI (#423)
## Describe the changes

This PR:
- Moves common crate attributes to the workspace Cargo.toml. 
- Adds a manual release flow for bumping, tagging, and draft release
2024-03-06 21:41:48 +02:00
DmytroTym
7185657ff7 Warmup function 2024-03-06 18:13:23 +02:00
Alex Xiong
b22aa02e91 fix: cargo fmt 2024-03-06 13:10:12 +00:00
Alex Xiong
b108c71bdd feat: add rust api for cudaFreeAsync 2024-03-06 12:44:43 +00:00
ChickenLover
9fc083916d Small features (#415)
This PR is a compilation of small improvements

 - Lock bindgen version for `icicle-cuda-runtime`
- Add an error message when trying to build on Mac (or any non
windows/linux machine)
 - Add documentation and template files for adding new curve
 - Add documentation on _params.cuh contents
- Add the script to bump all the rust crates versions to the same
version

Resolves #313
2024-03-06 13:48:34 +02:00
Jeremy Felder
40309329fb Migrate docs website + improved docs (#389) (#403)
migrate docs website + improved docs (#389)

* Update README.md (#385)

* refactor

* refactor

* refactor

* rename task

* update codespell

* multi gpu docs (#391)

* Refactor

* refacotr

* fix typo

* Apply suggestions from code review



* refactor

* refactor

---------

Co-authored-by: ImmanuelSegol <3ditds@gmail.com>
Co-authored-by: DmytroTym <dmytrotym1@gmail.com>
Co-authored-by: ChickenLover <Romangg81@gmail.com>
2024-02-28 14:40:04 +02:00
Jeremy Felder
e8cd2d7a98 GoLang bindings for v1.x (#386) 2024-02-22 20:52:48 +02:00
yshekel
275b2f4958 feature: mixed-radix NTT fast twiddles mode (#382)
- this mode is allocating additional 4N twiddle-factors to achieve faster computation
- enabled by flag for initDomain(). Defaults to false.

Co-authored-by: hadaringonyama <hadar@ingonyama.com>
2024-02-22 00:02:02 +02:00
nonam3e
4b221e9665 Grumpkin curve implementation (#379) 2024-02-21 23:20:28 +07:00
ChickenLover
f9755980f0 add vector operations bindings to Rust (#384)
* add vector operations bindings to Rust
2024-02-21 21:17:10 +07:00
ImmanuelSegol
275eaa9904 bump version 2024-02-15 19:36:18 +00:00
DmytroTym
a91397e2c1 MSM improvements (#372)
* Improved MSM

* Zero point handling in large buckets

* Fixed affine zero point conversion for arkworks

* cargo fmt

* Addressed comments

* MSM comments

* All zero scalars case handled

* clang format
2024-02-15 20:02:10 +02:00
ChickenLover
fd08925ed4 merge WIP 2024-02-15 14:57:09 +07:00
VitaliiH
774250926c multi card support (#356)
multi-GPU support
2024-02-14 22:29:30 +01:00
yshekel
a02459c64d Mixed-radix NTT support all orderings (#371)
- Mixed-radix NTT orderings support
- radix-2 small refactor: split core logic to function and renamed ct_butterfly to dit
- testing both radix2 and mixed-radix algs for all ntt tests
2024-02-13 15:49:24 +02:00
yshekel
e16ce1026d Mixed-radix NTT batch support (#367)
Co-authored-by: hadaringonyama <hadar@ingonyama.com>
2024-02-12 14:50:22 +02:00
ChickenLover
8c1750ea97 Feat/roman/display functions (#366)
* fix display and debug traits

* leave only one impl for printing scalars
2024-02-09 14:40:07 +07:00
yshekel
382bec4ad3 Mixed-radix NTT algorithm
Co-authored-by: hadaringonyama <hadar@ingonyama.com>
2024-02-08 20:43:12 +00:00