Compare commits

..

20 Commits

Author SHA1 Message Date
Ethan-000
621676bd41 feat: add num_bits() function (#570)
## Describe the changes

adding a `num_bits()` function similar to
dcf73a5f96/ff/src/biginteger/mod.rs (L482)

this could be useful for small field optimizations

## Linked Issues

Resolves #
2024-08-07 09:37:16 +03:00
Otsar
badb8c5d68 Removed ZK containers from docs sidebar (#571)
## Describe the changes

This PR...

## Linked Issues

Resolves #
2024-08-04 18:38:37 +03:00
Otsar
1300434bbe Removed ZK containers from docs sidebar 2024-08-04 11:14:06 +03:00
yshekel
6a67893773 remove the recommnedation to use zk-contariners in examples (#569) 2024-08-01 14:58:02 +03:00
ChickenLover
0cb0b49be9 Add Sha3 (#560)
## Describe the changes

This PR...

## Linked Issues

Resolves #
2024-07-28 15:31:28 +07:00
Vlad
8411ed1451 Feat/vlad/refactor from affine (#554)
## Describe the changes

This PR refactors the different affine to projective conversion
functions using the C function

also small bug fix for ProjectiveToAffine() function in Go

## Linked Issues

Resolves #
2024-07-22 10:37:24 +02:00
omahs
53f34aade5 Fix typos (#558)
Fix typos
2024-07-18 11:58:04 +03:00
release-bot
aacec3f72f Bump rust crates' version
icicle-babybear@2.8.0
icicle-bls12-377@2.8.0
icicle-bls12-381@2.8.0
icicle-bn254@2.8.0
icicle-bw6-761@2.8.0
icicle-core@2.8.0
icicle-cuda-runtime@2.8.0
icicle-grumpkin@2.8.0
icicle-hash@2.8.0
icicle-m31@2.8.0
icicle-stark252@2.8.0

Generated by cargo-workspaces
2024-07-16 13:57:56 +00:00
ChickenLover
a8fa05d0e3 Feat/roman/hash docs (#556)
## Describe the changes

This PR...

## Linked Issues

Resolves #

---------

Co-authored-by: Jeremy Felder <jeremy.felder1@gmail.com>
2024-07-16 16:39:35 +03:00
Vlad
877018c84c more go fmt 2024-07-15 16:55:40 +02:00
Vlad
91ac666e06 Merge branch 'feat/vlad/refactor-from-affine' of github.com:ingonyama-zk/icicle into feat/vlad/refactor-from-affine 2024-07-15 16:48:25 +02:00
Vlad
46e6c20440 go fmt 2024-07-15 16:47:40 +02:00
Vlad
e4eda8938d go fmt 2024-07-05 21:29:44 +02:00
Vlad
fb707d5350 Merge branch 'main' into feat/vlad/refactor-from-affine 2024-07-05 15:40:34 +02:00
Vlad
6336e74d5a refactor from_affine with C link 2024-07-04 11:03:58 +02:00
Vlad
279cdc66e0 generated go files 2024-07-03 10:41:32 +02:00
Vlad
81644fc28c use zero method of projective in toProjective
Co-authored-by: Jeremy Felder <jeremy.felder1@gmail.com>
2024-07-03 10:37:02 +02:00
Vlad
17732ea013 use zero method of projective in fromAffine
Co-authored-by: Jeremy Felder <jeremy.felder1@gmail.com>
2024-07-03 10:36:14 +02:00
Vlad
9e057c835d fixed to_projective in rust 2024-07-03 09:18:41 +02:00
Vlad
f08b5bb49d fixed fromAffine and toProj in golang 2024-07-03 09:07:43 +02:00
59 changed files with 650 additions and 628 deletions

View File

@@ -1,6 +1,6 @@
# Contributor's Guide
We welcome all contributions with open arms. At Ingonyama we take a village approach, believing it takes many hands and minds to build a ecosystem.
We welcome all contributions with open arms. At Ingonyama we take a village approach, believing it takes many hands and minds to build an ecosystem.
## Contributing to ICICLE
@@ -14,9 +14,9 @@ We welcome all contributions with open arms. At Ingonyama we take a village appr
When opening a [pull request](https://github.com/ingonyama-zk/icicle/pulls) please keep the following in mind.
- `Clear Purpose` - The pull request should solve a single issue and be clean of any unrelated changes.
- `Clear description` - If the pull request is for a new feature describe what you built, why you added it and how its best that we test it. For bug fixes please describe the issue and the solution.
- `Clear description` - If the pull request is for a new feature describe what you built, why you added it and how it's best that we test it. For bug fixes please describe the issue and the solution.
- `Consistent style` - Rust and Golang code should be linted by the official linters (golang fmt and rust fmt) and maintain a proper style. For CUDA and C++ code we use [`clang-format`](https://github.com/ingonyama-zk/icicle/blob/main/.clang-format), [here](https://github.com/ingonyama-zk/icicle/blob/605c25f9d22135c54ac49683b710fe2ce06e2300/.github/workflows/main-format.yml#L46) you can see how we run it.
- `Minimal Tests` - please add test which cover basic usage of your changes .
- `Minimal Tests` - please add test which cover basic usage of your changes.
## Questions?

View File

@@ -47,7 +47,7 @@ type NTTConfig[T any] struct {
- **`areInputsOnDevice`**: Indicates if input scalars are located on the device.
- **`areOutputsOnDevice`**: Indicates if results are stored on the device.
- **`IsAsync`**: Controls whether the NTT operation runs asynchronously.
- **`NttAlgorithm`**: Explicitly select the NTT algorithm. ECNTT supports running on `Radix2` algoruithm.
- **`NttAlgorithm`**: Explicitly select the NTT algorithm. ECNTT supports running on `Radix2` algorithm.
### Default Configuration

View File

@@ -139,7 +139,7 @@ cfg.Ctx.IsBigTriangle = true
Toggling between MSM modes occurs automatically based on the number of results you are expecting from the `MSM` function.
The number of results is interpreted from the size of `var out core.DeviceSlice`. Thus its important when allocating memory for `var out core.DeviceSlice` to make sure that you are allocating `<number of results> X <size of a single point>`.
The number of results is interpreted from the size of `var out core.DeviceSlice`. Thus it's important when allocating memory for `var out core.DeviceSlice` to make sure that you are allocating `<number of results> X <size of a single point>`.
```go
...
@@ -168,7 +168,7 @@ import (
)
```
This package include `G2Projective` and `G2Affine` points as well as a `G2Msm` method.
This package includes `G2Projective` and `G2Affine` points as well as a `G2Msm` method.
```go
package main

View File

@@ -171,7 +171,7 @@ Polynomial& add_monomial_inplace(Coeff monomial_coeff, uint64_t monomial = 0);
Polynomial& sub_monomial_inplace(Coeff monomial_coeff, uint64_t monomial = 0);
```
The ability to add or subtract monomials directly and in-place is an efficient way to manipualte polynomials.
The ability to add or subtract monomials directly and in-place is an efficient way to manipulate polynomials.
Example:

View File

@@ -12,11 +12,68 @@ At its core, Keccak consists of a permutation function operating on a state arra
- **Chi:** This step applies a nonlinear mixing operation to each lane of the state array.
- **Iota:** This step introduces a round constant to the state array.
## Keccak vs Sha3
There exists a [confusion](https://www.cybertest.com/blog/keccak-vs-sha3) between what is called `Keccak` and `Sha3`. In ICICLE we support both. `Keccak256` relates to the old hash function used in Ethereum, and `Sha3-256` relates to the modern hash function.
## Using Keccak
ICICLE Keccak supports batch hashing, which can be utilized for constructing a merkle tree.
ICICLE Keccak supports batch hashing, which can be utilized for constructing a merkle tree or running multiple hashes in parallel.
### Supported Bindings
- [Golang](https://github.com/ingonyama-zk/icicle/tree/main/wrappers/golang/hash/keccak)
- [Rust](https://github.com/ingonyama-zk/icicle/tree/main/wrappers/rust/icicle-hash)
- [Rust](https://github.com/ingonyama-zk/icicle/tree/main/wrappers/rust/icicle-hash)
### Example usage
This is an example of running 1024 Keccak-256 hashes in parallel, where input strings are of size 136 bytes:
```rust
use icicle_core::hash::HashConfig;
use icicle_cuda_runtime::memory::HostSlice;
use icicle_hash::keccak::keccak256;
let config = HashConfig::default();
let input_block_len = 136;
let number_of_hashes = 1024;
let preimages = vec![1u8; number_of_hashes * input_block_len];
let mut digests = vec![0u8; number_of_hashes * 32];
let preimages_slice = HostSlice::from_slice(&preimages);
let digests_slice = HostSlice::from_mut_slice(&mut digests);
keccak256(
preimages_slice,
input_block_len as u32,
number_of_hashes as u32,
digests_slice,
&config,
)
.unwrap();
```
### Merkle Tree
You can build a keccak merkle tree using the corresponding functions:
```rust
use icicle_core::tree::{merkle_tree_digests_len, TreeBuilderConfig};
use icicle_cuda_runtime::memory::HostSlice;
use icicle_hash::keccak::build_keccak256_merkle_tree;
let mut config = TreeBuilderConfig::default();
config.arity = 2;
let height = 22;
let input_block_len = 136;
let leaves = vec![1u8; (1 << height) * input_block_len];
let mut digests = vec![0u64; merkle_tree_digests_len((height + 1) as u32, 2, 1)];
let leaves_slice = HostSlice::from_slice(&leaves);
let digests_slice = HostSlice::from_mut_slice(&mut digests);
build_keccak256_merkle_tree(leaves_slice, digests_slice, height, input_block_len, &config).unwrap();
```
In the example above, a binary tree of height 22 is being built. Each leaf is considered to be a 136 byte long array. The leaves and digests are aligned in a flat array. You can also use keccak512 in `build_keccak512_merkle_tree` function.

View File

@@ -1,6 +1,6 @@
# MSM - Multi scalar multiplication
MSM stands for Multi scalar multiplication, its defined as:
MSM stands for Multi scalar multiplication, it's defined as:
<math xmlns="http://www.w3.org/1998/Math/MathML">
<mi>M</mi>
@@ -43,7 +43,7 @@ $a_0, \ldots, a_n$ - Scalars
$MSM(a, G) \in G$ - a single EC (elliptic curve) point
In words, MSM is the sum of scalar and EC point multiplications. We can see from this definition that the core operations occurring are Modular Multiplication and Elliptic curve point addition. Its obvious that multiplication can be computed in parallel and then the products summed, making MSM inherently parallelizable.
In words, MSM is the sum of scalar and EC point multiplications. We can see from this definition that the core operations occurring are Modular Multiplication and Elliptic curve point addition. It's obvious that multiplication can be computed in parallel and then the products summed, making MSM inherently parallelizable.
Accelerating MSM is crucial to a ZK protocol's performance due to the [large percent of run time](https://hackmd.io/@0xMonia/SkQ6-oRz3#Hardware-acceleration-in-action) they take when generating proofs.
@@ -131,7 +131,7 @@ Large buckets exist in two cases:
2. When `c` does not divide the scalar bit-size.
`large_bucket_factor` that is equal to 10 yields good results for most cases, but it's best to fine tune this parameter per `c` and per scalar distribution.
The two most important parameters for performance are `c` and the `precompute_factor`. They affect the number of EC additions as well as the memory size. When the points are not known in advance we cannot use precomputation. In this case the best `c` value is usually around $log_2(msmSize) - 4$. However, in most protocols the points are known in advanced and precomputation can be used unless limited by memory. Usually it's best to use maximum precomputation (such that we end up with only a single bucket module) combined we a `c` value around $log_2(msmSize) - 1$.
The two most important parameters for performance are `c` and the `precompute_factor`. They affect the number of EC additions as well as the memory size. When the points are not known in advance we cannot use precomputation. In this case the best `c` value is usually around $log_2(msmSize) - 4$. However, in most protocols the points are known in advance and precomputation can be used unless limited by memory. Usually it's best to use maximum precomputation (such that we end up with only a single bucket module) combined with a `c` value around $log_2(msmSize) - 1$.
## Memory usage estimation

View File

@@ -56,7 +56,7 @@ Choosing an algorithm is heavily dependent on your use case. For example Cooley-
NTT also supports two different modes `Batch NTT` and `Single NTT`
Deciding weather to use `batch NTT` vs `single NTT` is highly dependent on your application and use case.
Deciding whether to use `batch NTT` vs `single NTT` is highly dependent on your application and use case.
#### Single NTT

View File

@@ -1,6 +1,6 @@
# Poseidon
[Poseidon](https://eprint.iacr.org/2019/458.pdf) is a popular hash in the ZK ecosystem primarily because its optimized to work over large prime fields, a common setting for ZK proofs, thereby minimizing the number of multiplicative operations required.
[Poseidon](https://eprint.iacr.org/2019/458.pdf) is a popular hash in the ZK ecosystem primarily because it's optimized to work over large prime fields, a common setting for ZK proofs, thereby minimizing the number of multiplicative operations required.
Poseidon has also been specifically designed to be efficient when implemented within ZK circuits, Poseidon uses far less constraints compared to other hash functions like Keccak or SHA-256 in the context of ZK circuits.
@@ -42,7 +42,7 @@ To generate a secure hash output, the algorithm goes through a series of "full r
**Linear Transformation and Round Constants:** A linear transformation is performed and round constants are added. The linear transformation in partial rounds can be designed to be less computationally intensive (this is done by using a sparse matrix) than in full rounds, further optimizing the function's efficiency.
The user of Poseidon can often choose how many partial or full rounds he wishes to apply; more full rounds will increase security but degrade performance. The choice and balance is highly dependent on the use case.
The user of Poseidon can often choose how many partial or full rounds he wishes to apply; more full rounds will increase security but degrade performance. The choice and balance are highly dependent on the use case.
## Using Poseidon
@@ -53,13 +53,14 @@ So for Poseidon of arity 2 and input of size 1024 * 2, we would expect 1024 elem
### Supported Bindings
[`Go`](https://github.com/ingonyama-zk/icicle/blob/main/wrappers/golang/curves/bn254/poseidon/poseidon.go)
[`Rust`](https://github.com/ingonyama-zk/icicle/tree/main/wrappers/rust/icicle-core/src/poseidon)
### Constants
Poseidon is extremely customizable and using different constants will produce different hashes, security levels and performance results.
We support pre-calculated and optimized constants for each of the [supported curves](../core#supported-curves-and-operations).The constants can be found [here](https://github.com/ingonyama-zk/icicle/tree/main/icicle/include/poseidon/constants) and are labeled clearly per curve `<curve_name>_poseidon.h`.
We support pre-calculated and optimized constants for each of the [supported curves](../core#supported-curves-and-operations). The constants can be found [here](https://github.com/ingonyama-zk/icicle/tree/main/icicle/include/poseidon/constants) and are labeled clearly per curve `<curve_name>_poseidon.h`.
If you wish to generate your own constants you can use our python script which can be found [here](https://github.com/ingonyama-zk/icicle/tree/main/icicle/include/poseidon/constants/generate_parameters.py).
@@ -91,8 +92,6 @@ primitive_element = 7 # bls12-381
# primitive_element = 15 # bw6-761
```
We only support `alpha = 5` so if you want to use another alpha for S-box please reach out on discord or open a github issue.
### Rust API
This is the most basic way to use the Poseidon API.
@@ -101,71 +100,58 @@ This is the most basic way to use the Poseidon API.
let test_size = 1 << 10;
let arity = 2u32;
let ctx = get_default_device_context();
let constants = load_optimized_poseidon_constants::<F>(arity, &ctx).unwrap();
let config = PoseidonConfig::default();
let poseidon = Poseidon::load(arity, &ctx).unwrap();
let config = HashConfig::default();
let inputs = vec![F::one(); test_size * arity as usize];
let outputs = vec![F::zero(); test_size];
let mut input_slice = HostOrDeviceSlice::on_host(inputs);
let mut output_slice = HostOrDeviceSlice::on_host(outputs);
poseidon_hash_many::<F>(
poseidon.hash_many::<F>(
&mut input_slice,
&mut output_slice,
test_size as u32,
arity as u32,
&constants,
1, // Output length
&config,
)
.unwrap();
```
The `PoseidonConfig::default()` can be modified, by default the inputs and outputs are set to be on `Host` for example.
The `HashConfig` can be modified, by default the inputs and outputs are set to be on `Host` for example.
```rust
impl<'a> Default for PoseidonConfig<'a> {
impl<'a> Default for HashConfig<'a> {
fn default() -> Self {
let ctx = get_default_device_context();
Self {
ctx,
are_inputs_on_device: false,
are_outputs_on_device: false,
input_is_a_state: false,
aligned: false,
loop_state: false,
is_async: false,
}
}
}
```
In the example above `load_optimized_poseidon_constants::<F>(arity, &ctx).unwrap();` is used which will load the correct constants based on arity and curve. Its possible to [generate](#constants) your own constants and load them.
In the example above `Poseidon::load(arity, &ctx).unwrap();` is used which will load the correct constants based on arity and curve. It's possible to [generate](#constants) your own constants and load them.
```rust
let ctx = get_default_device_context();
let cargo_manifest_dir = env!("CARGO_MANIFEST_DIR");
let constants_file = PathBuf::from(cargo_manifest_dir)
.join("tests")
.join(format!("{}_constants.bin", field_prefix));
let mut constants_buf = vec![];
File::open(constants_file)
.unwrap()
.read_to_end(&mut constants_buf)
.unwrap();
let mut custom_constants = vec![];
for chunk in constants_buf.chunks(field_bytes) {
custom_constants.push(F::from_bytes_le(chunk));
}
let custom_constants = create_optimized_poseidon_constants::<F>(
arity as u32,
&ctx,
full_rounds_half,
partial_rounds,
&mut custom_constants,
)
.unwrap();
let custom_poseidon = Poseidon::new(
arity, // The arity of poseidon hash. The width will be equal to arity + 1
alpha, // The S-box power
full_rounds_half,
partial_rounds,
round_constants,
mds_matrix,
non_sparse_matrix,
sparse_matrices,
domain_tag,
ctx,
)
.unwrap();
```
## The Tree Builder
@@ -175,21 +161,34 @@ The tree builder allows you to build Merkle trees using Poseidon.
You can define both the tree's `height` and its `arity`. The tree `height` determines the number of layers in the tree, including the root and the leaf layer. The `arity` determines how many children each internal node can have.
```rust
let height = 20;
let arity = 2;
let leaves = vec![F::one(); 1 << (height - 1)];
let mut digests = vec![F::zero(); merkle_tree_digests_len(height, arity)];
let mut leaves_slice = HostOrDeviceSlice::on_host(leaves);
let ctx = get_default_device_context();
let constants = load_optimized_poseidon_constants::<F>(arity, &ctx).unwrap()
use icicle_bn254::tree::Bn254TreeBuilder;
use icicle_bn254::poseidon::Poseidon;
let mut config = TreeBuilderConfig::default();
config.keep_rows = 1;
build_poseidon_merkle_tree::<F>(&mut leaves_slice, &mut digests, height, arity, &constants, &config).unwrap();
let arity = 2;
config.arity = arity as u32;
let input_block_len = arity;
let leaves = vec![F::one(); (1 << height) * arity];
let mut digests = vec![F::zero(); merkle_tree_digests_len((height + 1) as u32, arity as u32, 1)];
println!("Root: {:?}", digests[0..1][0]);
let leaves_slice = HostSlice::from_slice(&leaves);
let digests_slice = HostSlice::from_mut_slice(&mut digests);
let ctx = device_context::DeviceContext::default();
let hash = Poseidon::load(2, &ctx).unwrap();
let mut config = TreeBuilderConfig::default();
config.keep_rows = 5;
Bn254TreeBuilder::build_merkle_tree(
leaves_slice,
digests_slice,
height,
input_block_len,
&hash,
&hash,
&config,
)
.unwrap();
```
Similar to Poseidon, you can also configure the Tree Builder `TreeBuilderConfig::default()`

View File

@@ -0,0 +1,88 @@
# Poseidon2
[Poseidon2](https://eprint.iacr.org/2023/323) is a recently released optimized version of Poseidon1. The two versions differ in two crucial points. First, Poseidon is a sponge hash function, while Poseidon2 can be either a sponge or a compression function depending on the use case. Secondly, Poseidon2 is instantiated by new and more efficient linear layers with respect to Poseidon. These changes decrease the number of multiplications in the linear layer by up to 90% and the number of constraints in Plonk circuits by up to 70%. This makes Poseidon2 currently the fastest arithmetization-oriented hash function without lookups.
## Using Poseidon2
ICICLE Poseidon2 is implemented for GPU and parallelization is performed for each state.
We calculate multiple hash-sums over multiple pre-images in parallel, rather than going block by block over the input vector.
For example, for Poseidon2 of width 16, input rate 8, output elements 8 and input of size 1024 * 8, we would expect 1024 * 8 elements of output. Which means each input block would be of size 8, resulting in 1024 Poseidon2 hashes being performed.
### Supported Bindings
[`Rust`](https://github.com/ingonyama-zk/icicle/tree/main/wrappers/rust/icicle-core/src/poseidon2)
### Constants
Poseidon2 is also extremely customizable and using different constants will produce different hashes, security levels and performance results.
We support pre-calculated constants for each of the [supported curves](../core#supported-curves-and-operations). The constants can be found [here](https://github.com/ingonyama-zk/icicle/tree/main/icicle/include/poseidon2/constants) and are labeled clearly per curve `<curve_name>_poseidon2.h`.
You can also use your own set of constants as shown [here](https://github.com/ingonyama-zk/icicle/blob/main/wrappers/rust/icicle-fields/icicle-babybear/src/poseidon2/mod.rs#L290)
### Rust API
This is the most basic way to use the Poseidon2 API.
```rust
let test_size = 1 << 10;
let width = 16;
let rate = 8;
let ctx = get_default_device_context();
let poseidon = Poseidon2::load(width, rate, MdsType::Default, DiffusionStrategy::Default, &ctx).unwrap();
let config = HashConfig::default();
let inputs = vec![F::one(); test_size * rate as usize];
let outputs = vec![F::zero(); test_size];
let mut input_slice = HostOrDeviceSlice::on_host(inputs);
let mut output_slice = HostOrDeviceSlice::on_host(outputs);
poseidon.hash_many::<F>(
&mut input_slice,
&mut output_slice,
test_size as u32,
rate as u32,
8, // Output length
&config,
)
.unwrap();
```
In the example above `Poseidon2::load(width, rate, MdsType::Default, DiffusionStrategy::Default, &ctx).unwrap();` is used to load the correct constants based on width and curve. Here, the default MDS matrices and diffusion are used. If you want to get a Plonky3 compliant version, set them to `MdsType::Plonky` and `DiffusionStrategy::Montgomery` respectively.
## The Tree Builder
Similar to Poseidon1, you can use Poseidon2 in a tree builder.
```rust
use icicle_bn254::tree::Bn254TreeBuilder;
use icicle_bn254::poseidon2::Poseidon2;
let mut config = TreeBuilderConfig::default();
let arity = 2;
config.arity = arity as u32;
let input_block_len = arity;
let leaves = vec![F::one(); (1 << height) * arity];
let mut digests = vec![F::zero(); merkle_tree_digests_len((height + 1) as u32, arity as u32, 1)];
let leaves_slice = HostSlice::from_slice(&leaves);
let digests_slice = HostSlice::from_mut_slice(&mut digests);
let ctx = device_context::DeviceContext::default();
let hash = Poseidon2::load(arity, arity, MdsType::Default, DiffusionStrategy::Default, &ctx).unwrap();
let mut config = TreeBuilderConfig::default();
config.keep_rows = 5;
Bn254TreeBuilder::build_merkle_tree(
leaves_slice,
digests_slice,
height,
input_block_len,
&hash,
&hash,
&config,
)
.unwrap();
```

19
docs/package-lock.json generated
View File

@@ -3680,6 +3680,8 @@
"version": "8.12.0",
"resolved": "https://registry.npmjs.org/ajv/-/ajv-8.12.0.tgz",
"integrity": "sha512-sRu1kpcO9yLtYxBKvqfTeh9KzZEwO3STyX1HT+4CaDzC6HpTGYhIhPIzj9XuKU7KYDwnaeh5hcOwjy1QuJzBPA==",
"optional": true,
"peer": true,
"dependencies": {
"fast-deep-equal": "^3.1.1",
"json-schema-traverse": "^1.0.0",
@@ -3694,7 +3696,9 @@
"node_modules/ajv-formats/node_modules/json-schema-traverse": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/json-schema-traverse/-/json-schema-traverse-1.0.0.tgz",
"integrity": "sha512-NM8/P9n3XjXhIZn1lLhkFaACTOURQXjWhV4BA/RnOv8xvgqtqpAX9IO4mRQxSx1Rlo4tqzeqb0sOlruaOy3dug=="
"integrity": "sha512-NM8/P9n3XjXhIZn1lLhkFaACTOURQXjWhV4BA/RnOv8xvgqtqpAX9IO4mRQxSx1Rlo4tqzeqb0sOlruaOy3dug==",
"optional": true,
"peer": true
},
"node_modules/ajv-keywords": {
"version": "3.5.2",
@@ -16340,14 +16344,13 @@
"version": "2.1.1",
"resolved": "https://registry.npmjs.org/ajv-formats/-/ajv-formats-2.1.1.tgz",
"integrity": "sha512-Wx0Kx52hxE7C18hkMEggYlEifqWZtYaRgouJor+WMdPnQyEK13vgEWyVNup7SoeeoLMsr4kf5h6dOW11I15MUA==",
"requires": {
"ajv": "^8.0.0"
},
"requires": {},
"dependencies": {
"ajv": {
"version": "8.12.0",
"resolved": "https://registry.npmjs.org/ajv/-/ajv-8.12.0.tgz",
"version": "https://registry.npmjs.org/ajv/-/ajv-8.12.0.tgz",
"integrity": "sha512-sRu1kpcO9yLtYxBKvqfTeh9KzZEwO3STyX1HT+4CaDzC6HpTGYhIhPIzj9XuKU7KYDwnaeh5hcOwjy1QuJzBPA==",
"optional": true,
"peer": true,
"requires": {
"fast-deep-equal": "^3.1.1",
"json-schema-traverse": "^1.0.0",
@@ -16358,7 +16361,9 @@
"json-schema-traverse": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/json-schema-traverse/-/json-schema-traverse-1.0.0.tgz",
"integrity": "sha512-NM8/P9n3XjXhIZn1lLhkFaACTOURQXjWhV4BA/RnOv8xvgqtqpAX9IO4mRQxSx1Rlo4tqzeqb0sOlruaOy3dug=="
"integrity": "sha512-NM8/P9n3XjXhIZn1lLhkFaACTOURQXjWhV4BA/RnOv8xvgqtqpAX9IO4mRQxSx1Rlo4tqzeqb0sOlruaOy3dug==",
"optional": true,
"peer": true
}
}
},

View File

@@ -53,6 +53,11 @@ module.exports = {
label: "Poseidon Hash",
id: "icicle/primitives/poseidon",
},
{
type: "doc",
label: "Poseidon2 Hash",
id: "icicle/primitives/poseidon2",
},
],
},
{
@@ -186,11 +191,6 @@ module.exports = {
},
]
},
{
type: "doc",
label: "ZK Containers",
id: "ZKContainers",
},
{
type: "doc",
label: "Ingonyama Grant program",

View File

@@ -19,5 +19,5 @@ add_executable(
example.cu
)
target_include_directories(example PRIVATE "../../../icicle/include")
target_link_libraries(example ${CMAKE_SOURCE_DIR}/build/icicle/lib/libingo_curve_bn254.a ${CMAKE_SOURCE_DIR}/build/icicle/lib/libingo_curve_bls12_377.a)
target_link_libraries(example ${CMAKE_SOURCE_DIR}/build/icicle/lib/libingo_curve_bn254.a)
set_target_properties(example PROPERTIES CUDA_SEPARABLE_COMPILATION ON)

View File

@@ -1,9 +1,5 @@
# Icicle example: Muli-Scalar Multiplication (MSM)
## Best-Practices
We recommend to run our examples in [ZK-containers](../../ZK-containers.md) to save your time and mental energy.
## Key-Takeaway
`Icicle` provides CUDA C++ template function `MSM` to accelerate [Multi-Scalar Multiplication](https://github.com/ingonyama-zk/ingopedia/blob/master/src/msm.md).

View File

@@ -10,9 +10,6 @@ mkdir -p build/icicle
cmake -S ../../../icicle/ -B build/icicle -DCMAKE_BUILD_TYPE=Release -DCURVE=bn254 -DG2=ON
cmake --build build/icicle
cmake -S ../../../icicle/ -B build/icicle -DCMAKE_BUILD_TYPE=Release -DCURVE=bls12_377 -DG2=ON
cmake --build build/icicle
# Configure and build the example application
cmake -S . -B build/example
cmake --build build/example

View File

@@ -3,23 +3,9 @@
#include <iomanip>
#include "api/bn254.h"
#include "api/bls12_377.h"
using namespace bn254;
// using namespace bn254;
typedef bn254::scalar_t scalar_bn254;
typedef bn254::affine_t affine_bn254;
typedef bn254::g2_affine_t g2_affine_bn254;
typedef bn254::projective_t projective_bn254;
typedef bn254::g2_projective_t g2_projective_bn254;
typedef bls12_377::scalar_t scalar_bls12377;
typedef bls12_377::affine_t affine_bls12377;
typedef bls12_377::g2_affine_t g2_affine_bls12377;
typedef bls12_377::projective_t projective_bls12377;
typedef bls12_377::g2_projective_t g2_projective_bls12377;
int msm_bn254(int argc, char* argv[])
int main(int argc, char* argv[])
{
std::cout << "Icicle example: Muli-Scalar Multiplication (MSM)" << std::endl;
std::cout << "Example parameters" << std::endl;
@@ -32,11 +18,11 @@ int msm_bn254(int argc, char* argv[])
std::cout << "Part I: use G1 points" << std::endl;
std::cout << "Generating random inputs on-host" << std::endl;
scalar_bn254* scalars = new scalar_bn254[N];
affine_bn254* points = new affine_bn254[N];
projective_bn254 result;
scalar_bn254::rand_host_many(scalars, N);
projective_bn254::rand_host_many_affine(points, N);
scalar_t* scalars = new scalar_t[N];
affine_t* points = new affine_t[N];
projective_t result;
scalar_t::rand_host_many(scalars, N);
projective_t::rand_host_many_affine(points, N);
std::cout << "Using default MSM configuration with on-host inputs" << std::endl;
device_context::DeviceContext ctx = device_context::get_default_device_context();
@@ -62,17 +48,17 @@ int msm_bn254(int argc, char* argv[])
cudaStream_t stream = config.ctx.stream;
// Execute the MSM kernel
bn254_msm_cuda(scalars, points, msm_size, config, &result);
std::cout << projective_bn254::to_affine(result) << std::endl;
std::cout << projective_t::to_affine(result) << std::endl;
std::cout << "Copying inputs on-device" << std::endl;
scalar_bn254* scalars_d;
affine_bn254* points_d;
projective_bn254* result_d;
cudaMalloc(&scalars_d, sizeof(scalar_bn254) * N);
cudaMalloc(&points_d, sizeof(affine_bn254) * N);
cudaMalloc(&result_d, sizeof(projective_bn254));
cudaMemcpy(scalars_d, scalars, sizeof(scalar_bn254) * N, cudaMemcpyHostToDevice);
cudaMemcpy(points_d, points, sizeof(affine_bn254) * N, cudaMemcpyHostToDevice);
scalar_t* scalars_d;
affine_t* points_d;
projective_t* result_d;
cudaMalloc(&scalars_d, sizeof(scalar_t) * N);
cudaMalloc(&points_d, sizeof(affine_t) * N);
cudaMalloc(&result_d, sizeof(projective_t));
cudaMemcpy(scalars_d, scalars, sizeof(scalar_t) * N, cudaMemcpyHostToDevice);
cudaMemcpy(points_d, points, sizeof(affine_t) * N, cudaMemcpyHostToDevice);
std::cout << "Reconfiguring MSM to use on-device inputs" << std::endl;
config.are_results_on_device = true;
@@ -84,9 +70,9 @@ int msm_bn254(int argc, char* argv[])
bn254_msm_cuda(scalars_d, points_d, msm_size, config, result_d);
// Copy the result back to the host
cudaMemcpy(&result, result_d, sizeof(projective_bn254), cudaMemcpyDeviceToHost);
cudaMemcpy(&result, result_d, sizeof(projective_t), cudaMemcpyDeviceToHost);
// Print the result
std::cout << projective_bn254::to_affine(result) << std::endl;
std::cout << projective_t::to_affine(result) << std::endl;
// Free the device memory
cudaFree(scalars_d);
cudaFree(points_d);
@@ -98,25 +84,25 @@ int msm_bn254(int argc, char* argv[])
std::cout << "Generating random inputs on-host" << std::endl;
// use the same scalars
g2_affine_bn254* g2_points = new g2_affine_bn254[N];
g2_projective_bn254::rand_host_many_affine(g2_points, N);
g2_affine_t* g2_points = new g2_affine_t[N];
g2_projective_t::rand_host_many_affine(g2_points, N);
std::cout << "Reconfiguring MSM to use on-host inputs" << std::endl;
config.are_results_on_device = false;
config.are_scalars_on_device = false;
config.are_points_on_device = false;
g2_projective_bn254 g2_result;
g2_projective_t g2_result;
bn254_g2_msm_cuda(scalars, g2_points, msm_size, config, &g2_result);
std::cout << g2_projective_bn254::to_affine(g2_result) << std::endl;
std::cout << g2_projective_t::to_affine(g2_result) << std::endl;
std::cout << "Copying inputs on-device" << std::endl;
g2_affine_bn254* g2_points_d;
g2_projective_bn254* g2_result_d;
cudaMalloc(&scalars_d, sizeof(scalar_bn254) * N);
cudaMalloc(&g2_points_d, sizeof(g2_affine_bn254) * N);
cudaMalloc(&g2_result_d, sizeof(g2_projective_bn254));
cudaMemcpy(scalars_d, scalars, sizeof(scalar_bn254) * N, cudaMemcpyHostToDevice);
cudaMemcpy(g2_points_d, g2_points, sizeof(g2_affine_bn254) * N, cudaMemcpyHostToDevice);
g2_affine_t* g2_points_d;
g2_projective_t* g2_result_d;
cudaMalloc(&scalars_d, sizeof(scalar_t) * N);
cudaMalloc(&g2_points_d, sizeof(g2_affine_t) * N);
cudaMalloc(&g2_result_d, sizeof(g2_projective_t));
cudaMemcpy(scalars_d, scalars, sizeof(scalar_t) * N, cudaMemcpyHostToDevice);
cudaMemcpy(g2_points_d, g2_points, sizeof(g2_affine_t) * N, cudaMemcpyHostToDevice);
std::cout << "Reconfiguring MSM to use on-device inputs" << std::endl;
config.are_results_on_device = true;
@@ -125,140 +111,14 @@ int msm_bn254(int argc, char* argv[])
std::cout << "Running MSM kernel with on-device inputs" << std::endl;
bn254_g2_msm_cuda(scalars_d, g2_points_d, msm_size, config, g2_result_d);
cudaMemcpy(&g2_result, g2_result_d, sizeof(g2_projective_bn254), cudaMemcpyDeviceToHost);
std::cout << g2_projective_bn254::to_affine(g2_result) << std::endl;
cudaMemcpy(&g2_result, g2_result_d, sizeof(g2_projective_t), cudaMemcpyDeviceToHost);
std::cout << g2_projective_t::to_affine(g2_result) << std::endl;
cudaFree(scalars_d);
cudaFree(g2_points_d);
cudaFree(g2_result_d);
delete[] g2_points;
delete[] scalars;
return 0;
}
int msm_bls12_377(int argc, char* argv[])
{
std::cout << "Icicle example: Muli-Scalar Multiplication (MSM)" << std::endl;
std::cout << "Example parameters" << std::endl;
int batch_size = 1;
std::cout << "Batch size: " << batch_size << std::endl;
unsigned msm_size = 1048576;
std::cout << "MSM size: " << msm_size << std::endl;
int N = batch_size * msm_size;
std::cout << "Part I: use G1 points" << std::endl;
std::cout << "Generating random inputs on-host" << std::endl;
scalar_bls12377* scalars = new scalar_bls12377[N];
affine_bls12377* points = new affine_bls12377[N];
projective_bls12377 result;
scalar_bls12377::rand_host_many(scalars, N);
projective_bls12377::rand_host_many_affine(points, N);
std::cout << "Using default MSM configuration with on-host inputs" << std::endl;
device_context::DeviceContext ctx = device_context::get_default_device_context();
msm::MSMConfig config = {
ctx, // ctx
0, // points_size
1, // precompute_factor
0, // c
0, // bitsize
10, // large_bucket_factor
1, // batch_size
false, // are_scalars_on_device
false, // are_scalars_montgomery_form
false, // are_points_on_device
false, // are_points_montgomery_form
false, // are_results_on_device
false, // is_big_triangle
false, // is_async
};
config.batch_size = batch_size;
std::cout << "Running MSM kernel with on-host inputs" << std::endl;
cudaStream_t stream = config.ctx.stream;
cudaStreamCreate(&stream);
// Execute the MSM kernel
bls12_377_msm_cuda(scalars, points, msm_size, config, &result);
std::cout << projective_bls12377::to_affine(result) << std::endl;
std::cout << "Copying inputs on-device" << std::endl;
scalar_bls12377* scalars_d_bls;
affine_bls12377* points_d_bls;
projective_bls12377* result_d_bls;
cudaMalloc(&scalars_d_bls, sizeof(scalar_bls12377) * N);
cudaMalloc(&points_d_bls, sizeof(affine_bls12377) * N);
cudaMalloc(&result_d_bls, sizeof(projective_bls12377));
cudaMemcpy(scalars_d_bls, scalars, sizeof(scalar_bls12377) * N, cudaMemcpyHostToDevice);
cudaMemcpy(points_d_bls, points, sizeof(affine_bls12377) * N, cudaMemcpyHostToDevice);
std::cout << "Reconfiguring MSM to use on-device inputs" << std::endl;
config.are_results_on_device = true;
config.are_scalars_on_device = true;
config.are_points_on_device = true;
std::cout << "Running MSM kernel with on-device inputs" << std::endl;
// Execute the MSM kernel
bls12_377_msm_cuda(scalars_d_bls, points_d_bls, msm_size, config, result_d_bls);
// Copy the result back to the host
cudaMemcpy(&result, result_d_bls, sizeof(projective_bls12377), cudaMemcpyDeviceToHost);
// Print the result
std::cout << projective_bls12377::to_affine(result) << std::endl;
// Free the device memory
cudaFree(scalars_d_bls);
cudaFree(points_d_bls);
cudaFree(result_d_bls);
// Free the host memory, keep scalars for G2 example
delete[] points;
std::cout << "Part II: use G2 points" << std::endl;
std::cout << "Generating random inputs on-host" << std::endl;
// use the same scalars
g2_affine_bls12377* g2_points = new g2_affine_bls12377[N];
g2_projective_bls12377::rand_host_many_affine(g2_points, N);
std::cout << "Reconfiguring MSM to use on-host inputs" << std::endl;
config.are_results_on_device = false;
config.are_scalars_on_device = false;
config.are_points_on_device = false;
g2_projective_bls12377 g2_result;
bls12_377_g2_msm_cuda(scalars, g2_points, msm_size, config, &g2_result);
std::cout << g2_projective_bls12377::to_affine(g2_result) << std::endl;
std::cout << "Copying inputs on-device" << std::endl;
g2_affine_bls12377* g2_points_d;
g2_projective_bls12377* g2_result_d;
cudaMalloc(&scalars_d_bls, sizeof(scalar_bls12377) * N);
cudaMalloc(&g2_points_d, sizeof(g2_affine_bls12377) * N);
cudaMalloc(&g2_result_d, sizeof(g2_projective_bls12377));
cudaMemcpy(scalars_d_bls, scalars, sizeof(scalar_bls12377) * N, cudaMemcpyHostToDevice);
cudaMemcpy(g2_points_d, g2_points, sizeof(g2_affine_bls12377) * N, cudaMemcpyHostToDevice);
std::cout << "Reconfiguring MSM to use on-device inputs" << std::endl;
config.are_results_on_device = true;
config.are_scalars_on_device = true;
config.are_points_on_device = true;
std::cout << "Running MSM kernel with on-device inputs" << std::endl;
bls12_377_g2_msm_cuda(scalars_d_bls, g2_points_d, msm_size, config, g2_result_d);
cudaMemcpy(&g2_result, g2_result_d, sizeof(g2_projective_bn254), cudaMemcpyDeviceToHost);
std::cout << g2_projective_bls12377::to_affine(g2_result) << std::endl;
cudaFree(scalars_d_bls);
cudaFree(g2_points_d);
cudaFree(g2_result_d);
delete[] g2_points;
delete[] scalars;
return 0;
}
int main(int argc, char* argv[])
{
std::cout << "Starting BN254 MSM" << std::endl;
msm_bn254(argc, argv);
std::cout << "Starting BLS12-377 MSM" << std::endl;
msm_bls12_377(argc, argv);
cudaStreamDestroy(stream);
return 0;
}

View File

@@ -18,7 +18,7 @@ add_executable(
example.cu
)
target_include_directories(example PRIVATE "../../../icicle/include")
target_link_libraries(example ${CMAKE_SOURCE_DIR}/build/icicle/lib/libingo_field_bn254.a ${CMAKE_SOURCE_DIR}/build/icicle/lib/libingo_field_bls12_377.a)
target_link_libraries(example ${CMAKE_SOURCE_DIR}/build/icicle/lib/libingo_field_bn254.a)
find_library(NVML_LIBRARY nvidia-ml PATHS /usr/local/cuda/targets/x86_64-linux/lib/stubs/ )
target_link_libraries(example ${NVML_LIBRARY})
set_target_properties(example PROPERTIES CUDA_SEPARABLE_COMPILATION ON)

View File

@@ -1,9 +1,5 @@
# Icicle example: Multiplication
## Best-Practices
We recommend to run our examples in [ZK-containers](../../ZK-containers.md) to save your time and mental energy.
## Key-Takeaway
`Icicle` accelerates multiplication operation `*` using [Karatsuba algorithm](https://en.wikipedia.org/wiki/Karatsuba_algorithm)

View File

@@ -7,13 +7,9 @@ mkdir -p build/example
mkdir -p build/icicle
# Configure and build Icicle
cmake -S ../../../icicle/ -B build/icicle -DMSM=OFF -DCMAKE_BUILD_TYPE=Release -DCURVE=bls12_377
cmake --build build/icicle
rm build/icicle/CMakeCache.txt
cmake -S ../../../icicle/ -B build/icicle -DMSM=OFF -DCMAKE_BUILD_TYPE=Release -DCURVE=bn254
cmake --build build/icicle
# Configure and build the example application
cmake -S . -B build/example
cmake --build build/example

View File

@@ -4,17 +4,14 @@
#include <nvml.h>
#include "api/bn254.h"
#include "api/bls12_377.h"
#include "vec_ops/vec_ops.cuh"
using namespace vec_ops;
// using namespace bn254;
typedef bn254::scalar_t T;
using namespace bn254;
typedef bls12_377::scalar_t T_bls;
typedef scalar_t T;
int vector_mult_bn254(T* vec_b, T* vec_a, T* vec_result, size_t n_elments, device_context::DeviceContext ctx)
int vector_mult(T* vec_b, T* vec_a, T* vec_result, size_t n_elments, device_context::DeviceContext ctx)
{
vec_ops::VecOpsConfig config = vec_ops::DefaultVecOpsConfig();
config.is_a_on_device = true;
@@ -28,24 +25,10 @@ int vector_mult_bn254(T* vec_b, T* vec_a, T* vec_result, size_t n_elments, devic
return 0;
}
int vector_mult_bls12377(T_bls* vec_b, T_bls* vec_a, T_bls* vec_result, size_t n_elments, device_context::DeviceContext ctx)
{
vec_ops::VecOpsConfig config = vec_ops::DefaultVecOpsConfig();
config.is_a_on_device = true;
config.is_b_on_device = true;
config.is_result_on_device = true;
cudaError_t err = bls12_377_mul_cuda(vec_a, vec_b, n_elments, config, vec_result);
if (err != cudaSuccess) {
std::cerr << "Failed to multiply vectors - " << cudaGetErrorString(err) << std::endl;
return 0;
}
return 0;
}
int main(int argc, char** argv)
{
const unsigned vector_size = 1 << 15;
const unsigned repetitions = 1 ;
const unsigned repetitions = 1 << 15;
cudaError_t err;
nvmlInit();
@@ -78,53 +61,41 @@ int main(int argc, char** argv)
// host data
T* host_in1 = (T*)malloc(vector_size * sizeof(T));
T* host_in2 = (T*)malloc(vector_size * sizeof(T));
T_bls* host_in1_bls12377 = (T_bls*)malloc(vector_size * sizeof(T_bls));
T_bls* host_in2_bls12377 = (T_bls*)malloc(vector_size * sizeof(T_bls));
std::cout << "Initializing vectors with random data" << std::endl;
T::rand_host_many(host_in1, vector_size);
T::rand_host_many(host_in2, vector_size);
T_bls::rand_host_many(host_in1_bls12377, vector_size);
T_bls::rand_host_many(host_in2_bls12377, vector_size);
// device data
device_context::DeviceContext ctx = device_context::get_default_device_context();
T* device_in1_bn254;
T* device_in2_bn254;
T* device_out_bn254;
T_bls* device_in1_bls12377;
T_bls* device_in2_bls12377;
T_bls* device_out_bls12377;
T* device_in1;
T* device_in2;
T* device_out;
err = cudaMalloc((void**)&device_in1_bn254, vector_size * sizeof(T));
err = cudaMalloc((void**)&device_in1_bls12377, vector_size * sizeof(T_bls));
err = cudaMalloc((void**)&device_in1, vector_size * sizeof(T));
if (err != cudaSuccess) {
std::cerr << "Failed to allocate device memory - " << cudaGetErrorString(err) << std::endl;
return 0;
}
err = cudaMalloc((void**)&device_in2_bn254, vector_size * sizeof(T));
err = cudaMalloc((void**)&device_in2_bls12377, vector_size * sizeof(T_bls));
err = cudaMalloc((void**)&device_in2, vector_size * sizeof(T));
if (err != cudaSuccess) {
std::cerr << "Failed to allocate device memory - " << cudaGetErrorString(err) << std::endl;
return 0;
}
err = cudaMalloc((void**)&device_out_bn254, vector_size * sizeof(T));
err = cudaMalloc((void**)&device_out_bls12377, vector_size * sizeof(T_bls));
err = cudaMalloc((void**)&device_out, vector_size * sizeof(T));
if (err != cudaSuccess) {
std::cerr << "Failed to allocate device memory - " << cudaGetErrorString(err) << std::endl;
return 0;
}
// copy from host to device
err = cudaMemcpy(device_in1_bn254, host_in1, vector_size * sizeof(T), cudaMemcpyHostToDevice);
err = cudaMemcpy(device_in1_bls12377, host_in1_bls12377, vector_size * sizeof(T_bls), cudaMemcpyHostToDevice);
err = cudaMemcpy(device_in1, host_in1, vector_size * sizeof(T), cudaMemcpyHostToDevice);
if (err != cudaSuccess) {
std::cerr << "Failed to copy data from host to device - " << cudaGetErrorString(err) << std::endl;
return 0;
}
err = cudaMemcpy(device_in2_bn254, host_in2, vector_size * sizeof(T), cudaMemcpyHostToDevice);
err = cudaMemcpy(device_in2_bls12377, host_in2_bls12377, vector_size * sizeof(T_bls), cudaMemcpyHostToDevice);
err = cudaMemcpy(device_in2, host_in2, vector_size * sizeof(T), cudaMemcpyHostToDevice);
if (err != cudaSuccess) {
std::cerr << "Failed to copy data from host to device - " << cudaGetErrorString(err) << std::endl;
return 0;
@@ -133,10 +104,7 @@ int main(int argc, char** argv)
std::cout << "Starting warm-up" << std::endl;
// Warm-up loop
for (int i = 0; i < repetitions; i++) {
std::cout << "bn254 mult" << std::endl;
vector_mult_bn254(device_in1_bn254, device_in2_bn254, device_out_bn254, vector_size, ctx);
std::cout << "bls12-377 mult" << std::endl;
vector_mult_bls12377(device_in1_bls12377, device_in2_bls12377, device_out_bls12377, vector_size, ctx);
vector_mult(device_in1, device_in2, device_out, vector_size, ctx);
}
std::cout << "Starting benchmarking" << std::endl;
@@ -154,7 +122,7 @@ int main(int argc, char** argv)
auto start_time = std::chrono::high_resolution_clock::now();
// Benchmark loop
for (int i = 0; i < repetitions; i++) {
vector_mult_bn254(device_in1_bn254, device_in2_bn254, device_out_bn254, vector_size, ctx);
vector_mult(device_in1, device_in2, device_out, vector_size, ctx);
}
auto end_time = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(end_time - start_time);
@@ -178,7 +146,7 @@ int main(int argc, char** argv)
// Optional: validate multiplication
T* host_out = (T*)malloc(vector_size * sizeof(T));
cudaMemcpy(host_out, device_out_bn254, vector_size * sizeof(T), cudaMemcpyDeviceToHost);
cudaMemcpy(host_out, device_out, vector_size * sizeof(T), cudaMemcpyDeviceToHost);
// validate multiplication here...
@@ -186,9 +154,9 @@ int main(int argc, char** argv)
free(host_in1);
free(host_in2);
free(host_out);
cudaFree(device_in1_bn254);
cudaFree(device_in2_bn254);
cudaFree(device_out_bn254);
cudaFree(device_in1);
cudaFree(device_in2);
cudaFree(device_out);
nvmlShutdown();
return 0;
}

View File

@@ -1,9 +1,5 @@
# Icicle example: Number-Theoretical Transform (NTT)
## Best-Practices
We recommend to run our examples in [ZK-containers](../../ZK-containers.md) to save your time and mental energy.
## Key-Takeaway
`Icicle` provides CUDA C++ template function NTT for [Number Theoretical Transform](https://github.com/ingonyama-zk/ingopedia/blob/master/src/fft.md), also known as Discrete Fourier Transform.

View File

@@ -1,9 +1,5 @@
# ICICLE example: Pedersen Commitment
## Best-Practices
We recommend to run our examples in [ZK-containers](../../ZK-containers.md) to save your time and mental energy.
## Key-Takeaway
A Pedersen Commitment is a cryptographic primitive to commit to a value or a vector of values while keeping it hidden, yet enabling the committer to reveal the value later. It provides both hiding (the commitment does not reveal any information about the value) and binding properties (once a value is committed, it cannot be changed without detection).

View File

@@ -1,9 +1,5 @@
# ICICLE examples: computations with polynomials
## Best-Practices
We recommend to run our examples in [ZK-containers](../../ZK-containers.md) to save your time and mental energy.
## Key-Takeaway
Polynomials are crucial for Zero-Knowledge Proofs (ZKPs): they enable efficient representation and verification of computational statements, facilitate privacy-preserving protocols, and support complex mathematical operations essential for constructing and verifying proofs without revealing underlying data. Polynomial API is documented [here](https://dev.ingonyama.com/icicle/polynomials/overview)

View File

@@ -1,9 +1,5 @@
# Icicle example: build a Merkle tree using Poseidon hash
## Best-Practices
We recommend to run our examples in [ZK-containers](../../ZK-containers.md) to save your time and mental energy.
## Key-Takeaway
`Icicle` provides CUDA C++ template `poseidon_hash` to accelerate the popular [Poseidon hash function](https://www.poseidon-hash.info/).

View File

@@ -2,10 +2,6 @@
`ICICLE` provides Rust bindings to CUDA-accelerated C++ implementation of [Multi-Scalar Multiplication](https://github.com/ingonyama-zk/ingopedia/blob/master/src/msm.md).
## Best Practices
In order to save time and setting up prerequisites manually, we recommend running this example in our [ZKContainer](../../ZKContainer.md).
## Usage
```rust

View File

@@ -4,10 +4,6 @@
`ICICLE` provides Rust bindings to CUDA-accelerated C++ implementation of [Number Theoretic Transform](https://github.com/ingonyama-zk/ingopedia/blob/master/src/fft.md).
## Best Practices
In order to save time and setting up prerequisites manually, we recommend running this example in our [ZKContainer](../../ZKContainer.md).
## Usage
```rust

View File

@@ -124,6 +124,19 @@ public:
*/
static constexpr HOST_DEVICE_INLINE unsigned num_of_reductions() { return CONFIG::num_of_reductions; }
// count number of bits of the field element without leading zeros.
static constexpr HOST_DEVICE_INLINE unsigned num_bits(const Field& x)
{
size_t size = sizeof(x.limbs_storage.limbs[0]) * 8;
unsigned ret = size * TLC;
for (unsigned i = TLC; i-- > 0;) {
int leading = __clz(x.limbs_storage.limbs[i]);
ret -= leading;
if (leading != size) { break; }
}
return ret;
}
static constexpr unsigned slack_bits = 32 * TLC - NBITS;
struct Wide {

View File

@@ -11,9 +11,25 @@
using namespace hash;
namespace keccak {
// Input rate in bytes
const int KECCAK_256_RATE = 136;
const int KECCAK_512_RATE = 72;
// Digest size in u64
const int KECCAK_256_DIGEST = 4;
const int KECCAK_512_DIGEST = 8;
// Number of state elements in u64
const int KECCAK_STATE_SIZE = 25;
const int KECCAK_PADDING_CONST = 1;
const int SHA3_PADDING_CONST = 6;
class Keccak : public Hasher<uint8_t, uint64_t>
{
public:
const int PADDING_CONST;
cudaError_t run_hash_many_kernel(
const uint8_t* input,
uint64_t* output,
@@ -22,7 +38,34 @@ namespace keccak {
unsigned int output_len,
const device_context::DeviceContext& ctx) const override;
Keccak(unsigned int rate) : Hasher<uint8_t, uint64_t>(25, 25, rate, 0) {}
Keccak(unsigned int rate, unsigned int padding_const)
: Hasher<uint8_t, uint64_t>(KECCAK_STATE_SIZE, KECCAK_STATE_SIZE, rate, 0), PADDING_CONST(padding_const)
{
}
};
class Keccak256 : public Keccak
{
public:
Keccak256() : Keccak(KECCAK_256_RATE, KECCAK_PADDING_CONST) {}
};
class Keccak512 : public Keccak
{
public:
Keccak512() : Keccak(KECCAK_512_RATE, KECCAK_PADDING_CONST) {}
};
class Sha3_256 : public Keccak
{
public:
Sha3_256() : Keccak(KECCAK_256_RATE, SHA3_PADDING_CONST) {}
};
class Sha3_512 : public Keccak
{
public:
Sha3_512() : Keccak(KECCAK_512_RATE, SHA3_PADDING_CONST) {}
};
} // namespace keccak

View File

@@ -20,6 +20,11 @@ extern "C" void CONCAT_EXPAND(CURVE, to_affine)(projective_t* point, affine_t* p
*point_out = projective_t::to_affine(*point);
}
extern "C" void CONCAT_EXPAND(CURVE, from_affine)(affine_t* point, projective_t* point_out)
{
*point_out = projective_t::from_affine(*point);
}
extern "C" void CONCAT_EXPAND(CURVE, generate_projective_points)(projective_t* points, int size)
{
projective_t::rand_host_many(points, size);

View File

@@ -20,6 +20,11 @@ extern "C" void CONCAT_EXPAND(CURVE, g2_to_affine)(g2_projective_t* point, g2_af
*point_out = g2_projective_t::to_affine(*point);
}
extern "C" void CONCAT_EXPAND(CURVE, g2_from_affine)(g2_affine_t* point, g2_projective_t* point_out)
{
*point_out = g2_projective_t::from_affine(*point);
}
extern "C" void CONCAT_EXPAND(CURVE, g2_generate_projective_points)(g2_projective_t* points, int size)
{
g2_projective_t::rand_host_many(points, size);

View File

@@ -11,13 +11,29 @@ namespace keccak {
extern "C" cudaError_t
keccak256_cuda(uint8_t* input, int input_block_size, int number_of_blocks, uint8_t* output, HashConfig& config)
{
return Keccak(136).hash_many(input, (uint64_t*)output, number_of_blocks, input_block_size, 4, config);
return Keccak256().hash_many(
input, (uint64_t*)output, number_of_blocks, input_block_size, KECCAK_256_DIGEST, config);
}
extern "C" cudaError_t
keccak512_cuda(uint8_t* input, int input_block_size, int number_of_blocks, uint8_t* output, HashConfig& config)
{
return Keccak(72).hash_many(input, (uint64_t*)output, number_of_blocks, input_block_size, 8, config);
return Keccak512().hash_many(
input, (uint64_t*)output, number_of_blocks, input_block_size, KECCAK_512_DIGEST, config);
}
extern "C" cudaError_t
sha3_256_cuda(uint8_t* input, int input_block_size, int number_of_blocks, uint8_t* output, HashConfig& config)
{
return Sha3_256().hash_many(
input, (uint64_t*)output, number_of_blocks, input_block_size, KECCAK_256_DIGEST, config);
}
extern "C" cudaError_t
sha3_512_cuda(uint8_t* input, int input_block_size, int number_of_blocks, uint8_t* output, HashConfig& config)
{
return Sha3_512().hash_many(
input, (uint64_t*)output, number_of_blocks, input_block_size, KECCAK_512_DIGEST, config);
}
extern "C" cudaError_t build_keccak256_merkle_tree_cuda(
@@ -27,7 +43,7 @@ namespace keccak {
unsigned int input_block_len,
const merkle_tree::TreeBuilderConfig& tree_config)
{
Keccak keccak(136);
Keccak256 keccak;
return merkle_tree::build_merkle_tree<uint8_t, uint64_t>(
leaves, digests, height, input_block_len, keccak, keccak, tree_config);
}
@@ -39,7 +55,31 @@ namespace keccak {
unsigned int input_block_len,
const merkle_tree::TreeBuilderConfig& tree_config)
{
Keccak keccak(72);
Keccak512 keccak;
return merkle_tree::build_merkle_tree<uint8_t, uint64_t>(
leaves, digests, height, input_block_len, keccak, keccak, tree_config);
}
extern "C" cudaError_t build_sha3_256_merkle_tree_cuda(
const uint8_t* leaves,
uint64_t* digests,
unsigned int height,
unsigned int input_block_len,
const merkle_tree::TreeBuilderConfig& tree_config)
{
Sha3_256 keccak;
return merkle_tree::build_merkle_tree<uint8_t, uint64_t>(
leaves, digests, height, input_block_len, keccak, keccak, tree_config);
}
extern "C" cudaError_t build_sha3_512_merkle_tree_cuda(
const uint8_t* leaves,
uint64_t* digests,
unsigned int height,
unsigned int input_block_len,
const merkle_tree::TreeBuilderConfig& tree_config)
{
Sha3_512 keccak;
return merkle_tree::build_merkle_tree<uint8_t, uint64_t>(
leaves, digests, height, input_block_len, keccak, keccak, tree_config);
}

View File

@@ -161,7 +161,7 @@ namespace keccak {
0x8000000000008002, 0x8000000000000080, 0x000000000000800a, 0x800000008000000a,
0x8000000080008081, 0x8000000000008080, 0x0000000080000001, 0x8000000080008008};
__device__ void keccakf(u64 s[25])
__device__ void keccakf(u64 s[KECCAK_STATE_SIZE])
{
u64 t0, t1, t2, t3, t4;
@@ -180,15 +180,20 @@ namespace keccak {
}
template <const int R>
__global__ void
keccak_hash_blocks(const uint8_t* input, int input_block_size, int output_len, int number_of_blocks, uint64_t* output)
__global__ void keccak_hash_blocks(
const uint8_t* input,
int input_block_size,
int output_len,
int number_of_blocks,
uint64_t* output,
int padding_const)
{
int sid = (blockIdx.x * blockDim.x) + threadIdx.x;
if (sid >= number_of_blocks) { return; }
const uint8_t* b_input = input + sid * input_block_size;
uint64_t* b_output = output + sid * output_len;
uint64_t state[25] = {}; // Initialize with zeroes
uint64_t state[KECCAK_STATE_SIZE] = {}; // Initialize with zeroes
int input_len = input_block_size;
@@ -209,7 +214,7 @@ namespace keccak {
}
// pad 10*1
last_block[input_len] = 1;
last_block[input_len] = padding_const;
for (int i = 0; i < R - input_len - 1; i++) {
last_block[input_len + i + 1] = 0;
}
@@ -238,13 +243,13 @@ namespace keccak {
int number_of_gpu_blocks = (number_of_states - 1) / number_of_threads + 1;
switch (rate) {
case 136:
keccak_hash_blocks<136><<<number_of_gpu_blocks, number_of_threads, 0, ctx.stream>>>(
input, input_len, output_len, number_of_states, output);
case KECCAK_256_RATE:
keccak_hash_blocks<KECCAK_256_RATE><<<number_of_gpu_blocks, number_of_threads, 0, ctx.stream>>>(
input, input_len, output_len, number_of_states, output, PADDING_CONST);
break;
case 72:
keccak_hash_blocks<72><<<number_of_gpu_blocks, number_of_threads, 0, ctx.stream>>>(
input, input_len, output_len, number_of_states, output);
case KECCAK_512_RATE:
keccak_hash_blocks<KECCAK_512_RATE><<<number_of_gpu_blocks, number_of_threads, 0, ctx.stream>>>(
input, input_len, output_len, number_of_states, output, PADDING_CONST);
break;
default:
THROW_ICICLE_ERR(IcicleError_t::InvalidArgument, "KeccakHash: #rate must be one of [136, 72]");

View File

@@ -129,8 +129,9 @@ namespace merkle_tree {
while (number_of_states > 0) {
CHK_IF_RETURN(compression.run_hash_many_kernel(
(L*)prev_layer, next_layer, number_of_states, tree_config.digest_elements * tree_config.arity,
tree_config.digest_elements, hash_config.ctx));
(L*)prev_layer, next_layer, number_of_states,
tree_config.digest_elements * tree_config.arity * (sizeof(D) / sizeof(L)), tree_config.digest_elements,
hash_config.ctx));
if (!keep_rows || subtree_height < keep_rows) {
D* digests_with_offset =
@@ -298,8 +299,9 @@ namespace merkle_tree {
size_t segment_offset = start_segment_offset;
while (number_of_states > 0) {
CHK_IF_RETURN(compression.run_hash_many_kernel(
(L*)prev_layer, next_layer, number_of_states, tree_config.digest_elements * tree_config.arity,
tree_config.digest_elements, tree_config.ctx));
(L*)prev_layer, next_layer, number_of_states,
tree_config.digest_elements * tree_config.arity * (sizeof(D) / sizeof(L)), tree_config.digest_elements,
tree_config.ctx));
if (!tree_config.keep_rows || cap_height < tree_config.keep_rows + (int)caps_mode) {
D* digests_with_offset = digests + segment_offset;
CHK_IF_RETURN(cudaMemcpyAsync(

View File

@@ -28,21 +28,6 @@ func (p *MockProjective) FromLimbs(x, y, z []uint32) MockProjective {
return *p
}
func (p *MockProjective) FromAffine(a MockAffine) MockProjective {
z := MockBaseField{}
z.One()
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
p.Zero()
} else {
p.X = a.X
p.Y = a.Y
p.Z = z.One()
}
return *p
}
type MockAffine struct {
X, Y MockBaseField
}
@@ -68,18 +53,3 @@ func (a *MockAffine) FromLimbs(x, y []uint32) MockAffine {
return *a
}
func (a MockAffine) ToProjective() MockProjective {
var z MockBaseField
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
var p MockProjective
return p.Zero()
}
return MockProjective{
X: a.X,
Y: a.Y,
Z: z.One(),
}
}

View File

@@ -40,17 +40,10 @@ func (p *Projective) FromLimbs(x, y, z []uint32) Projective {
}
func (p *Projective) FromAffine(a Affine) Projective {
z := BaseField{}
z.One()
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
p.Zero()
} else {
p.X = a.X
p.Y = a.Y
p.Z = z.One()
}
cA := (*C.affine_t)(unsafe.Pointer(&a))
cP := (*C.projective_t)(unsafe.Pointer(p))
C.bls12_377_from_affine(cA, cP)
return *p
}
@@ -65,7 +58,7 @@ func (p *Projective) ProjectiveToAffine() Affine {
var a Affine
cA := (*C.affine_t)(unsafe.Pointer(&a))
cP := (*C.projective_t)(unsafe.Pointer(&p))
cP := (*C.projective_t)(unsafe.Pointer(p))
C.bls12_377_to_affine(cP, cA)
return a
}
@@ -111,18 +104,12 @@ func (a *Affine) FromLimbs(x, y []uint32) Affine {
}
func (a Affine) ToProjective() Projective {
var z BaseField
var p Projective
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
var p Projective
return p.Zero()
}
return Projective{
X: a.X,
Y: a.Y,
Z: z.One(),
}
cA := (*C.affine_t)(unsafe.Pointer(&a))
cP := (*C.projective_t)(unsafe.Pointer(&p))
C.bls12_377_from_affine(cA, cP)
return p
}
func AffineFromProjective(p *Projective) Affine {

View File

@@ -40,17 +40,10 @@ func (p *G2Projective) FromLimbs(x, y, z []uint32) G2Projective {
}
func (p *G2Projective) FromAffine(a G2Affine) G2Projective {
z := G2BaseField{}
z.One()
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
p.Zero()
} else {
p.X = a.X
p.Y = a.Y
p.Z = z.One()
}
cA := (*C.g2_affine_t)(unsafe.Pointer(&a))
cP := (*C.g2_projective_t)(unsafe.Pointer(p))
C.bls12_377_g2_from_affine(cA, cP)
return *p
}
@@ -65,7 +58,7 @@ func (p *G2Projective) ProjectiveToAffine() G2Affine {
var a G2Affine
cA := (*C.g2_affine_t)(unsafe.Pointer(&a))
cP := (*C.g2_projective_t)(unsafe.Pointer(&p))
cP := (*C.g2_projective_t)(unsafe.Pointer(p))
C.bls12_377_g2_to_affine(cP, cA)
return a
}
@@ -111,18 +104,12 @@ func (a *G2Affine) FromLimbs(x, y []uint32) G2Affine {
}
func (a G2Affine) ToProjective() G2Projective {
var z G2BaseField
var p G2Projective
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
var p G2Projective
return p.Zero()
}
return G2Projective{
X: a.X,
Y: a.Y,
Z: z.One(),
}
cA := (*C.g2_affine_t)(unsafe.Pointer(&a))
cP := (*C.g2_projective_t)(unsafe.Pointer(&p))
C.bls12_377_g2_from_affine(cA, cP)
return p
}
func G2AffineFromProjective(p *G2Projective) G2Affine {

View File

@@ -14,6 +14,7 @@ typedef struct DeviceContext DeviceContext;
bool bls12_377_g2_eq(g2_projective_t* point1, g2_projective_t* point2);
void bls12_377_g2_to_affine(g2_projective_t* point, g2_affine_t* point_out);
void bls12_377_g2_from_affine(g2_affine_t* point, g2_projective_t* point_out);
void bls12_377_g2_generate_projective_points(g2_projective_t* points, int size);
void bls12_377_g2_generate_affine_points(g2_affine_t* points, int size);
cudaError_t bls12_377_g2_affine_convert_montgomery(g2_affine_t* points, size_t n, bool is_into, DeviceContext* ctx);

View File

@@ -14,6 +14,7 @@ typedef struct DeviceContext DeviceContext;
bool bls12_377_eq(projective_t* point1, projective_t* point2);
void bls12_377_to_affine(projective_t* point, affine_t* point_out);
void bls12_377_from_affine(affine_t* point, projective_t* point_out);
void bls12_377_generate_projective_points(projective_t* points, int size);
void bls12_377_generate_affine_points(affine_t* points, int size);
cudaError_t bls12_377_affine_convert_montgomery(affine_t* points, size_t n, bool is_into, DeviceContext* ctx);

View File

@@ -40,17 +40,10 @@ func (p *Projective) FromLimbs(x, y, z []uint32) Projective {
}
func (p *Projective) FromAffine(a Affine) Projective {
z := BaseField{}
z.One()
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
p.Zero()
} else {
p.X = a.X
p.Y = a.Y
p.Z = z.One()
}
cA := (*C.affine_t)(unsafe.Pointer(&a))
cP := (*C.projective_t)(unsafe.Pointer(p))
C.bls12_381_from_affine(cA, cP)
return *p
}
@@ -65,7 +58,7 @@ func (p *Projective) ProjectiveToAffine() Affine {
var a Affine
cA := (*C.affine_t)(unsafe.Pointer(&a))
cP := (*C.projective_t)(unsafe.Pointer(&p))
cP := (*C.projective_t)(unsafe.Pointer(p))
C.bls12_381_to_affine(cP, cA)
return a
}
@@ -111,18 +104,12 @@ func (a *Affine) FromLimbs(x, y []uint32) Affine {
}
func (a Affine) ToProjective() Projective {
var z BaseField
var p Projective
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
var p Projective
return p.Zero()
}
return Projective{
X: a.X,
Y: a.Y,
Z: z.One(),
}
cA := (*C.affine_t)(unsafe.Pointer(&a))
cP := (*C.projective_t)(unsafe.Pointer(&p))
C.bls12_381_from_affine(cA, cP)
return p
}
func AffineFromProjective(p *Projective) Affine {

View File

@@ -40,17 +40,10 @@ func (p *G2Projective) FromLimbs(x, y, z []uint32) G2Projective {
}
func (p *G2Projective) FromAffine(a G2Affine) G2Projective {
z := G2BaseField{}
z.One()
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
p.Zero()
} else {
p.X = a.X
p.Y = a.Y
p.Z = z.One()
}
cA := (*C.g2_affine_t)(unsafe.Pointer(&a))
cP := (*C.g2_projective_t)(unsafe.Pointer(p))
C.bls12_381_g2_from_affine(cA, cP)
return *p
}
@@ -65,7 +58,7 @@ func (p *G2Projective) ProjectiveToAffine() G2Affine {
var a G2Affine
cA := (*C.g2_affine_t)(unsafe.Pointer(&a))
cP := (*C.g2_projective_t)(unsafe.Pointer(&p))
cP := (*C.g2_projective_t)(unsafe.Pointer(p))
C.bls12_381_g2_to_affine(cP, cA)
return a
}
@@ -111,18 +104,13 @@ func (a *G2Affine) FromLimbs(x, y []uint32) G2Affine {
}
func (a G2Affine) ToProjective() G2Projective {
var z G2BaseField
var p G2Projective
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
var p G2Projective
return p.Zero()
}
cA := (*C.g2_affine_t)(unsafe.Pointer(&a))
cP := (*C.g2_projective_t)(unsafe.Pointer(&p))
C.bls12_381_g2_from_affine(cA, cP)
return p
return G2Projective{
X: a.X,
Y: a.Y,
Z: z.One(),
}
}
func G2AffineFromProjective(p *G2Projective) G2Affine {

View File

@@ -14,6 +14,7 @@ typedef struct DeviceContext DeviceContext;
bool bls12_381_g2_eq(g2_projective_t* point1, g2_projective_t* point2);
void bls12_381_g2_to_affine(g2_projective_t* point, g2_affine_t* point_out);
void bls12_381_g2_from_affine(g2_affine_t* point, g2_projective_t* point_out);
void bls12_381_g2_generate_projective_points(g2_projective_t* points, int size);
void bls12_381_g2_generate_affine_points(g2_affine_t* points, int size);
cudaError_t bls12_381_g2_affine_convert_montgomery(g2_affine_t* points, size_t n, bool is_into, DeviceContext* ctx);

View File

@@ -14,6 +14,7 @@ typedef struct DeviceContext DeviceContext;
bool bls12_381_eq(projective_t* point1, projective_t* point2);
void bls12_381_to_affine(projective_t* point, affine_t* point_out);
void bls12_381_from_affine(affine_t* point, projective_t* point_out);
void bls12_381_generate_projective_points(projective_t* points, int size);
void bls12_381_generate_affine_points(affine_t* points, int size);
cudaError_t bls12_381_affine_convert_montgomery(affine_t* points, size_t n, bool is_into, DeviceContext* ctx);

View File

@@ -40,17 +40,10 @@ func (p *Projective) FromLimbs(x, y, z []uint32) Projective {
}
func (p *Projective) FromAffine(a Affine) Projective {
z := BaseField{}
z.One()
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
p.Zero()
} else {
p.X = a.X
p.Y = a.Y
p.Z = z.One()
}
cA := (*C.affine_t)(unsafe.Pointer(&a))
cP := (*C.projective_t)(unsafe.Pointer(p))
C.bn254_from_affine(cA, cP)
return *p
}
@@ -65,7 +58,7 @@ func (p *Projective) ProjectiveToAffine() Affine {
var a Affine
cA := (*C.affine_t)(unsafe.Pointer(&a))
cP := (*C.projective_t)(unsafe.Pointer(&p))
cP := (*C.projective_t)(unsafe.Pointer(p))
C.bn254_to_affine(cP, cA)
return a
}
@@ -111,18 +104,13 @@ func (a *Affine) FromLimbs(x, y []uint32) Affine {
}
func (a Affine) ToProjective() Projective {
var z BaseField
var p Projective
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
var p Projective
return p.Zero()
}
cA := (*C.affine_t)(unsafe.Pointer(&a))
cP := (*C.projective_t)(unsafe.Pointer(&p))
C.bn254_from_affine(cA, cP)
return p
return Projective{
X: a.X,
Y: a.Y,
Z: z.One(),
}
}
func AffineFromProjective(p *Projective) Affine {

View File

@@ -40,17 +40,10 @@ func (p *G2Projective) FromLimbs(x, y, z []uint32) G2Projective {
}
func (p *G2Projective) FromAffine(a G2Affine) G2Projective {
z := G2BaseField{}
z.One()
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
p.Zero()
} else {
p.X = a.X
p.Y = a.Y
p.Z = z.One()
}
cA := (*C.g2_affine_t)(unsafe.Pointer(&a))
cP := (*C.g2_projective_t)(unsafe.Pointer(p))
C.bn254_g2_from_affine(cA, cP)
return *p
}
@@ -65,7 +58,7 @@ func (p *G2Projective) ProjectiveToAffine() G2Affine {
var a G2Affine
cA := (*C.g2_affine_t)(unsafe.Pointer(&a))
cP := (*C.g2_projective_t)(unsafe.Pointer(&p))
cP := (*C.g2_projective_t)(unsafe.Pointer(p))
C.bn254_g2_to_affine(cP, cA)
return a
}
@@ -111,18 +104,12 @@ func (a *G2Affine) FromLimbs(x, y []uint32) G2Affine {
}
func (a G2Affine) ToProjective() G2Projective {
var z G2BaseField
var p G2Projective
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
var p G2Projective
return p.Zero()
}
return G2Projective{
X: a.X,
Y: a.Y,
Z: z.One(),
}
cA := (*C.g2_affine_t)(unsafe.Pointer(&a))
cP := (*C.g2_projective_t)(unsafe.Pointer(&p))
C.bn254_g2_from_affine(cA, cP)
return p
}
func G2AffineFromProjective(p *G2Projective) G2Affine {

View File

@@ -14,6 +14,7 @@ typedef struct DeviceContext DeviceContext;
bool bn254_g2_eq(g2_projective_t* point1, g2_projective_t* point2);
void bn254_g2_to_affine(g2_projective_t* point, g2_affine_t* point_out);
void bn254_g2_from_affine(g2_affine_t* point, g2_projective_t* point_out);
void bn254_g2_generate_projective_points(g2_projective_t* points, int size);
void bn254_g2_generate_affine_points(g2_affine_t* points, int size);
cudaError_t bn254_g2_affine_convert_montgomery(g2_affine_t* points, size_t n, bool is_into, DeviceContext* ctx);

View File

@@ -14,6 +14,7 @@ typedef struct DeviceContext DeviceContext;
bool bn254_eq(projective_t* point1, projective_t* point2);
void bn254_to_affine(projective_t* point, affine_t* point_out);
void bn254_from_affine(affine_t* point, projective_t* point_out);
void bn254_generate_projective_points(projective_t* points, int size);
void bn254_generate_affine_points(affine_t* points, int size);
cudaError_t bn254_affine_convert_montgomery(affine_t* points, size_t n, bool is_into, DeviceContext* ctx);

View File

@@ -40,17 +40,10 @@ func (p *Projective) FromLimbs(x, y, z []uint32) Projective {
}
func (p *Projective) FromAffine(a Affine) Projective {
z := BaseField{}
z.One()
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
p.Zero()
} else {
p.X = a.X
p.Y = a.Y
p.Z = z.One()
}
cA := (*C.affine_t)(unsafe.Pointer(&a))
cP := (*C.projective_t)(unsafe.Pointer(p))
C.bw6_761_from_affine(cA, cP)
return *p
}
@@ -65,7 +58,7 @@ func (p *Projective) ProjectiveToAffine() Affine {
var a Affine
cA := (*C.affine_t)(unsafe.Pointer(&a))
cP := (*C.projective_t)(unsafe.Pointer(&p))
cP := (*C.projective_t)(unsafe.Pointer(p))
C.bw6_761_to_affine(cP, cA)
return a
}
@@ -111,18 +104,13 @@ func (a *Affine) FromLimbs(x, y []uint32) Affine {
}
func (a Affine) ToProjective() Projective {
var z BaseField
var p Projective
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
var p Projective
return p.Zero()
}
cA := (*C.affine_t)(unsafe.Pointer(&a))
cP := (*C.projective_t)(unsafe.Pointer(&p))
C.bw6_761_from_affine(cA, cP)
return p
return Projective{
X: a.X,
Y: a.Y,
Z: z.One(),
}
}
func AffineFromProjective(p *Projective) Affine {

View File

@@ -40,17 +40,10 @@ func (p *G2Projective) FromLimbs(x, y, z []uint32) G2Projective {
}
func (p *G2Projective) FromAffine(a G2Affine) G2Projective {
z := G2BaseField{}
z.One()
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
p.Zero()
} else {
p.X = a.X
p.Y = a.Y
p.Z = z.One()
}
cA := (*C.g2_affine_t)(unsafe.Pointer(&a))
cP := (*C.g2_projective_t)(unsafe.Pointer(p))
C.bw6_761_g2_from_affine(cA, cP)
return *p
}
@@ -65,7 +58,7 @@ func (p *G2Projective) ProjectiveToAffine() G2Affine {
var a G2Affine
cA := (*C.g2_affine_t)(unsafe.Pointer(&a))
cP := (*C.g2_projective_t)(unsafe.Pointer(&p))
cP := (*C.g2_projective_t)(unsafe.Pointer(p))
C.bw6_761_g2_to_affine(cP, cA)
return a
}
@@ -111,18 +104,12 @@ func (a *G2Affine) FromLimbs(x, y []uint32) G2Affine {
}
func (a G2Affine) ToProjective() G2Projective {
var z G2BaseField
var p G2Projective
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
var p G2Projective
return p.Zero()
}
return G2Projective{
X: a.X,
Y: a.Y,
Z: z.One(),
}
cA := (*C.g2_affine_t)(unsafe.Pointer(&a))
cP := (*C.g2_projective_t)(unsafe.Pointer(&p))
C.bw6_761_g2_from_affine(cA, cP)
return p
}
func G2AffineFromProjective(p *G2Projective) G2Affine {

View File

@@ -14,6 +14,7 @@ typedef struct DeviceContext DeviceContext;
bool bw6_761_g2_eq(g2_projective_t* point1, g2_projective_t* point2);
void bw6_761_g2_to_affine(g2_projective_t* point, g2_affine_t* point_out);
void bw6_761_g2_from_affine(g2_affine_t* point, g2_projective_t* point_out);
void bw6_761_g2_generate_projective_points(g2_projective_t* points, int size);
void bw6_761_g2_generate_affine_points(g2_affine_t* points, int size);
cudaError_t bw6_761_g2_affine_convert_montgomery(g2_affine_t* points, size_t n, bool is_into, DeviceContext* ctx);

View File

@@ -14,6 +14,7 @@ typedef struct DeviceContext DeviceContext;
bool bw6_761_eq(projective_t* point1, projective_t* point2);
void bw6_761_to_affine(projective_t* point, affine_t* point_out);
void bw6_761_from_affine(affine_t* point, projective_t* point_out);
void bw6_761_generate_projective_points(projective_t* points, int size);
void bw6_761_generate_affine_points(affine_t* points, int size);
cudaError_t bw6_761_affine_convert_montgomery(affine_t* points, size_t n, bool is_into, DeviceContext* ctx);

View File

@@ -40,17 +40,10 @@ func (p *Projective) FromLimbs(x, y, z []uint32) Projective {
}
func (p *Projective) FromAffine(a Affine) Projective {
z := BaseField{}
z.One()
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
p.Zero()
} else {
p.X = a.X
p.Y = a.Y
p.Z = z.One()
}
cA := (*C.affine_t)(unsafe.Pointer(&a))
cP := (*C.projective_t)(unsafe.Pointer(p))
C.grumpkin_from_affine(cA, cP)
return *p
}
@@ -65,7 +58,7 @@ func (p *Projective) ProjectiveToAffine() Affine {
var a Affine
cA := (*C.affine_t)(unsafe.Pointer(&a))
cP := (*C.projective_t)(unsafe.Pointer(&p))
cP := (*C.projective_t)(unsafe.Pointer(p))
C.grumpkin_to_affine(cP, cA)
return a
}
@@ -111,18 +104,13 @@ func (a *Affine) FromLimbs(x, y []uint32) Affine {
}
func (a Affine) ToProjective() Projective {
var z BaseField
var p Projective
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
var p Projective
return p.Zero()
}
cA := (*C.affine_t)(unsafe.Pointer(&a))
cP := (*C.projective_t)(unsafe.Pointer(&p))
C.grumpkin_from_affine(cA, cP)
return p
return Projective{
X: a.X,
Y: a.Y,
Z: z.One(),
}
}
func AffineFromProjective(p *Projective) Affine {

View File

@@ -14,6 +14,7 @@ typedef struct DeviceContext DeviceContext;
bool grumpkin_eq(projective_t* point1, projective_t* point2);
void grumpkin_to_affine(projective_t* point, affine_t* point_out);
void grumpkin_from_affine(affine_t* point, projective_t* point_out);
void grumpkin_generate_projective_points(projective_t* points, int size);
void grumpkin_generate_affine_points(affine_t* points, int size);
cudaError_t grumpkin_affine_convert_montgomery(affine_t* points, size_t n, bool is_into, DeviceContext* ctx);

View File

@@ -39,21 +39,17 @@ func (p *{{.CurvePrefix}}Projective) FromLimbs(x, y, z []uint32) {{.CurvePrefix}
return *p
}
{{if ne .CurvePrefix "Mock"}}
func (p *{{.CurvePrefix}}Projective) FromAffine(a {{.CurvePrefix}}Affine) {{.CurvePrefix}}Projective {
z := {{.CurvePrefix}}BaseField{}
z.One()
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
p.Zero()
}else{
p.X = a.X
p.Y = a.Y
p.Z = z.One()
}
cA := (*C.{{toCName .CurvePrefix}}affine_t)(unsafe.Pointer(&a))
cP := (*C.{{toCName .CurvePrefix}}projective_t)(unsafe.Pointer(p))
C.{{.Curve}}{{toCNameBackwards .CurvePrefix}}_from_affine(cA, cP)
return *p
}
{{if ne .CurvePrefix "Mock"}}
func (p {{.CurvePrefix}}Projective) ProjectiveEq(p2 *{{.CurvePrefix}}Projective) bool {
cP := (*C.{{toCName .CurvePrefix}}projective_t)(unsafe.Pointer(&p))
cP2 := (*C.{{toCName .CurvePrefix}}projective_t)(unsafe.Pointer(&p2))
@@ -65,7 +61,7 @@ func (p *{{.CurvePrefix}}Projective) ProjectiveToAffine() {{.CurvePrefix}}Affine
var a {{.CurvePrefix}}Affine
cA := (*C.{{toCName .CurvePrefix}}affine_t)(unsafe.Pointer(&a))
cP := (*C.{{toCName .CurvePrefix}}projective_t)(unsafe.Pointer(&p))
cP := (*C.{{toCName .CurvePrefix}}projective_t)(unsafe.Pointer(p))
C.{{.Curve}}{{toCNameBackwards .CurvePrefix}}_to_affine(cP, cA)
return a
}
@@ -110,21 +106,17 @@ func (a *{{.CurvePrefix}}Affine) FromLimbs(x, y []uint32) {{.CurvePrefix}}Affine
return *a
}
func (a {{.CurvePrefix}}Affine) ToProjective() {{.CurvePrefix}}Projective {
var z {{.CurvePrefix}}BaseField
if (a.X == z.Zero()) && (a.Y == z.Zero()) {
var p {{.CurvePrefix}}Projective
return p.Zero()
}
return {{.CurvePrefix}}Projective{
X: a.X,
Y: a.Y,
Z: z.One(),
}
}
{{if ne .CurvePrefix "Mock"}}
func (a {{.CurvePrefix}}Affine) ToProjective() {{.CurvePrefix}}Projective {
var p {{.CurvePrefix}}Projective
cA := (*C.{{toCName .CurvePrefix}}affine_t)(unsafe.Pointer(&a))
cP := (*C.{{toCName .CurvePrefix}}projective_t)(unsafe.Pointer(&p))
C.{{.Curve}}{{toCNameBackwards .CurvePrefix}}_from_affine(cA, cP)
return p
}
func {{.CurvePrefix}}AffineFromProjective(p *{{.CurvePrefix}}Projective) {{.CurvePrefix}}Affine {
return p.ProjectiveToAffine()
}

View File

@@ -14,6 +14,7 @@ typedef struct DeviceContext DeviceContext;
bool {{.Curve}}{{toCNameBackwards .CurvePrefix}}_eq({{toCName .CurvePrefix}}projective_t* point1, {{toCName .CurvePrefix}}projective_t* point2);
void {{.Curve}}{{toCNameBackwards .CurvePrefix}}_to_affine({{toCName .CurvePrefix}}projective_t* point, {{toCName .CurvePrefix}}affine_t* point_out);
void {{.Curve}}{{toCNameBackwards .CurvePrefix}}_from_affine({{toCName .CurvePrefix}}affine_t* point, {{toCName .CurvePrefix}}projective_t* point_out);
void {{.Curve}}{{toCNameBackwards .CurvePrefix}}_generate_projective_points({{toCName .CurvePrefix}}projective_t* points, int size);
void {{.Curve}}{{toCNameBackwards .CurvePrefix}}_generate_affine_points({{toCName .CurvePrefix}}affine_t* points, int size);
cudaError_t {{.Curve}}{{toCNameBackwards .CurvePrefix}}_affine_convert_montgomery({{toCName .CurvePrefix}}affine_t* points, size_t n, bool is_into, DeviceContext* ctx);

View File

@@ -18,7 +18,7 @@ exclude = [
]
[workspace.package]
version = "2.7.1"
version = "2.8.0"
edition = "2021"
authors = [ "Ingonyama" ]
homepage = "https://www.ingonyama.com"

View File

@@ -22,6 +22,8 @@ pub trait Curve: Debug + PartialEq + Copy + Clone {
#[doc(hidden)]
fn to_affine(point: *const Projective<Self>, point_aff: *mut Affine<Self>);
#[doc(hidden)]
fn from_affine(point: *const Affine<Self>, point_proj: *mut Projective<Self>);
#[doc(hidden)]
fn generate_random_projective_points(size: usize) -> Vec<Projective<Self>>;
#[doc(hidden)]
fn generate_random_affine_points(size: usize) -> Vec<Affine<Self>>;
@@ -79,27 +81,17 @@ impl<C: Curve> Affine<C> {
}
pub fn to_projective(&self) -> Projective<C> {
if *self == Self::zero() {
return Projective::<C>::zero();
}
Projective {
x: self.x,
y: self.y,
z: C::BaseField::one(),
}
let mut proj = Projective::<C>::zero();
C::from_affine(self as *const Self, &mut proj as *mut Projective<C>);
proj
}
}
impl<C: Curve> From<Affine<C>> for Projective<C> {
fn from(item: Affine<C>) -> Self {
if item == (Affine::<C>::zero()) {
return Self::zero();
}
Self {
x: item.x,
y: item.y,
z: C::BaseField::one(),
}
let mut proj = Self::zero();
C::from_affine(&item as *const Affine<C>, &mut proj as *mut Self);
proj
}
}
@@ -282,6 +274,8 @@ macro_rules! impl_curve {
pub(crate) fn eq(point1: *const $projective_type, point2: *const $projective_type) -> bool;
#[link_name = concat!($curve_prefix, "_to_affine")]
pub(crate) fn proj_to_affine(point: *const $projective_type, point_out: *mut $affine_type);
#[link_name = concat!($curve_prefix, "_from_affine")]
pub(crate) fn proj_from_affine(point: *const $affine_type, point_out: *mut $projective_type);
#[link_name = concat!($curve_prefix, "_generate_projective_points")]
pub(crate) fn generate_projective_points(points: *mut $projective_type, size: usize);
#[link_name = concat!($curve_prefix, "_generate_affine_points")]
@@ -315,6 +309,10 @@ macro_rules! impl_curve {
unsafe { $curve_prefix_ident::proj_to_affine(point, point_out) };
}
fn from_affine(point: *const $affine_type, point_out: *mut $projective_type) {
unsafe { $curve_prefix_ident::proj_from_affine(point, point_out) };
}
fn generate_random_projective_points(size: usize) -> Vec<$projective_type> {
let mut res = vec![$projective_type::zero(); size];
unsafe {

View File

@@ -24,6 +24,6 @@ pub(crate) mod tests {
let ctx = device_context::DeviceContext::default();
let sponge = Poseidon::load(2, &ctx).unwrap();
check_build_field_merkle_tree::<_, _, Bls12_377TreeBuilder>(25, 2, &sponge, &sponge, ScalarField::zero());
check_build_field_merkle_tree::<_, _, Bls12_377TreeBuilder>(18, 2, &sponge, &sponge, ScalarField::zero());
}
}

View File

@@ -24,6 +24,6 @@ pub(crate) mod tests {
let ctx = device_context::DeviceContext::default();
let sponge = Poseidon::load(2, &ctx).unwrap();
check_build_field_merkle_tree::<_, _, Bls12_381TreeBuilder>(25, 2, &sponge, &sponge, ScalarField::zero());
check_build_field_merkle_tree::<_, _, Bls12_381TreeBuilder>(18, 2, &sponge, &sponge, ScalarField::zero());
}
}

View File

@@ -25,6 +25,22 @@ extern "C" {
config: &HashConfig,
) -> CudaError;
pub(crate) fn sha3_256_cuda(
input: *const u8,
input_block_size: u32,
number_of_blocks: u32,
output: *mut u8,
config: &HashConfig,
) -> CudaError;
pub(crate) fn sha3_512_cuda(
input: *const u8,
input_block_size: u32,
number_of_blocks: u32,
output: *mut u8,
config: &HashConfig,
) -> CudaError;
pub(crate) fn build_keccak256_merkle_tree_cuda(
leaves: *const u8,
digests: *mut u64,
@@ -40,6 +56,22 @@ extern "C" {
input_block_len: u32,
config: &TreeBuilderConfig,
) -> CudaError;
pub(crate) fn build_sha3_256_merkle_tree_cuda(
leaves: *const u8,
digests: *mut u64,
height: u32,
input_block_len: u32,
config: &TreeBuilderConfig,
) -> CudaError;
pub(crate) fn build_sha3_512_merkle_tree_cuda(
leaves: *const u8,
digests: *mut u64,
height: u32,
input_block_len: u32,
config: &TreeBuilderConfig,
) -> CudaError;
}
pub fn keccak256(
@@ -86,6 +118,50 @@ pub fn keccak512(
}
}
pub fn sha3_256(
input: &(impl HostOrDeviceSlice<u8> + ?Sized),
input_block_size: u32,
number_of_blocks: u32,
output: &mut (impl HostOrDeviceSlice<u8> + ?Sized),
config: &HashConfig,
) -> IcicleResult<()> {
let mut local_cfg = config.clone();
local_cfg.are_inputs_on_device = input.is_on_device();
local_cfg.are_outputs_on_device = output.is_on_device();
unsafe {
sha3_256_cuda(
input.as_ptr(),
input_block_size,
number_of_blocks,
output.as_mut_ptr(),
&local_cfg,
)
.wrap()
}
}
pub fn sha3_512(
input: &(impl HostOrDeviceSlice<u8> + ?Sized),
input_block_size: u32,
number_of_blocks: u32,
output: &mut (impl HostOrDeviceSlice<u8> + ?Sized),
config: &HashConfig,
) -> IcicleResult<()> {
let mut local_cfg = config.clone();
local_cfg.are_inputs_on_device = input.is_on_device();
local_cfg.are_outputs_on_device = output.is_on_device();
unsafe {
sha3_512_cuda(
input.as_ptr(),
input_block_size,
number_of_blocks,
output.as_mut_ptr(),
&local_cfg,
)
.wrap()
}
}
pub fn build_keccak256_merkle_tree(
leaves: &(impl HostOrDeviceSlice<u8> + ?Sized),
digests: &mut (impl HostOrDeviceSlice<u64> + ?Sized),
@@ -123,3 +199,41 @@ pub fn build_keccak512_merkle_tree(
.wrap()
}
}
pub fn build_sha3_256_merkle_tree(
leaves: &(impl HostOrDeviceSlice<u8> + ?Sized),
digests: &mut (impl HostOrDeviceSlice<u64> + ?Sized),
height: usize,
input_block_len: usize,
config: &TreeBuilderConfig,
) -> IcicleResult<()> {
unsafe {
build_sha3_256_merkle_tree_cuda(
leaves.as_ptr(),
digests.as_mut_ptr(),
height as u32,
input_block_len as u32,
config,
)
.wrap()
}
}
pub fn build_sha3_512_merkle_tree(
leaves: &(impl HostOrDeviceSlice<u8> + ?Sized),
digests: &mut (impl HostOrDeviceSlice<u64> + ?Sized),
height: usize,
input_block_len: usize,
config: &TreeBuilderConfig,
) -> IcicleResult<()> {
unsafe {
build_sha3_512_merkle_tree_cuda(
leaves.as_ptr(),
digests.as_mut_ptr(),
height as u32,
input_block_len as u32,
config,
)
.wrap()
}
}

View File

@@ -15,7 +15,7 @@ pub(crate) mod tests {
let number_of_hashes = 1024;
let preimages = vec![1u8; number_of_hashes * input_block_len];
let mut digests = vec![0u8; number_of_hashes * 64];
let mut digests = vec![0u8; number_of_hashes * 32];
let preimages_slice = HostSlice::from_slice(&preimages);
let digests_slice = HostSlice::from_mut_slice(&mut digests);