mirror of
https://github.com/zkemail/zk-regex.git
synced 2026-01-10 06:07:58 -05:00
feat: added detailed readme for compiler
This commit is contained in:
70
README.md
Normal file
70
README.md
Normal file
@@ -0,0 +1,70 @@
|
||||
# ZK-Regex: Verifiable Regular Expressions in Arithmetic Circuits
|
||||
|
||||
`zk-regex` enables proving regular expression matching within zero-knowledge circuits. It compiles standard regex patterns into circuit-friendly Non-deterministic Finite Automata (NFAs) and generates corresponding circuit code for **[Circom](https://docs.circom.io/)** and **[Noir](https://noir-lang.org/)** proving systems.
|
||||
|
||||
This allows developers to build ZK applications that can verifiably process or validate text based on complex patterns without revealing the text itself.
|
||||
|
||||
## Key Features
|
||||
|
||||
- **Regex Compilation:** Converts standard regular expression syntax into NFAs optimized for ZK circuits.
|
||||
- **Circuit Generation:** Automatically generates verifiable circuit code for:
|
||||
- [Circom](https://docs.circom.io/)
|
||||
- [Noir](https://noir-lang.org/)
|
||||
- **Helper Libraries:** Provides supporting libraries and circuit templates for easier integration into Circom and Noir projects.
|
||||
- **Underlying Tech:** Leverages the robust Thompson NFA construction via the Rust [`regex-automata`](https://github.com/rust-lang/regex/tree/master/regex-automata) crate.
|
||||
|
||||
## Project Structure
|
||||
|
||||
The project is organized into the following packages:
|
||||
|
||||
- **`compiler/`**: The core Rust library responsible for parsing regex patterns, building NFAs, and generating circuit code. See [compiler/README.md](./compiler/README.md) for API details and usage.
|
||||
- **`circom/`**: Contains Circom templates and helper circuits required to use the generated regex verification circuits within a Circom project. See [circom/README.md](./circom/README.md) for integration details.
|
||||
- **`noir/`**: Contains Noir contracts/libraries required to use the generated regex verification logic within a Noir project. See [noir/README.md](./noir/README.md) for integration details.
|
||||
|
||||
## High-Level Workflow
|
||||
|
||||
1. **Define Regex:** Start with your standard regular expression pattern.
|
||||
```json
|
||||
{
|
||||
"parts": [
|
||||
{ "Pattern": "(?:\r\n|^)subject:" },
|
||||
{ "PublicPattern": ["[a-z]+", 128] },
|
||||
{ "Pattern": "\r\n" }
|
||||
]
|
||||
}
|
||||
```
|
||||
2. **Compile & Generate Circuit:** Use the `zk-regex-compiler` library to compile the pattern and generate circuit code for your chosen framework (Circom or Noir).
|
||||
|
||||
```rust
|
||||
// Simplified example - see compiler/README.md for full usage
|
||||
use zk_regex_compiler::{gen_from_raw, ProvingFramework};
|
||||
|
||||
let parts = Vec::new();
|
||||
parts.push(RegexPart::Pattern("(?:\\r\\n|^)subject:".to_string()));
|
||||
parts.push(RegexPart::PublicPattern(("([a-z]+)".to_string(), 128)));
|
||||
parts.push(RegexPart::Pattern("\r\n".to_string()));
|
||||
let decomposed_config = DecomposedRegexConfig { parts };
|
||||
|
||||
let (nfa, circom_code) = gen_from_decomposed(parts, "MyRegex", ProvingFramework::Circom)?;
|
||||
// Save or use circom_code
|
||||
```
|
||||
|
||||
3. **Integrate Circuit:** Include the generated code and the corresponding helper library ([`zk-regex-circom`](./circom/README.md) or [`zk-regex-noir`](./noir/README.md)) in your ZK project.
|
||||
4. **Generate Inputs:** Use the `zk-regex-compiler`'s [`gen_circuit_inputs`](./compiler/README.md#gen_circuit_inputsnfa-nfagraph-input-str-max_haystack_len-usize-max_match_len-usize-proving_framework-provingframework---resultproverinputs-compilererror) function to prepare the private and public inputs for your prover based on the text you want to match.
|
||||
5. **Prove & Verify:** Run your ZK proving system using the generated inputs and circuit. The proof demonstrates that the (private) text matches the (public) regex pattern.
|
||||
|
||||
## Installation
|
||||
|
||||
Installation details depend on which part of the project you need:
|
||||
|
||||
- **Compiler:** If using the compiler directly in a Rust project, add it to your `Cargo.toml`. See [compiler/README.md](./compiler/README.md).
|
||||
- **Circom Helpers:** See [circom/README.md](./circom/README.md) for instructions on integrating the Circom templates.
|
||||
- **Noir Helpers:** See [noir/README.md](./noir/README.md) for instructions on adding the Noir library dependency.
|
||||
|
||||
## Contributing
|
||||
|
||||
Contributions are welcome! Please follow standard Rust development practices. Open an issue to discuss major changes before submitting a pull request.
|
||||
|
||||
## License
|
||||
|
||||
This project is licensed under the [Specify License Here - e.g., MIT License or Apache 2.0].
|
||||
@@ -1,15 +1,3 @@
|
||||
# circom
|
||||
# zk-Regex Circom
|
||||
|
||||
To install dependencies:
|
||||
|
||||
```bash
|
||||
bun install
|
||||
```
|
||||
|
||||
To run:
|
||||
|
||||
```bash
|
||||
bun run index.ts
|
||||
```
|
||||
|
||||
This project was created using `bun init` in bun v1.2.5. [Bun](https://bun.sh) is a fast all-in-one JavaScript runtime.
|
||||
This package provides the necessary Circom templates to integrate regex verification logic generated by the `zk-regex-compiler` into your Circom projects.
|
||||
|
||||
139
compiler/README.md
Normal file
139
compiler/README.md
Normal file
@@ -0,0 +1,139 @@
|
||||
# ZK-Regex Compiler
|
||||
|
||||
This package contains the core Rust library for compiling regular expressions into circuit-friendly Non-deterministic Finite Automata (NFAs) and generating circuit code for Circom and Noir.
|
||||
|
||||
It uses the [`regex-automata`](https://github.com/rust-lang/regex/tree/master/regex-automata) crate to parse regex patterns and construct Thompson NFAs, which are then processed to create structures suitable for arithmetic circuits.
|
||||
|
||||
## Core API
|
||||
|
||||
The main functionalities are exposed through the [`lib.rs`](./src/lib.rs) file:
|
||||
|
||||
- **`compile(pattern: &str) -> Result<NFAGraph, CompilerError>`**
|
||||
|
||||
- Parses the input regex `pattern` string.
|
||||
- Builds an internal NFA representation ([`NFAGraph`](./src/types.rs)).
|
||||
- Returns the `NFAGraph` or a [`CompilerError::RegexCompilation`](./src/error.rs) if the pattern is invalid.
|
||||
|
||||
- **`gen_from_raw(pattern: &str, max_bytes: Option<Vec<usize>>, template_name: &str, proving_framework: ProvingFramework) -> Result<(NFAGraph, String), CompilerError>`**
|
||||
|
||||
- Compiles a raw regex `pattern` string directly into circuit code.
|
||||
- `max_bytes`: Optional vector specifying maximum byte lengths for each capture group. If `None`, defaults might be used or capture groups might not be specifically handled (verify this behavior).
|
||||
- `template_name`: A name used for the main template/contract in the generated code (e.g., Circom template name).
|
||||
- `proving_framework`: Specifies the target output ([`ProvingFramework::Circom`](./src/types.rs#L23) or [`ProvingFramework::Noir`](./src/types.rs#L23)).
|
||||
- Returns a tuple containing the compiled [`NFAGraph`](./src/nfa/mod.rs#L32) and the generated circuit code as a `String`, or a [`CompilerError`](./src/error.rs#L5).
|
||||
|
||||
- **`gen_from_decomposed(config: DecomposedRegexConfig, template_name: &str, proving_framework: ProvingFramework) -> Result<(NFAGraph, String), CompilerError>`**
|
||||
|
||||
- Constructs a regex pattern by combining parts defined in the `config` (of type [`DecomposedRegexConfig`](./src/types.rs#L15)).
|
||||
- Generates circuit code similarly to `gen_from_raw`.
|
||||
- Useful for building complex regex patterns programmatically.
|
||||
- Returns a tuple containing the compiled [`NFAGraph`](./src/nfa/mod.rs#L32) and the generated circuit code as a `String`, or a [`CompilerError`](./src/error.rs#L5).
|
||||
- _(Note: Requires understanding the structure of [`DecomposedRegexConfig`](./src/types.rs#L15))_
|
||||
|
||||
- **`gen_circuit_inputs(nfa: &NFAGraph, input: &str, max_haystack_len: usize, max_match_len: usize, proving_framework: ProvingFramework) -> Result<ProverInputs, CompilerError>`**
|
||||
|
||||
- Generates the necessary inputs for the prover based on the compiled [`nfa`](./src/nfa/mod.rs#L32), the `input` string to match against, and circuit constraints.
|
||||
- `max_haystack_len`: The maximum length of the input string allowed by the circuit.
|
||||
- `max_match_len`: The maximum length of the regex match allowed by the circuit.
|
||||
- `proving_framework`: Specifies for which framework ([`Circom`](./src/types.rs#L23) or [`Noir`](./src/types.rs#L23)) the inputs should be formatted.
|
||||
- Returns a [`ProverInputs`](./src/types.rs#L33) struct (containing formatted public and private inputs) or a [`CompilerError::CircuitInputsGeneration`](./src/error.rs).
|
||||
- _(Note: Requires understanding the structure of [`ProverInputs`](./src/types.rs#L33) for the specific framework)_
|
||||
|
||||
## Usage Examples (Rust)
|
||||
|
||||
Add this crate to your `Cargo.toml`:
|
||||
|
||||
```toml
|
||||
[dependencies]
|
||||
zk-regex-compiler = { git = "https://github.com/zkemail/zk-regex", package = "compiler" }
|
||||
```
|
||||
|
||||
**Example 1: Compile a simple regex to NFA**
|
||||
|
||||
```rust
|
||||
use zk_regex_compiler::{compile, CompilerError};
|
||||
|
||||
fn main() -> Result<(), CompilerError> {
|
||||
let pattern = r"^a+b*$";
|
||||
let nfa = compile(pattern)?;
|
||||
println!("Successfully compiled regex to NFA with {} states.", nfa.states().len());
|
||||
// You can now inspect the nfa graph structure
|
||||
Ok(())
|
||||
}
|
||||
```
|
||||
|
||||
**Example 2: Generate Circom Code**
|
||||
|
||||
```rust
|
||||
use zk_regex_compiler::{gen_from_raw, ProvingFramework, CompilerError};
|
||||
|
||||
fn main() -> Result<(), CompilerError> {
|
||||
let pattern = r"(a|b){2,3}";
|
||||
let template_name = "ABRegex";
|
||||
let (nfa, circom_code) = gen_from_raw(pattern, None, template_name, ProvingFramework::Circom)?;
|
||||
|
||||
println!("Generated Circom Code:\n{}", circom_code);
|
||||
// Save circom_code to a .circom file or use it directly
|
||||
Ok(())
|
||||
}
|
||||
```
|
||||
|
||||
**Example 3: Generate Noir Code**
|
||||
|
||||
```rust
|
||||
use zk_regex_compiler::{gen_from_raw, ProvingFramework, CompilerError};
|
||||
|
||||
fn main() -> Result<(), CompilerError> {
|
||||
let pattern = r"\d{3}-\d{3}-\d{4}"; // Example: Phone number
|
||||
let template_name = "PhoneRegex";
|
||||
let (nfa, noir_code) = gen_from_raw(pattern, None, template_name, ProvingFramework::Noir)?;
|
||||
|
||||
println!("Generated Noir Code:\n{}", noir_code);
|
||||
// Save noir_code to a .nr file or integrate into a Noir project
|
||||
Ok(())
|
||||
}
|
||||
```
|
||||
|
||||
**Example 4: Generate Circuit Inputs**
|
||||
|
||||
```rust
|
||||
use zk_regex_compiler::{compile, gen_circuit_inputs, ProvingFramework, CompilerError};
|
||||
|
||||
fn main() -> Result<(), CompilerError> {
|
||||
let pattern = r"abc";
|
||||
let nfa = compile(pattern)?;
|
||||
|
||||
let input_str = "test abc test";
|
||||
let max_haystack_len = 64; // Must match circuit parameter
|
||||
let max_match_len = 16; // Must match circuit parameter
|
||||
|
||||
// Generate inputs for Circom
|
||||
let circom_inputs = gen_circuit_inputs(&nfa, input_str, max_haystack_len, max_match_len, ProvingFramework::Circom)?;
|
||||
println!("Circom Inputs: {:?}", circom_inputs); // Need to format/serialize ProverInputs
|
||||
|
||||
// Generate inputs for Noir
|
||||
let noir_inputs = gen_circuit_inputs(&nfa, input_str, max_haystack_len, max_match_len, ProvingFramework::Noir)?;
|
||||
println!("Noir Inputs: {:?}", noir_inputs); // Need to format/serialize ProverInputs
|
||||
|
||||
Ok(())
|
||||
}
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
The library uses the [`CompilerError`](./src/error.rs) enum to report issues:
|
||||
|
||||
- `RegexCompilation(String)`: An error occurred during regex parsing or NFA construction (from [`regex-automata`](https://github.com/rust-lang/regex/tree/master/regex-automata)).
|
||||
- `CircuitGeneration(String)`: An error occurred during the generation of Circom or Noir code.
|
||||
- `CircuitInputsGeneration(String)`: An error occurred while generating prover inputs for a given string.
|
||||
|
||||
Match on the enum variants to handle errors appropriately.
|
||||
|
||||
## Building & Testing
|
||||
|
||||
Navigate to the `compiler/` directory and use standard Cargo commands:
|
||||
|
||||
```bash
|
||||
cargo build --release
|
||||
cargo test
|
||||
```
|
||||
3
noir/README.md
Normal file
3
noir/README.md
Normal file
@@ -0,0 +1,3 @@
|
||||
# zk-Regex Noir Library
|
||||
|
||||
This package provides the necessary Noir libraries/contracts and helper functions to integrate regex verification logic generated by the `zk-regex-compiler` into your Noir projects.
|
||||
Reference in New Issue
Block a user