docs: re-write documentation

This commit is contained in:
aquint-zama
2022-06-01 15:41:37 +02:00
committed by Umut
parent 546ed48765
commit 35e46aca69
53 changed files with 1505 additions and 1673 deletions

View File

@@ -1,45 +1,14 @@
# Compilation Pipeline In Depth
# Compilation
## What is **concrete-numpy**?
**concrete-numpy** is a convenient python package, made on top of **Concrete compiler** and **Concrete library**, for developing homomorphic applications. One of its essential functionalities is to transform Python functions to their `MLIR` equivalent. Unfortunately, not all python functions can be converted due to the limits of current product (we are in the alpha stage), or sometimes due to inherent restrictions of FHE itself. However, you can already build interesting and impressing use cases, and more will be available in further versions of the framework.
## How can I use it?
```python
# Import necessary Concrete components
import concrete.numpy as cnp
# Define the function to homomorphize
def f(x, y):
return (2 * x) + y
# Create a Compiler
compiler = cnp.Compiler(f, {"x": "encrypted", "y": "encrypted"})
# Compile to a Circuit using an inputset
inputset = [(0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (2, 1), (3, 0), (3, 1)]
circuit = compiler.compile(inputset)
# Make homomorphic inference
circuit.encrypt_run_decrypt(1, 0)
```
## Overview of the numpy compilation process
The compilation journey begins with tracing to get an easy to understand and manipulate representation of the function. We call this representation `Computation Graph` which is basically a Directed Acyclic Graph (DAG) containing nodes representing the computations done in the function. Working with graphs is good because they have been studied extensively over the years and there are a lot of algorithms to manipulate them. Internally, we use [networkx](https://networkx.org) which is an excellent graph library for Python.
The compilation journey begins with tracing to get an easy-to-manipulate representation of the function. We call this representation a `Computation Graph`, which is basically a Directed Acyclic Graph (DAG) containing nodes representing the computations done in the function. Working with graphs is good because they have been studied extensively over the years and there are a lot of algorithms to manipulate them. Internally, we use [networkx](https://networkx.org), which is an excellent graph library for Python.
The next step in the compilation is transforming the computation graph. There are many transformations we perform, and they will be discussed in their own sections. In any case, the result of transformations is just another computation graph.
After transformations are applied, we need to determine the bounds (i.e., the minimum and the maximum values) of each intermediate node. This is required because FHE currently allows a limited precision for computations. Bound measurement is our way to know what is the needed precision for the function.
After transformations are applied, we need to determine the bounds (i.e., the minimum and the maximum values) of each intermediate node. This is required because FHE currently allows a limited precision for computations. Bound measurement is our way to know what is the required precision for the function.
The final step is to transform the computation graph to equivalent `MLIR` code. How this is done will be explained in detail in its own chapter.
Once the MLIR is prepared, the rest of the stack, which you can learn more about [here](http://docs.zama.ai/), takes over and completes the compilation process.
Here is the visual representation of the pipeline:
![Frontend Flow](../_static/compilation-pipeline/frontend_flow.svg)
Once the MLIR is generated, we send it to the **Concrete Compiler**, and it completes the compilation process.
## Tracing
@@ -52,13 +21,11 @@ def f(x):
the goal of tracing is to create the following computation graph without needing any change from the user.
![](../_static/compilation-pipeline/two_x_plus_three.png)
![](../\_static/compilation-pipeline/two\_x\_plus\_three.png)
(Note that the edge labels are for non-commutative operations. To give an example, a subtraction node represents `(predecessor with edge label 0) - (predecessor with edge label 1)`)
To do this, we make use of Tracers, which are objects that record the operation performed during their creation. We create a `Tracer` for each argument of the function and call the function with those tracers. Tracers make use of operator overloading feature of Python to achieve their goal.
Here is an example:
To do this, we make use of `Tracer`s, which are objects that record the operation performed during their creation. We create a `Tracer` for each argument of the function and call the function with those tracers. `Tracer`s make use of the operator overloading feature of Python to achieve their goal:
```
def f(x, y):
@@ -70,11 +37,11 @@ y = Tracer(computation=Input("y"))
resulting_tracer = f(x, y)
```
`2 * y` will be performed first, and `*` is overloaded for `Tracer` to return another tracer: `Tracer(computation=Multiply(Constant(2), self.computation))` which is equal to: `Tracer(computation=Multiply(Constant(2), Input("y")))`
`2 * y` will be performed first, and `*` is overloaded for `Tracer` to return another tracer: `Tracer(computation=Multiply(Constant(2), self.computation))`, which is equal to `Tracer(computation=Multiply(Constant(2), Input("y")))`
`x + (2 * y)` will be performed next, and `+` is overloaded for `Tracer` to return another tracer: `Tracer(computation=Add(self.computation, (2 * y).computation))` which is equal to: `Tracer(computation=Add(Input("x"), Multiply(Constant(2), Input("y")))`
`x + (2 * y)` will be performed next, and `+` is overloaded for `Tracer` to return another tracer: `Tracer(computation=Add(self.computation, (2 * y).computation))`, which is equal to `Tracer(computation=Add(Input("x"), Multiply(Constant(2), Input("y")))`
In the end, we will have output Tracers that can be used to create the computation graph. The implementation is a bit more complex than this, but the idea is the same.
In the end, we will have output tracers that can be used to create the computation graph. The implementation is a bit more complex than this, but the idea is the same.
Tracing is also responsible for indicating whether the values in the node would be encrypted or not, and the rule for that is if a node has an encrypted predecessor, it is encrypted as well.
@@ -86,33 +53,33 @@ With the current version of **Concrete Numpy**, floating point inputs and floati
Let's take a closer look at the transforms we can currently perform.
### Fusing floating point operations
### Fusing.
We have allocated a whole new chapter to explaining float fusing. You can find it [here](float-fusing.md).
We have allocated a whole new chapter to explaining fusing. You can find it after this chapter.
## Bounds measurement
Given a computation graph, the goal of the bound measurement step is to assign the minimal data type to each node in the graph.
Let's say we have an encrypted input that is always between `0` and `10`, we should assign the type `Encrypted<uint4>` to node of this input as `Encrypted<uint4>` is the minimal encrypted integer that supports all the values between `0` and `10`.
Let's say we have an encrypted input that is always between `0` and `10`. We should assign the type `Encrypted<uint4>` to the node of this input as `Encrypted<uint4>` is the minimal encrypted integer that supports all values between `0` and `10`.
If there were negative values in the range, we could have used `intX` instead of `uintX`.
Bounds measurement is necessary because FHE supports limited precision, and we don't want unexpected behaviour during evaluation of the compiled functions.
Bounds measurement is necessary because FHE supports limited precision, and we don't want unexpected behaviour while evaluating the compiled functions.
Let's take a closer look at how we perform bounds measurement.
### Inputset evaluation
### Inputset evaluation.
This is a simple approach that requires an inputset to be provided by the user.
The inputset is not to be confused with the dataset which is classical in ML, as it doesn't require labels. Rather, it is a set of values which are typical inputs of the function.
The inputset is not to be confused with the dataset, which is classical in ML, as it doesn't require labels. Rather, it is a set of values which are typical inputs of the function.
The idea is to evaluate each input in the inputset and record the result of each operation in the computation graph. Then we compare the evaluation results with the current minimum/maximum values of each node and update the minimum/maximum accordingly. After the entire inputset is evaluated, we assign a data type to each node using the minimum and the maximum value it contains.
The idea is to evaluate each input in the inputset and record the result of each operation in the computation graph. Then we compare the evaluation results with the current minimum/maximum values of each node and update the minimum/maximum accordingly. After the entire inputset is evaluated, we assign a data type to each node using the minimum and the maximum values it contains.
Here is an example, given this computation graph where `x` is encrypted:
![](../_static/compilation-pipeline/two_x_plus_three.png)
![](../\_static/compilation-pipeline/two\_x\_plus\_three.png)
and this inputset:
@@ -178,156 +145,4 @@ Assigned Data Types:
## MLIR conversion
The actual compilation will be done by the **Concrete** compiler, which is expecting an MLIR input. The MLIR conversion goes from a computation graph to its MLIR equivalent. You can read more about it [here](mlir.md)
## Example walkthrough #1
### Function to homomorphize
```
def f(x):
return (2 * x) + 3
```
### Parameters
```
x = "encrypted"
```
#### Corresponding computation graph
![](../_static/compilation-pipeline/two_x_plus_three.png)
### Topological transforms
#### Fusing floating point operations
This transform isn't applied since the computation doesn't involve any floating point operations.
### Bounds measurement using \[2, 3, 1] as inputset (same settings as above)
Data Types:
* `x`: Encrypted<**uint2**>
* `2`: Clear<**uint2**>
* `*`: Encrypted<**uint3**>
* `3`: Clear<**uint2**>
* `+`: Encrypted<**uint4**>
### MLIR lowering
```
module {
func @main(%arg0: !FHE.eint<4>) -> !FHE.eint<4> {
%c3_i5 = constant 3 : i5
%c2_i5 = constant 2 : i5
%0 = "FHE.mul_eint_int"(%arg0, %c2_i5) : (!FHE.eint<4>, i5) -> !FHE.eint<4>
%1 = "FHE.add_eint_int"(%0, %c3_i5) : (!FHE.eint<4>, i5) -> !FHE.eint<4>
return %1 : !FHE.eint<4>
}
}
```
## Example walkthrough #2
### Function to homomorphize
```
def f(x, y):
return (42 - x) + (y * 2)
```
### Parameters
```
x = "encrypted"
y = "encrypted"
```
#### Corresponding computation graph
![](../_static/compilation-pipeline/forty_two_minus_x_plus_y_times_two.png)
### Topological transforms
#### Fusing floating point operations
This transform isn't applied since the computation doesn't involve any floating point operations.
### Bounds measurement using \[(6, 0), (5, 1), (3, 0), (4, 1)] as inputset
Evaluation Result of `(6, 0)`:
* `42`: 42
* `x`: 6
* `y`: 0
* `2`: 2
* `-`: 36
* `*`: 0
* `+`: 36
Evaluation Result of `(5, 1)`:
* `42`: 42
* `x`: 5
* `y`: 1
* `2`: 2
* `-`: 37
* `*`: 2
* `+`: 39
Evaluation Result of `(3, 0)`:
* `42`: 42
* `x`: 3
* `y`: 0
* `2`: 2
* `-`: 39
* `*`: 0
* `+`: 39
Evaluation Result of `(4, 1)`:
* `42`: 42
* `x`: 4
* `y`: 1
* `2`: 2
* `-`: 38
* `*`: 2
* `+`: 40
Bounds:
* `42`: \[42, 42]
* `x`: \[3, 6]
* `y`: \[0, 1]
* `2`: \[2, 2]
* `-`: \[36, 39]
* `*`: \[0, 2]
* `+`: \[36, 40]
Data Types:
* `42`: Clear<**uint6**>
* `x`: Encrypted<**uint3**>
* `y`: Encrypted<**uint1**>
* `2`: Clear<**uint2**>
* `-`: Encrypted<**uint6**>
* `*`: Encrypted<**uint2**>
* `+`: Encrypted<**uint6**>
### MLIR lowering
```
module {
func @main(%arg0: !FHE.eint<6>, %arg1: !FHE.eint<6>) -> !FHE.eint<6> {
%c42_i7 = constant 42 : i7
%c2_i7 = constant 2 : i7
%0 = "FHE.sub_int_eint"(%c42_i7, %arg0) : (i7, !FHE.eint<6>) -> !FHE.eint<6>
%1 = "FHE.mul_eint_int"(%arg1, %c2_i7) : (!FHE.eint<6>, i7) -> !FHE.eint<6>
%2 = "FHE.add_eint"(%0, %1) : (!FHE.eint<6>, !FHE.eint<6>) -> !FHE.eint<6>
return %2 : !FHE.eint<6>
}
}
```
The actual compilation will be done by the **Concrete Compiler**, which is expecting an MLIR input. The MLIR conversion goes from a computation graph to its MLIR equivalent. You can read more about it [here](mlir.md).

View File

@@ -1,64 +1,60 @@
# Contribute
# Contributing
{% hint style="info" %}
There are two ways to contribute to **Concrete Numpy** or to **Concrete** tools in general:
{% hint style='info' %}
There are two ways to contribute to **concrete-numpy** or to **Concrete** tools in general:
- you can open issues to report bugs and typos and to suggest ideas
- you can ask to become an official contributor by emailing hello@zama.ai. Only approved contributors can send pull requests, so please make sure to get in touch before you do!
* You can open issues to report bugs and typos and to suggest ideas.
* You can ask to become an official contributor by emailing hello@zama.ai. Only approved contributors can send pull requests, so please make sure to get in touch before you do!
{% endhint %}
Let's go over some other important things that you need to be careful about.
Now, let's go over some other important items that you need to know.
## Creating a new branch
We are using a consistent branch naming scheme, and you are expected to follow it as well. Here is the format and some examples.
We are using a consistent branch naming scheme, and you are expected to follow it as well. Here is the format:
```shell
git checkout -b {feat|fix|refactor|test|benchmark|doc|style|chore}/short-description_$issue_id
git checkout -b {feat|fix|refactor|test|benchmark|doc|style|chore}/short-description
```
e.g.
...and some examples:
```shell
git checkout -b feat/explicit-tlu_11
git checkout -b fix/tracing_indexing_42
git checkout -b feat/direct-tlu
git checkout -b fix/tracing-indexing
```
## Before committing
### Conformance
### Conformance.
Each commit to **concrete-numpy** should conform to the standards decided by the team. Conformance can be checked using the following command.
Each commit to **Concrete Numpy** should conform to the standards decided by the team. Conformance can be checked using the following command:
```shell
make pcc
```
### pytest
### Testing.
Of course, tests must pass as well.
On top of conformance, all tests must pass with 100% code coverage across the codebase:
```shell
make pytest
```
### Coverage
{% hint style="info" %}
There may be cases where covering 100% of the code is not possible (e.g., exceptions that cannot be triggered in normal execution circumstances). In those cases, you may be allowed to disable coverage for some specific lines. This should be the exception rather than the rule. Reviewers may ask why some lines are not covered and, if it appears they can be covered, then the PR won't be accepted in that state.
{% endhint %}
The last requirement is to make sure you get 100 percent code coverage. The `make pytest` command checks that by default and will fail with a coverage report at the end should some lines of your code not be executed during testing.
## Committing
If your coverage is below 100 percent, you should write more tests and then create the pull request. If you ignore this warning and create the PR, GitHub actions will fail and your PR will not be merged anyway.
There may be cases where covering you code is not possible (exception that cannot be triggered in normal execution circumstances), in those cases you may be allowed to disable coverage for some specific lines. This should be the exception rather than the rule and reviewers will ask why some lines are not covered and if it appears they can be covered then the PR won't be accepted in that state.
## Commiting
We are using a consistent commit naming scheme, and you are expected to follow it as well (the CI will make sure you do). The accepted format can be printed to your terminal by running:
We are using a consistent commit naming scheme, and you are expected to follow it as well. Again, here is the accepted format:
```shell
make show_scope
```
e.g.
...and some examples:
```shell
git commit -m "feat: implement bounds checking"
@@ -66,15 +62,15 @@ git commit -m "feat(debugging): add an helper function to draw intermediate repr
git commit -m "fix(tracing): fix a bug that crashed pytorch tracer"
```
To learn more about conventional commits, check [this](https://www.conventionalcommits.org/en/v1.0.0/) page. Just a reminder that commit messages are checked in the comformance step, and rejected if they don't follow the rules.
To learn more about conventional commits, check [this](https://www.conventionalcommits.org/en/v1.0.0/) page.
## Before creating pull request
## Before creating a pull request
{% hint style='tip' %}
We remind that only official contributors can send pull requests. To become such an official contributor, please email hello@zama.ai.
{% hint style="info" %}
We remind you that only official contributors can send pull requests. To become an official contributor, please email hello@zama.ai.
{% endhint %}
You should rebase on top of `main` branch before you create your pull request. We don't allow merge commits, so rebasing on `main` before pushing gives you the best chance of avoiding having to rewrite parts of your PR later if some conflicts arise with other PRs being merged. After you commit your changes to your new branch, you can use the following commands to rebase:
You should rebase on top of the `main` branch before you create your pull request. We don't allow merge commits, so rebasing on `main` before pushing gives you the best chance of avoiding rewriting parts of your PR later if conflicts arise with other PRs being merged. After you commit your changes to your new branch, you can use the following commands to rebase:
```shell
# fetch the list of active remote branches

View File

@@ -1,54 +1,45 @@
# Docker
# Docker Setup
## Setting up docker and X forwarding
## Installation
Before you start this section, go ahead and install docker. You can follow [this](https://docs.docker.com/engine/install/) official guide for that.
Before you start this section, go ahead and install Docker. You can follow [this](https://docs.docker.com/engine/install/) official guide if you need help.
### Linux
## X forwarding
### Linux.
You can use xhost command:
```shell
xhost +localhost
```
### Mac OS
### macOS.
To be able to use X forwarding on Mac OS:
- Install XQuartz
- Open XQuartz.app application, make sure in the application parameters that `authorize network connections` are set (currently in the Security settings)
- Open a new terminal within XQuartz.app and type:
To use X forwarding on Mac OS:
* Install XQuartz
* Open XQuartz.app application, make sure in the application parameters that `authorize network connections` are set (currently in the Security settings)
* Open a new terminal within XQuartz.app and type:
```shell
xhost +127.0.0.1
```
and now, the X server should be all set in docker (in the regular terminal).
X server should be all set for Docker in the regular terminal.
### Windows
## Building
Install Xming and use Xlaunch:
- Multiple Windows, Display number: 0
- `Start no client`
- **IMPORTANT**: Check `No Access Control`
- You can save this configuration to re-launch easily, then click finish.
## Logging in and building the image
Docker image of **Concrete-Numpy** is based on another docker image provided by the compiler team. Once you have access to this repository you should be able to launch the commands to build the dev docker image with `make docker_build`.
Upon joining to the team, you need to log in using the following command:
You can use the dedicated target in the Makefile to build the docker image:
```shell
docker login ghcr.io
make docker_build
```
This command will ask for a username and a password. For username, just enter your GitHub username. For password, you should create a personal access token from [here](https://github.com/settings/tokens) selecting `read:packages` permission. Just paste the generated access token as your password, and you are good to go.
## Starting
Once you do that, you can get inside the docker environment using the following command:
You can use the dedicated target in the Makefile to start the docker session:
```shell
make docker_build_and_start
# or equivalently but shorter
make docker_bas
make docker_start
```
After you finish your work, you can leave the docker by using the `exit` command or by pressing `CTRL + D`.

View File

@@ -1,86 +0,0 @@
# Fusing Floating Point Operations
## Why is it needed?
The current compiler stack only supports integers with 7 bits or less. But it's not uncommon to have numpy code using floating point numbers.
We added fusing floating point operations to make tracing numpy functions somewhat user friendly to allow in-line quantization in the numpy code e.g.:
<!--pytest-codeblocks:skip-->
```python
import numpy
def quantized_sin(x):
# from a 7 bit unsigned integer x, compute z in the [0; 2 * pi] range
z = 2 * numpy.pi * x * (1 / 127)
# quantize over 6 bits and offset to be >= 0, round and convert to integers in range [0; 63]
quantized_sin = numpy.rint(31 * numpy.sin(z) + 31).astype(numpy.int64)
# output quantized_sin and a further offset result
return quantized_sin, quantized_sin + 32
```
This function `quantized_sin` is not strictly supported as is by the compiler as there are floating point intermediate values. However, when looking at the function globally we can see we have a single integer input and a single integer output. As we know the input range we can compute a table to represent the whole computation for each input value, which can later be lowered to a PBS in the FHE world.
Any computation where there is a single variable integer input and a single integer output can be replaced by an equivalent table lookup.
The `quantized_sin` graph of operations:
![](../_static/float_fusing_example/before.png)
The float subgraph that was detected:
![](../_static/float_fusing_example/subgraph.png)
The simplified graph of operations with the float subgraph condensed in a `GenericFunction` node:
![](../_static/float_fusing_example/after.png)
## How is it done in **Concrete Numpy**?
The first step consists in detecting where we go from floating point computation back to integers. This allows the identification of the potential terminal node of the float subgraph we are going to fuse.
From the terminal node, we go back up through the nodes until we find nodes that go from integers to floats. If we find a single node then we have a fusable subgraph that we replace by an equivalent GenericFunction node and stop the search for fusable subgraphs for the terminal node being considered. If we find more than one such node we try to find a single common ancestor that would go from integers to floats. We repeat the process as long as there are potential ancestors nodes, stopping if we find a suitable float subgraph with a single integer input and a single integer output.
Here is an example benefiting from the expanded search:
<!--pytest-codeblocks:skip-->
```python
def fusable_with_bigger_search(x, y):
"""fusable with bigger search"""
x = x + 1
x_1 = x.astype(numpy.int64)
x_1 = x_1 + 1.5
x_2 = x.astype(numpy.int64)
x_2 = x_2 + 3.4
add = x_1 + x_2
add_int = add.astype(numpy.int64)
return add_int + y
```
The `fusable_with_bigger_search` graph of operations:
![](../_static/float_fusing_example/before_bigger_search.png)
The float subgraph that was detected:
![](../_static/float_fusing_example/subgraph_bigger_search.png)
The simplified graph of operations with the float subgraph condensed in a `GenericFunction` node:
![](../_static/float_fusing_example/after_bigger_search.png)
An example of a non fusable computation with that technique is:
<!--pytest-codeblocks:skip-->
```python
import numpy
def non_fusable(x, y):
x_1 = x + 1.5 # x_1 is now float
y_1 = y + 3.4 # y_1 is now float
add = x_1 + y_1
add_int = add.astype(numpy.int64)
return add_int
```
From `add_int` you will find two `Add` nodes going from int to float (`x_1` and `y_1`) which we cannot represent with a single input table look-up. KolmogorovArnold representation theorem states that every multivariate continuous function can be represented as a superposition of continuous functions of one variable ([from Wikipedia](https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Arnold\_representation\_theorem)), so the above case could be handled in future versions of **Concrete** tools.

37
docs/dev/fusing.md Normal file
View File

@@ -0,0 +1,37 @@
# Fusing
Fusing is the act of combining multiple nodes into a single node, which is converted to a table lookup.
## How is it done?
Code related to fusing is in the `concrete/numpy/compilation/utils.py` file.
Fusing can be performed using the `fuse` function. Within `fuse`:
1. We loop until there are no more subgraphs to fuse.
2. Within each iteration:
3. We find a subgraph to fuse.
3.1. We search for a terminal node that is appropriate for fusing.
3.2. We crawl backwards to find the closest integer nodes to this node.
3.3. If there is a single node as such, we return the subgraph from this node to the terminal node.
3.4. Otherwise, we try to find the lowest common ancestor (lca) of this list of nodes.
3.5. If lca doesn't exist, we say this particular terminal node is not fusable, and we go back to search for another subgraph.
3.6. Otherwise, we use this lca as the input of the subgraph and continue with `subgraph` node creation below.
4. We convert the subgraph into a `subgraph` node
4.1. We check fusability status of the nodes of the subgraph in this step.
5. We substitute the `subgraph` node to the original graph.
## Limitations
With the current implementation, we cannot fuse subgraphs that depend on multiple encrypted values where those values doesn't have a common lca (e.g., `np.round(np.sin(x) + np.cos(y))`).
{% hint style="info" %}
[KolmogorovArnold representation theorem](https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Arnold\_representation\_theorem) states that every multivariate continuous function can be represented as a superposition of continuous functions of one variable. Therefore, the case above could be handled in future versions of **Concrete Numpy**.
{% endhint %}

View File

@@ -1,17 +1,17 @@
# MLIR
The MLIR project is a sub-project of the LLVM project. It's designed to simplify building domain-specific compilers such as ours: Concrete Compiler.
The MLIR project is a sub-project of the LLVM project. It's designed to simplify building domain-specific compilers such as our **Concrete Compiler**.
Concrete Compiler accepts MLIR as input and emits compiled assembly code for the target architecture.
**Concrete Compiler** accepts MLIR as an input and emits compiled assembly code for a target architecture.
Concrete NumPy does the MLIR generation from the computation graph. Code related to this conversion is in `concrete/numpy/mlir` folder.
**Concrete Numpy** performs the MLIR generation from the computation graph. Code related to this conversion is in the `concrete/numpy/mlir` folder.
The conversion can be performed using `convert` method of `GraphConverter` class.
The conversion can be performed using the `convert` method of the `GraphConverter` class.
Within `convert` method of `GraphConverter`:
Within the `convert` method of `GraphConverter`:
* MLIR compatibility of the graph is checked
* Bit-width constraints are checked
* Negative lookup tables are offsetted
* Computation graph is traversed and each node is converted to their corresponding MLIR representation using `NodeConverter` class
* String representation of resulting MLIR is returned
* MLIR compatibility of the graph is checked;
* bit width constraints are checked;
* negative lookup tables are offset;
* the computation graph is traversed and each node is converted to their corresponding MLIR representation using the `NodeConverter` class;
* and string representation of the resulting MLIR is returned.

View File

@@ -1,171 +1,93 @@
# Project Setup
{% hint style='info' %}
It is strongly recommended to use the development docker (see the [docker](./docker.md) guide). However you can setup the project on bare macOS and Linux provided you install the required dependencies (check Dockerfile.env for the required binary packages like make).
The project targets Python 3.8 through 3.9 inclusive.
{% hint style="info" %}
It is **strongly** recommended to use the development tool Docker. Though, you can set the project up on a bare Linux or macOS as long as you have the required dependencies. You can see the required dependencies in `Dockerfile.dev` under `docker` directory.
{% endhint %}
## Installing Python
## Installing `Python`
**concrete-numpy** is a `Python` library, so `Python` should be installed to develop **concrete-numpy**. `v3.8` and `v3.9` are the only supported versions.
**Concrete Numpy** is a `Python` library, so `Python` should be installed to develop it. `v3.8` and `v3.9` are, currently, the only supported versions.
You can follow [this](https://realpython.com/installing-python/) guide to install it (alternatively you can google `how to install python 3.8 (or 3.9)`).
You probably have Python already, but in case you don't, or in case you have an unsupported version, you can google `how to install python 3.8` and follow one of the results.
## Installing Poetry
## Installing `Poetry`
`Poetry` is our package manager. It drastically simplifies dependency and environment management.
You can follow [this](https://python-poetry.org/docs/#installation) official guide to install it.
{% hint style='danger' %}
As there is no `concrete-compiler` package for Windows, only the dev dependencies can be installed. This requires poetry >= 1.2.
## Installing `make`
At the time of writing (January 2022), there is only an alpha version of poetry 1.2 that you can install. In the meantime we recommend following [this link to setup the docker environment](./docker.md) on Windows.
{% endhint %}
`make` is used to launch various commands such as formatting and testing.
## Installing make
On Linux, you can install `make` using the package manager of your distribution.
The dev tools use `make` to launch the various commands.
On Linux you can install `make` from your distribution's preferred package manager.
On Mac OS you can install a more recent version of `make` via brew:
On macOS, you can install `gmake` via brew:
```shell
# check for gmake
which gmake
# If you don't have it, it will error out, install gmake
brew install make
# recheck, now you should have gmake
which gmake
```
It is possible to install `gmake` as `make`, check this [StackOverflow post](https://stackoverflow.com/questions/38901894/how-can-i-install-a-newer-version-of-make-on-mac-os) for more info.
On Windows check [this GitHub gist](https://gist.github.com/evanwill/0207876c3243bbb6863e65ec5dc3f058#make).
{% hint style='tip' %}
In the following sections, be sure to use the proper `make` tool for your system: `make`, `gmake`, or other.
{% hint style="info" %}
In the following sections, be sure to use the proper `make` tool for your system (i.e., `make`, `gmake`, etc).
{% endhint %}
## Cloning repository
## Cloning the repository
Now, it's time to get the source code of **concrete-numpy**.
Now, it's time to get the source code of **Concrete Numpy**.
Clone the code repository using the link for your favourite communication protocol (ssh or https).
Clone the git repository from GitHub using the protocol of your choice (ssh or https).
## Setting up environment on your host OS
## Setting up the environment
We are going to make use of virtual environments. This helps to keep the project isolated from other `Python` projects in the system. The following commands will create a new virtual environment under the project directory and install dependencies to it.
Virtual environments are utilized to keep the project isolated from other `Python` projects in the system.
{% hint style='danger' %}
The following command will not work on Windows if you don't have poetry >= 1.2. As poetry 1.2 is still in alpha we recommend following [this link to setup the docker environment](./docker.md) instead.
{% endhint %}
To create a new virtual environment and install dependencies, use the command:
```shell
cd concrete-numpy
make setup_env
```
## Activating the environment
Finally, all we need to do is to activate the newly created environment using the following command.
### macOS or Linux
To activate the newly created environment, use:
```shell
source .venv/bin/activate
```
### Windows
## Syncing the environment
From time to time, new dependencies will be added to the project and old ones will be removed.
The command below will make sure the project has the proper environment, so run it regularly.
```shell
source .venv/Scripts/activate
make sync_env
```
## Setting up environment on docker
## Troubleshooting
The docker automatically creates and sources a venv in ~/dev_venv/
### In native setups.
The venv persists thanks to volumes. We also create a volume for ~/.cache to speed up later reinstallations. You can check which docker volumes exist with:
```shell
docker volume ls
```
You can still run all `make` commands inside the docker (to update the venv, for example). Be mindful of the current venv being used (the name in parentheses at the beginning of your command prompt).
```shell
# Here we have dev_venv sourced
(dev_venv) dev_user@8e299b32283c:/src$ make setup_env
```
## Leaving the environment
After your work is done, you can simply run the following command to leave the environment.
If you are having issues in a native setup, you can try to re-create your environment like this:
```shell
deactivate
```
## Syncing environment with the latest changes
From time to time, new dependencies will be added to project or the old ones will be removed. The command below will make sure the project has the proper environment. So run it regularly!
```shell
make sync_env
```
## Troubleshooting your environment
### In your OS
If you are having issues, consider using the dev docker exclusively (unless you are working on OS specific bug fixes or features).
Here are the steps you can take on your OS to try and fix issues:
```shell
# Try to install the env normally
make setup_env
# If you are still having issues, sync the environment
make sync_env
# If you are still having issues on your OS delete the venv:
rm -rf .venv
# And re-run the env setup
make setup_env
source .venv/bin/activate
```
At this point you should consider using docker as nobody will have the exact same setup as you, unless you need to develop on your OS directly, in which case you can ask us for help but may not get a solution right away.
If the problem persists, you should consider using Docker. If you are working on a platform specific feature and Docker is not an option, you should create an issue so that we can take a look at your problem.
### In docker
### In docker setups.
Here are the steps you can take in your docker to try and fix issues:
If you are having issues in a docker setup, you can try to re-build the docker image:
```shell
# Try to install the env normally
make setup_env
# If you are still having issues, sync the environment
make sync_env
# If you are still having issues in docker delete the venv:
rm -rf ~/dev_venv/*
# Disconnect from the docker
exit
# And relaunch, the venv will be reinstalled
make docker_start
# If you are still out of luck, force a rebuild which will also delete the volumes
make docker_rebuild
# And start the docker which will reinstall the venv
make docker_start
```
If the problem persists at this point, you should consider asking for help. We're here and ready to assist!
If the problem persists, you should contact us for help.

View File

@@ -1,9 +1,17 @@
# Creating A Release On GitHub
# Release process
## Release Candidate cycle
## Release candidate cycle
Before settling for a final release, we go through a Release Candidate (RC) cycle. The idea is that once the code base and documentations look ready for a release you create an RC Release by opening an issue with the release template [here](https://github.com/zama-ai/concrete-numpy-internal/issues/new?assignees=&labels=&template=release.md), starting with version `vX.Y.Zrc1` and then with versions `vX.Y.Zrc2`, `vX.Y.Zrc3`...
Throughout the quarter, many release candidatess are relesed. Those candidates are released in a private package repository. At the end of the quarter, we take the latest release candidate, and release it in PyPI without `rcX` tag.
## Proper release
## Release flow
Once the last RC is deemed ready, open an issue with the release template using the last RC version from which you remove the `rc?` part (i.e. `v12.67.19` if your last RC version was `v12.67.19-rc4`) on [github](https://github.com/zama-ai/concrete-numpy-internal/issues/new?assignees=&labels=&template=release.md).
* Checkout to the commit that you want to include in the release (everything before this commit and this commit will be in the release)
* Run `make release`
* Wait for CI to complete
* Checkout to `chore/version` branch
* Run `VERSION=a.b.c-rcX make set_version` with appropriate version
* Push the branch to origin
* Create a PR to merge it to main
* Wait for CI to finish and get approval in the meantime
* Merge the version update to main

View File

@@ -2,31 +2,25 @@
## Terminology
In this section we will go over some terms that we use throughout the project.
Some terms used throughout the project include:
- intermediate representation
- a data structure to represent a computation
- basically a computation graph in which nodes are either inputs, constants, or operations on other nodes
- tracing
- it is the technique to take a python function from a user and generate intermediate representation corresponding to it in a painless way for the user
- bounds
- before intermediate representation is converted to MLIR, we need to know which node will output which type (e.g., uint3 vs uint5)
- there are several ways to do this but the simplest one is to evaluate the intermediate representation with some combinations of inputs and remember the maximum and the minimum values for each node, which is what we call bounds, and bounds can be used to determine the appropriate type for each node
- circuit
- it is the result of compilation
- it is made of the computation graph and the compiler engine
- it has methods for printing, visualizing, and evaluating
* computation graph - a data structure to represent a computation. This is basically a directed acyclic graph in which nodes are either inputs, constants or operations on other nodes.
* tracing - the technique that takes a Python function from the user and generates the corresponding computation graph in an easy to read format.
* bounds - before a computation graph is converted to MLIR, we need to know which node will output which type (e.g., uint3 vs euint5). Computation graphs with different inputs must remember the minimum and maximum values for each node, which is what we call bounds, and use bounds to determine the appropriate type for each node
* circuit - the result of compilation. A circuit is made of the client and server components and has methods, everything from printing and drawing to evaluation.
## Module structure
In this section, we will discuss the module structure of **concrete-numpy** briefly. You are encouraged to check individual `.py` files to learn more!
In this section, we will discuss the module structure of **Concrete Numpy** briefly. You are encouraged to check individual `.py` files to learn more.
- concrete
- numpy
- dtypes: data type specifications
- values: value specifications (i.e., data type + shape + encryption status)
- representation: representation of computation
- tracing: tracing of python functions
- extensions: custom functionality which is not available in numpy (e.g., conv2d)
- mlir: mlir conversion
- compilation: compilation from python functions to circuits
* Concrete
* Numpy
* dtypes - data type specifications
* values - value specifications (i.e., data type + shape + encryption status)
* representation - representation of computation
* tracing - tracing of Python functions
* extensions - custom functionality which is not available in NumPy (e.g., direct table lookups)
* MLIR - MLIR conversion
* compilation - compilation from a Python function to a circuit, client/server architecture
* ONNX
* convolution - custom convolution operations that follow the behavior of ONNX