Circom Circuits Library for Machine Learning
Disclaimer: This package is not affiliated with circom, circomlib, or iden3.
Description
- This repository contains a library of circuit templates.
- You can read more about the circom language in the circom documentation webpage.
Organisation
This respository contains 5 folders:
circuits: it contains the implementation of different cryptographic primitives in circom language.test: tests.
Polynomial activation:
Inspired by Ali, R. E., So, J., & Avestimehr, A. S. (2020), circuit/Poly.circom has been addded as a template to implement f(x)=x**2+x as an alternative activation layer to ReLU. The non-polynomial nature of ReLU activation results in a large number of constraints per layer. By replacing ReLU with the polynomial activation f(n,x)=x**2+n*x, the number of constraints drastically decrease with a slight performance tradeoff. A parameter n is required when declaring the component to adjust for the scaling of floating-point weights and biases into integers. See below for more information.
Weights and biases scaling:
- Circom only accepts integers as signals, but Tensorflow weights and biases are floating-point numbers.
- In order to simulate a neural network in Circom, weights must be scaled up by
10**mtimes. The largermis, the higher the precision. - Subsequently, biases (if any) must be scaled up by
10**2mtimes or even more to maintain the correct output of the network.
An example is provided below.
Scaling example: mnist_poly
In models/mnist_poly.ipynb, a sample model of Conv2d-Poly-Dense layers was trained on the MNIST dataset. After training, the weights and biases must be properly scaled before inputting into the circuit:
- Pixel values ranged from 0 to 255. In order for the polynomial activation approximation to work, these input values were scaled to 0.000 to 0.255 during model training. But the original integer values were scaled by
10**6times as input to the circuit- Overall scaled by
10**9times
- Overall scaled by
- Weights in the
Conv2dlayer were scaled by10**9times for higher precision. Subsequently, biases in the same layer must be scaled by(10**9)*(10**9)=10**18times. - The linear term in the polynomial activation layer would also need to be adjusted by
10**18times in order to match the scaling of the quadratic term. Hence we performed the acitvation withf(x)=x**2+(10**18)*x. - Weights in the
Denselayer were scaled by10**9time for precision again. - Biases in the
Denselayer had been omitted for simplcity, sinceArgMaxlayer is not affected by the biases. However, if the biases were to be included (for example in a deeper network as an intermediate layer), they would have to be scaled by(10**9)**5=10**45times to adjust correctly.
We can easily see that a deeper network would have to sacrifice precision, due to the limitation that Circom works under a finite field of modulo p which is around 254 bits. As log(2**254)~76, we need to make sure total scaling do not aggregate to exceed 10**76 (or even less) times. On average, a network with l layers should be scaled by less than or equal to 76//l times.
Circuits to be added:
- max/sum-pooling