From 20fe843a49db8582e93ea199060e12f49c3fb134 Mon Sep 17 00:00:00 2001
From: jfrery <jordan.frery@zama.ai>
Date: Fri, 10 Dec 2021 10:31:27 +0100
Subject: [PATCH] docs: update quantization.md

---
 docs/user/explanation/quantization.md | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/docs/user/explanation/quantization.md b/docs/user/explanation/quantization.md
index 5800021da..ede5d863d 100644
--- a/docs/user/explanation/quantization.md
+++ b/docs/user/explanation/quantization.md
@@ -1,5 +1,5 @@
 ```{warning}
-FIXME(Jordan/Andrei): see if this is still appropriate, update etc; make link with use_quantization.md
+FIXME(Andrei): see if this is still appropriate, update etc; make link with use_quantization.md
 ```
 
 # Quantization
@@ -20,10 +20,29 @@ The basic idea of quantization is to take a range of values represented by a _la
 
 ## Quantization in practice
 
-To quantize a range of values on a smaller range of values, we first need to choose the data type that is going to be used. **ConcreteLib**, the library used in the **Concrete Framework**, is currently limited to 7 bits unsigned integers, so we'll use that for the example. Knowing that, for a value in the range `[min_range, max_range]`, we can compute the step of the quantization, which is `(max_range - min_range) / (2**n - 1)` where n is the number of bits, here 7, so in practice the quantization step is `step = (max_range - min_range) / 127`. This means the gap between consecutive representible values cannot be smaller than that `step` value which means there can be a substantial loss of precision. Every interval of length `step = (max_range - min_range) / 127` will be represented by a value in `[0..127]`.
+Let's first define some notations. Let $ [\alpha, \beta ] $ be the range of our value to quantize where $ \alpha $ is the minimum and $ \beta $ is the maximum.
 
-The IntelLabs distiller quantization documentation goes into a detailed explanation about the math to quantize values and how to keep computations consistent: [quantization algorithm documentation](https://intellabs.github.io/distiller/algo_quantization.html).
+To quantize a range with floating point values (in $ \mathbb{R} $) to unsigned integer values (in $ \mathbb{N} $), we first need to choose the data type that is going to be used. **ConcreteLib**, the library used in the **Concrete Framework**, is currently limited to 7 bits unsigned integers, so we'll use that for the example. Knowing that, for a value in the range $ [\alpha, \beta ] $, we can compute the `scale` $ S $ of the quantization:
 
+$$ S =  \frac{\beta - \alpha}{2^n - 1} $$
+
+
+ where $ n $ is the number of bits (here 7). In practice the quantization scale is then $ S = \frac{\beta - \alpha}{127} $. This means the gap between consecutive representible values cannot be smaller than that $ S $ value which means there can be a substantial loss of precision. Every interval of length $ S $ will be represented by a value within the range $ [0..127] $.
+
+The other important parameter from this quantization schema is the `zero point` $ Z $ value. This essentially brings the 0 floating point value to a specific integer. Doing this allows us to have an asymetric quantization where the resulting integer is in the unsigned integer realm, $ \mathbb{N} $. 
+
+$$ Z = \mathsc{round} \left(- \frac{\alpha}{S} \right) $$
+
+There is more mathematics involved in how computations change when replacing floating point values by integers for a fully connected or a convolution layer. The IntelLabs distiller quantization documentation goes into a [detailed explanation](https://intellabs.github.io/distiller/algo_quantization.html) about the maths to quantize values and how to keep computations consistent.
+
+Regarding quantization and FHE compilation, it is important to understand the difference between two modes:
+
+1. the quantization is done before the compilation; notably, the quantization is completely controlled by the user, and can be done by any means, including by using third party frameworks
+2. the quantization is done during the compilation (inside our framework), with much less control by the user.
+
+For the moment, only the second method is available in Concrete Framework, but we plan to have the first method available in a further release, since it should give more freedom and better results to the user.
+
+We detail the use of quantization within Concrete Framework in [here](../howto/use_quantization.md).
 ## Resources
 
 - IntelLabs distiller explanation of quantization: [Distiller documentation](https://intellabs.github.io/distiller/algo_quantization.html)