mirror of
https://github.com/zama-ai/concrete.git
synced 2026-01-13 23:08:14 -05:00
764 lines
173 KiB
Plaintext
764 lines
173 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "_FTzVxUkjQno"
|
|
},
|
|
"source": [
|
|
"# SHA-256 Implementation Using Concrete\n",
|
|
"\n",
|
|
"In this tutorial, we will explore the implementation of SHA-256, a widely used hashing algorithm, using concrete-python. Details about the algorithm can be found [here](https://en.wikipedia.org/wiki/SHA-2).\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"metadata": {
|
|
"colab": {
|
|
"base_uri": "https://localhost:8080/"
|
|
},
|
|
"id": "zXozpJvmcBH1",
|
|
"outputId": "79dfc00b-10cc-4ffd-d4b9-a10f18d8d01e"
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
|
|
"Requirement already satisfied: concrete-python in /usr/local/lib/python3.10/dist-packages (1.0.0)\n",
|
|
"Requirement already satisfied: numpy>=1.23 in /usr/local/lib/python3.10/dist-packages (from concrete-python) (1.24.3)\n",
|
|
"Requirement already satisfied: scipy>=1.10 in /usr/local/lib/python3.10/dist-packages (from concrete-python) (1.10.1)\n",
|
|
"Requirement already satisfied: torch>=1.13 in /usr/local/lib/python3.10/dist-packages (from concrete-python) (2.0.0+cu118)\n",
|
|
"Requirement already satisfied: networkx>=2.6 in /usr/local/lib/python3.10/dist-packages (from concrete-python) (3.1)\n",
|
|
"Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch>=1.13->concrete-python) (4.5.0)\n",
|
|
"Requirement already satisfied: triton==2.0.0 in /usr/local/lib/python3.10/dist-packages (from torch>=1.13->concrete-python) (2.0.0)\n",
|
|
"Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch>=1.13->concrete-python) (1.11.1)\n",
|
|
"Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=1.13->concrete-python) (3.1.2)\n",
|
|
"Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch>=1.13->concrete-python) (3.12.0)\n",
|
|
"Requirement already satisfied: cmake in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch>=1.13->concrete-python) (3.25.2)\n",
|
|
"Requirement already satisfied: lit in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch>=1.13->concrete-python) (16.0.2)\n",
|
|
"Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=1.13->concrete-python) (2.1.2)\n",
|
|
"Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch>=1.13->concrete-python) (1.3.0)\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# Uncomment this line to install dependency\n",
|
|
"# ! pip install concrete-python\n",
|
|
"\n",
|
|
"# Required libraries\n",
|
|
"from concrete import fhe\n",
|
|
"import numpy as np"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "oCfjYazikbm_"
|
|
},
|
|
"source": [
|
|
"## Data Representation\n",
|
|
"As mentioned in the wiki page, all variables are $32$-bit unsigned integers. Additions should be calculated modulo $2^{32}$.\n",
|
|
"\n",
|
|
"While addition of 32-bit numbers are possible in the library, any other operations such modulizing, rotations, and bitwise operations are currently not possible. These operations require a lookup table with 32-bit inputs, but as of writing this tutorial, concrete-python supports up to 16-bit lookup tables. Higher precision lookup tables is still a research challenge in the homomorphic world and such a table would be dificult to compile and store at this moment.\n",
|
|
"\n",
|
|
"Thus, we need to break all the variables to **chunks** and work at the chunk level. Throughtout the code, *WIDTH* refers to the bitwidth of a chunk, and *NUM_CHUNKS* shows the number of chunks we need to represent a 32-bit data. These parameters are set at the begining of the code. We vary these parameters to see the impact of the *WIDTH* on the performance of the compiler and the circuit.\n",
|
|
"\n",
|
|
"\n",
|
|
"\n",
|
|
"Figure 1: Shows a break down of 32 bit of data into 4 chunks of 8 bit. This is not the only way to chunk the input."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"metadata": {
|
|
"id": "yaz8cNzjQ1UW"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Bitwidth of each chunk and number of chunks in each 32-bit number.\n",
|
|
"WIDTH, NUM_CHUNKS= 4, 8\n",
|
|
"\n",
|
|
"## Some other valid parameter sets\n",
|
|
"# WIDTH, NUM_CHUNKS= 8, 4\n",
|
|
"# WIDTH, NUM_CHUNKS= 2, 16\n",
|
|
"\n",
|
|
"assert (WIDTH * NUM_CHUNKS == 32)\n",
|
|
"\n",
|
|
"def break_down_data(data, data_size):\n",
|
|
" all_chunks = [\n",
|
|
" [ (x >> i*WIDTH)%(2**WIDTH) for i in range(data_size//WIDTH)[::-1] ]\n",
|
|
" for x in data\n",
|
|
" ]\n",
|
|
" return all_chunks\n",
|
|
"\n",
|
|
"def reshape_data(data):\n",
|
|
" return np.array(data).reshape(-1, NUM_CHUNKS)\n",
|
|
"\n",
|
|
"def chunks_to_uint32(chunks):\n",
|
|
" return int(sum([2**((NUM_CHUNKS-1-i)*WIDTH)*x for i, x in enumerate(chunks)]))\n",
|
|
"\n",
|
|
"def chunks_to_hexarray(chunks):\n",
|
|
" hexes = [hex(chunks_to_uint32(word))[2:] for word in chunks]\n",
|
|
" hexes = ['0'*(8-len(y))+y for y in hexes] #Appending leadning zero to the ones that are less than 8 characters TODO: write better\n",
|
|
" result = \"\".join(hexes)\n",
|
|
" return result\n"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "u7pA-B3As9u4"
|
|
},
|
|
"source": [
|
|
"### Creating Chunks\n",
|
|
"There are two list of constants in the algorithm, K and H. Before executing the algorithm, we need to break them to chunks using `split_to_chunks` function.\n",
|
|
"\n",
|
|
"\n",
|
|
"The input of the algorithm is arbitrary bytes. We might need to break each byte to smaller chunks based on the value of *WIDTH* after padding the data as per instructed by the algorithm. `break_down_data` function returns a numpy array of shape (48,NUM_CHUNKS)"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "b8rlvVf42CIa"
|
|
},
|
|
"source": [
|
|
"## Operations\n",
|
|
"Now that the data is stores as chunks, we must modify all operations we need to work at the level of chunks. In this section we explain how we implemented the required operations. The main three category of operations that we need to implement SHA-256 are:\n",
|
|
"\n",
|
|
"* Bitwise operations (AND, OR, XOR, NEGATE)\n",
|
|
"* Shifts and Rotations\n",
|
|
"* Modular Addition "
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "zlM1RN-NnjDn"
|
|
},
|
|
"source": [
|
|
"### Bitwise Operations\n",
|
|
"Bitwise operations are easily implemented in concrete-numpy. A bitwise operation over a 32-bit number is equivalent to the same operation over the chunks."
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "CxCwJOao2KCt"
|
|
},
|
|
"source": [
|
|
"### Rotation and Shifts\n",
|
|
"To understand how rotations work, consider a small example with 4 chunks of width 4, representing a 16-bit number, as shown in Figure 1. Most significant bits are located at index 0. So a 16-bit number will be `[[chunk_0], [chunk_1], [chunk_2], [chunk_3]]` with WIDTH=4. There are two possible scenario for rotations:\n",
|
|
"\n",
|
|
"1. Any rotation by a multiple of WIDTH (in this case, 4) will result in rotating the array of chunks. For example, right rotate(4) will be `[[chunk_3], [chunk_0], [chunk_1], [chunk_2]]`.\n",
|
|
"\n",
|
|
"2. For rotations less than WIDTH, for example `y`, we break every chunk into two parts of bitlength, `WIDTH-y` and `y`. We need to add the low `y`-bits of each chunk with the high `WIDTH-y` bits of the next chunk. Figure 2 illustrated this process. We leverage two lookup tables to extract the two segments of each chunk.\n",
|
|
"<!-- `table_extract_low_bits_and_raise` extracts the low `y`-bits of each chunk and places them in the MSB of the chunk. `table_extract_high_bits` extracts WIDTH-y most significant bits of each chunk. -->\n",
|
|
"\n",
|
|
"3. Rotations by other amounts are broken into the two steps described above.\n",
|
|
"\n",
|
|
""
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "575ogsJhFDIo"
|
|
},
|
|
"source": [
|
|
"### Shift\n",
|
|
"The shift operation is the same as rotation, but we prepend the encrypted scalar zero when we move the bits to the right."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"metadata": {
|
|
"id": "TRAFRZime-Jv"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"def right_rotate_list_of_chunks(list_to_rotate, amount):\n",
|
|
" return np.concatenate((\n",
|
|
" list_to_rotate[-amount:],\n",
|
|
" list_to_rotate[:-amount]\n",
|
|
" ))\n",
|
|
" \n",
|
|
"def right_shift_list_of_chunks(list_to_rotate, amount):\n",
|
|
" return np.concatenate((\n",
|
|
" [0] * list_to_rotate[-amount:].shape[0] ,\n",
|
|
" list_to_rotate[:- amount]\n",
|
|
" ))\n",
|
|
" \n",
|
|
"def left_shift_list_of_chunks(list_to_rotate, amount):\n",
|
|
" return np.concatenate((\n",
|
|
" list_to_rotate[amount:] ,\n",
|
|
" [0] * list_to_rotate[:amount].shape[0]\n",
|
|
" ))\n",
|
|
"\n",
|
|
"def rotate_less_than_width(chunks, shift):\n",
|
|
" raised_low_bits = fhe.univariate(lambda x: (x % 2**shift) << (WIDTH-shift))(chunks)\n",
|
|
" shifted_raised_low_bits = right_rotate_list_of_chunks(raised_low_bits, 1)\n",
|
|
"\n",
|
|
" high_bits = chunks >> shift\n",
|
|
" return shifted_raised_low_bits + high_bits\n",
|
|
"\n",
|
|
"def right_rotate(chunks, rotate_amount):\n",
|
|
" x = rotate_amount // WIDTH\n",
|
|
" y = rotate_amount % WIDTH\n",
|
|
" if x != 0: \n",
|
|
" rotated_chunks = right_rotate_list_of_chunks(chunks, x)\n",
|
|
" else:\n",
|
|
" rotated_chunks = chunks\n",
|
|
" if y != 0:\n",
|
|
" rotated = rotate_less_than_width(rotated_chunks, y)\n",
|
|
" else:\n",
|
|
" rotated = rotated_chunks\n",
|
|
"\n",
|
|
" return rotated\n",
|
|
"\n",
|
|
"def right_shift(chunks, shift_amount):\n",
|
|
" x = shift_amount // WIDTH\n",
|
|
" y = shift_amount % WIDTH\n",
|
|
" if x != 0:\n",
|
|
" shifted_chunks = right_shift_list_of_chunks(chunks, x)\n",
|
|
" else:\n",
|
|
" shifted_chunks = chunks\n",
|
|
" if y != 0:\n",
|
|
" # shift within chunks\n",
|
|
" raised_low_bits = fhe.univariate(lambda x: (x % 2**y) << (WIDTH-y))(shifted_chunks)\n",
|
|
" shifted_raised_low_bits = right_shift_list_of_chunks(raised_low_bits, 1)\n",
|
|
" high_bits = shifted_chunks >> y\n",
|
|
" result = shifted_raised_low_bits + high_bits\n",
|
|
" else:\n",
|
|
" result = shifted_chunks\n",
|
|
" return result"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "SKg8mKFOPXSV"
|
|
},
|
|
"source": [
|
|
"### Modular 32-bit Addition\n",
|
|
"Modular 32-bit addition is frequently used in SHA256. While Concrete supports additions of 32-bit numbers, modulizing the result requires a lookup table which is too large for Concrete. Hence, the addition must be done over chunks.\n",
|
|
"\n",
|
|
"Below is the function to add two 32-bit numbers mod $2^{32}$."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"metadata": {
|
|
"id": "EJEPvp2wQms9"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"def add_two_32_bits(a,b):\n",
|
|
" added = np.sum([a,b], axis=0)\n",
|
|
"\n",
|
|
" for i in range(NUM_CHUNKS):\n",
|
|
" results = added % (2 ** WIDTH)\n",
|
|
" if i < NUM_CHUNKS-1:\n",
|
|
" carries = added >> WIDTH\n",
|
|
" added = left_shift_list_of_chunks(carries, 1) + results\n",
|
|
"\n",
|
|
" return results\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"metadata": {
|
|
"id": "Uo6o_QMQn_fw"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Testing the addition function, adding four 32-bit numbers\n",
|
|
"test_inputs = np.random.randint(0,2**32, size=(2,))\n",
|
|
"input_chunks = break_down_data(test_inputs, 32)\n",
|
|
"\n",
|
|
"assert(chunks_to_uint32(add_two_32_bits(input_chunks[0], input_chunks[1]))== np.sum(test_inputs) % (2**32))"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "IOr8DTRJRYTl"
|
|
},
|
|
"source": [
|
|
"Adding two 4-bit numbers results in a 5-bit number. We then use two lookup tables:\n",
|
|
"\n",
|
|
"* `extract_carry` which extracts the carry of adding two chunks\n",
|
|
"* `extract_result` which extracts the 4-bit chunk which results from adding two chunks (without the carry)\n",
|
|
"\n",
|
|
"Each carry must now be added to the chunk next chunk and this process is repeated for as many chunks as there are. The figure below illustrates this process.\n",
|
|
"\n",
|
|
"\n",
|
|
"\n"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "4-dQk0wbtPOe"
|
|
},
|
|
"source": [
|
|
"The benefit of this addition algorithm is that it can be extended to the case where more two 32-bit numbers are added. The only difference is that the carry from the first iteration of the loop can be larger than 1. Specifically, by adding $k$ 4-bit numbers, the carry can be as big as $\\log_2 k$. For correctness, $\\log_2 k$ must be less than 4 or $k<16$.\n",
|
|
"\n",
|
|
"In our implementation of SHA-256, we only have two input and four input additions, so we only implement those.\n",
|
|
"\n",
|
|
"For four input addition, he first iteration of the loop, we use a different lookup table that extract a 2-bit carry and rest of the chunk. The rest of the algorithm does not change."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"metadata": {
|
|
"id": "obO8wHRbXHfj"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"def add_four_32_bits(a,b,c,d):\n",
|
|
" added = np.sum([a,b,c,d], axis=0)\n",
|
|
" \n",
|
|
" # First iteration of the loop is seperated\n",
|
|
" carries = added >> WIDTH\n",
|
|
" results = added % (2**WIDTH)\n",
|
|
" shifted_carries = left_shift_list_of_chunks(carries, 1)\n",
|
|
" added = shifted_carries + results\n",
|
|
"\n",
|
|
" for i in range(1,NUM_CHUNKS):\n",
|
|
" results = added % (2**WIDTH)\n",
|
|
" \n",
|
|
" # In the last iteration, carries need not be calculated\n",
|
|
" if i != NUM_CHUNKS-1: \n",
|
|
" carries = added >> WIDTH\n",
|
|
" shifted_carries = left_shift_list_of_chunks(carries, 1)\n",
|
|
" added = shifted_carries + results\n",
|
|
"\n",
|
|
" return results"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"metadata": {
|
|
"id": "zcwnDdPFdqE1"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Testing the addition function, adding four 32-bit numbers\n",
|
|
"\n",
|
|
"for _ in range(1000):\n",
|
|
" test_inputs = np.random.randint(0,2**32, size=(4,))\n",
|
|
" input_chunks = break_down_data(test_inputs, 32)\n",
|
|
"\n",
|
|
" assert(chunks_to_uint32(add_four_32_bits(input_chunks[0], input_chunks[1], input_chunks[2], input_chunks[3]))== np.sum(test_inputs) % (2**32))"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "1g6eEdhGoJl9"
|
|
},
|
|
"source": [
|
|
"## Operations for SHA-256\n",
|
|
"\n",
|
|
"Using the basic operations from the previous section, we can now implement all the necessary functions for SHA256"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"metadata": {
|
|
"id": "uo7HfO1DpVFK"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Used in the expansion\n",
|
|
"\n",
|
|
"def s0(w):\n",
|
|
" return right_rotate(w, 7) ^ right_rotate(w, 18) ^ right_shift(w, 3)\n",
|
|
"\n",
|
|
"def s1(w):\n",
|
|
" return right_rotate(w, 17) ^ right_rotate(w, 19) ^ right_shift(w, 10)\n",
|
|
"\n",
|
|
"# Used in main loop\n",
|
|
"\n",
|
|
"def S0(a_word):\n",
|
|
" return right_rotate(a_word, 2) ^ right_rotate(a_word, 13) ^ right_rotate(a_word, 22)\n",
|
|
"\n",
|
|
"def S1(e_word):\n",
|
|
" return right_rotate(e_word, 6) ^ right_rotate(e_word, 11) ^ right_rotate(e_word, 25)\n",
|
|
"\n",
|
|
"def Ch(e_word, f_word, g_word):\n",
|
|
" return (e_word & f_word) ^ ((2**WIDTH-1 - e_word) & g_word)\n",
|
|
"\n",
|
|
"def Maj(a_word, b_word, c_word):\n",
|
|
" return (a_word & b_word) ^ (a_word & c_word) ^ (b_word & c_word)\n",
|
|
"\n",
|
|
"def main_loop(args, w_i_plus_k_i):\n",
|
|
" a, b, c, d, e, f, g, h = args\n",
|
|
" temp1 = add_four_32_bits(h,S1(e),Ch(e, f, g), w_i_plus_k_i)\n",
|
|
" temp2 = add_two_32_bits(S0(a), Maj(a, b, c))\n",
|
|
" new_a = add_two_32_bits(temp1, temp2)\n",
|
|
" new_e = add_two_32_bits(d, temp1)\n",
|
|
" return np.array([new_a, a, b, c, new_e, e, f, g])"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "biM997KmvwUL"
|
|
},
|
|
"source": [
|
|
"We also need a function to pad the input as the first step of SHA256."
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "fZ-3sEH5vopA"
|
|
},
|
|
"source": [
|
|
"Moreover, we need a function to parse the input given to the program. The input is given as bytes, but the chunks might be smaller. We extract smaller chunks from bytes using lookup tables."
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "L4leg-z_skkU"
|
|
},
|
|
"source": [
|
|
"## Bringing it all together\n",
|
|
"Using all the components from the above, we can implement SHA256 as shown below."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"metadata": {
|
|
"id": "tmSfvdpyrwUx"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"K = [\n",
|
|
" 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,\n",
|
|
" 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,\n",
|
|
" 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,\n",
|
|
" 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967,\n",
|
|
" 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,\n",
|
|
" 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,\n",
|
|
" 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,\n",
|
|
" 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2\n",
|
|
"]\n",
|
|
"H = [0x6a09e667,0xbb67ae85,0x3c6ef372,0xa54ff53a,0x510e527f,0x9b05688c,0x1f83d9ab,0x5be0cd19]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 10,
|
|
"metadata": {
|
|
"id": "NHGCiC-Gk_tw"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"k_in = reshape_data(break_down_data(K, 32))\n",
|
|
"h_in = reshape_data(break_down_data(H, 32))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 11,
|
|
"metadata": {
|
|
"id": "yTiMkmBsHmKy"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"def uint64_to_bin(uint64 : int):\n",
|
|
" return (\"\".join([str(uint64 >> i & 1) for i in range(63, -1, -1)]))\n",
|
|
"\n",
|
|
"def sha256_preprocess(text):\n",
|
|
" \"\"\"\n",
|
|
" Takes a message of arbitrary length and returns a message\n",
|
|
" of length that is a multiple of 512 bits, with the original message padded\n",
|
|
" with a 1 bit, followed by 0 bits, followed by the original message length\n",
|
|
" in bits\n",
|
|
" \"\"\"\n",
|
|
" data = text\n",
|
|
" # convert to uint4 and group into 32 bit words (8 uint4s)\n",
|
|
" # #log (\"data is:\", data, data.shape)\n",
|
|
" message_len = data.shape[0] * 8 # denoted as 'l' in spec\n",
|
|
" # find padding length 'k'\n",
|
|
" k = (((448 - 1 - message_len) % 512) + 512) % 512 \n",
|
|
" # #log (\"k is:\", k)\n",
|
|
" zero_pad_width_in_bits = k\n",
|
|
" padstring = \"1\" + \"0\" * zero_pad_width_in_bits + str(uint64_to_bin(message_len))\n",
|
|
" #log (\"padstring size:\", len(padstring))\n",
|
|
" #log (\"padstring is:\", padstring)\n",
|
|
"\n",
|
|
" total_size = len(padstring) + message_len\n",
|
|
" #log (\"total size:\", total_size)\n",
|
|
" assert total_size % 512 == 0\n",
|
|
"\n",
|
|
" pad = np.array([int(padstring[i:i+8], 2) for i in range(0, len(padstring), 8)], dtype=np.uint8)\n",
|
|
" padded = np.concatenate((data, pad))\n",
|
|
" words = break_down_data(padded, 8)\n",
|
|
" chunks = reshape_data(words)\n",
|
|
" return chunks"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 12,
|
|
"metadata": {
|
|
"id": "3ox6Zs-ysoLr"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Number of rounds must be 64 to have correct SHA256\n",
|
|
"# If looking to get a faster run, reduce the number of rounds (but it will not be correct)\n",
|
|
"\n",
|
|
"def sha256(data, number_of_rounds=64):\n",
|
|
" h_chunks = fhe.zeros((len(h_in), NUM_CHUNKS))\n",
|
|
" k_chunks = fhe.zeros((len(k_in), NUM_CHUNKS))\n",
|
|
" h_chunks += h_in\n",
|
|
" k_chunks += k_in\n",
|
|
"\n",
|
|
" num_of_iters = data.shape[0]*32//512\n",
|
|
" for chunk_iter in range(0, num_of_iters):\n",
|
|
" \n",
|
|
" # Initializing the variables\n",
|
|
" chunk = data[chunk_iter*16:(chunk_iter+1)*16]\n",
|
|
" w = [None for _ in range(number_of_rounds)]\n",
|
|
" # Starting the main loop and expansion\n",
|
|
" working_vars = h_chunks\n",
|
|
" for j in range(0, number_of_rounds):\n",
|
|
" if j<16:\n",
|
|
" w[j] = chunk[j]\n",
|
|
" else:\n",
|
|
" w[j] = add_four_32_bits(w[j-16], s0(w[j-15]), w[j-7], s1(w[j-2]))\n",
|
|
" w_i_k_i = add_two_32_bits(w[j], k_chunks[j])\n",
|
|
" working_vars = main_loop(working_vars,w_i_k_i)\n",
|
|
" \n",
|
|
" # Accumulating the results\n",
|
|
" for j in range(8):\n",
|
|
" h_chunks[j] = add_two_32_bits(h_chunks[j], working_vars[j])\n",
|
|
" return h_chunks"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "w89rhSOh4In2"
|
|
},
|
|
"source": [
|
|
"We can test the correctness of this function as below (this is not in encrypted form yet)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 13,
|
|
"metadata": {
|
|
"colab": {
|
|
"base_uri": "https://localhost:8080/"
|
|
},
|
|
"id": "006LZp7c0yBA",
|
|
"outputId": "31588127-23e9-4b49-e481-d14842e336e7"
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
" SHA256: a412c46b0be134c593b0ad520d4a4c4e1d8aecca799be0be2c4d233ccf455cb7\n",
|
|
"Our SHA256: a412c46b0be134c593b0ad520d4a4c4e1d8aecca799be0be2c4d233ccf455cb7\n",
|
|
"Match: True\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"import hashlib\n",
|
|
"text = (\n",
|
|
" b\"Lorem ipsum dolor sit amet, consectetur adipiscing elit. \"\n",
|
|
" b\"Curabitur bibendum, urna eu bibendum egestas, neque augue eleifend odio, et sagittis viverra. and more than 150\"\n",
|
|
")\n",
|
|
"\n",
|
|
"result = sha256(sha256_preprocess(np.frombuffer(text, dtype=np.uint8)))\n",
|
|
"\n",
|
|
"m = hashlib.sha256()\n",
|
|
"m.update(text)\n",
|
|
"\n",
|
|
"print(\" SHA256:\", m.hexdigest())\n",
|
|
"print(\"Our SHA256:\", chunks_to_hexarray(result))\n",
|
|
"print(\"Match:\", chunks_to_hexarray(result)==m.hexdigest())"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 14,
|
|
"metadata": {
|
|
"id": "1uHN9GXgla_z"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"class HomomorphicSHA:\n",
|
|
" circuit: fhe.Circuit\n",
|
|
" def __init__(self, input_size_in_bytes=150, number_of_rounds=64) -> None:\n",
|
|
" self.input_size_in_bytes=input_size_in_bytes\n",
|
|
" assert 0 <= number_of_rounds <= 64, \"Number of rounds must be betweem zero and 64\"\n",
|
|
" self.number_of_rounds=number_of_rounds\n",
|
|
" inputset=[\n",
|
|
" sha256_preprocess(np.random.randint(0, 2**8, size=(input_size_in_bytes,)))\n",
|
|
" for _ in range(100)\n",
|
|
" ]\n",
|
|
" # Compilation of the circuit should take a few minutes\n",
|
|
" compiler = fhe.Compiler(lambda data: sha256(data, self.number_of_rounds), {\"data\": \"encrypted\"})\n",
|
|
" self.circuit = compiler.compile(\n",
|
|
" inputset=inputset,\n",
|
|
" configuration=fhe.Configuration(\n",
|
|
" enable_unsafe_features=True,\n",
|
|
" use_insecure_key_cache=True,\n",
|
|
" insecure_key_cache_location=\".keys\",\n",
|
|
" dataflow_parallelize=True,\n",
|
|
" ),\n",
|
|
" verbose=False,\n",
|
|
" )\n",
|
|
" \n",
|
|
" def getSHA(self, data):\n",
|
|
" assert len(data) == self.input_size_in_bytes, f\"Input size is not correct, should be {self.input_size_in_bytes} bytes/characters\"\n",
|
|
" return self.circuit.encrypt_run_decrypt(sha256_preprocess(data))\n",
|
|
"\n",
|
|
" def getPlainSHA(self, data):\n",
|
|
" return sha256(sha256_preprocess(data), self.number_of_rounds)"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "SpxY6dScee-k"
|
|
},
|
|
"source": [
|
|
"Now we are ready to compile the circuit! Note that **the compilation will take a long time**, so if you are looking to get a test run, you can set the number of rounds to something smaller than 64."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 15,
|
|
"metadata": {
|
|
"id": "P0cMOZUGee-k"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Warning: This will compile the circuit and will take a few minutes\n",
|
|
"\n",
|
|
"input_size_in_bytes = 150\n",
|
|
"running_small_example=True\n",
|
|
"\n",
|
|
"if running_small_example:\n",
|
|
" number_of_rounds = 2\n",
|
|
" sha = HomomorphicSHA(input_size_in_bytes, number_of_rounds)\n",
|
|
"else:\n",
|
|
" sha = HomomorphicSHA(input_size_in_bytes)"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"id": "zz1rd7VWee-k"
|
|
},
|
|
"source": [
|
|
"And after compilation, we are ready to run the circuit. Remember that the input size has to match what you gave in the previous cell. Our function will check this first to make sure the input is of the correct size. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 16,
|
|
"metadata": {
|
|
"colab": {
|
|
"base_uri": "https://localhost:8080/"
|
|
},
|
|
"id": "EkF0UxTcv_cQ",
|
|
"outputId": "c4e2c710-02bc-40e2-a921-4a29ac88380b"
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"This cell is disabled. It can takes hours. If you want to run this cell, set accept_a_very_long_run=True\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# WARNING: This takes a LONG time\n",
|
|
"accept_a_very_long_run = False\n",
|
|
"if not accept_a_very_long_run:\n",
|
|
" print(\"This cell is disabled. It can takes hours. If you want to run this cell, set accept_a_very_long_run=True\")\n",
|
|
"else:\n",
|
|
" text = (\n",
|
|
" b\"Lorem ipsum dolor sit amet, consectetur adipiscing elit. \"\n",
|
|
" b\"Curabitur bibendum, urna eu bibendum egestas, neque augue eleifend odio, et sagittis viverra.\"\n",
|
|
" )\n",
|
|
" input_bytes = np.frombuffer(text, dtype=np.uint8)\n",
|
|
" encrypted_evaluation = sha.getSHA(input_bytes)\n",
|
|
"\n",
|
|
" print(\"Encrypted Evaluation: \", chunks_to_hexarray(encrypted_evaluation))\n",
|
|
" print(\" Plain Evaluation: \", chunks_to_hexarray(sha.getPlainSHA(input_bytes)))"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"colab": {
|
|
"provenance": []
|
|
},
|
|
"kernelspec": {
|
|
"display_name": "Python 3.10.7 64-bit",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.7"
|
|
},
|
|
"vscode": {
|
|
"interpreter": {
|
|
"hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
|
|
}
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 0
|
|
}
|