ezkl/examples/notebooks/voice_judge.ipynb

{
    "cells": [
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "# Voice judgoor\n",
                "\n",
                "Here we showcase a full-end-to-end flow of:\n",
                "1. training a model for a specific task (judging voices)\n",
                "2. creating a proof of judgment\n",
                "3. creating and deploying an evm verifier\n",
                "4. verifying the proof of judgment using the verifier\n",
                "\n",
                "First we download a few voice related datasets from kaggle, which are all labelled using the same emotion and tone labelling standard.\n",
                "\n",
                "We have 8 emotions in both speaking and singing datasets: neutral, calm, happy, sad, angry, fear, disgust, surprise.\n",
                "\n",
                "To download the dataset make sure you have the kaggle cli installed in your local env `pip install kaggle`. Make sure you set up your `kaggle.json` file as detailed [here](https://www.kaggle.com/docs/api#getting-started-installation-&-authentication).\n",
                "Then run the associated `voice_data.sh` data download script: `sh voice_data.sh`.\n",
                "\n",
                "Make sure you set the `VOICE_DATA_DIR` variables to point to the directory the `voice_data.sh` script has downloaded to. This script also accepts an argument to download to a specific directory: `sh voice_data.sh /path/to/voice/data`.\n"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "\n",
                "import os\n",
                "# os.environ[\"VOICE_DATA_DIR\"] = \".\"\n",
                "\n",
                "voice_data_dir = os.environ.get('VOICE_DATA_DIR')\n",
                "\n",
                "#  if is none set to \"\"\n",
                "if voice_data_dir is None:\n",
                "    voice_data_dir = \"\"\n",
                "\n",
                "print(\"voice_data_dir: \", voice_data_dir)\n"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### TESS Dataset"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# check if notebook is in colab\n",
                "try:\n",
                "    # install ezkl\n",
                "    import google.colab\n",
                "    import subprocess\n",
                "    import sys\n",
                "    subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"ezkl\"])\n",
                "    subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"onnx\"])\n",
                "\n",
                "# rely on local installation of ezkl if the notebook is not in colab\n",
                "except:\n",
                "    pass\n",
                "\n",
                "from torch import nn\n",
                "import ezkl\n",
                "import os\n",
                "import json\n",
                "import pandas as pd\n",
                "import logging\n",
                "\n",
                "# read in VOICE_DATA_DIR from environment variable\n",
                "\n",
                "# FORMAT = '%(levelname)s %(name)s %(asctime)-15s %(filename)s:%(lineno)d %(message)s'\n",
                "# logging.basicConfig(format=FORMAT)\n",
                "# logging.getLogger().setLevel(logging.INFO)\n",
                "\n",
                "\n",
                "Tess = os.path.join(voice_data_dir, \"data/TESS/\")\n",
                "\n",
                "tess = os.listdir(Tess)\n",
                "\n",
                "emotions = []\n",
                "files = []\n",
                "\n",
                "for item in tess:\n",
                "    items = os.listdir(Tess + item)\n",
                "    for file in items:\n",
                "        part = file.split('.')[0]\n",
                "        part = part.split('_')[2]\n",
                "        if part == 'ps':\n",
                "            emotions.append('surprise')\n",
                "        else:\n",
                "            emotions.append(part)\n",
                "        files.append(Tess + item + '/' + file)\n",
                "\n",
                "tess_df = pd.concat([pd.DataFrame(emotions, columns=['Emotions']), pd.DataFrame(files, columns=['Files'])], axis=1)\n",
                "tess_df"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### RAVDESS SONG dataset"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "Ravdess = os.path.join(voice_data_dir, \"data/RAVDESS_SONG/audio_song_actors_01-24/\")\n",
                "\n",
                "ravdess_list = os.listdir(Ravdess)\n",
                "\n",
                "files = []\n",
                "emotions = []\n",
                "\n",
                "for item in ravdess_list:\n",
                "    actor = os.listdir(Ravdess + item)\n",
                "    for file in actor:\n",
                "        name = file.split('.')[0]\n",
                "        parts = name.split('-')\n",
                "        emotions.append(int(parts[2]))\n",
                "        files.append(Ravdess + item + '/' + file)\n",
                "\n",
                "emotion_data = pd.DataFrame(emotions, columns=['Emotions'])\n",
                "files_data = pd.DataFrame(files, columns=['Files'])\n",
                "\n",
                "ravdess_song_df = pd.concat([emotion_data, files_data], axis=1)\n",
                "\n",
                "ravdess_song_df.Emotions.replace({1:'neutral', 2:'calm', 3:'happy', 4:'sad', 5:'angry', 6:'fear', 7:'disgust', 8:'surprise'}, inplace=True)\n",
                "\n",
                "ravdess_song_df"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### RAVDESS SPEECH Dataset"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "Ravdess = os.path.join(voice_data_dir, \"data/RAVDESS_SPEECH/audio_speech_actors_01-24/\")\n",
                "\n",
                "ravdess_list = os.listdir(Ravdess)\n",
                "\n",
                "files = []\n",
                "emotions = []\n",
                "\n",
                "for item in ravdess_list:\n",
                "    actor = os.listdir(Ravdess + item)\n",
                "    for file in actor:\n",
                "        name = file.split('.')[0]\n",
                "        parts = name.split('-')\n",
                "        emotions.append(int(parts[2]))\n",
                "        files.append(Ravdess + item + '/' + file)\n",
                "        \n",
                "emotion_data = pd.DataFrame(emotions, columns=['Emotions'])\n",
                "files_data = pd.DataFrame(files, columns=['Files'])\n",
                "\n",
                "ravdess_df = pd.concat([emotion_data, files_data], axis=1)\n",
                "\n",
                "ravdess_df.Emotions.replace({1:'neutral', 2:'calm', 3:'happy', 4:'sad', 5:'angry', 6:'fear', 7:'disgust', 8:'surprise'}, inplace=True)\n",
                "\n",
                "ravdess_df"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### CREMA Dataset"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "Crema = os.path.join(voice_data_dir, \"data/CREMA-D/\")\n",
                "\n",
                "crema = os.listdir(Crema)\n",
                "emotions = []\n",
                "files = []\n",
                "\n",
                "for item in crema:\n",
                "    files.append(Crema + item)\n",
                "    \n",
                "    parts = item.split('_')\n",
                "    if parts[2] == 'SAD':\n",
                "        emotions.append('sad')\n",
                "    elif parts[2] == 'ANG':\n",
                "        emotions.append('angry')\n",
                "    elif parts[2] == 'DIS':\n",
                "        emotions.append('disgust')\n",
                "    elif parts[2] == 'FEA':\n",
                "        emotions.append('fear')\n",
                "    elif parts[2] == 'HAP':\n",
                "        emotions.append('happy')\n",
                "    elif parts[2] == 'NEU':\n",
                "        emotions.append('neutral')\n",
                "    else :\n",
                "        emotions.append('unknown')\n",
                "        \n",
                "emotions_data = pd.DataFrame(emotions, columns=['Emotions'])\n",
                "files_data = pd.DataFrame(files, columns=['Files'])\n",
                "\n",
                "crema_df = pd.concat([emotions_data, files_data], axis=1)\n",
                "\n",
                "crema_df"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### SAVEE Dataset"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "Savee = os.path.join(voice_data_dir,\"data/SAVEE/\")\n",
                "\n",
                "savee = os.listdir(Savee)\n",
                "\n",
                "emotions = []\n",
                "files = []\n",
                "\n",
                "for item in savee:\n",
                "    files.append(Savee + item)\n",
                "    part = item.split('_')[1]\n",
                "    ele = part[:-6]\n",
                "    if ele == 'a':\n",
                "        emotions.append('angry')\n",
                "    elif ele == 'd':\n",
                "        emotions.append('disgust')\n",
                "    elif ele == 'f':\n",
                "        emotions.append('fear')\n",
                "    elif ele == 'h':\n",
                "        emotions.append('happy')\n",
                "    elif ele == 'n':\n",
                "        emotions.append('neutral')\n",
                "    elif ele == 'sa':\n",
                "        emotions.append('sad')\n",
                "    else:\n",
                "        emotions.append('surprise')\n",
                "\n",
                "savee_df = pd.concat([pd.DataFrame(emotions, columns=['Emotions']), pd.DataFrame(files, columns=['Files'])], axis=1)\n",
                "savee_df"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Combining all datasets"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "df = pd.concat([ravdess_df, ravdess_song_df, crema_df, tess_df, savee_df], axis = 0)\n",
                "# relabel indices\n",
                "df.index = range(len(df.index))\n",
                "df.to_csv(\"df.csv\",index=False)\n",
                "df\n"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "import seaborn as sns\n",
                "sns.histplot(data=df, x=\"Emotions\")\n"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Training \n",
                "\n",
                "Here we convert all audio files into 2D frequency-domain spectrograms so that we can leverage convolutional neural networks, which tend to be more efficient than time-series model like RNNs or LSTMs.\n",
                "We thus: \n",
                "1. Extract the mel spectrogram from each of the audio recordings. \n",
                "2. Rescale each of these to the decibel (DB) scale. \n",
                "3. Define the model as the following model: `(x) -> (conv) -> (relu) -> (linear) -> (y)`\n",
                "\n",
                "\n",
                "You may notice that we introduce a second computational graph `(key) -> (key)`. The reasons for this are to do with MEV, and if you are not interested you can skip the following paragraph. \n",
                "\n",
                "Let's say that obtaining a high score from the judge and then submitting said score to the EVM verifier could result in the issuance of a reward (financial or otherwise). There is an incentive then for MEV bots to scalp any issued valid proof and submit a duplicate transaction with the same proof to the verifier contract in the hopes of obtaining the reward before the original issuer. Here we add `(key) -> (key)` such that the transaction creator's public key / address is both a private input AND a public input to the proof. As such the on-chain verification only succeeds if the key passed in during proof time is also passed in as a public input to the contract. The reward issued by the contract can then be irrevocably tied to that key such that even if the proof is submitted by another actor, the reward would STILL go to the original singer / transaction issuer. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "\n",
                "\n",
                "import librosa\n",
                "import numpy as np\n",
                "import matplotlib.pyplot as plt\n",
                "\n",
                "\n",
                "#stft extraction from augmented data\n",
                "def extract_mel_spec(filename):\n",
                "    x,sr=librosa.load(filename,duration=3,offset=0.5)\n",
                "    X = librosa.feature.melspectrogram(y=x, sr=sr)\n",
                "    Xdb = librosa.power_to_db(X, ref=np.max)\n",
                "    Xdb = Xdb.reshape(1,128,-1)\n",
                "    return Xdb\n",
                "\n",
                "Xdb=df.iloc[:,1].apply(lambda x: extract_mel_spec(x))\n"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "Here we convert label to a number between 0 and 1 where 1 is pleasant surprised and 0 is disgust and the rest are floats in between. The model loves pleasantly surprised voices and hates disgust ;) "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# get max size\n",
                "max_size = 0\n",
                "for i in range(len(Xdb)):\n",
                "    if Xdb[i].shape[2] > max_size:\n",
                "        max_size = Xdb[i].shape[2]\n",
                "\n",
                "# 0 pad 2nd dim to max size\n",
                "Xdb=Xdb.apply(lambda x: np.pad(x,((0,0),(0,0),(0,max_size-x.shape[2]))))\n",
                "\n",
                "Xdb=pd.DataFrame(Xdb)\n",
                "Xdb['label'] = df['Emotions']\n",
                "# convert label to a number between 0 and 1 where 1 is pleasant surprised and 0 is disgust and the rest are floats in betwee\n",
                "Xdb['label'] = Xdb['label'].apply(lambda x: 1 if x=='surprise' else 0 if x=='disgust' else 0.2 if x=='fear' else 0.4 if x=='happy' else 0.6 if x=='sad' else 0.8)\n",
                "\n",
                "Xdb.iloc[0,0][0].shape"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "import torch\n",
                "# Defines the model\n",
                "# we got convs, we got relu, we got linear layers\n",
                "# What else could one want ????\n",
                "\n",
                "class MyModel(nn.Module):\n",
                "    def __init__(self):\n",
                "        super(MyModel, self).__init__()\n",
                "\n",
                "        self.conv1 = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=5, stride=4)\n",
                "\n",
                "        self.d1 = nn.Linear(992, 1)\n",
                "\n",
                "        self.sigmoid = nn.Sigmoid()\n",
                "        self.relu = nn.ReLU()\n",
                "\n",
                "    def forward(self, key, x):\n",
                "        # 32x1x28x28 => 32x32x26x26\n",
                "        x = self.conv1(x)\n",
                "        x = self.relu(x)\n",
                "        x = x.flatten(start_dim=1)\n",
                "        x = self.d1(x)\n",
                "        x = self.sigmoid(x)\n",
                "\n",
                "        return [key, x]\n",
                "\n",
                "\n",
                "circuit = MyModel()\n",
                "\n",
                "output = circuit(0, torch.tensor(Xdb.iloc[0,0][0].reshape(1,1,128,130)))\n",
                "\n",
                "output\n",
                "\n",
                "\n",
                "\n"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "Here we leverage the classic Adam optimizer, coupled with 0.001 weight decay so as to regularize the model. The weight decay (a.k.a L2 regularization) can also help on the zk-circuit end of things in that it prevents inputs to Halo2 lookup tables from falling out of range (lookup tables are how we represent non-linearities like ReLU and Sigmoid inside our circuits). "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "from tqdm import tqdm\n",
                "\n",
                "# Train the model using pytorch\n",
                "n_epochs = 10    # number of epochs to run\n",
                "batch_size = 10  # size of each batch\n",
                "\n",
                "\n",
                "loss_fn = nn.MSELoss()  #MSE\n",
                "# adds l2 regularization\n",
                "optimizer = torch.optim.Adam(circuit.parameters(), lr=0.001, weight_decay=0.001)\n",
                "\n",
                "# randomly shuffle dataset\n",
                "Xdb = Xdb.sample(frac=1).reset_index(drop=True)\n",
                "\n",
                "# split into train and test and validation sets with 80% train, 10% test, 10% validation\n",
                "train = Xdb.iloc[:int(len(Xdb)*0.8)]\n",
                "test = Xdb.iloc[int(len(Xdb)*0.8):int(len(Xdb)*0.9)]\n",
                "val = Xdb.iloc[int(len(Xdb)*0.9):]\n",
                "\n",
                "batches_per_epoch = len(train)\n",
                "\n",
                "\n",
                "def get_loss(Xbatch, ybatch):\n",
                "    y_pred = circuit(0, Xbatch)[1]\n",
                "    loss = loss_fn(y_pred, ybatch)\n",
                "    return loss\n",
                "\n",
                "for epoch in range(n_epochs):\n",
                "    # X is a torch Variable\n",
                "    permutation = torch.randperm(batches_per_epoch)\n",
                "\n",
                "    with tqdm(range(batches_per_epoch), unit=\"batch\", mininterval=0) as bar:\n",
                "        bar.set_description(f\"Epoch {epoch}\")\n",
                "        for i in bar:\n",
                "            start = i * batch_size\n",
                "            # take a batch\n",
                "            indices = np.random.choice(batches_per_epoch, batch_size)\n",
                "\n",
                "            data = np.concatenate(train.iloc[indices.tolist(),0].values)\n",
                "            labels = train.iloc[indices.tolist(),1].values.astype(np.float32)\n",
                "\n",
                "            data = data.reshape(batch_size,1,128,130)\n",
                "            labels = labels.reshape(batch_size,1)\n",
                "\n",
                "            # convert to tensors\n",
                "            Xbatch = torch.tensor(data)\n",
                "            ybatch = torch.tensor(labels)\n",
                "\n",
                "            # forward pass\n",
                "            loss = get_loss(Xbatch, ybatch)\n",
                "            # backward pass\n",
                "            optimizer.zero_grad()\n",
                "            loss.backward()\n",
                "            # update weights\n",
                "            optimizer.step()\n",
                "\n",
                "            bar.set_postfix(\n",
                "                batch_loss=float(loss),\n",
                "            )\n",
                "        # get validation loss\n",
                "        val_data = np.concatenate(val.iloc[:,0].values)\n",
                "        val_labels = val.iloc[:,1].values.astype(np.float32)\n",
                "        val_data = val_data.reshape(len(val),1,128,130)\n",
                "        val_labels = val_labels.reshape(len(val),1)\n",
                "        val_loss = get_loss(torch.tensor(val_data), torch.tensor(val_labels))\n",
                "        print(f\"Validation loss: {val_loss}\")\n",
                "\n",
                "\n",
                "\n",
                "# get validation loss\n",
                "test_data = np.concatenate(test.iloc[:,0].values)\n",
                "test_labels = val.iloc[:,1].values.astype(np.float32)\n",
                "test_data = val_data.reshape(len(val),1,128,130)\n",
                "test_labels = val_labels.reshape(len(val),1)\n",
                "test_loss = get_loss(torch.tensor(test_data), torch.tensor(test_labels))\n",
                "print(f\"Test loss: {test_loss}\")\n",
                "\n"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": []
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "#\n",
                "val_data = {\n",
                "    \"input_data\": [np.zeros(100).tolist(), np.concatenate(val.iloc[:100,0].values).flatten().tolist()],\n",
                "}\n",
                "# save as json file\n",
                "with open(\"val_data.json\", \"w\") as f:\n",
                "    json.dump(val_data, f)\n"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "x = 0.1*torch.rand(1,*[1, 128, 130], requires_grad=True)\n",
                "key = torch.rand(1,*[1], requires_grad=True)\n",
                "\n",
                "# Flips the neural net into inference mode\n",
                "circuit.eval()\n",
                "\n",
                "    # Export the model\n",
                "torch.onnx.export(circuit,               # model being run\n",
                "                      (key, x),                   # model input (or a tuple for multiple inputs)\n",
                "                      \"network.onnx\",            # where to save the model (can be a file or file-like object)\n",
                "                      export_params=True,        # store the trained parameter weights inside the model file\n",
                "                      opset_version=10,          # the ONNX version to export the model to\n",
                "                      do_constant_folding=True,  # whether to execute constant folding for optimization\n",
                "                      input_names = ['input'],   # the model's input names\n",
                "                      output_names = ['output'], # the model's output names\n",
                "                      dynamic_axes={'input' : {0 : 'batch_size'},\n",
                "                                    'input.1' : {0 : 'batch_size'}, # variable length axes\n",
                "                                    'output' : {0 : 'batch_size'}})\n",
                "\n",
                "key_array = ((key).detach().numpy()).reshape([-1]).tolist()\n",
                "data_array = ((x).detach().numpy()).reshape([-1]).tolist()\n",
                "\n",
                "data = dict(input_data = [key_array, data_array])\n",
                "\n",
                "    # Serialize data into file:\n",
                "json.dump( data, open(\"input.json\", 'w' ))\n",
                "\n",
                "\n",
                "# ezkl.export(circuit, input_shape = [[1], [1025, 130]], run_gen_witness=False, run_calibrate_settings=False)"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "Here we set the visibility of the different parts of the circuit, whereby the model params and the outputs of the computational graph (the key and the judgment) are public"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "import ezkl\n",
                "import os \n",
                "\n",
                "model_path = os.path.join('network.onnx')\n",
                "compiled_model_path = os.path.join('network.compiled')\n",
                "pk_path = os.path.join('test.pk')\n",
                "vk_path = os.path.join('test.vk')\n",
                "settings_path = os.path.join('settings.json')\n",
                "srs_path = os.path.join('kzg.params')\n",
                "data_path = os.path.join('input.json')\n",
                "val_data = os.path.join('val_data.json')\n",
                "\n",
                "run_args = ezkl.PyRunArgs()\n",
                "run_args.input_visibility = \"private\"\n",
                "run_args.param_visibility = \"fixed\"\n",
                "run_args.output_visibility = \"public\"\n",
                "run_args.variables = [(\"batch_size\", 1)]\n",
                "\n",
                "\n",
                "# TODO: Dictionary outputs\n",
                "res = ezkl.gen_settings(model_path, settings_path, py_run_args=run_args)\n",
                "assert res == True\n"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "Now we generate a settings file. This file basically instantiates a bunch of parameters that determine their circuit shape, size etc... Because of the way we represent nonlinearities in the circuit (using Halo2's [lookup tables](https://zcash.github.io/halo2/design/proving-system/lookup.html)), it is often best to _calibrate_ this settings file as some data can fall out of range of these lookups.\n",
                "\n",
                "You can pass a dataset for calibration that will be representative of real inputs you might find if and when you deploy the prover. Here we use the validation dataset we used during training. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "\n",
                "\n",
                "res = ezkl.calibrate_settings(val_data, model_path, settings_path, \"resources\", scales = [4])\n",
                "assert res == True\n",
                "print(\"verified\")\n"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "res = ezkl.compile_circuit(model_path, compiled_model_path, settings_path)\n",
                "assert res == True"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "As we use Halo2 with KZG-commitments we need an SRS string from (preferably) a multi-party trusted setup ceremony. For an overview of the procedures for such a ceremony check out [this page](https://blog.ethereum.org/2023/01/16/announcing-kzg-ceremony). The `get_srs` command retrieves a correctly sized SRS given the calibrated settings file from [here](https://github.com/han0110/halo2-kzg-srs). \n",
                "\n",
                "These SRS were generated with [this](https://github.com/privacy-scaling-explorations/perpetualpowersoftau) ceremony. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "res = await ezkl.get_srs(settings_path)"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "We now need to generate the (partial) circuit witness. These are the model outputs (and any hashes) that are generated when feeding the previously generated `input.json` through the circuit / model. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "\n",
                "witness_path = \"witness.json\"\n",
                "\n",
                "res = ezkl.gen_witness(data_path, compiled_model_path, witness_path)\n",
                "assert os.path.isfile(witness_path)"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "As a sanity check we can run a mock proof. This just checks that all the constraints are valid. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "\n",
                "\n",
                "res = ezkl.mock(witness_path, compiled_model_path)"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "Here we setup verifying and proving keys for the circuit. As the name suggests the proving key is needed for ... proving and the verifying key is needed for ... verifying. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "!export RUST_LOG=trace\n",
                "# HERE WE SETUP THE CIRCUIT PARAMS\n",
                "# WE GOT KEYS\n",
                "# WE GOT CIRCUIT PARAMETERS\n",
                "# EVERYTHING ANYONE HAS EVER NEEDED FOR ZK\n",
                "res = ezkl.setup(\n",
                "        compiled_model_path,\n",
                "        vk_path,\n",
                "        pk_path,\n",
                "        \n",
                "    )\n",
                "\n",
                "assert res == True\n",
                "assert os.path.isfile(vk_path)\n",
                "assert os.path.isfile(pk_path)\n",
                "assert os.path.isfile(settings_path)"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "Now we generate a full proof. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# GENERATE A PROOF\n",
                "\n",
                "proof_path = os.path.join('test.pf')\n",
                "\n",
                "res = ezkl.prove(\n",
                "        witness_path,\n",
                "        compiled_model_path,\n",
                "        pk_path,\n",
                "        proof_path,\n",
                "        \n",
                "        \"single\",\n",
                "    )\n",
                "\n",
                "print(res)\n",
                "assert os.path.isfile(proof_path)"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "And verify it as a sanity check. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# VERIFY IT\n",
                "\n",
                "res = ezkl.verify(\n",
                "        proof_path,\n",
                "        settings_path,\n",
                "        vk_path,\n",
                "        \n",
                "    )\n",
                "\n",
                "assert res == True\n",
                "print(\"verified\")"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "import os\n",
                "abi_path = 'test.abi'\n",
                "sol_code_path = 'test.sol'\n",
                "vk_path = os.path.join('test.vk')\n",
                "srs_path = os.path.join('kzg.params')\n",
                "settings_path = os.path.join('settings.json')\n",
                "\n",
                "\n",
                "res = await ezkl.create_evm_verifier(\n",
                "        vk_path,\n",
                "        \n",
                "        settings_path,\n",
                "        sol_code_path,\n",
                "        abi_path,\n",
                "    )\n",
                "\n",
                "assert res == True"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Verify if the Verifier Works Locally"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "#### Deploy The Contract"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# Make sure anvil is running locally first\n",
                "# run with $ anvil -p 3030\n",
                "# we use the default anvil node here\n",
                "import json\n",
                "\n",
                "address_path = os.path.join(\"address.json\")\n",
                "\n",
                "res = await ezkl.deploy_evm(\n",
                "    address_path,\n",
                "    'http://127.0.0.1:3030',\n",
                "    sol_code_path,\n",
                ")\n",
                "\n",
                "assert res == True\n",
                "\n",
                "with open(address_path, 'r') as file:\n",
                "    addr = file.read().rstrip()"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# make sure anvil is running locally\n",
                "# $ anvil -p 3030\n",
                "\n",
                "res = await ezkl.verify_evm(\n",
                "    addr,\n",
                "    \"http://127.0.0.1:3030\",\n",
                "    proof_path\n",
                ")\n",
                "assert res == True"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": []
        }
    ],
    "metadata": {
        "kernelspec": {
            "display_name": "Python 3 (ipykernel)",
            "language": "python",
            "name": "python3"
        },
        "language_info": {
            "codemirror_mode": {
                "name": "ipython",
                "version": 3
            },
            "file_extension": ".py",
            "mimetype": "text/x-python",
            "name": "python",
            "nbconvert_exporter": "python",
            "pygments_lexer": "ipython3",
            "version": "3.9.15"
        }
    },
    "nbformat": 4,
    "nbformat_minor": 2
}