ezkl/examples/notebooks/hashed_vis.ipynb

{
    "cells": [
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "# hashed-ezkl\n",
                "\n",
                "Here's an example leveraging EZKL whereby the inputs to the model, and the model params themselves, are hashed inside a circuit.\n",
                "\n",
                "In this setup:\n",
                "- the hashes are publicly known to the prover and verifier\n",
                "- the hashes serve as \"public inputs\" (a.k.a instances) to the circuit\n",
                "\n",
                "We leave the outputs of the model as public as well (known to the  verifier and prover). \n"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": []
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "First we import the necessary dependencies and set up logging to be as informative as possible. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# check if notebook is in colab\n",
                "try:\n",
                "    # install ezkl\n",
                "    import google.colab\n",
                "    import subprocess\n",
                "    import sys\n",
                "    subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"ezkl\"])\n",
                "    subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"onnx\"])\n",
                "\n",
                "# rely on local installation of ezkl if the notebook is not in colab\n",
                "except:\n",
                "    pass\n",
                "\n",
                "from torch import nn\n",
                "import ezkl\n",
                "import os\n",
                "import json\n",
                "import logging\n",
                "\n",
                "# uncomment for more descriptive logging \n",
                "FORMAT = '%(levelname)s %(name)s %(asctime)-15s %(filename)s:%(lineno)d %(message)s'\n",
                "logging.basicConfig(format=FORMAT)\n",
                "logging.getLogger().setLevel(logging.INFO)\n"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "Now we define our model. It is a humble model with but a conv layer and a $ReLU$ non-linearity, but it is a model nonetheless"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "import torch\n",
                "# Defines the model\n",
                "# we got convs, we got relu, \n",
                "# What else could one want ????\n",
                "\n",
                "class MyModel(nn.Module):\n",
                "    def __init__(self):\n",
                "        super(MyModel, self).__init__()\n",
                "\n",
                "        self.conv1 = nn.Conv2d(in_channels=3, out_channels=1, kernel_size=5, stride=4)\n",
                "        self.relu = nn.ReLU()\n",
                "\n",
                "    def forward(self, x):\n",
                "        x = self.conv1(x)\n",
                "        x = self.relu(x)\n",
                "\n",
                "        return x\n",
                "\n",
                "\n",
                "circuit = MyModel()\n",
                "\n",
                "# this is where you'd train your model\n",
                "\n",
                "\n"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "We omit training for purposes of this demonstration. We've marked where training would happen in the cell above. \n",
                "Now we export the model to onnx and create a corresponding (randomly generated) input file.\n",
                "\n",
                "You can replace the random `x` with real data if you so wish. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "x = torch.rand(1,*[3, 8, 8], requires_grad=True)\n",
                "\n",
                "# Flips the neural net into inference mode\n",
                "circuit.eval()\n",
                "\n",
                "    # Export the model\n",
                "torch.onnx.export(circuit,               # model being run\n",
                "                      x,                   # model input (or a tuple for multiple inputs)\n",
                "                      \"network.onnx\",            # where to save the model (can be a file or file-like object)\n",
                "                      export_params=True,        # store the trained parameter weights inside the model file\n",
                "                      opset_version=10,          # the ONNX version to export the model to\n",
                "                      do_constant_folding=True,  # whether to execute constant folding for optimization\n",
                "                      input_names = ['input'],   # the model's input names\n",
                "                      output_names = ['output'], # the model's output names\n",
                "                      dynamic_axes={'input' : {0 : 'batch_size'},    # variable length axes\n",
                "                                    'output' : {0 : 'batch_size'}})\n",
                "\n",
                "data_array = ((x).detach().numpy()).reshape([-1]).tolist()\n",
                "\n",
                "data = dict(input_data = [data_array])\n",
                "\n",
                "    # Serialize data into file:\n",
                "json.dump( data, open(\"input.json\", 'w' ))\n",
                "\n"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "This is where the magic happens. We define our `PyRunArgs` objects which contains the visibility parameters for out model. \n",
                "- `input_visibility` defines the visibility of the model inputs\n",
                "- `param_visibility` defines the visibility of the model weights and constants and parameters \n",
                "- `output_visibility` defines the visibility of the model outputs\n",
                "\n",
                "There are currently 6 visibility settings:\n",
                "- `public`: known to both the verifier and prover (a subtle nuance is that this may not be the case for model parameters but until we have more rigorous theoretical results we don't want to make strong claims as to this). \n",
                "- `private`: known only to the prover\n",
                "- `fixed`: known to the prover and verifier (as a commit), but not modifiable by the prover.\n",
                "- `hashed`: the hash pre-image is known to the prover, the prover and verifier know the hash. The prover proves that the they know the pre-image to the hash. \n",
                "- `encrypted`: the non-encrypted element and the secret key used for decryption are known to the prover. The prover and the verifier know the encrypted element, the public key used to encrypt, and the hash of the decryption hey. The prover proves that they know the pre-image of the hashed decryption key and that this key can in fact decrypt the encrypted message.\n",
                "- `kzgcommit`: unblinded advice column which generates a kzg commitment. This doesn't appear in the instances of the circuit and must instead be inserted directly within the proof bytes.  \n",
                "\n",
                "\n",
                "Here we create the following setup:\n",
                "- `input_visibility`: \"hashed\"\n",
                "- `param_visibility`: \"hashed\"\n",
                "- `output_visibility`: public\n",
                "\n",
                "We encourage you to play around with other setups :) \n",
                "\n",
                "Shoutouts: \n",
                "\n",
                "- [summa-solvency](https://github.com/summa-dev/summa-solvency) for their help with the poseidon hashing chip. \n",
                "- [timeofey](https://github.com/timoftime) for providing inspiration in our developement of the el-gamal encryption circuit in Halo2. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "import ezkl\n",
                "\n",
                "model_path = os.path.join('network.onnx')\n",
                "compiled_model_path = os.path.join('network.compiled')\n",
                "pk_path = os.path.join('test.pk')\n",
                "vk_path = os.path.join('test.vk')\n",
                "settings_path = os.path.join('settings.json')\n",
                "srs_path = os.path.join('kzg.srs')\n",
                "data_path = os.path.join('input.json')\n",
                "\n",
                "run_args = ezkl.PyRunArgs()\n",
                "run_args.input_visibility = \"hashed\"\n",
                "run_args.param_visibility = \"hashed\"\n",
                "run_args.output_visibility = \"public\"\n",
                "run_args.variables = [(\"batch_size\", 1)]\n",
                "\n",
                "\n",
                "\n"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "Now we generate a settings file. This file basically instantiates a bunch of parameters that determine their circuit shape, size etc... Because of the way we represent nonlinearities in the circuit (using Halo2's [lookup tables](https://zcash.github.io/halo2/design/proving-system/lookup.html)), it is often best to _calibrate_ this settings file as some data can fall out of range of these lookups.\n",
                "\n",
                "You can pass a dataset for calibration that will be representative of real inputs you might find if and when you deploy the prover. Here we create a dummy calibration dataset for demonstration purposes. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "!RUST_LOG=trace\n",
                "# TODO: Dictionary outputs\n",
                "res = ezkl.gen_settings(model_path, settings_path, py_run_args=run_args)\n",
                "assert res == True"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# generate a bunch of dummy calibration data\n",
                "cal_data = {\n",
                "    \"input_data\": [torch.cat((x, torch.rand(10, *[3, 8, 8]))).flatten().tolist()],\n",
                "}\n",
                "\n",
                "cal_path = os.path.join('val_data.json')\n",
                "# save as json file\n",
                "with open(cal_path, \"w\") as f:\n",
                "    json.dump(cal_data, f)\n",
                "\n",
                "res = ezkl.calibrate_settings(cal_path, model_path, settings_path, \"resources\")"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "res = ezkl.compile_circuit(model_path, compiled_model_path, settings_path)\n",
                "assert res == True"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "As we use Halo2 with KZG-commitments we need an SRS string from (preferably) a multi-party trusted setup ceremony. For an overview of the procedures for such a ceremony check out [this page](https://blog.ethereum.org/2023/01/16/announcing-kzg-ceremony). The `get_srs` command retrieves a correctly sized SRS given the calibrated settings file from [here](https://github.com/han0110/halo2-kzg-srs). \n",
                "\n",
                "These SRS were generated with [this](https://github.com/privacy-scaling-explorations/perpetualpowersoftau) ceremony. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "res = ezkl.get_srs(srs_path, settings_path)\n"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "We now need to generate the (partial) circuit witness. These are the model outputs (and any hashes) that are generated when feeding the previously generated `input.json` through the circuit / model. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "!export RUST_BACKTRACE=1\n",
                "\n",
                "witness_path = \"witness.json\"\n",
                "\n",
                "res = ezkl.gen_witness(data_path, compiled_model_path, witness_path)"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "As a sanity check you can \"mock prove\" (i.e check that all the constraints of the circuit match without generate a full proof). "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "\n",
                "\n",
                "res = ezkl.mock(witness_path, compiled_model_path)"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "Here we setup verifying and proving keys for the circuit. As the name suggests the proving key is needed for ... proving and the verifying key is needed for ... verifying. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# HERE WE SETUP THE CIRCUIT PARAMS\n",
                "# WE GOT KEYS\n",
                "# WE GOT CIRCUIT PARAMETERS\n",
                "# EVERYTHING ANYONE HAS EVER NEEDED FOR ZK\n",
                "res = ezkl.setup(\n",
                "        compiled_model_path,\n",
                "        vk_path,\n",
                "        pk_path,\n",
                "        srs_path,\n",
                "    )\n",
                "\n",
                "assert res == True\n",
                "assert os.path.isfile(vk_path)\n",
                "assert os.path.isfile(pk_path)\n",
                "assert os.path.isfile(settings_path)"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "Now we generate a full proof. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# GENERATE A PROOF\n",
                "\n",
                "proof_path = os.path.join('test.pf')\n",
                "\n",
                "res = ezkl.prove(\n",
                "        witness_path,\n",
                "        compiled_model_path,\n",
                "        pk_path,\n",
                "        proof_path,\n",
                "        srs_path,\n",
                "        \"single\",\n",
                "    )\n",
                "\n",
                "print(res)\n",
                "assert os.path.isfile(proof_path)"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "And verify it as a sanity check. "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# VERIFY IT\n",
                "\n",
                "res = ezkl.verify(\n",
                "        proof_path,\n",
                "        settings_path,\n",
                "        vk_path,\n",
                "        srs_path,\n",
                "    )\n",
                "\n",
                "assert res == True\n",
                "print(\"verified\")"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "We can now create an EVM / `.sol` verifier that can be deployed on chain to verify submitted proofs using a view function."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "\n",
                "abi_path = 'test.abi'\n",
                "sol_code_path = 'test.sol'\n",
                "\n",
                "res = ezkl.create_evm_verifier(\n",
                "        vk_path,\n",
                "        srs_path,\n",
                "        settings_path,\n",
                "        sol_code_path,\n",
                "        abi_path,\n",
                "    )\n",
                "assert res == True\n"
            ]
        },
        {
            "attachments": {},
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## Verify on the evm"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# Make sure anvil is running locally first\n",
                "# run with $ anvil -p 3030\n",
                "# we use the default anvil node here\n",
                "import json\n",
                "\n",
                "address_path = os.path.join(\"address.json\")\n",
                "\n",
                "res = ezkl.deploy_evm(\n",
                "    address_path,\n",
                "    sol_code_path,\n",
                "    'http://127.0.0.1:3030'\n",
                ")\n",
                "\n",
                "assert res == True\n",
                "\n",
                "with open(address_path, 'r') as file:\n",
                "    addr = file.read().rstrip()"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": [
                "# make sure anvil is running locally\n",
                "# $ anvil -p 3030\n",
                "\n",
                "res = ezkl.verify_evm(\n",
                "    proof_path,\n",
                "    addr,\n",
                "    \"http://127.0.0.1:3030\"\n",
                ")\n",
                "assert res == True"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": null,
            "metadata": {},
            "outputs": [],
            "source": []
        }
    ],
    "metadata": {
        "kernelspec": {
            "display_name": "ezkl",
            "language": "python",
            "name": "python3"
        },
        "language_info": {
            "codemirror_mode": {
                "name": "ipython",
                "version": 3
            },
            "file_extension": ".py",
            "mimetype": "text/x-python",
            "name": "python",
            "nbconvert_exporter": "python",
            "pygments_lexer": "ipython3",
            "version": "3.9.13"
        },
        "orig_nbformat": 4
    },
    "nbformat": 4,
    "nbformat_minor": 2
}