2022-10-11 20:53:41 +05:30
2022-10-11 20:53:41 +05:30
2022-09-29 01:20:42 +05:30
2022-04-21 20:31:30 +05:30
2022-09-19 15:07:59 -07:00
2022-06-12 20:15:30 -07:00
2022-10-11 01:53:55 -07:00
2022-08-25 18:52:00 -05:00

SHARK

High Performance Machine Learning and Data Analytics for CPUs, GPUs, Accelerators and Heterogeneous Clusters

Nightly Release Validate torch-models on Shark Runtime

Communication Channels

Installation

Installation (Linux and macOS)

Setup a new pip Virtual Environment

This step sets up a new VirtualEnv for Python

python --version #Check you have 3.7->3.10 on Linux or 3.10 on macOS
python -m venv shark_venv
source shark_venv/bin/activate

# If you are using conda create and activate a new conda env

# Some older pip installs may not be able to handle the recent PyTorch deps
python -m pip install --upgrade pip

macOS Metal users please install https://sdk.lunarg.com/sdk/download/latest/mac/vulkan-sdk.dmg and enable "System wide install"

Install SHARK

This step pip installs SHARK and related packages on Linux Python 3.7, 3.8, 3.9, 3.10 and macOS Python 3.10

pip install nodai-shark -f https://github.com/nod-ai/SHARK/releases -f https://github.com/llvm/torch-mlir/releases -f https://github.com/nod-ai/shark-runtime/releases --extra-index-url https://download.pytorch.org/whl/nightly/cpu

If you are on an Intel macOS machine you need this workaround for an upstream issue.

Download and run Resnet50 sample

curl -O https://raw.githubusercontent.com/nod-ai/SHARK/main/shark/examples/shark_inference/resnet50_script.py
#Install deps for test script
pip install --pre torch torchvision torchaudio tqdm pillow gsutil --extra-index-url https://download.pytorch.org/whl/nightly/cpu
python ./resnet50_script.py --device="cpu"  #use cuda or vulkan or metal

Download and run BERT (MiniLM) sample

curl -O https://raw.githubusercontent.com/nod-ai/SHARK/main/shark/examples/shark_inference/minilm_jit.py
#Install deps for test script
pip install transformers torch --extra-index-url https://download.pytorch.org/whl/nightly/cpu
python ./minilm_jit.py --device="cpu"  #use cuda or vulkan or metal
Source Installation

Check out the code

git clone https://github.com/nod-ai/SHARK.git

Setup your Python VirtualEnvironment and Dependencies

# Setup venv and install necessary packages (torch-mlir, nodLabs/Shark, ...).
./setup_venv.sh
source shark.venv/bin/activate

For example if you want to use Python3.10 and upstream IREE with TF Import tools you can use the environment variables like:

# PYTHON=python3.10 VENV_DIR=0617_venv IMPORTER=1 USE_IREE=1 ./setup_venv.sh 

If you are a Torch-mlir developer or an IREE developer and want to test local changes you can uninstall the provided packages with pip uninstall torch-mlir and / or pip uninstall iree-compiler iree-runtime and build locally with Python bindings and set your PYTHONPATH as mentioned here for IREE and here for Torch-MLIR.

How to use your locally built Torch-MLIR with SHARK

1.) Run `./setup_venv.sh in SHARK` and activate `shark.venv` virtual env.
2.) Run `pip uninstall torch-mlir`.
3.) Go to your local Torch-MLIR directory.
4.) Activate mlir_venv virtual envirnoment.
5.) Run `pip uninstall -r requirements.txt`.
6.) Run `pip install -r requirements.txt`.
7.) Build Torch-MLIR.
8.) Activate shark.venv virtual environment from the Torch-MLIR directory.
8.) Run `export PYTHONPATH=`pwd`/build/tools/torch-mlir/python_packages/torch_mlir:`pwd`/examples` in the Torch-MLIR directory.
9.) Go to the SHARK directory.

Now the SHARK will use your locally build Torch-MLIR repo.

Run a demo script

python -m  shark.examples.shark_inference.resnet50_script --device="cpu" # Use gpu | vulkan
# Or a pytest
pytest tank/test_models.py -k "MiniLM"
Testing and Benchmarks

Run all model tests on CPU/GPU/VULKAN/Metal

pytest tank/test_models.py

# If on Linux for multithreading on CPU (faster results):
pytest tank/test_models.py -n auto

Running specific tests


# Search for test cases by including a keyword that matches all or part of the test case's name;
pytest tank/test_models.py -k "keyword" 

# Test cases are named uniformly by format test_module_<model_name_underscores_only>_<torch/tf>_<static/dynamic>_<device>.

# Example: Test all models on nvidia gpu:
pytest tank/test_models.py -k "cuda"

# Example: Test all tensorflow resnet models on Vulkan backend:
pytest tank/test_models.py -k "resnet and tf and vulkan"

# Exclude a test case:
pytest tank/test_models.py -k "not ..."

### Run benchmarks on SHARK tank pytests and generate bench_results.csv with results.

(the following requires source installation with `IMPORTER=1 ./setup_venv.sh`)

```shell
pytest --benchmark tank/test_models.py
  
# Just do static GPU benchmarks for PyTorch tests:
pytest --benchmark tank/test_models.py -k "pytorch and static and cuda"

Benchmark Resnet50, MiniLM on CPU

(requires source installation with IMPORTER=1 ./setup_venv.sh)

# We suggest running the following commands as root before running benchmarks on CPU:
  
cat /sys/devices/system/cpu/cpu*/topology/thread_siblings_list | awk -F, '{print $2}' | sort -n | uniq | ( while read X ; do echo $X ; echo 0 > /sys/devices/system/cpu/cpu$X/online ; done )
echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo

# Benchmark canonical Resnet50 on CPU via pytest
pytest --benchmark tank/test_models -k "resnet50 and tf_static_cpu"

# Benchmark canonical MiniLM on CPU via pytest
pytest --benchmark tank/test_models -k "MiniLM and cpu"

# Benchmark MiniLM on CPU via transformer-benchmarks:
git clone --recursive https://github.com/nod-ai/transformer-benchmarks.git
cd transformer-benchmarks
./perf-ci.sh -n
# Check detail.csv for MLIR/IREE results.

API Reference

Shark Inference API


from shark.shark_importer import SharkImporter

# SharkImporter imports mlir file from the torch, tensorflow or tf-lite module.

mlir_importer = SharkImporter(
    torch_module,
    (input),
    frontend="torch",  #tf, #tf-lite
)
torch_mlir, func_name = mlir_importer.import_mlir(tracing_required=True)

# SharkInference accepts mlir in linalg, mhlo, and tosa dialect.

from shark.shark_inference import SharkInference
shark_module = SharkInference(torch_mlir, func_name, device="cpu", mlir_dialect="linalg")
shark_module.compile()
result = shark_module.forward((input))

Example demonstrating running MHLO IR.

from shark.shark_inference import SharkInference
import numpy as np

mhlo_ir = r"""builtin.module  {
      func.func @forward(%arg0: tensor<1x4xf32>, %arg1: tensor<4x1xf32>) -> tensor<4x4xf32> {
        %0 = chlo.broadcast_add %arg0, %arg1 : (tensor<1x4xf32>, tensor<4x1xf32>) -> tensor<4x4xf32>
        %1 = "mhlo.abs"(%0) : (tensor<4x4xf32>) -> tensor<4x4xf32>
        return %1 : tensor<4x4xf32>
      }
}"""

arg0 = np.ones((1, 4)).astype(np.float32)
arg1 = np.ones((4, 1)).astype(np.float32)
shark_module = SharkInference(mhlo_ir, func_name="forward", device="cpu", mlir_dialect="mhlo")
shark_module.compile()
result = shark_module.forward((arg0, arg1))

Supported and Validated Models

PyTorch Models

Huggingface PyTorch Models

Hugging Face Models Torch-MLIR lowerable SHARK-CPU SHARK-CUDA SHARK-METAL
BERT 💚 (JIT) 💚 💚 💚
Albert 💚 (JIT) 💚 💚 💚
BigBird 💚 (AOT)
DistilBERT 💚 (JIT) 💚 💚 💚
GPT2 💔 (AOT)
MobileBert 💚 (JIT) 💚 💚 💚

Torchvision Models

TORCHVISION Models Torch-MLIR lowerable SHARK-CPU SHARK-CUDA SHARK-METAL
AlexNet 💚 (Script) 💚 💚 💚
DenseNet121 💚 (Script)
MNasNet1_0 💚 (Script) 💚 💚 💚
MobileNetV2 💚 (Script) 💚 💚 💚
MobileNetV3 💚 (Script) 💚 💚 💚
Unet 💔 (Script)
Resnet18 💚 (Script) 💚 💚 💚
Resnet50 💚 (Script) 💚 💚 💚
Resnet101 💚 (Script) 💚 💚 💚
Resnext50_32x4d 💚 (Script) 💚 💚 💚
ShuffleNet_v2 💔 (Script)
SqueezeNet 💚 (Script) 💚 💚 💚
EfficientNet 💚 (Script)
Regnet 💚 (Script) 💚 💚 💚
Resnest 💔 (Script)
Vision Transformer 💚 (Script)
VGG 16 💚 (Script) 💚 💚
Wide Resnet 💚 (Script) 💚 💚 💚
RAFT 💔 (JIT)

For more information refer to MODEL TRACKING SHEET

PyTorch Training Models

Models Torch-MLIR lowerable SHARK-CPU SHARK-CUDA SHARK-METAL
BERT 💔 💔
FullyConnected 💚 💚
JAX Models

JAX Models

Models JAX-MHLO lowerable SHARK-CPU SHARK-CUDA SHARK-METAL
DALL-E 💔 💔
FullyConnected 💚 💚
TFLite Models

TFLite Models

Models TOSA/LinAlg SHARK-CPU SHARK-CUDA SHARK-METAL
BERT 💔 💔
FullyConnected 💚 💚
albert 💚 💚
asr_conformer 💚 💚
bird_classifier 💚 💚
cartoon_gan 💚 💚
craft_text 💚 💚
deeplab_v3 💚 💚
densenet 💚 💚
east_text_detector 💚 💚
efficientnet_lite0_int8 💚 💚
efficientnet 💚 💚
gpt2 💚 💚
image_stylization 💚 💚
inception_v4 💚 💚
inception_v4_uint8 💚 💚
lightning_fp16 💚 💚
lightning_i8 💚 💚
lightning 💚 💚
magenta 💚 💚
midas 💚 💚
mirnet 💚 💚
mnasnet 💚 💚
mobilebert_edgetpu_s_float 💚 💚
mobilebert_edgetpu_s_quant 💚 💚
mobilebert 💚 💚
mobilebert_tf2_float 💚 💚
mobilebert_tf2_quant 💚 💚
mobilenet_ssd_quant 💚 💚
mobilenet_v1 💚 💚
mobilenet_v1_uint8 💚 💚
mobilenet_v2_int8 💚 💚
mobilenet_v2 💚 💚
mobilenet_v2_uint8 💚 💚
mobilenet_v3-large 💚 💚
mobilenet_v3-large_uint8 💚 💚
mobilenet_v35-int8 💚 💚
nasnet 💚 💚
person_detect 💚 💚
posenet 💚 💚
resnet_50_int8 💚 💚
rosetta 💚 💚
spice 💚 💚
squeezenet 💚 💚
ssd_mobilenet_v1 💚 💚
ssd_mobilenet_v1_uint8 💚 💚
ssd_mobilenet_v2_fpnlite 💚 💚
ssd_mobilenet_v2_fpnlite_uint8 💚 💚
ssd_mobilenet_v2_int8 💚 💚
ssd_mobilenet_v2 💚 💚
ssd_spaghettinet_large 💚 💚
ssd_spaghettinet_large_uint8 💚 💚
visual_wake_words_i8 💚 💚
TF Models

Tensorflow Models (Inference)

Hugging Face Models tf-mhlo lowerable SHARK-CPU SHARK-CUDA SHARK-METAL
BERT 💚 💚 💚 💚
albert-base-v2 💚 💚 💚 💚
DistilBERT 💚 💚 💚 💚
CamemBert 💚 💚 💚 💚
ConvBert 💚 💚 💚 💚
Deberta
electra 💚 💚 💚 💚
funnel
layoutlm 💚 💚 💚 💚
longformer
mobile-bert 💚 💚 💚 💚
remembert
tapas
flaubert 💚 💚 💚 💚
roberta 💚 💚 💚 💚
xlm-roberta 💚 💚 💚 💚
mpnet 💚 💚 💚 💚
IREE Project Channels
MLIR and Torch-MLIR Project Channels

License

nod.ai SHARK is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.

Description
No description provided
Readme Apache-2.0 50 MiB
Languages
Python 72.5%
C 18.2%
C++ 5%
Jupyter Notebook 2.5%
CSS 0.7%
Other 1.1%