PhaneeshB
ee0009d4b8
pythonize uname for cpu target triple in windows
2023-01-12 22:39:49 +05:30
PhaneeshB
9d851c3346
small fixes
2023-01-12 22:32:24 +05:30
George Petterson
6ad9b213b9
Add GCN4
...
(cherry picked from commit 3be072b3c09c9b38bc2d79ad6e6900eefee49a1c)
2023-01-09 21:09:50 +05:30
PhaneeshB
e4375e8195
Add support for vulkan target env
2023-01-09 21:09:50 +05:30
jinchen62
017dcab685
Add target triple support for TITAN RTX ( #756 )
2023-01-04 15:39:00 -08:00
Abhishek Varma
e60b4568c6
[SharkInference] Make SharkInference compile the entire module ( #708 )
...
* [SharkInference] Make SharkInference compile the entire module
-- Previously SharkInference was compiling and providing run APIs
for a harcoded function with function name "forward".
-- This commit makes the compiling functionality generic and now
any function being defined within the module can be run.
-- It also creates an API to fetch all the function names defined
within the compiled module.
-- This commit updates both web and command-line execution of Stable
Diffusion to use new API of SharkInference.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-01-03 23:25:23 +05:30
powderluv
cc6fbdb0c3
Add sm_89 and point to nvcuda.dll ( #731 )
2022-12-26 10:54:38 -08:00
Stanley Winata
5cf4976054
[Vulkan][utils] Add GTX Pascal support. ( #709 )
2022-12-22 15:24:15 -08:00
PhaneeshB
2befe771b3
Add support for automatic target triple selection for SD
2022-12-21 22:38:06 +05:30
Stella Laurenzo
10630ab597
Add config stanza for NVIDIA RTX 2080. ( #658 )
...
Just happened to have this card on my Windows machine and verified that the SD demo works on it.
```
Average step time: 144.26142692565918ms/it
Clip Inference Avg time (ms) = (205.001 + 44.000) / 2 = 124.501
VAE Inference time (ms): 281.001
Total image generation time: 7.856997728347778sec
```
I'd love to add an API upstream to derive compiler tuning flags from a host device.
2022-12-18 16:40:47 -08:00
Quinn Dawkins
2bc6de650d
[SD] Add support for a compiled version of the discrete Euler scheduler ( #657 )
...
* Add Shark version of euler scheduler
* Add Shark version of euler scheduler to web ui
2022-12-17 19:25:43 -08:00
Phaneesh Barwaria
831f206cd0
Revert "Add target triple selection for multiple cards" ( #655 )
...
This reverts commit acb905f0cc .
2022-12-16 15:01:45 -08:00
PhaneeshB
acb905f0cc
Add target triple selection for multiple cards
2022-12-17 02:24:37 +05:30
nirvedhmeshram
2928179331
Add more NVIDIA targets ( #640 )
2022-12-15 11:24:38 -06:00
Boian Petkantchin
bc17c29b2e
In get_iree_runtime_config get the specific device instead of the default
2022-12-13 13:21:51 -08:00
Boian Petkantchin
aaf60bdee6
Simplify iree_device_map
2022-12-13 13:21:51 -08:00
Stanley Winata
57c94f8f80
[vulkan] Add "radeon" check to the default AMD triple ( #604 )
2022-12-10 09:05:48 -08:00
Ean Garvey
0225292a44
Remove print statements from compile utils ( #593 )
2022-12-08 13:40:47 -08:00
Ean Garvey
1699db79b5
Disable SHARK-Runtime flags if USE_IREE=1 specified during setup. ( #588 )
...
* Disable SHARK-Runtime flags if USE_IREE=1 specified during setup.
* Update setup_venv.sh
* Autodetect cpu count for runtime flags.
2022-12-08 02:31:31 -06:00
Ean Garvey
40eea21863
Enable conv nchw-to-nhwc flag by default for most models + minor fixes ( #584 )
2022-12-07 16:24:02 -08:00
Stanley Winata
6049f86bc4
[Vulkan][Utils] Automatic platform/OS detection ( #569 )
...
To enable AMD gpus on macOS, we need this detection to let the compiler know that we would be needing moltenVK to use this GPU.
2022-12-07 12:05:00 +07:00
Stanley Winata
c4444ff695
[vulkan][utils] Add rdna3 detection ( #565 )
2022-12-05 23:56:06 -08:00
Quinn Dawkins
e19a97f316
Don't do a numpy copy on the results from compiled vm ( #543 )
2022-12-05 14:21:47 -05:00
aldesilv
9a8638a6d0
dump all isas with amdllpc ( #517 )
...
SHARK/shark/examples/shark_inference/stable_diffusion$ python main.py --precision="fp16" --device="vulkan" --iree-vulkan-target-triple=rdna3-unknown-linux --no-load_vmfb --dispatch_benchmarks="all" --dispatch_benchmarks_dir="SD_dispatches" --dump_isa
Co-authored-by: alexander <alexander@nod-labs.com >
2022-11-30 11:33:30 -08:00
Gaurav Shukla
a5445866b8
[WEB] Update the iree flag
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2022-11-30 18:56:48 +05:30
Phaneesh Barwaria
e67bcffea7
add vulkan-heap-block-size flag ( #498 )
2022-11-22 13:30:25 +05:30
Phaneesh Barwaria
d9f4a9954a
modify to get correct target triple ( #485 )
2022-11-13 20:13:44 -08:00
Mehdi Amini
a526f7d5b8
Fix dispatch saving code after 749a2c2d ( #483 )
...
In 749a2c2d iree_device_map and iree_target_map have been made functions
but not all of the uses were updated.
2022-11-14 05:39:01 +05:30
Phaneesh Barwaria
749a2c2dec
add support for choosing vulkan device ( #439 )
2022-11-12 14:00:41 -08:00
yzhang93
9a86e5c476
Fix dispatch benchmarking tool ( #460 )
2022-11-08 09:37:12 -08:00
Eliasj42
32d3f4bd5f
added ordered benchmarks to dispatch benchmarking tool ( #450 )
...
* added ordered benchmarks to dispatch benchmarking tool
* saved changes
* updated readme
Co-authored-by: Elias Joseph <elias@nod-labs.com >
2022-11-07 09:36:21 -08:00
Gaurav Shukla
25931d48a3
[WEB] Update stable diffusion UI and enable live preview ( #447 )
...
This commit enables live preview feature and also updates stable
diffusion web UI.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2022-10-31 04:10:15 -07:00
powderluv
fd89b06641
Drop RDNA1 for now
2022-10-29 14:29:09 -07:00
Gaurav Shukla
f8dc996004
Update vulkan-target-triple for Radeon devices. ( #446 )
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2022-10-29 14:27:20 -07:00
Phaneesh Barwaria
e6a964088b
Add os agnostic vulkan device name check ( #445 )
2022-10-29 13:19:14 -07:00
Eliasj42
7f37599a60
Added a dispatch benchmarking tool ( #441 )
...
To produce benchmarks of individual dispatches, you can add --dispatch_benchmarks=All --dispatch_benchmarks_dir=<output_dir> to your command line argument.
Co-authored-by: Elias Joseph <elias@nod-labs.com >
2022-10-28 14:31:03 -07:00
Ean Garvey
fd7baae548
Serialize torch-mlir CAPI module as bytecode instead of string. ( #435 )
...
* Serialize torch-mlir CAPI as bytecode instead of string.
* Minor fixes to MLIR data handling in SHARK python.
2022-10-27 14:37:15 -05:00
PhaneeshB
fd578a48a9
add cli args for vulkan target triple
2022-10-25 21:47:26 +05:30
Quinn Dawkins
1d33913d48
Add option to save and load precompiled flatbuffer ( #425 )
2022-10-23 16:24:09 -07:00
Quinn Dawkins
7be1d7d0be
Add option for extra arguments through SharkInference.compile ( #408 )
2022-10-19 15:32:48 -05:00
gpetters94
53df0620e3
Add OPT to tank ( #214 )
2022-10-11 11:03:56 -05:00
Ean Garvey
d82b305781
Fix issues with loading .vmfb into SharkInference
2022-09-23 09:53:13 +05:30
Ean Garvey
814a6f8295
Modify vulkan target triple substring searches. ( #318 )
2022-09-20 01:20:20 -05:00
erman-gurses
fc8aa6ae63
Add ROCM parameters ( #335 )
2022-09-16 09:12:19 -07:00
Vivek Khandelwal
c43448a826
Update compile_utils.py
2022-09-15 18:28:10 +05:30
Ean Garvey
6cf5564c84
Remove "gpu" device alias and migrate to using "cuda" for NVIDIA GPU. ( #325 )
...
* Replace instances of "gpu" alias for devices with "cuda"
2022-09-13 01:16:56 -05:00
Stanley Winata
55bcb2eb3c
Level Zero Backend ( #280 )
2022-08-17 19:19:27 -07:00
Ean Garvey
22ff92c48b
Add config.VmModule argument to from_flatbuffer call. ( #266 )
2022-08-14 15:11:19 -07:00
powderluv
db6e2207ed
Update _common.py
2022-08-13 13:49:01 -07:00
Daniel Garvey
7975087ee2
change backend name ( #265 )
2022-08-13 12:01:12 -07:00