Abhishek Varma
c854208d49
[Llama2] Prefetch llama2 tokenizer configs ( #1824 )
...
-- This commit prefetches llama2 tokenizer configs from shark_tank.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
20230910.942
20230909.941
20230908.940
2023-09-08 11:29:54 -07:00
Gaurav Shukla
c5dcfc1f13
[vicuna] Exit when mlir is not present in shark tank ( #1825 )
...
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-08 10:30:29 -07:00
Abhishek Varma
bde63ee8ae
Add logging feature in WebUI ( #1821 )
2023-09-08 05:48:05 -07:00
Vivek Khandelwal
9681d494eb
Update decomp list and shark trainer for DLRM
20230907.939
20230907.938
2023-09-06 21:24:50 +05:30
Gaurav Shukla
ede6bf83e2
[vicuna] Disabling the IR generation path
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-06 20:13:17 +05:30
Ean Garvey
2c2693fb7d
Fix torchvision versioning in Linux importer setup. ( #1809 )
20230905.935
20230905.934
2023-09-05 12:57:03 -05:00
Vivek Khandelwal
1d31b2b2c6
Fix StableHLO Compilation flag
2023-09-05 21:32:33 +05:30
Gaurav Shukla
d2f64eefa3
[chatbot] Remove few outdated models from list ( #1814 )
20230904.932
2023-09-04 09:26:32 -07:00
Abhishek Varma
87ae14b6ff
[SD] Add sdpfa decomposition + update IREE flag
...
-- This commit adds Scaled Dot Product Flash Attention's decomposition
in shark_importer.
-- It also updates `iree-flow-enable-data-tiling` to `iree-opt-data-tiling`.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
20230904.931
2023-09-04 18:03:53 +05:30
Phaneesh Barwaria
1ccafa1fc1
fix llama2-70b rewrite tensor dim
20230903.930
20230903.929
20230902.928
20230901.927
2023-09-01 17:27:06 +05:30
jinchen62
4c3d8a0a7f
Enable downloading vmfb/mlir for webui ( #1807 )
20230831.925
20230831.924
2023-08-31 11:05:47 -07:00
jinchen62
3601dc7c3b
Fix llama2 13b combined ir ( #1803 )
20230830.923
20230829.922
20230829.921
20230828.920
2023-08-28 11:34:44 -07:00
Daniel Garvey
671881cf87
Llama2 70b ( #1783 )
...
* llama2 70b IR gen
* fix IR sec llama2 + debug
* llama270b
---------
Co-authored-by: PhaneeshB <b.phaneesh@gmail.com >
20230827.919
20230826.918
20230826.917
2023-08-25 23:04:28 -07:00
Gaurav Shukla
4e9be6be59
[chatbot] Add debug as class attribute ( #1799 )
...
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
20230825.916
20230825.915
2023-08-25 21:46:29 -07:00
Ean Garvey
9c8cbaf498
Add support for ROCM (Windows) in Studio + compile utils ( #1770 )
...
* WIP: MSVC ROCM support for SHARK Studio
* Make get_iree_rocm_args platform-agnostic.
* Update stable_args.py
* Update rocm arg handling in SD utils
* Guard quantization imports.
Co-authored-by: jam https://github.com/jammm
20230825.914
2023-08-25 20:56:05 -07:00
Ean Garvey
9e348a114e
Revert changes process_skipfiles.py ( #1798 )
...
Keeps a small typo fix but reverts the rest of changes to this file from 450c231171
2023-08-25 15:31:49 -07:00
jinchen62
51f90a4d56
Update conversion passes for brevitas quant op ( #1795 )
2023-08-25 17:28:07 -05:00
Abhishek Varma
310d5d0a49
Fix llama2 13b crashing + add spec file for CLI execution of Llama ( #1797 )
...
* [Llama2] Add a fix for Llama2 13B downloading/crashing
-- This commit fixes downloading/crashing of llama2 13B on wrong
.mlir file.
-- Also adds support for downloading vmfb from shark_tank in CLI.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
* [llama2] Add a spec file to run Llama/Vicuna CLI exe
-- This commit adds a spec file to run Llama/Vicuna CLI exe.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
---------
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
20230825.913
2023-08-25 09:36:09 -05:00
Ean Garvey
9697981004
Pipe through a debug option to iree compile utils. ( #1796 )
...
* Update compile_utils.py
* Pipe through a flag to toggle debug options in compile utils.
* Update SharkLLMBase.py
2023-08-25 07:11:11 -07:00
Ean Garvey
450c231171
Add tokenizers to requirements.txt ( #1790 )
...
* Add tokenizers to requirements and pin version
* Update process_skipfiles.py
20230824.911
2023-08-24 19:44:04 -05:00
Ean Garvey
07f6f4a2f7
Add a short README for the OPT examples and small tweaks. ( #1793 )
...
* Small changes to OPT example.
* Update opt README.
* Add a few modes to batch script.
* Update README.md
2023-08-24 17:26:11 -07:00
jinchen62
610813c72f
Add iree flag to strip assertions ( #1791 )
2023-08-24 10:51:19 -07:00
Ean Garvey
8e3860c9e6
Remove flags that are default in upstream IREE ( #1785 )
...
* Remove index bits flags now set by default
* Update shark_studio_imports.py
2023-08-24 11:57:54 -05:00
xzuyn
e37d6720eb
Add Hires Fix ( #1787 )
...
* improper test hiresfix
* add sliders & use `clear_cache`
* add resample choices & fix step adjustment
* add step adjustment to img2img
* add resample options to img2img
* simplify hiresfix
- import `img2img_inf` from `img2img_ui.py` instead of just copying it into `txt2img_ui.py`
* set `hri` to None after using
* add more resample types, and don't show output until hiresfix is done
* cleaner implementation
* ran black
* ran black again with jupyter dependencies
20230824.909
2023-08-24 09:01:41 -07:00
Vivek Khandelwal
16160d9a7d
Fix combine mlir script
2023-08-24 19:10:49 +05:30
Sungsoon Cho
79075a1a07
Opt perf ( #1786 )
...
* Define command line args, model-name, max-seq-len, platform, etc.
* Add usage example.
* Add opt_perf_comparision_batch.py.
* Use shlex instead.
2023-08-24 08:33:12 -05:00
Abhishek Varma
db990826d3
Add Llama2 13B int4 fp16 support ( #1784 )
...
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
20230823.908
2023-08-23 10:00:32 -07:00
gpetters94
7ee3e4ba5d
Add stencil_unet_512 support ( #1778 )
...
This should fix any remaining issues with stencils and long prompts.
20230822.907
2023-08-22 12:23:46 -04:00
Vivek Khandelwal
05889a8fe1
Add LLaMa2-int4-fp16 support ( #1782 )
20230822.906
2023-08-22 07:45:50 -07:00
jinchen62
b87efe7686
Fix venv setup for brevitas ( #1779 )
20230821.905
2023-08-21 11:58:51 -07:00
gpetters94
82b462de3a
Fix stencils for long prompts ( #1777 )
20230820.903
20230820.904
20230819.902
20230819.901
2023-08-19 00:26:51 -07:00
Daniel Garvey
d8f0f7bade
replace public with private ( #1776 )
...
unload footguns
20230818.899
2023-08-18 14:22:46 -07:00
gpetters94
79bd0b84a1
Fix an issue with diffusers>0.19.3 ( #1775 )
2023-08-18 14:06:06 -04:00
jinchen62
8738571d1e
Adapt the change of brevitas custom op name ( #1772 )
20230818.898
20230817.897
20230817.896
2023-08-17 14:24:43 -07:00
Gaurav Shukla
a4c354ce54
[version] Pin diffusers==0.19.3
...
Once the latest works with LORA train, unpin it.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-17 21:27:10 +05:30
Gaurav Shukla
cc53efa89f
[cli] Fix chatbot cli
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-17 21:27:10 +05:30
Gaurav Shukla
9ae8bc921e
[chatbot] Fix chatbot cli and webview warning
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-17 21:27:10 +05:30
Gaurav Shukla
32eb78f0f9
[chatbot] Fix switching parameters in chatbot
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-17 19:14:17 +05:30
Ean Garvey
cb509343d9
Fix pytest benchmarks and shark_tank generation. ( #1632 )
...
- fix setup_venv.sh for benchmarks/imports etc.
- fix torch benchmarks in SharkBenchmarkRunner
- generate SD artifacts using build_tools/stable_diffusion_testing.py and --import_mlir
- decouple SD gen from tank/generate_sharktank for now
20230816.895
2023-08-16 17:48:47 -05:00
powderluv
6da391c9b1
update signtool to use /fd certHash
20230816.894
20230815.893
20230815.892
2023-08-15 15:11:40 -07:00
Ean Garvey
9dee7ae652
fix tkinter window ( #1766 )
2023-08-15 13:23:09 -07:00
Ean Garvey
343dfd901c
Update SHARK-Runtime links to SRT ( #1765 )
...
* Update nightly.yml
* Update setup_venv.ps1
* Update CMakeLists.txt
* Update shark_iree_profiling.md
* Update setup_venv.sh
* Update README.md
* Update .gitmodules
* Update CMakeLists.txt
* Update README.md
* fix signtool flags
* Update nightly.yml
* Update benchmark_utils.py
* uncomment tkinter launch
2023-08-15 12:40:44 -07:00
Ean Garvey
57260b9c37
(Studio) Add hf-hub to pyinstaller metadata ( #1761 )
20230814.887
2023-08-14 23:01:50 -05:00
Ean Garvey
18e7d2d061
Enable vae tunings for rdna3. ( #1764 )
2023-08-14 21:00:14 -07:00
Stanley Winata
51a1009796
Add Forward method to SHARKRunner and fix examples. ( #1756 )
2023-08-14 19:20:37 -07:00
Daniel Garvey
045c3c3852
enable iree-opt-const-expr-hoisting in vicuna ( #1742 )
...
Co-authored-by: powderluv <powderluv@users.noreply.github.com >
2023-08-14 18:43:42 -07:00
Ean Garvey
0139dd58d9
Specify max allocation size in IREE compile args. ( #1760 )
2023-08-14 15:43:09 -05:00
Ean Garvey
c96571855a
prevents recompiles for cuda benchmarks + update benchmark_module path ( #1759 )
...
* xfail resnet50_fp16
* Fix cuda benchmarks and prevent recompilation.
2023-08-14 15:30:32 -05:00
PhaneeshB
4f61d69d86
add support passing iree flags for LLMs
2023-08-15 00:22:56 +05:30
Phaneesh Barwaria
531d447768
set default allocator for metal device creation ( #1755 )
2023-08-14 06:17:52 -07:00