dan
489a858af1
enforce fp32 accumulates for cpu
2023-10-29 18:59:00 +00:00
Vivek Khandelwal
b83d32fafe
Fix Falcon GPTQ Pipeline
20231011.986
2023-10-11 20:09:32 +05:30
Vivek Khandelwal
0a618e1863
Add support for Falcon GPTQ
2023-10-11 10:47:48 +05:30
Phaneesh Barwaria
a731eb6ed4
Macos fixes ( #1883 )
...
* fix venv setup for MacOS
* allow stream fuse binding on mac
* clean iree metal args
20231010.985
2023-10-09 23:36:12 -07:00
Ean Garvey
2004d16945
Revert "[SDXL] Add SDXL pipeline to SHARK ( #1731 )" ( #1882 )
...
This reverts commit 9f0a421764 .
20231009.984
2023-10-09 18:01:44 -07:00
Gaurav Shukla
6e409bfb77
fix else if syntax error
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-10-10 06:23:56 +05:30
Gaurav Shukla
77727d149c
[warning] Fix dropdown warning
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-10-10 05:18:43 +05:30
Ean Garvey
66f6e79d68
Split CPU/GPU definitions conditionally outside of torch contexts. ( #1879 )
2023-10-09 16:46:41 -07:00
Ean Garvey
3b825579a7
(LLaMa-2) Point to int4 + f32 acc .mlir for cpu ( #1878 )
...
- fixes some issues with non-system prompt invocation
Co-authored-by: Gaurav Shukla <gauravshukla789@gmail.com >
2023-10-09 14:37:35 -05:00
Abhishek Varma
9f0a421764
[SDXL] Add SDXL pipeline to SHARK ( #1731 )
...
-- This commit adds SDXL pipeline to SHARK.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-10-09 13:01:37 -05:00
Gaurav Shukla
c28682110c
[chatbot] Flag to add system prompt
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-10-09 22:17:39 +05:30
Ean Garvey
caf6cc5d8f
Switch most compile flows to use ireec.compile_file. ( #1863 )
...
* Switch most compile flows to use ireec.compile_file.
* re-add input type to compile_str path.
* Check if mlir_module exists before checking if it's a path or pyobject.
* Fix some save_dir cases
20231009.983
20231006.980
2023-10-06 23:04:43 -05:00
Ean Garvey
8614a18474
Remove tf dependencies from importer path. ( #1874 )
...
* Remove tf dependencies from import path.
* Fix formatting.
20231006.979
2023-10-06 12:27:12 -07:00
Jakub Kuderski
86c1c0c215
Add aggregate statistics to microbenchmark ( #1871 )
...
Print averaged results at the end of all iterations. Increase the
default number of iterations to 5.
Example:
```
Number of iterations: 5
Prefill: avg. 0.03 s, stddev 0.00
Decode: avg. 43.34 tokens/s, stdev 0.13
```
Also remove the -2 in the number of generated tokens -- I did not find
any evidence we need it.
2023-10-06 10:03:07 -07:00
Daniel Garvey
8bb364bcb8
enforce fp32 accumulates for cpu ( #1873 )
2023-10-06 11:34:49 -05:00
Daniel Garvey
7abddd01ec
argmax inside model + brevitas pin ( #1872 )
20231005.978
2023-10-05 20:15:21 -07:00
Abhishek Varma
2a451fa0c7
[Llama2] Add a standalone utility for dynamic and combining IRs
...
-- This script adds a standalone utility for converting Llama IRs
to dynamic and combining them as well.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-10-05 20:01:06 +05:30
Jakub Kuderski
9c4610b9da
Add microbenchmark mode to vicuna CLI ( #1864 )
...
Add flags to enable a non-internactive mode for microbenchmarking llama
models. In this mode, the system and user prompts are specified with CLI
flags, and the number of generated tokens and iterations is fixed.
Also move the stats below the response and trim any response blankspace.
20231004.977
2023-10-05 00:12:08 -04:00
powderluv
a38cc9d216
Update vulkan_utils.py for Radeon 780m igpu ( #1866 )
2023-10-04 20:33:07 -07:00
Jakub Kuderski
1c382449ec
[vulkan] Print note about module load times. NFC. ( #1862 )
...
Print a note ahead of a potentially long inactivity to set the right expectations.
Separately, we should add progress to the UI and make this loading faster.
20231004.976
20231003.975
2023-10-03 17:27:27 -04:00
Gaurav Shukla
7cc9b3f8e8
[llama cli] Fix llama cli
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
20231003.974
2023-10-03 20:39:53 +05:30
Gaurav Shukla
e54517e967
[UI] Disable config generator, lora train and model manager ( #1858 )
...
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-10-02 22:34:40 -07:00
Ean Garvey
326327a799
Collect pipeline submodules for diffusers ckpt preprocessing. ( #1859 )
20231002.973
20231002.972
2023-10-03 00:29:28 -04:00
Ean Garvey
785b65c7b0
Add flag for specifying device-local caching allocator heap key. ( #1856 )
2023-10-03 00:28:39 -04:00
Sungsoon Cho
0d16c81687
Remove unused import. ( #1857 )
2023-10-02 11:36:08 -05:00
Vivek Khandelwal
8dd7850c69
Add Falcon-GPTQ support
2023-10-02 16:39:57 +05:30
Gaurav Shukla
e930ba85b4
[os] Remove os dependency from vmfb naming ( #1854 )
...
Also fixes a small ui issue for chatbot.
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
20231001.971
20230930.970
20230930.969
20230929.968
2023-09-29 12:38:17 -05:00
Gaurav Shukla
cd732e7a38
[chatbot] split execution time to prefill and decode
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-29 13:18:03 +05:30
Gaurav Shukla
8e0f8b3227
[ui] Update chatbot UI
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-29 13:18:03 +05:30
Gaurav Shukla
b8210ef796
[chatbot] Re-instantiate the chatbot object if device id changes
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-29 13:18:03 +05:30
PhaneeshB
94594542a9
remove use of vulkaninfo
20230928.967
2023-09-28 21:57:00 +05:30
Gaurav Shukla
82f833e87d
[vulkan] Update vmfb naming
...
Update vmfb naming for vulkan devices in order to resolve naming
conflicts in the presence of multiple vulkan devices.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-28 14:52:11 +05:30
Vivek Khandelwal
c9d6870105
Modify falcon pipeline for 180b support
2023-09-28 12:39:35 +05:30
Jakub Kuderski
4fec03a6cc
[vulkan] Switch from coop matrix NV to KHR ( #1848 )
20230927.965
2023-09-27 21:43:37 -04:00
harsh-nod
9a27f51378
Deprecate inference directory
...
This patch removes the inference directory that was no longer being used.
2023-09-27 14:29:00 -07:00
Abhishek Varma
ad1a0f35ff
Fix misdirection while saving vmfb
...
-- Currently SHARK suggests that vmfb has been saved, while
that is not the case and no vmfb is generated.
This creates a misdirection for IR/vmfbs which are of larger
size.
-- This commit therefore fixes that misdirection.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-09-27 16:25:29 +05:30
Nelson Sharpe
6773278ec2
Fix checkpoint_path unexpected argument ( #1832 )
20230926.964
20230925.963
20230924.962
2023-09-24 14:17:52 -07:00
Abhishek Varma
9a0efffcca
[Llama2] Fix wrong Vulkan device ID + Add Vulkan compile flags
...
-- This commit fixes the wrong Vulkan device being selected during
runtime.
-- It also adds couple of IREE compilation flags to target specific
Vulkan device.
-- It also changes the Vulkan device listing to be more in tune with
lowering control flow.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
20230923.961
20230922.960
2023-09-22 22:24:18 +05:30
gpetters94
61c6f153d9
Switch to keras-nightly to fix a Linux issue ( #1835 )
20230921.959
2023-09-21 12:33:45 -04:00
Phaneesh Barwaria
effd42e8f5
pin gradio to v3.44.3
2023-09-21 17:33:43 +05:30
Sungsoon Cho
b5fbb1a8a0
Rename the func arg save_json to avoid name collision. ( #1837 )
...
* Rename the func arg save_json to avoid name collision.
* black formatted.
20230920.958
20230919.957
2023-09-19 17:29:27 -05:00
Quinn Dawkins
ded74d09cd
[vicuna.py] Keep past key values on device ( #1836 )
...
The past key values are only used within the models themselves and can
be kept on device. For vulkan int4, this gives 44 tok/s (for the first
prompt) and settles at around 26 tok/s on 7900xtx.
2023-09-19 18:17:41 -04:00
Boian Petkantchin
79267931c1
Add argument --additional_compile_args ( #1119 )
...
This allows to pass more arguemnts to the IREE compiler
Example:
python my-app.py --additional_compile_args="--mlir-pretty-debuginfo --mlir-timing"
Co-authored-by: Boian Petkantchin <boian@nod-labs.com >
2023-09-19 11:26:03 -05:00
zjgarvey
9eceba69b7
local_tank_cache included into clear_all ( #1833 )
20230918.956
2023-09-18 00:27:23 -05:00
Ean Garvey
ca609afb6a
Update README.md ( #1830 )
20230917.955
20230916.954
20230915.953
20230914.951
20230914.950
2023-09-14 10:33:57 -05:00
Gaurav Shukla
11bdce9790
[flags] Fix vulkan runtime flags as vma is dropped from iree ( #1831 )
2023-09-14 08:58:59 -05:00
Ean Garvey
684943a4a6
(SD) Fix tokenizers imports in pyinstaller builds. ( #1828 )
...
* Fix tokenizers metadata.
* (SD) Disable VAE lowering configs (rdna3) and add versioned tunings.
* Update sd_annotation.py
* (SD) Add cv2 to spec.
* Update stencil pipeline with the new img2img arg.
20230913.949
20230912.948
2023-09-12 12:23:48 -05:00
PhaneeshB
b817bb8455
add roles for llama2
2023-09-12 10:59:28 +05:30
Ean Garvey
780f520f02
Fix vk.target_env extensions and remove redundant SD imports. ( #1826 )
...
* Remove redundant IREE runtime imports.
* Fix vulkan target env extensions.
20230911.946
2023-09-11 13:42:52 -05:00
Dom
c61b6f8d65
Code refactoring ( #1817 )
...
* use join
* fix bug
* further code optimizations
---------
Co-authored-by: Daniel Garvey <34486624+dan-garvey@users.noreply.github.com >
2023-09-11 11:30:56 -05:00