SHARK-Studio

github/SHARK-Studio

Fork 0

mirror of https://github.com/nod-ai/SHARK-Studio.git synced 2026-01-15 00:37:59 -05:00

Commit Graph

Author	SHA1	Message	Date
Ean Garvey	05b498267e	Add StreamingLLM support to studio2 chat (#2060 ) * Streaming LLM * Update precision and add gpu support * (studio2) Separate weights generation for quantization support * Adapt prompt changes to studio flow * Remove outdated flag from llm compile flags. * (studio2) use turbine vmfbRunner * tweaks to prompts * Update CPU path and llm api test. * Change device in test to cpu. * Fixes to runner, device names, vmfb mgmt * Use small test without external weights.	2024-01-18 19:01:07 -06:00

Author

SHA1

Message

Date

Ean Garvey

05b498267e

Add StreamingLLM support to studio2 chat (#2060 )

* Streaming LLM 

* Update precision and add gpu support

* (studio2) Separate weights generation for quantization support

* Adapt prompt changes to studio flow

* Remove outdated flag from llm compile flags.

* (studio2) use turbine vmfbRunner

* tweaks to prompts

* Update CPU path and llm api test.

* Change device in test to cpu.

* Fixes to runner, device names, vmfb mgmt

* Use small test without external weights.

2024-01-18 19:01:07 -06:00

1 Commits