chenyu
|
f88506e630
|
move gpt2/llama sampling inside the model call (#3013)
* move gpt2/llama sampling inside the model call
* argmax uses one more kernel
|
2024-01-04 17:01:50 -05:00 |
|
George Hotz
|
64dded27f0
|
pad ops broke coder (#2881)
* pad ops broke coder
* that contiguous fixes it
* Update lazy.py
|
2023-12-20 17:03:41 -08:00 |
|
George Hotz
|
0fd44259cd
|
bf16 fix + cleanups from mixtral (#2698)
* bf16 fix + cleanups from mixtral
* generic bf16 cast
|
2023-12-10 16:31:52 -08:00 |
|
George Hotz
|
9d7ead84e1
|
hotfix: no need for model cache in examples/coder.py
|
2023-12-05 16:27:36 -08:00 |
|
George Hotz
|
7170a9a057
|
coder.py can write and run code (#2439)
* wip mistral
* coder
* touchups
* cleanups
* mistral cleanups
* clean up cache create
* download the weights, fix tests
* fix llama loading
* global fixup
* clean up all
* move llama model
* cleanups
* Revert "cleanups"
This reverts commit a71c5d59eb.
* fine, leave it
|
2023-11-25 12:27:54 -08:00 |
|