chenyu
|
c0f76ed4ea
|
transformer kvcache and mask have same dtype as input (#2771)
* transformer kvcache and mask have same dtype as input
* don't use `=0` in cstyle ternary where
* (bool)
* where float16 test
|
2023-12-14 22:41:51 -05:00 |
|
George Hotz
|
b3982187d1
|
Mixtral Example (#2691)
* mixtral
* simpler
* global counters
* simpler
* weights arg
|
2023-12-10 17:18:31 -08:00 |
|
chenyu
|
539b00a645
|
move llama getenv("JIT") from models to examples (#2671)
Transformer class has a jit param so we should use that in the caller
|
2023-12-07 12:43:22 -05:00 |
|
chenyu
|
6ba6349c97
|
JIT=0 llama.py should not jit (#2609)
|
2023-12-04 20:21:07 -05:00 |
|
Davi Silva
|
ddeec24fa8
|
Cleanup & fix llama.py (#2524)
* docs, cleanup crap
* comma AI
* fix 70B
* this is why lexical scope exists
|
2023-11-30 16:00:17 -05:00 |
|
George Hotz
|
7170a9a057
|
coder.py can write and run code (#2439)
* wip mistral
* coder
* touchups
* cleanups
* mistral cleanups
* clean up cache create
* download the weights, fix tests
* fix llama loading
* global fixup
* clean up all
* move llama model
* cleanups
* Revert "cleanups"
This reverts commit a71c5d59eb.
* fine, leave it
|
2023-11-25 12:27:54 -08:00 |
|