Oleg Rybalko
7220f5c9fc
fixed hf convert and now it's working with tinyllama ( #2374 )
...
* fixed hf convert and now it's working with tinyllama
* added tinyllama config
* refactored code and made it work with all llama models
* prettier order
* prettier order
* fixed suffix for tinyllama and refactored convert_from_hf
* dynamically update help if MODEL_PARAMS changes and default size is the 1st
2023-11-21 14:36:52 -08:00
chenyu
d0f966b320
add a segfault linearizer test case ( #2383 )
...
* add a segfault linearizer test case
* another interesting one
2023-11-21 15:06:41 -05:00
chenyu
9eeba968cd
fix the variable arg order ( #2382 )
2023-11-21 12:02:31 -05:00
nimlgen
c5f429a40a
Fix linearizer cache ( #2371 )
...
* fix linearizer cache
* better comments
* a bit cleaner
2023-11-21 07:58:35 -08:00
Umut Zengin
0da72119bb
Readable and Faster Union of Vars ( #2380 )
...
* functool reduce to set.union
* flake8
2023-11-21 09:45:19 -05:00
qazal
15c316b9b1
add marker ( #2379 )
2023-11-21 09:44:15 -05:00
wozeparrot
fb0d650b25
feat: don't optimize buffers when its not an astrunner ( #2377 )
2023-11-20 22:07:31 -08:00
wozeparrot
abbcc7aefa
missed cleanup from cache_id removal ( #2376 )
2023-11-21 01:03:43 -05:00
Duc TranMinh
179551a55c
remove file writing in metal ops ( #2369 )
...
* remove file writing in metal ops
* remove unused import
---------
Co-authored-by: ductm104 <ductm>
2023-11-20 19:24:39 -08:00
chenyu
c4cc4966ed
update some test_tensor.py cases with 0 in shape ( #2368 )
2023-11-19 20:35:05 -05:00
chenyu
6add808f6a
support tuple shape input for rand and empty ( #2367 )
2023-11-19 20:20:39 -05:00
chenyu
e9847be790
remove whisper +1-1 hack ( #2360 )
...
* remove whisper +1-1 hack
* Revert "remove whisper +1-1 hack"
This reverts commit 5db3800f09 .
* update whisper tests
* comment context
2023-11-19 17:56:36 -05:00
George Hotz
a0890f4e6c
move fetch to helpers ( #2363 )
...
* switch datasets to new fetch
* add test_helpers
* fix convnext and delete old torch load
2023-11-19 12:29:51 -08:00
chenyu
03968622a2
Pretty multinomial ( #2365 )
...
* pretty multinomial
p, cdf_normalized -> weight, cdf
symmetric unsqueeze / squeeze
check num_sample > 0
TODO: how do we want to handle 0/0 in general?
* no 0-dim input
* single sum
2023-11-19 15:10:10 -05:00
Friedrich Carl Eichenroth
0eb0defa6f
remove unused key properties ( #2359 )
2023-11-18 23:30:21 -08:00
Friedrich Carl Eichenroth
b3a21eee7d
just new types ( #2358 )
2023-11-18 23:29:46 -08:00
chenyu
f203d37258
retry test_webgpu.js 3 times ( #2362 )
2023-11-18 21:24:47 -05:00
mmmkkaaayy
08d09eb666
Enable whisper test in CI for more backends ( #2355 )
2023-11-18 17:52:50 -05:00
chenyu
d7d078c7f9
Node.vars() returns a set and properly dedup ( #2356 )
...
* dedup RedNode.vars()
* vars returns a set
* fix more vars
* unused import
* update to_movement_ops
* comment
2023-11-18 17:44:52 -05:00
chenyu
0443cbfbb9
fix shm path test on macos ( #2357 )
...
AttributeError: 'PosixPath' object has no attribute 'startswith'
2023-11-18 17:37:42 -05:00
chenyu
f02e17a967
Variable.num -> NumNode ( #2354 )
2023-11-18 15:45:52 -05:00
George Hotz
40246d35bc
ops_shm removed ( #2351 )
...
* ops_shm removed
* buf.cast
* err, forgot those
2023-11-18 11:41:58 -08:00
George Hotz
9b58d4cb37
cleanup unused movement ops ( #2353 )
...
* cleanup_mops
* no expand
* nothing
* revert that
* add comment
* add correctness check to disk tensor
2023-11-18 09:19:02 -08:00
chenyu
c4d97bba8c
simplify Node.sum, remove factorize method ( #2352 )
2023-11-18 11:55:48 -05:00
George Hotz
e35c31c8e5
xid for hip, device in time linearizer ( #2348 )
...
Co-authored-by: Tiny Box <tinybox@tinygrad.org >
2023-11-17 20:50:07 -08:00
chenyu
6e44a798df
update fixed linearizer test ( #2347 )
...
* update fixed linearizer test
* except CLANG
2023-11-17 23:46:37 -05:00
George Hotz
c8c5212dce
a lil more beautiful_mnist
2023-11-17 19:53:06 -08:00
George Hotz
c7b38b324b
A beautiful MNIST training example ( #2272 )
...
* beautiful mnist
* beautiful mnist example
* from tinygrad import Tensor
* more beautiful
* the jit is super core tinygrad
* globalcounters reset on jit run
* symlinks and exclude
* beautiful_cartpole
* evaluate is it's own function
* no symlinks
* more beautiful
* jit reset for double speed
* type hinting for JIT
* beautiful_mnist gets 98%
* beautiful_mnist < 4s with BEAM=2
* better cartpole
* use actor critic
* zero_grad got lost
* delete double relu
* stable cartpole with PPO
* beautiful_cartpole is more beautiful
* REPLAY_BUFFER
* beautiful stuff typechecks
* None support in shape
* hp tuning
2023-11-17 19:42:43 -08:00
chenyu
74e6b6c9fc
types ( #2346 )
2023-11-17 18:49:24 -05:00
chenyu
d2c0035c73
add back as_strided, move rebuilt mops to extra ( #2344 )
...
* add back as_strided, move rebuilt mops to extra
* negative stride for ops_cpu
* Revert "negative stride for ops_cpu"
This reverts commit a13b6815ac .
* skip that
* style
2023-11-17 14:34:30 -05:00
nimlgen
064034c42c
hip free event + a bit faster cpu time ( #2342 )
...
* free hip events
* hip faster
2023-11-17 09:50:49 -08:00
chenyu
ad3d7428fa
good line shaves in st and faster ( #2343 )
2023-11-17 11:00:26 -05:00
George Hotz
652d2de256
wow how did i think that was okay ( #2339 )
2023-11-16 21:21:11 -08:00
chenyu
8e22c0d95c
everything can jit now ( #2338 )
2023-11-16 23:54:57 -05:00
Friedrich Carl Eichenroth
a8875bd770
add types to lazy ( #2327 )
...
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2023-11-16 20:48:41 -08:00
George Hotz
1d5501594e
force rebuild of ocelot ( #2334 )
...
* force rebuild of ocelot
* SzymonOzog gpuocelot
* delete that
* downgrade that
* non parallel
* force rebuild
* use llvm
* nauto
* less mem maybe
* print test
* helper_test_exception skip CUDACPU
* helper_test_exception
* shippable
2023-11-16 20:44:14 -08:00
imaolo
0d0c74bac9
Assert for memory allocation failures ( #2337 )
...
* assert adequate memory has been freed
* cleaned up runtime error message
* improved metal buffer alloc error catching and reporting
* decreased lines and altered messages
* removed unnecessary _get_cur_free_space() call
* improved assert message
* added allocate massive buffer test
* added test_lru_allocator_metal_max_buffer_length
* split into two asserts and removed walrus assignment from assert expression
* update assert message and use byte data type for clarity
2023-11-16 20:14:16 -08:00
chenyu
aa01a63b3f
cleanup of lines / unused / types ( #2336 )
2023-11-16 21:15:32 -05:00
chenyu
3971259832
fix test_real_world llama ( #2335 )
2023-11-16 19:50:08 -05:00
chenyu
3b9dd3330c
add device to beam search cache key ( #2333 )
2023-11-16 18:35:08 -05:00
Friedrich Carl Eichenroth
75676ab8e1
Profiling-helper ( #2321 )
...
* change profiler
* remove unused imports
* remove unused imports
* change lazybuffer references
* remove unused line
* remove unused import
* remove unused stuff
* add types
* typing
* typing
* typing
* trigger actions
* -1 loc
* fixup
* trigger actions
* revert lazy typing changes
* WIP profiler helper
* replace old start & stop profiler
* fixup
* linting
* Update llama.py
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2023-11-16 14:15:56 -08:00
mmmkkaaayy
8235da11dd
whisper: support batch inference, add librispeech WER test ( #2074 )
...
* whisper: support batch inference, add librispeech WER test, add kv caching and JIT
* remove JIT_SUPPORTED_DEVICE
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2023-11-16 13:50:08 -08:00
George Hotz
3baaf298d6
two stage cumsum in tensor.py ( #2331 )
...
* two stage cumsum in tensor.py
* 2 more kernels for llama cumsum
* gpt-2 and llama use fast multinomial
2023-11-16 12:09:53 -08:00
chenyu
163b2bc26a
wgpu.utils._device -> wgpu.utils.device ( #2330 )
...
* wgpu.utils._device -> wgpu.utils.device
* can i do this?
* no need to specify metal
2023-11-16 12:52:13 -05:00
chenyu
27f4c26312
fix getitem slice when end < start ( #2329 )
2023-11-16 11:20:27 -05:00
chenyu
822d6e6f18
Simpler mops verify ( #2325 )
...
* rewrite the to_movement_ops check using symbolic
* tweak
2023-11-15 21:47:18 -05:00
George Hotz
ef67d7ff5d
shapetracker whitespace
2023-11-15 15:24:09 -08:00
chenyu
a98511561c
fuzz_linearizer same api for interpreted and compiled ( #2320 )
2023-11-15 17:40:22 -05:00
George Hotz
294e71de15
remove lines (unused code) ( #2319 )
...
* remove lines
* uhh, i'm tired
* that function never worked
* types for ast_parse
2023-11-15 14:36:11 -08:00
George Hotz
628365eab6
JIT cleanups ( #2317 )
...
* cleanup cleanup
* dedup update_stats
2023-11-15 13:34:52 -08:00