chenyu
97275101e9
fix safetensor load uint32 and uint64 ( #3315 )
...
the correct keys are U32 and U64.
2024-02-04 10:46:27 -05:00
Yoshinori Sano
edb74897b2
support safe load bf16 ( #3310 )
...
* support safe load bf16
* fix lint error E501
* add test for loading safetensors
* key should be BOOL
* fix lint
2024-02-04 10:08:39 -05:00
chenyu
d459956966
move TestGetContraction to test_helpers ( #3313 )
...
also cleaned long lines in test_shapetracker and enabled the line length check
2024-02-04 06:05:01 -05:00
Felix Wu
021eea3a52
fix UnboundLocalError when running Compiler with DISABLE_COMPILER_CACHE ( #3296 )
2024-02-01 21:12:33 -05:00
David Hou
3378625773
name upcast variables ( #3200 )
...
* name upcast variables
* typing
* unused
2024-01-22 11:37:28 -05:00
George Hotz
d2aab65958
remove unused expr node ( #3170 )
...
* remove unused expr node
* still works
* simple expr_idxs
* fixup typing
2024-01-18 14:18:43 -08:00
George Hotz
f0c178b7e9
move get_contraction to helpers ( #3162 )
...
* move get_contraction to helpers
* move simplify
* lines
* to_movement_ops is not generic
2024-01-17 19:13:11 -08:00
George Hotz
a464909d79
fast resnet eval ( #3135 )
...
* fast resnet eval
* fix HIP multidevice graph
* neater expression for devices
* lines
* add decorator test
2024-01-15 14:15:18 -08:00
Paul Gustafson
6bb65cd02e
fix off-by-one error in st_equal ( #3131 )
...
* fix off by one error
* whitespace
2024-01-15 11:32:13 -08:00
chenyu
c658aa4fbf
minor cleanup of test_disk_tensor ( #3112 )
2024-01-13 20:54:58 -05:00
chenyu
a300fea2a4
failed test case due to cast resets shapetracker ( #3109 )
...
cast implicitly resets shapetracker and makes it contiguous (for disk tensor), which fails for Interpreted backend if inputs contain non-contiguous st.
2024-01-13 12:46:51 -05:00
chenyu
f018a55ea1
update NumNode.__hash__ to be hash(self.b) ( #3105 )
...
with this, `a:=NumNode(x) == b` implies `hash(a) == hash(b)`
2024-01-12 19:46:21 -05:00
chenyu
dab8214103
unit tests for Device.canonicalize ( #3055 )
2024-01-09 12:47:20 -05:00
George Hotz
655c6f61d3
St real size ( #3046 )
...
* track the size in the lazybuffer
* shapetracker real size
* lint
2024-01-08 14:44:53 -08:00
George Hotz
c003be7309
Revert "track size in shapetracker" ( #3043 )
...
* Revert "track size in shapetracker (#3026 )"
This reverts commit a8ba1ac08f .
* st.size
2024-01-08 13:13:39 -08:00
George Hotz
a8ba1ac08f
track size in shapetracker ( #3026 )
...
* track size in shapetracker
* shapetracker adapter
* size is an int
* create Buffer with st.size
* only compare the views for the jit
* fix webgpu
2024-01-05 20:15:53 -08:00
George Hotz
60abc62a3f
fast hip read ( #3014 )
...
* fast hip read
* hip read faster
* fix tests
* to_mv
* simplify
* bump to 6k lines
2024-01-05 10:33:13 -08:00
George Hotz
9699c8c90b
don't alloc for InterpretedASTRunner ( #2999 )
2024-01-03 17:05:53 -08:00
chenyu
74cc6fd3c2
remove AndNode.__floordiv__ special case ( #2996 )
...
* remove AndNode.__floordiv__
AndNode produces a Node that min/max is bounded by [0, 1] so `//` on top of that is almost always 0.
we don't really use that either
* keep the test
2024-01-03 17:44:55 -05:00
chenyu
ff5399f053
move one last dtype test from test_helpers to test_dtype ( #2975 )
2024-01-02 12:37:56 -05:00
George Hotz
a280cfe169
move dtypes to dtype.py ( #2964 )
...
* move dtypes to dtype.py
* fix urllib
2024-01-01 14:58:48 -08:00
George Hotz
c81ce9643d
move globalcounters to ops ( #2960 )
...
* move globalcounters to ops
* missed a few
* sick of that failing
2024-01-01 14:21:02 -08:00
chenyu
8291986959
Variable.sum -> Node.sum, Variable.ands -> Node.ands ( #2961 )
2024-01-01 16:21:28 -05:00
chenyu
3d720b5761
move expand_idx, iter_idxs and expand_node from symbolic to linearizer ( #2959 )
2024-01-01 14:41:21 -05:00
George Hotz
5cac6338a4
apply the multitensor optimizations in lazy.py ( #2901 )
...
* apply the multitensor optimizations in lazy.py
* less lines
* hack for webgpu
* save a line
2023-12-21 13:55:49 -08:00
George Hotz
1765849937
new lazy, benchmark ( #2878 )
...
* lazy rewrite, try 2
* min fix tests
* pass contig test
* put broken pads back
* move that to realize
* no contig child fixes array packing
* so wrong
* now that's correct
* base children
* fix bind issues
* disable to_image_idx
* fix tests
* that failure shouldn't break other tests
* more fixes
* fix torch
* skip failing tests in CI
* 1e-7
* half is broken
* 1e-6 margin of error
2023-12-20 14:33:21 -08:00
Peter Cawley
dae8976889
Fix reshape merging with masks ( #2877 )
2023-12-20 14:00:58 -08:00
George Hotz
ca59054463
fix shapetracker math ( #2861 )
...
* proper test
* all st math good now
* fix real_strides bug
2023-12-19 22:17:34 -08:00
chenyu
5a739e8c20
update one skipped pad_reshape test that was fine ( #2860 )
...
* update one skipped pad_reshape test that was fine
had a typo
* this one passed
2023-12-19 23:25:52 -05:00
chenyu
ad233d557f
disable reshape merging with masks ( #2858 )
...
fuzzer found a bug, and it's not complete
2023-12-19 19:06:16 -05:00
Oleg Rybalko
42a038c83f
More readable torch_load ext check ( #2853 )
...
* more readable extension check
* enable tarfile test
* detach tensor if requires grad in torch
2023-12-19 14:53:15 -05:00
George Hotz
b2192b5400
minor improvements ( #2845 )
2023-12-18 22:09:08 -08:00
George Hotz
d086325b1b
hotfix: failing tests
2023-12-18 21:12:42 -08:00
George Hotz
b6d71b131e
hotfix: push broken tests
2023-12-18 21:08:42 -08:00
George Hotz
80f53245e8
shapetracker add and invert ( #2828 )
...
* invert (broken)
* decent invert
* shapetracker invert works
* plus is meh, invert is good
* support invert mask
* a few more invert tests
* shapetracker math invert test
2023-12-18 16:03:27 -08:00
chenyu
b4fa189c8c
Revert "Revert "Make Tensor creation allow multi-dim list of int and bool ( #2793 )" ( #2810 )" ( #2813 )
...
This reverts commit 71a60762ed .
2023-12-17 11:48:27 -05:00
chenyu
71a60762ed
Revert "Make Tensor creation allow multi-dim list of int and bool ( #2793 )" ( #2810 )
...
This reverts commit 798bf813b1 .
2023-12-17 02:03:52 -05:00
geohotstan
798bf813b1
Make Tensor creation allow multi-dim list of int and bool ( #2793 )
...
* the universe is flat as a 2D tensor
* try this
* TESTS
* less lines in test
* don't change all_int since other places use it
* add tests and del noqa by making non-aesthetic spacing LOOOOOL
* some reordering
* fixed empty list and add tests
* more tests
* add list bool tensors
* clearer with least lines added
* added bool
* oops
* more tests
* improved tests
* oops
2023-12-17 01:58:10 -05:00
George Hotz
877c78b4ce
lazy tests ( #2796 )
...
* tests
* mini sd is very mini
2023-12-16 08:24:21 -08:00
chenyu
5235cdee3d
remove _arg_int32 internal type ( #2767 )
...
in DEFINE_GLOBAL, PtrDtype(int32) is buffer and int32 is int
2023-12-14 14:17:14 -05:00
George Hotz
7e5b3e53fe
changes to prep for new lazy ( #2748 )
...
* changes to prep for new lazy
* put those back
2023-12-13 10:28:22 -08:00
Umut Zengin
8ad7cfeeb1
More simplification in to_image_idx and symbolic ( #2679 )
...
* less valid
* add test
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2023-12-13 12:30:44 -05:00
George Hotz
6d6eb9302d
ruff checks the max line length is 150 ( #2734 )
...
* ruff checks the max line length is 150
* fix tensor.py
* a lot more
* done
2023-12-12 17:34:47 -08:00
Guy Leroy
ee9e1d3662
Extend available types for safe_save ( #2720 )
...
* Extend available types to save with
* Linter fix
2023-12-11 14:50:35 -08:00
George Hotz
0fd44259cd
bf16 fix + cleanups from mixtral ( #2698 )
...
* bf16 fix + cleanups from mixtral
* generic bf16 cast
2023-12-10 16:31:52 -08:00
qazal
73b067f5ce
Bitcast p2 bfloat16 tests + clang fix ( #2635 )
...
* add bf16 test support
this model takes me almost a minute to download though:
https://huggingface.co/TinyPixel/Llama-2-7B-bf16-sharded/resolve/main/pytorch_model-00001-of-00014.bin?download=true : 100%|█████████████████████████████| 981M/981M [00:40<00:00, 24.2MB/s]
* ensure we first load if it is bitcast to avoid taking the address of an rvalue
* tiny bf16 in the cloud
skip GPU
* should skip torch
lint
* Revert "ensure we first load if it is bitcast to avoid taking the address of an rvalue"
This reverts commit b86a28ab84 .
* break the kernel
* skip LLVM and GPU in CI
* skip CUDA
2023-12-08 10:30:10 -08:00
chenyu
b931a20882
minor shapetracker cleanup ( #2652 )
2023-12-06 11:43:52 -05:00
Amrit Sahu
71d989b476
adding test to cover #2644 failure ( #2645 )
2023-12-06 11:00:30 -05:00
George Hotz
232ed2af3f
more test cleanups ( #2631 )
...
* more test cleanups
* move test example back
2023-12-05 16:17:57 -08:00
George Hotz
35b5e95097
parallel beam search ( #2610 )
...
* better print
* fix beam search with vars
* cleanups
* parallel is not default
* restore that
* bugfix
* cleanups
* bugfix
2023-12-05 10:09:45 -08:00