wozeparrot
5f3d5cfb02
catch cycles in print_tree ( #2891 )
...
* feat: smaller tree on references
* fix: shorter line
* fix: huh
* fix: should be all
* feat: cleaner
* fix: extra imports
* fix: pass by reference
2023-12-21 18:40:37 -08:00
George Hotz
4432cb17bb
minor cleanups / remove that op ( #2905 )
2023-12-21 18:24:20 -08:00
chenyu
fd0ba33b38
onnx_ops formatting cleanup ( #2904 )
...
also removed a case in safe_numpy that always convert 0-dim array to 1-dim
2023-12-21 20:06:06 -05:00
George Hotz
5cac6338a4
apply the multitensor optimizations in lazy.py ( #2901 )
...
* apply the multitensor optimizations in lazy.py
* less lines
* hack for webgpu
* save a line
2023-12-21 13:55:49 -08:00
chenyu
5bf43c9634
reenable one onnx test failed due to dtype ( #2902 )
2023-12-21 15:50:02 -05:00
chenyu
677ae7673d
use np.less and torch.lt for CMPLT ( #2899 )
...
also removed one unused output_type
2023-12-21 14:37:24 -05:00
qazal
d2e9245de8
render_locals takes a dtype ( #2873 )
...
Co-authored-by: chenyu <chenyu@fastmail.com >
2023-12-21 14:15:28 -05:00
chenyu
6116039f7b
don't match dtype with first input in where ( #2898 )
...
* don't match dtype with first input in where
`Tensor([1, 2, 3]).where(1.2, 2.3)` the first `[1, 2, 3]` can directly cast into bool without casting float (in broadcasted) first
* cast in one place
2023-12-21 13:02:15 -05:00
chenyu
7dc3352877
increase stable diffusion validation threshold 1e-4 -> 3e-4 ( #2897 )
...
saw a flaky CI failure with 1.1e-4, and 3e-4 is a good number
2023-12-21 11:45:25 -05:00
qazal
24e79e0f53
Move the webgpu CMPLT hack to one place ( #2895 )
...
* move hacks to one place
* no casting in mlops, move to tensor
* ruff fix
2023-12-21 11:14:56 -05:00
George Hotz
852ef57ba4
fix readme typo
2023-12-21 08:06:24 -08:00
George Hotz
193109a88c
hotfix: compare on ids
2023-12-20 23:47:50 -08:00
George Hotz
f6c7833f9f
fast compare for lazyop ( #2893 )
2023-12-20 23:32:27 -08:00
chenyu
1500aca43d
remove output_type in ops_cpu and ops_torch ( #2892 )
...
now the input types are matched and checked in lazy, we can remove these output_type.
also remove the usage of least_upper_dtype in ops.py since we can just use the input type
2023-12-21 02:11:27 -05:00
chenyu
2d2c4980fe
assert for elementwise dtypes in lazy ( #2888 )
...
* assert for elementwise dtypes in lazy
* no image hack
* check dtype of scalar for IMAGE=2
2023-12-21 01:42:32 -05:00
George Hotz
41b2a25be6
Fix exponential behavior in lazyops ( #2890 )
...
* add cache to ast_parse and lazyop builder
* add caches
2023-12-20 22:06:50 -08:00
George Hotz
8c4a0f8e15
Fix int child count ( #2882 )
...
* pad ops broke coder
* that contiguous fixes it
* Update lazy.py
* recursive add
* fix all
* revert that
* todo test
2023-12-20 21:06:27 -08:00
chenyu
8a04107d30
move the op casting logic from mlops to tensor try 2 ( #2887 )
...
* unary works
* where works
* add sub mul
* xor div
* CMPLT
* sparse_categorical_crossentropy
* image const
* sparse_categorical_crossentropy
2023-12-20 23:50:37 -05:00
George Hotz
7da2325dc7
get_lazyops() -> lazyops ( #2884 )
...
* get_lazyops() -> lazyops
* don't compare empty mem
2023-12-20 18:04:49 -08:00
George Hotz
64dded27f0
pad ops broke coder ( #2881 )
...
* pad ops broke coder
* that contiguous fixes it
* Update lazy.py
2023-12-20 17:03:41 -08:00
George Hotz
e1861ab65e
remove realize from optimizer ( #2880 )
...
* remove realize from optimizer
* one still needed
* opt realize
2023-12-20 16:42:41 -08:00
George Hotz
1765849937
new lazy, benchmark ( #2878 )
...
* lazy rewrite, try 2
* min fix tests
* pass contig test
* put broken pads back
* move that to realize
* no contig child fixes array packing
* so wrong
* now that's correct
* base children
* fix bind issues
* disable to_image_idx
* fix tests
* that failure shouldn't break other tests
* more fixes
* fix torch
* skip failing tests in CI
* 1e-7
* half is broken
* 1e-6 margin of error
2023-12-20 14:33:21 -08:00
Peter Cawley
dae8976889
Fix reshape merging with masks ( #2877 )
2023-12-20 14:00:58 -08:00
George Hotz
8fe24038d8
Revert "mulacc fusion cleanup ( #2871 )" ( #2876 )
...
This reverts commit 863c5b26ed .
2023-12-20 13:26:25 -08:00
qazal
863c5b26ed
mulacc fusion cleanup ( #2871 )
...
* add mulacc fusion tests
* cleanup the implementation
* fix indent in the test utility
* less verbose
2023-12-20 15:39:54 -05:00
chenyu
e13b4964d7
remove the all_int(shape) check in Tensor._loadop ( #2874 )
...
* remove the all_int(shape) check in Tensor._loadop
we can support jittable symbolic shape random with custom rand now, and we can formalize it in the test after threefry is ready
* MOCKHIP false positive
2023-12-20 15:04:50 -05:00
qazal
5f07ef455e
update dtypes ( #2872 )
2023-12-20 15:04:02 -05:00
chenyu
857c35d256
make gpt2 decode output just once at the end ( #2869 )
...
also updated function name from greedy_until to generate, as it's not greedy nor until
2023-12-20 12:14:55 -05:00
chenyu
e92069fb1c
remove unused symbolic.is_sym_int ( #2868 )
2023-12-20 11:37:54 -05:00
George Hotz
ca59054463
fix shapetracker math ( #2861 )
...
* proper test
* all st math good now
* fix real_strides bug
2023-12-19 22:17:34 -08:00
chenyu
5a739e8c20
update one skipped pad_reshape test that was fine ( #2860 )
...
* update one skipped pad_reshape test that was fine
had a typo
* this one passed
2023-12-19 23:25:52 -05:00
chenyu
39af93ed7c
minor tensor.py function cleanup ( #2859 )
...
* minor tensor.py function cleanup
* where outputs not aligned yet
2023-12-19 22:39:39 -05:00
George Hotz
94f71fe238
random and empty shouldn't reshape
2023-12-19 18:09:03 -08:00
George Hotz
637879af78
add direct install to readme
2023-12-19 18:04:00 -08:00
chenyu
ad233d557f
disable reshape merging with masks ( #2858 )
...
fuzzer found a bug, and it's not complete
2023-12-19 19:06:16 -05:00
chenyu
1231ec5a02
run the sz.py line count at the end of linter ci ( #2857 )
2023-12-19 16:33:12 -05:00
George Hotz
ac6ec936cd
update contributing
2023-12-19 12:19:14 -08:00
George Hotz
e477cc2f45
hotfix: README is ~25 ops to stop getting PRs about it
2023-12-19 11:53:35 -08:00
Oleg Rybalko
42a038c83f
More readable torch_load ext check ( #2853 )
...
* more readable extension check
* enable tarfile test
* detach tensor if requires grad in torch
2023-12-19 14:53:15 -05:00
chenyu
172a88e719
skip slow test_indexing on METAL ( #2852 )
...
LLVM still runs and is a lot faster, would be curious to know why.
also reworded some error messages and remove regex check
2023-12-19 12:00:54 -05:00
chenyu
6d7e9e0a56
hotfix convert Y_train to int before passing into index ( #2850 )
2023-12-19 11:40:56 -05:00
geohotstan
fec8e9060c
Add simple fancy indexing exceptions ( #2706 )
...
* fancy indexing raise error
* updated error message
* improved error check
* oops
* fixed onnx
* oops typo
* merge
* add full_flatten
* try
* merged and updated some tests
* more cleaning
* done
* temp fix onnx
* try
* add todo in onnx_test
* reword
* gah
2023-12-19 11:23:51 -05:00
qazal
417d42a363
UOps support for ImageDType ( #2848 )
...
* cleanup buffer dtypes in global_load
* update with feedback
2023-12-19 09:39:48 -05:00
George Hotz
90fb09b55c
remove unused _device_extra_args
2023-12-18 22:14:58 -08:00
George Hotz
b2192b5400
minor improvements ( #2845 )
2023-12-18 22:09:08 -08:00
George Hotz
d086325b1b
hotfix: failing tests
2023-12-18 21:12:42 -08:00
George Hotz
07df14aa0e
HIP cleanups ( #2843 )
...
* move everything to code_for_op to reason about it
* loop the loopable parts
* its not that unreadable
* these are loopable too
* nitpick
* tests p1 - replace these with the actual compiler running alu ops tests
* tests p2: compile test_dtype_alu in HIP!
+add to CI
* nobody liked test_renderer
* revert test_dtypes change
* isolated mockhip tests
* dont need the WHERE hack after #2782
+ruff
* bf16 is broken in HIP
job failed in: https://github.com/tinygrad/tinygrad/actions/runs/7232101987/job/19705951290?pr=2778#step:8:73
* picking this back up
* add compile tests for unary ops and binary ops
* MOD is only in ints
* CMPLT wont work after the dtypes pr is merged because it will always be bool
* test all combinations
* Update cstyle.py
* don't use vload
* no getenv
* set seed
---------
Co-authored-by: qazal <qazal.software@gmail.com >
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com >
2023-12-18 21:09:32 -08:00
George Hotz
b6d71b131e
hotfix: push broken tests
2023-12-18 21:08:42 -08:00
George Hotz
9b35186bbe
hotfix: don't import from tinygrad in sz.py
2023-12-18 20:49:46 -08:00
George Hotz
6617dcf095
move graph to runtime, check line count with sz.py ( #2842 )
...
* move graph to runtime, check line count with sz.py
* oops, didn't save
* dtype aliases
* restore comment, REALCOUNT
2023-12-18 20:30:06 -08:00