George Hotz
4b3fcb4064
Revert "REDUCE_AXIS keepdim=False ( #11311 )" ( #11718 )
...
This reverts commit b518a7378a .
2025-08-18 13:28:53 -07:00
b1tg
b518a7378a
REDUCE_AXIS keepdim=False ( #11311 )
...
* progress
* fix tests
* fix tests
* remove hack for test_symfold
* fix test_conv.py on llvm
* hack test_cache_speed
* lint
* remove hack for helper_linearizer_opt
* tests
* fix DSP
* clean up
* remove hack for kernelize.py
* hack for test/test_multitensor.py TestMultiTensor.test_matmul_shard_none
* clean
* uop.r need reshape?
* lower_store cause fail
* fix lower?
* avoid contiguous hack
* 2134
* conv2d count
* remove unused
* hack lower
* reduced and clean up
* fix TestMultiTensor.test_matmul_shard_none
* src sync + fix TestMultiTensor.test_matmul_shard_none
* remove excluded in mop
---------
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com >
2025-08-18 10:09:17 -07:00
George Hotz
82be8abfd2
move opt under codegen ( #11569 )
2025-08-07 14:19:17 -07:00
George Hotz
21570545d3
move view pushing to codegen, try 2 ( #11534 )
...
* move view pushing to codegen, try 2
* fix up some linearizer tests
* fix test search
* fix test schedule
* delete that test
* fix test arange
* fix a few tests
* update tests
* push views
* ebs cleanup
* fix local/reg
* test and lint
* fix more tests
* test cleanups
* skipped that one
2025-08-06 15:58:38 -07:00
George Hotz
6fd1332763
update some tests for less Kernel ( #11543 )
...
* update some tests for less Kernel
* get_program update
2025-08-06 14:19:59 -07:00
George Hotz
4fe11725c6
pass through sink arg, update linearizer test ( #11536 )
...
* pass through sink arg, update linearizer test
* get_program help
* bump line count
* use new api
2025-08-06 09:48:48 -07:00
chenyu
0e5d8d5c3c
remove tests that used .to_uop() ( #11425 )
...
* remove tests that used .to_uop()
* import
2025-07-29 15:52:16 -04:00
George Hotz
466ab5a3f2
store/load not pass through index ( #11381 )
...
* noop
* fix noop
* store cat is NOOP
* store dtype is void
* stores aren't passed through anymore
* meh, skip those for ptx
* correct ptx skip
* hl runs
2025-07-25 21:01:47 -07:00
George Hotz
e14b4fefa5
ranges on store ( #11334 )
...
* ranges on store
* fix store spec
* fix that
* fix gates
* fix tests
* fix ptx
2025-07-22 21:00:50 -07:00
George Hotz
affd83961c
small changes from define_reg ( #11327 )
...
* small changes from define_reg
* fix webgpu
2025-07-22 11:11:48 -07:00
George Hotz
3b674df34b
generic changes from define_reg_2 ( #11315 )
...
* generic changes from define_reg_2
* fix for ptx
* ugh, that one
2025-07-21 15:14:06 -07:00
chenyu
54924f9969
type remove Union and Optional [pr] ( #11283 )
...
use `|` for consistency
2025-07-19 14:05:52 -04:00
chenyu
ec3efd2919
move upcast before reduce ( #11250 )
...
* move upcast before reduce
upcast goes to end of global+local+upcast
* r_196_32_4_24_8
2025-07-18 14:42:15 -04:00
chenyu
522dc72f08
remove Kernel.local_dims [pr] ( #11268 )
...
* remove Kernel.local_dims [pr]
also not needed
* fix test_matvec
2025-07-16 17:46:19 -04:00
chenyu
c8e5c4d7c3
insert_before -> insert_at [pr] ( #11257 )
...
more precise
2025-07-15 17:44:34 -04:00
chenyu
b6662096cb
remove more first_reduce [pr] ( #11239 )
2025-07-14 19:13:44 -04:00
chenyu
eb8e17ef59
remove most of the first_upcast [pr] ( #11238 )
2025-07-14 16:54:24 -04:00
chenyu
674dc28505
remove Kernel.full_unupcasted_shape [pr] ( #11215 )
...
decomp to shape_len and first_upcast to get the last upcast-able dim
2025-07-13 13:56:23 -04:00
chenyu
2b48b961be
fix a few broken AMX tests ( #11204 )
2025-07-12 21:42:38 -04:00
chenyu
a0438012af
remove Kernel.get_program [pr] ( #11203 )
2025-07-12 20:50:29 -04:00
chenyu
6283d50224
DEPRECATED_linearize -> to_program [pr] ( #11198 )
2025-07-12 13:46:20 -04:00
George Hotz
2893feb9f6
cleanups for kernel.py ( #11143 )
...
* cleanups for kernel.py
* fixups
2025-07-08 18:10:25 -07:00
George Hotz
359bed74f8
axis type tracking [pr] ( #11137 )
...
* axis type tracking [pr]
* keep update_info
* keep legacy colors
* update tests to apply_opt
2025-07-08 14:16:25 -07:00
George Hotz
0597735f28
remove TC=3 not porting this ( #11045 )
2025-06-30 15:12:49 -07:00
chenyu
126fcf4129
clean up AMD_LLVM in tests ( #11021 )
2025-06-28 22:45:47 -04:00
George Hotz
be53ef4f0a
rename DEFINE_ACC -> DEFINE_REG ( #11006 )
...
* rename DEFINE_ACC -> DEFINE_REG
* add CMPEQ to groupops
2025-06-27 11:09:25 -07:00
George Hotz
5a1911b7c4
apply the global dims late ( #11002 )
...
* apply the global dims late [pr]
* late gpudims
* tests passing
* remove the random local_dims inc
* simpler
2025-06-27 09:54:34 -07:00
George Hotz
b4eb876d5a
kernel.py no longer permutes reduce axis [pr] ( #10968 )
...
* kernel.py no longer permutes reduce axis [pr]
* delete tests that handcode uops
* regen of sops is broken...
* put import back
* just remove that
* disable those tests
2025-06-26 17:44:58 -07:00
Ignacio Sica
579194f523
remove some linearize calls from tests 2 [pr] ( #10992 )
...
* refactor count_float4 to take uops as input instead of kernel
* remove some calls to linearize in test_linearizer
* remove some more calls
* remove one more call
2025-06-26 18:22:27 -03:00
Ignacio Sica
21f1c4cc09
remove some linearize calls from tests [pr] ( #10978 )
...
* remove some linearize calls from tests
speed_compare_cuda_ptx
test_uop_spec
test_linearizer
test_uops
test_winograd
* more clear assert message
2025-06-25 12:37:17 -07:00
Ignacio Sica
98d2cde293
revert tc_group feature ( #10971 )
2025-06-24 20:58:13 -07:00
George Hotz
8a65720528
hotfix: disable test_tensor_core_opts_group test on real metal
2025-06-24 15:21:33 -07:00
George Hotz
8743ca40e2
force reduce to be in axis order ( #10837 )
...
* force reduce to be in axis order
* disable rule causing loop
* disable that rule
* no ra there
* only move non reduce
* fix tests
2025-06-24 13:00:16 -07:00
Ignacio Sica
956a8391a5
minor cleanup on test_tensor_core_opts tests ( #10924 )
...
* minor cleanup on test_tensor_core_opts tests
Tests now notify when skipped
Before, they silently skipped if backend didn't had half precision and
accumulation
Also cleaned up atol and rtol setup
* refactor test_tensor_core_opts_group
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-06-23 16:30:21 -07:00
Ignacio Sica
b8d09a1dae
tc with group/grouptop ( #10903 )
2025-06-23 09:58:41 -07:00
George Hotz
92678e59ee
move kernel to opt ( #10899 )
2025-06-20 15:22:28 -07:00
George Hotz
32e9949052
rename lazydata to uop ( #10698 )
2025-06-08 08:42:22 -07:00
qazal
5b59728c75
refactor LOAD(DEFINE_GLOBAL, VIEW) in kernels to LOAD(VIEW(DEFINE_GLOBAL)) ( #10541 )
...
* changes to core tinygrad
* fixups pt1
TC=3
docs/abstractions2.py
IMAGE=2
test_quantize_dsp
test_schedule
* more tests
* green now
* images stay images
2025-05-30 14:27:58 +03:00
qazal
bbf05110a2
use kernelize in TestLinearizer.test_indexing_multireduce [pr] ( #10571 )
2025-05-30 11:27:09 +03:00
qazal
9169dcfb49
do not create kernels with more inputs than the backend allows ( #10510 )
...
* work
* no itertools + top down pass
* clean viz
* python can do that
* webgpu
* gbarrier of gbarrier is gbarrier
* device can be tuple
* bug in toposort
* failing test for gated toposort
* contiguous of gbarrier is gbarrier
* check for binops
* Revert "check for binops"
This reverts commit 53e3cdf720 .
* viz + match on gbarrier, self exists by default
* alt
* green now
* cleanup
2025-05-26 18:02:03 +03:00
George Hotz
411392dfb7
move files into uop dir ( #10399 )
...
* move files into uop dir [pr]
* tinygrad.uop is a thing
* fix uop docs, no pr
* fix viz
2025-05-18 11:38:28 -07:00
Ignacio Sica
8f79492c75
fix test_tensor_cores_codegen for ptx renderer ( #10119 )
2025-05-01 21:52:36 -03:00
Ignacio Sica
bf5fb97498
fix AMD_LLVM bf16 tc for gfx1100 ( #10102 )
...
* fix amd_llvm bf16 tc
* cleanup pattern
2025-04-30 20:06:38 -03:00
Ignacio Sica
bda116d773
fix use_tensor_cores propagation ( #10048 )
...
* propagate use_tensor_cores
* add use_tensor_core to arg in test and search
* bugfix
* get TC val from ContextVar in search
* revert minor space change
* add tc emulation test to ci and benchmark
* revert
* revert whitespace change
* remove test for ptx
* add comment and remove llvm test run
2025-04-28 19:30:50 -03:00
George Hotz
4c242b0483
hotfix: tests all pass on metal local
2025-04-28 12:09:00 -04:00
qazal
d13c100981
don't sort dims in verify_sink_dims [pr] ( #10059 )
...
* don't sort dims in verify_sink_dims [pr]
* 1 can exist with n
* put process_replay warn last
* assert shape is the same
* bring that back
2025-04-26 23:24:30 +08:00
Ignacio Sica
76a86735c0
hotfix amd bf16 is supported case ( #10039 )
...
* hotfix amd and amd_llvm
* bf16 not supported in ci
* hotfix amd_llvm is not a device
* remove default
* dont gate on ci and amd_llvm
* minor cleanup
* skip bf16 tc test for amd_llvm
2025-04-24 21:29:27 -03:00
Ignacio Sica
b4f823acbe
fix helper_tc_allclose ( #9606 )
...
* fix helper_tc_allclose
* cleanup
* hotfix
* cleanup
* cleanup
* check real buffer and add cast for bf16
* cleanup
* fix padded for ops_python
* avoid assert on amd emulated tc
* swap dimensions
* revert, should have nothing to do with padded
* revert fix, should not go in this pr
* remove skip
2025-04-24 18:36:40 -03:00
Ignacio Sica
51ca19d061
set test_tensor_cores_padded_amd to expectedFailure ( #10036 )
...
* init
* add expected failure to correctly track progres
* hotfix
* skip for amd_llvm as well
* add skip
* add pr number
* move comment to amd test
* change reason
2025-04-24 17:11:40 -03:00
Ignacio Sica
373ca59b7f
use is_dtype_supported to check dtype support in tc tests ( #10035 )
2025-04-24 14:59:14 -03:00