chenyu
0e266f376c
ops_gpu -> ops_cl ( #12103 )
2025-09-10 15:15:48 -04:00
nimlgen
551560b87c
do not use getenv('PTX') in tests ( #12095 )
...
* test without ptx
* fix tests
* fix test
* linters
2025-09-10 14:04:07 +03:00
Sieds Lykles
c6c16b2946
var_vals uses str for var (#12011 )
...
* var_vals is str,int
* remove imports
* remove print
* fix test
* change var_vals in hcq
* update test_hcq
* fix multitensor _device_num var
* fix syminfer test
* shorten line
* p.vars stays list[Variable]
* shorten line
* vars is back to tuple[Variable, ...]
* change var_vals in extra
* change var_vals from shapetracker
* var_vals is str:int
* fix signature
2025-09-06 04:16:12 +02:00
George Hotz
ee4f696086
delete more tests ( #12043 )
...
* delete more tests
* delete and simplify
* flaky on windows
* a few more, those remained
2025-09-05 15:31:30 -07:00
Sieds Lykles
572a3c15c6
Move Ops.SPECIAL arg to src ( #11918 )
...
* initial moving bound to src
* arg to src
* remove import
* fixup linearizer
* arg to src
* fix test_uop_graph
* fix more tests
* fix python renderer
* get const value from const uop
* ssimplify uop estimates
* fix webgpu locals
* fix old test
* gate Ops.SPECIAL in linearizer
* use ssimplify() for local/global_size
* remove toposort gate_parents_instead_of_self
* fix rendering in comment
* cleanup
* rename and add comments
* add BottomUpGate with test
2025-09-04 09:31:44 +02:00
Sieds Lykles
d1d0960e6e
remove intermediate cast using bounds - weaker pattern ( #11974 )
2025-09-03 06:24:40 +02:00
Sieds Lykles
d9560a631c
remove cast between ints if safe ( #11946 )
2025-09-01 05:56:49 +02:00
Sieds Lykles
f32f3464d6
Can safe cast from certain ints to floats ( #11941 )
...
* add rule
* add some tests
* prevent infinite loop with bfloat16
* add some ints to double and float can_safe_cast
* add tests
2025-09-01 00:51:24 +02:00
Sieds Lykles
1c6e43c203
Double cast is one cast if intermediate cast is safe ( #11939 )
...
* add rule
* add some tests
* prevent infinite loop with bfloat16
* prevent more infinite rewrite
2025-09-01 00:36:29 +02:00
George Hotz
afad7d0cd1
remove dtype from range, it will be dtypes.index soon [pr] ( #11914 )
...
* remove dtype from range, it will be dtypes.index soon [pr]
* a few more
2025-08-29 09:52:07 -07:00
George Hotz
b9b438c516
small updates from postopt ( #11903 )
...
* tests from postopt
* modernize
* skip lin tests
* that's fixed?
* skip, not failure
2025-08-28 12:34:52 -07:00
Ben Waldron
ea1be2e4cd
[bounty] Remove using reshape to register symbolic shape ( #11771 )
...
* Modify tests and start work towards removing symbolic reshape
* Refactor symbolic reshape
* fix small error
* much cleaner + fix more tests
* Can remove this now
* Update test_symbolic_ops and test_tiny
* Couple more tests
* Unused import
* More tests and add EXPAND to Tensor.empty
* Fix test beam search
* all int
* Fix rangeify by adding shrink
* Remove OOB check and so fix test_symbolic_jit
* test_symbolic_jit doesn't need OOB Context anymore either
* Should remove that test now
* Cleanups part 1
* fix linters
* Final cleanups
* Don't reassign inside for loop
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-08-28 12:30:49 -04:00
Sieds Lykles
a286a1a6f7
Fast idiv try removing factors of two before cast ( #11824 )
...
* try removing factors of two
* dont return if None
* add test
2025-08-24 20:04:25 +02:00
Sieds Lykles
10540414cd
Add Ops.CMPEQ ( #10431 )
...
* Add op
* add to Groupop.ALU
* fix spec
* fix ptx
* temporary pickle by name to see process replay
* add Ops.EQ to binary ops
* Actuall rename properly
* add test to assert CMPEQ is being used
* Ops.CMPEQ is automatic cast to bool
* add Ops.CMPEQ to llvm
* add Ops.CMPEQ to llvm
2025-08-10 13:13:16 +02:00
George Hotz
82be8abfd2
move opt under codegen ( #11569 )
2025-08-07 14:19:17 -07:00
George Hotz
6fd1332763
update some tests for less Kernel ( #11543 )
...
* update some tests for less Kernel
* get_program update
2025-08-06 14:19:59 -07:00
George Hotz
108aac8af4
use AddrSpace instead of local ( #11314 )
...
* use AddrSpace instead of local
* addrspace in test
2025-07-21 14:00:06 -07:00
qazal
7619bf35e7
cleanup: remove disabled TestIndexingOrdering ( #11101 )
...
* cleanup: remove disabled TestIndexingOrdering
* don't import kernelize internals
2025-07-05 18:14:37 +03:00
Ignacio Sica
21f1c4cc09
remove some linearize calls from tests [pr] ( #10978 )
...
* remove some linearize calls from tests
speed_compare_cuda_ptx
test_uop_spec
test_linearizer
test_uops
test_winograd
* more clear assert message
2025-06-25 12:37:17 -07:00
George Hotz
b41e0563a3
move stuff to kernelize folder ( #10902 )
...
* move stuff to kernelize folder
* oops, forgot that
2025-06-20 16:10:20 -07:00
George Hotz
92678e59ee
move kernel to opt ( #10899 )
2025-06-20 15:22:28 -07:00
George Hotz
cba6e15937
split grouper and kernelize [pr] ( #10854 )
2025-06-17 17:54:20 -07:00
George Hotz
5dc1bc6070
switch get_kernel -> get_program [pr] ( #10817 )
...
* switch get_kernel -> get_program [pr]
* fix tests
2025-06-15 12:26:50 -07:00
Sieds Lykles
37d3ca152e
Adapt >> for division by power of two to all ints ( #10803 )
...
* Change divison by power of two to always use shift
* Change test to test int instead of uint
* simplify condition
* add old rule back with comment
* remove import
* use sresolve instead of simplify
* use keyword in simplify instead of sresolve
* webgpu cast y to uint
* remove comment
* explicitly set dtype in wgsl
* without simplify
* undo simplify kwarg
* change test to test both int32 and uint32
2025-06-14 14:55:51 -04:00
George Hotz
a38947b4bb
move symbolic and transcendental to uop [pr] ( #10771 )
2025-06-10 20:51:22 -07:00
George Hotz
32e9949052
rename lazydata to uop ( #10698 )
2025-06-08 08:42:22 -07:00
Ahmed Harmouche
650404a143
[webgpu] Proper shared mem size for packed types ( #10585 )
...
* Proper shared mem size in webgpu
* Add test
* Refactor test
2025-06-01 20:18:33 -04:00
George Hotz
411392dfb7
move files into uop dir ( #10399 )
...
* move files into uop dir [pr]
* tinygrad.uop is a thing
* fix uop docs, no pr
* fix viz
2025-05-18 11:38:28 -07:00
George Hotz
603c03bef2
fix tests for rewrite [pr] ( #10167 )
...
* fix tests for rewrite [pr]
* cleaner
* delete linearize_uop
* clean up the rest
2025-05-05 19:19:49 -07:00
Sieds Lykles
338f33efae
Fast mod ( #10055 )
...
* Enable fast mod
* Add test
2025-05-05 09:15:43 -07:00
quortus
5cdc96409e
Update outdated renderer.render calls ( #10044 )
2025-04-26 07:35:19 -04:00
George Hotz
2ed3acd767
toposort is a function [pr] ( #10004 )
2025-04-23 16:25:03 +01:00
Sieds Lykles
07d1aefaf4
fast idiv ( #9755 )
...
* fast idiv with tests and fuzzer
* Add todo comment
* Add env variable to toggle fast_idiv
* Move env check
* Add fuzz fast_idiv to ci
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-04-07 08:32:24 -04:00
qazal
891322fd51
split into grouper.py ( #9768 )
...
* split into grouper.py
* update tests
* reorder
2025-04-07 18:40:59 +08:00
qazal
8ddb1357c0
fix UPat.location after pickle ( #9763 )
...
* fix UPat.location after pickle [pr]
* named upat test
2025-04-07 15:16:42 +08:00
George Hotz
1714fc3ba4
start work on speed [pr] ( #9707 )
...
* fix get_location
* fix get_location try 2
* clean up split_load_store [pr]
* SHR fixup [pr]
2025-04-03 10:39:01 +08:00
George Hotz
3c5161b4cb
add validation of the bounds of Ops.INDEX ( #9503 )
...
* add validation of the bounds of Ops.INDEX
* do mask properly
* more validation
* correct
* fix gated
* add CAST support to vmin/vmax
* fix ptx and image
* ptx no diff
* upat.index also stays
---------
Co-authored-by: qazal <qazal.software@gmail.com >
2025-03-20 12:15:55 +08:00
qazal
1839e8c9b3
place masks in INDEX for TestGatedStoreRewrite [pr] ( #9512 )
2025-03-20 09:46:53 +08:00
chenyu
2e7c2780a9
CLANG -> CPU ( #9189 )
2025-02-20 18:03:09 -05:00
George Hotz
caee42e8a6
Revert "name from uops [pr] ( #9151 )" ( #9154 )
...
This reverts commit 28897be9a2 .
2025-02-18 16:06:44 +08:00
George Hotz
28897be9a2
name from uops [pr] ( #9151 )
2025-02-18 15:52:03 +08:00
George Hotz
a4dab3ec3f
add name uop ( #9149 )
...
* add name uop, TODO: refactor renderer to use
* renderer uses name uop
* fix tests
* render
* ptx
2025-02-18 15:26:58 +08:00
George Hotz
df3b320f46
rewriter -> devectorizer [pr] ( #9147 )
2025-02-18 12:42:08 +08:00
George Hotz
1bf66d62cf
symbolic gets its own file [pr] ( #9132 )
2025-02-17 18:55:21 +08:00
quortus
5bdf0c7951
Bitcast constant folding 2.0 ( #9089 )
...
* Prevent const folding in test_payne_hanek_reduction
* Do not use list as a default parameter
* Bitcast constant folding
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-02-17 18:08:20 +08:00
George Hotz
9289425170
add ast to ProgramSpec + pre matcher [pr] ( #9128 )
...
* add ast to ProgramSpec + pre matcher [pr]
* cleaner cast + test fix
2025-02-17 16:39:14 +08:00
qazal
c80603285e
bring back some things from the fix_kernel_ops diff [pr] ( #9027 )
...
* bring fix_kernel_ops back [pr]
* fix
2025-02-11 14:20:31 +01:00
George Hotz
fb698920f1
revert scheduler change ( #9019 )
...
* Revert "cleanup ast rewriter [pr] (#9012 )"
This reverts commit bf0bcb2d5a .
* Revert "kernel op cleanups + use ScheduleItem [pr] (#9009 )"
This reverts commit c52cd2b437 .
* Revert "construct the schedule sink 2 (#8925 )"
This reverts commit cfd3db7862 .
2025-02-11 11:34:12 +08:00
qazal
bf0bcb2d5a
cleanup ast rewriter [pr] ( #9012 )
2025-02-10 19:07:59 +01:00
qazal
b17ec42b56
remove const_arg ( #9002 )
...
* remove const_arg
* use -m pytest
* remove test_const_arg test, variable arg on CONST does not exist.
* use base in test_const_dtype
2025-02-10 12:45:11 +01:00