George Hotz
2da02f1ae1
add loads at the end ( #12988 )
...
* add loads at the end
* simpler
* late load
* tests passing
* fix matvec
* spec test passes
* fix where on load
* fix abs2
* fix more tests
2025-10-30 10:42:19 +08:00
Sieds Lykles
9f39f6391c
shared_codegen_spec and fix index spec ( #12967 )
...
* split shared_codegen_spec and fix index
* add VCONST to program_spec and move index to shared_codegen_spec
* working ignore_oob=0
* cleanup
* fix spec
* undo that
* move barrier and special earlier
* fix more spec issues
* more updates
* remove special from program_spec
* cleanup and fixes
* move more to shared
* special is not in shared_spec
* some comments
* dont do bounds check there
2025-10-29 09:14:11 +01:00
George Hotz
5e01cc299b
zero len ranges fail ( #12974 )
...
* zero len ranges fail
* fix Python backend
* fix llvm
* fix ptx
* yolo fix nir
* this works...
* always store...
* always store...
* Revert "always store..."
This reverts commit 0816cf344d .
2025-10-28 22:49:55 +08:00
George Hotz
701a632907
move VECTORIZE/CONST ( #12942 )
2025-10-27 17:37:13 +08:00
George Hotz
804133cffd
rename RECIP to RECIPROCAL ( #12939 )
2025-10-27 16:53:13 +08:00
George Hotz
8a941d95a4
SPEC=2 is full spec, SPEC=1 is default ( #12910 )
...
* SPEC=1 passes all tests
* just use SPEC, not __debug__
2025-10-25 11:10:43 +08:00
George Hotz
e85cee0aad
flip Ops.END srcs ( #12882 )
...
* flip Ops.END srcs
* backward
* late end split
2025-10-23 12:47:50 +08:00
George Hotz
74b4cfe44b
Ops.GROUP + range check ( #12880 )
...
* simpler
* fix that
* Ops.GROUP + range check
* fix bugs
* fix linter
* fix test
2025-10-23 12:05:21 +08:00
George Hotz
7762b3558b
clean up the spec ( #12868 )
...
* tighten up the spec
* move validate into a different file
* that moved to validate
* after(barr)
2025-10-22 19:50:42 +08:00
George Hotz
726988fa4b
late ifs try 2 ( #12865 )
...
* late ifs try 2
* fix image
* fix that test
* panic
* ptx fixups
* preserve toposort
* those pass locally
* Revert "those pass locally"
This reverts commit 063409f828 .
* no ls
* make that explicit
2025-10-22 18:49:27 +08:00
George Hotz
92778c7a8b
rename opts to ren, add store ranges back ( #12856 )
...
* rename opts to ren
* fix docs and bring store back
2025-10-22 09:15:38 +08:00
qazal
b6835f4134
remove Ops.VIEW and related UOp methods ( #12522 )
...
* remove Ops.VIEW and related UOp methods
* update abstractions2.py
* no ShapeTrackers in abstractions2.py
* it's a size 1
2025-10-08 14:47:02 +03:00
qazal
a7cb80bfab
use recursive_property in UOp device ( #12477 )
...
* simple failing test with RecursionError
* switch to @recursive_property
* merge 2
* diff
2025-10-08 06:15:05 +03:00
George Hotz
945cc46475
delete children tracking from uop ( #12491 )
...
* delete children tracking from uop
* uop children no longer exists
* no tracked children
* that test is flaky too
2025-10-08 09:04:14 +08:00
George Hotz
403fdfcfd4
check spec in test, cleanup vectorize render ( #12484 )
2025-10-07 17:05:50 +08:00
qazal
a95159d579
remove TestShapeSpec, it relies on ShapeTracker [pr] ( #12369 )
2025-09-30 14:20:35 +03:00
chenyu
0e266f376c
ops_gpu -> ops_cl ( #12103 )
2025-09-10 15:15:48 -04:00
nimlgen
551560b87c
do not use getenv('PTX') in tests ( #12095 )
...
* test without ptx
* fix tests
* fix test
* linters
2025-09-10 14:04:07 +03:00
Sieds Lykles
c6c16b2946
var_vals uses str for var (#12011 )
...
* var_vals is str,int
* remove imports
* remove print
* fix test
* change var_vals in hcq
* update test_hcq
* fix multitensor _device_num var
* fix syminfer test
* shorten line
* p.vars stays list[Variable]
* shorten line
* vars is back to tuple[Variable, ...]
* change var_vals in extra
* change var_vals from shapetracker
* var_vals is str:int
* fix signature
2025-09-06 04:16:12 +02:00
George Hotz
ee4f696086
delete more tests ( #12043 )
...
* delete more tests
* delete and simplify
* flaky on windows
* a few more, those remained
2025-09-05 15:31:30 -07:00
Sieds Lykles
572a3c15c6
Move Ops.SPECIAL arg to src ( #11918 )
...
* initial moving bound to src
* arg to src
* remove import
* fixup linearizer
* arg to src
* fix test_uop_graph
* fix more tests
* fix python renderer
* get const value from const uop
* ssimplify uop estimates
* fix webgpu locals
* fix old test
* gate Ops.SPECIAL in linearizer
* use ssimplify() for local/global_size
* remove toposort gate_parents_instead_of_self
* fix rendering in comment
* cleanup
* rename and add comments
* add BottomUpGate with test
2025-09-04 09:31:44 +02:00
Sieds Lykles
d1d0960e6e
remove intermediate cast using bounds - weaker pattern ( #11974 )
2025-09-03 06:24:40 +02:00
Sieds Lykles
d9560a631c
remove cast between ints if safe ( #11946 )
2025-09-01 05:56:49 +02:00
Sieds Lykles
f32f3464d6
Can safe cast from certain ints to floats ( #11941 )
...
* add rule
* add some tests
* prevent infinite loop with bfloat16
* add some ints to double and float can_safe_cast
* add tests
2025-09-01 00:51:24 +02:00
Sieds Lykles
1c6e43c203
Double cast is one cast if intermediate cast is safe ( #11939 )
...
* add rule
* add some tests
* prevent infinite loop with bfloat16
* prevent more infinite rewrite
2025-09-01 00:36:29 +02:00
George Hotz
afad7d0cd1
remove dtype from range, it will be dtypes.index soon [pr] ( #11914 )
...
* remove dtype from range, it will be dtypes.index soon [pr]
* a few more
2025-08-29 09:52:07 -07:00
George Hotz
b9b438c516
small updates from postopt ( #11903 )
...
* tests from postopt
* modernize
* skip lin tests
* that's fixed?
* skip, not failure
2025-08-28 12:34:52 -07:00
Ben Waldron
ea1be2e4cd
[bounty] Remove using reshape to register symbolic shape ( #11771 )
...
* Modify tests and start work towards removing symbolic reshape
* Refactor symbolic reshape
* fix small error
* much cleaner + fix more tests
* Can remove this now
* Update test_symbolic_ops and test_tiny
* Couple more tests
* Unused import
* More tests and add EXPAND to Tensor.empty
* Fix test beam search
* all int
* Fix rangeify by adding shrink
* Remove OOB check and so fix test_symbolic_jit
* test_symbolic_jit doesn't need OOB Context anymore either
* Should remove that test now
* Cleanups part 1
* fix linters
* Final cleanups
* Don't reassign inside for loop
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-08-28 12:30:49 -04:00
Sieds Lykles
a286a1a6f7
Fast idiv try removing factors of two before cast ( #11824 )
...
* try removing factors of two
* dont return if None
* add test
2025-08-24 20:04:25 +02:00
Sieds Lykles
10540414cd
Add Ops.CMPEQ ( #10431 )
...
* Add op
* add to Groupop.ALU
* fix spec
* fix ptx
* temporary pickle by name to see process replay
* add Ops.EQ to binary ops
* Actuall rename properly
* add test to assert CMPEQ is being used
* Ops.CMPEQ is automatic cast to bool
* add Ops.CMPEQ to llvm
* add Ops.CMPEQ to llvm
2025-08-10 13:13:16 +02:00
George Hotz
82be8abfd2
move opt under codegen ( #11569 )
2025-08-07 14:19:17 -07:00
George Hotz
6fd1332763
update some tests for less Kernel ( #11543 )
...
* update some tests for less Kernel
* get_program update
2025-08-06 14:19:59 -07:00
George Hotz
108aac8af4
use AddrSpace instead of local ( #11314 )
...
* use AddrSpace instead of local
* addrspace in test
2025-07-21 14:00:06 -07:00
qazal
7619bf35e7
cleanup: remove disabled TestIndexingOrdering ( #11101 )
...
* cleanup: remove disabled TestIndexingOrdering
* don't import kernelize internals
2025-07-05 18:14:37 +03:00
Ignacio Sica
21f1c4cc09
remove some linearize calls from tests [pr] ( #10978 )
...
* remove some linearize calls from tests
speed_compare_cuda_ptx
test_uop_spec
test_linearizer
test_uops
test_winograd
* more clear assert message
2025-06-25 12:37:17 -07:00
George Hotz
b41e0563a3
move stuff to kernelize folder ( #10902 )
...
* move stuff to kernelize folder
* oops, forgot that
2025-06-20 16:10:20 -07:00
George Hotz
92678e59ee
move kernel to opt ( #10899 )
2025-06-20 15:22:28 -07:00
George Hotz
cba6e15937
split grouper and kernelize [pr] ( #10854 )
2025-06-17 17:54:20 -07:00
George Hotz
5dc1bc6070
switch get_kernel -> get_program [pr] ( #10817 )
...
* switch get_kernel -> get_program [pr]
* fix tests
2025-06-15 12:26:50 -07:00
Sieds Lykles
37d3ca152e
Adapt >> for division by power of two to all ints ( #10803 )
...
* Change divison by power of two to always use shift
* Change test to test int instead of uint
* simplify condition
* add old rule back with comment
* remove import
* use sresolve instead of simplify
* use keyword in simplify instead of sresolve
* webgpu cast y to uint
* remove comment
* explicitly set dtype in wgsl
* without simplify
* undo simplify kwarg
* change test to test both int32 and uint32
2025-06-14 14:55:51 -04:00
George Hotz
a38947b4bb
move symbolic and transcendental to uop [pr] ( #10771 )
2025-06-10 20:51:22 -07:00
George Hotz
32e9949052
rename lazydata to uop ( #10698 )
2025-06-08 08:42:22 -07:00
Ahmed Harmouche
650404a143
[webgpu] Proper shared mem size for packed types ( #10585 )
...
* Proper shared mem size in webgpu
* Add test
* Refactor test
2025-06-01 20:18:33 -04:00
George Hotz
411392dfb7
move files into uop dir ( #10399 )
...
* move files into uop dir [pr]
* tinygrad.uop is a thing
* fix uop docs, no pr
* fix viz
2025-05-18 11:38:28 -07:00
George Hotz
603c03bef2
fix tests for rewrite [pr] ( #10167 )
...
* fix tests for rewrite [pr]
* cleaner
* delete linearize_uop
* clean up the rest
2025-05-05 19:19:49 -07:00
Sieds Lykles
338f33efae
Fast mod ( #10055 )
...
* Enable fast mod
* Add test
2025-05-05 09:15:43 -07:00
quortus
5cdc96409e
Update outdated renderer.render calls ( #10044 )
2025-04-26 07:35:19 -04:00
George Hotz
2ed3acd767
toposort is a function [pr] ( #10004 )
2025-04-23 16:25:03 +01:00
Sieds Lykles
07d1aefaf4
fast idiv ( #9755 )
...
* fast idiv with tests and fuzzer
* Add todo comment
* Add env variable to toggle fast_idiv
* Move env check
* Add fuzz fast_idiv to ci
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-04-07 08:32:24 -04:00
qazal
891322fd51
split into grouper.py ( #9768 )
...
* split into grouper.py
* update tests
* reorder
2025-04-07 18:40:59 +08:00