Commit Graph

225 Commits

Author SHA1 Message Date
George Hotz
afad7d0cd1 remove dtype from range, it will be dtypes.index soon [pr] (#11914)
* remove dtype from range, it will be dtypes.index soon [pr]

* a few more
2025-08-29 09:52:07 -07:00
George Hotz
b9b438c516 small updates from postopt (#11903)
* tests from postopt

* modernize

* skip lin tests

* that's fixed?

* skip, not failure
2025-08-28 12:34:52 -07:00
Ben Waldron
ea1be2e4cd [bounty] Remove using reshape to register symbolic shape (#11771)
* Modify tests and start work towards removing symbolic reshape

* Refactor symbolic reshape

* fix small error

* much cleaner + fix more tests

* Can remove this now

* Update test_symbolic_ops and test_tiny

* Couple more tests

* Unused import

* More tests and add EXPAND to Tensor.empty

* Fix test beam search

* all int

* Fix rangeify by adding shrink

* Remove OOB check and so fix test_symbolic_jit

* test_symbolic_jit doesn't need OOB Context anymore either

* Should remove that test now

* Cleanups part 1

* fix linters

* Final cleanups

* Don't reassign inside for loop

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-08-28 12:30:49 -04:00
Sieds Lykles
a286a1a6f7 Fast idiv try removing factors of two before cast (#11824)
* try removing factors of two

* dont return if None

* add test
2025-08-24 20:04:25 +02:00
Sieds Lykles
10540414cd Add Ops.CMPEQ (#10431)
* Add op

* add to Groupop.ALU

* fix spec

* fix ptx

* temporary pickle by name to see process replay

* add Ops.EQ to binary ops

* Actuall rename properly

* add test to assert CMPEQ is being used

* Ops.CMPEQ is automatic cast to bool

* add Ops.CMPEQ to llvm

* add Ops.CMPEQ to llvm
2025-08-10 13:13:16 +02:00
George Hotz
82be8abfd2 move opt under codegen (#11569) 2025-08-07 14:19:17 -07:00
George Hotz
6fd1332763 update some tests for less Kernel (#11543)
* update some tests for less Kernel

* get_program update
2025-08-06 14:19:59 -07:00
George Hotz
108aac8af4 use AddrSpace instead of local (#11314)
* use AddrSpace instead of local

* addrspace in test
2025-07-21 14:00:06 -07:00
qazal
7619bf35e7 cleanup: remove disabled TestIndexingOrdering (#11101)
* cleanup: remove disabled TestIndexingOrdering

* don't import kernelize internals
2025-07-05 18:14:37 +03:00
Ignacio Sica
21f1c4cc09 remove some linearize calls from tests [pr] (#10978)
* remove some linearize calls from tests

speed_compare_cuda_ptx
test_uop_spec
test_linearizer
test_uops
test_winograd

* more clear assert message
2025-06-25 12:37:17 -07:00
George Hotz
b41e0563a3 move stuff to kernelize folder (#10902)
* move stuff to kernelize folder

* oops, forgot that
2025-06-20 16:10:20 -07:00
George Hotz
92678e59ee move kernel to opt (#10899) 2025-06-20 15:22:28 -07:00
George Hotz
cba6e15937 split grouper and kernelize [pr] (#10854) 2025-06-17 17:54:20 -07:00
George Hotz
5dc1bc6070 switch get_kernel -> get_program [pr] (#10817)
* switch get_kernel -> get_program [pr]

* fix tests
2025-06-15 12:26:50 -07:00
Sieds Lykles
37d3ca152e Adapt >> for division by power of two to all ints (#10803)
* Change divison by power of two to always use shift

* Change test to test int instead of uint

* simplify condition

* add old rule back with comment

* remove import

* use sresolve instead of simplify

* use keyword in simplify instead of sresolve

* webgpu cast y to uint

* remove comment

* explicitly set dtype in wgsl

* without simplify

* undo simplify kwarg

* change test to test both int32 and uint32
2025-06-14 14:55:51 -04:00
George Hotz
a38947b4bb move symbolic and transcendental to uop [pr] (#10771) 2025-06-10 20:51:22 -07:00
George Hotz
32e9949052 rename lazydata to uop (#10698) 2025-06-08 08:42:22 -07:00
Ahmed Harmouche
650404a143 [webgpu] Proper shared mem size for packed types (#10585)
* Proper shared mem size in webgpu

* Add test

* Refactor test
2025-06-01 20:18:33 -04:00
George Hotz
411392dfb7 move files into uop dir (#10399)
* move files into uop dir [pr]

* tinygrad.uop is a thing

* fix uop docs, no pr

* fix viz
2025-05-18 11:38:28 -07:00
George Hotz
603c03bef2 fix tests for rewrite [pr] (#10167)
* fix tests for rewrite [pr]

* cleaner

* delete linearize_uop

* clean up the rest
2025-05-05 19:19:49 -07:00
Sieds Lykles
338f33efae Fast mod (#10055)
* Enable fast mod

* Add test
2025-05-05 09:15:43 -07:00
quortus
5cdc96409e Update outdated renderer.render calls (#10044) 2025-04-26 07:35:19 -04:00
George Hotz
2ed3acd767 toposort is a function [pr] (#10004) 2025-04-23 16:25:03 +01:00
Sieds Lykles
07d1aefaf4 fast idiv (#9755)
* fast idiv with tests and fuzzer

* Add todo comment

* Add env variable to toggle fast_idiv

* Move env check

* Add fuzz fast_idiv to ci

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-04-07 08:32:24 -04:00
qazal
891322fd51 split into grouper.py (#9768)
* split into grouper.py

* update tests

* reorder
2025-04-07 18:40:59 +08:00
qazal
8ddb1357c0 fix UPat.location after pickle (#9763)
* fix UPat.location after pickle [pr]

* named upat test
2025-04-07 15:16:42 +08:00
George Hotz
1714fc3ba4 start work on speed [pr] (#9707)
* fix get_location

* fix get_location try 2

* clean up split_load_store [pr]

* SHR fixup [pr]
2025-04-03 10:39:01 +08:00
George Hotz
3c5161b4cb add validation of the bounds of Ops.INDEX (#9503)
* add validation of the bounds of Ops.INDEX

* do mask properly

* more validation

* correct

* fix gated

* add CAST support to vmin/vmax

* fix ptx and image

* ptx no diff

* upat.index also stays

---------

Co-authored-by: qazal <qazal.software@gmail.com>
2025-03-20 12:15:55 +08:00
qazal
1839e8c9b3 place masks in INDEX for TestGatedStoreRewrite [pr] (#9512) 2025-03-20 09:46:53 +08:00
chenyu
2e7c2780a9 CLANG -> CPU (#9189) 2025-02-20 18:03:09 -05:00
George Hotz
caee42e8a6 Revert "name from uops [pr] (#9151)" (#9154)
This reverts commit 28897be9a2.
2025-02-18 16:06:44 +08:00
George Hotz
28897be9a2 name from uops [pr] (#9151) 2025-02-18 15:52:03 +08:00
George Hotz
a4dab3ec3f add name uop (#9149)
* add name uop, TODO: refactor renderer to use

* renderer uses name uop

* fix tests

* render

* ptx
2025-02-18 15:26:58 +08:00
George Hotz
df3b320f46 rewriter -> devectorizer [pr] (#9147) 2025-02-18 12:42:08 +08:00
George Hotz
1bf66d62cf symbolic gets its own file [pr] (#9132) 2025-02-17 18:55:21 +08:00
quortus
5bdf0c7951 Bitcast constant folding 2.0 (#9089)
* Prevent const folding in test_payne_hanek_reduction

* Do not use list as a default parameter

* Bitcast constant folding

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-02-17 18:08:20 +08:00
George Hotz
9289425170 add ast to ProgramSpec + pre matcher [pr] (#9128)
* add ast to ProgramSpec + pre matcher [pr]

* cleaner cast + test fix
2025-02-17 16:39:14 +08:00
qazal
c80603285e bring back some things from the fix_kernel_ops diff [pr] (#9027)
* bring fix_kernel_ops back [pr]

* fix
2025-02-11 14:20:31 +01:00
George Hotz
fb698920f1 revert scheduler change (#9019)
* Revert "cleanup ast rewriter [pr] (#9012)"

This reverts commit bf0bcb2d5a.

* Revert "kernel op cleanups + use ScheduleItem [pr] (#9009)"

This reverts commit c52cd2b437.

* Revert "construct the schedule sink 2 (#8925)"

This reverts commit cfd3db7862.
2025-02-11 11:34:12 +08:00
qazal
bf0bcb2d5a cleanup ast rewriter [pr] (#9012) 2025-02-10 19:07:59 +01:00
qazal
b17ec42b56 remove const_arg (#9002)
* remove const_arg

* use -m pytest

* remove test_const_arg test, variable arg on CONST does not exist.

* use base in test_const_dtype
2025-02-10 12:45:11 +01:00
qazal
fd9f9ec772 realized base tensors become RESHAPE(BUFFER) [pr] (#8994) 2025-02-10 10:17:54 +01:00
chenyu
a092b6395d Tuple -> tuple, List -> list [pr] (#8936) 2025-02-06 14:21:19 -05:00
eliotgolding
bb5ded85cc Don't rewrite idiv to rshift when numerator is negative (#8885)
* more conditions for shift rewrite mul/idiv

* make ptx test uint so the new condition is true

* delete idiv test

* rewrite to 0 is wrong for idiv, as denominator is cast to 0 before division

* mul/div by 2**(large count) is unsupported anyway
2025-02-05 07:47:33 +08:00
Ali Ladjevardi
6e523e4d17 Remove size arg from DEFINE_LOCAL [pr] (#8845)
* remove size arg form DEFINE_LOCAL

* make mypy happy

* whitespace

* dont change code in extra

* revert to temp1 to pass pr
2025-02-02 19:47:32 +08:00
George Hotz
643c09a6c6 tensor uop spec should be in spec.py [pr] (#8827)
* tensor uop spec should be in spec.py [pr]

* err, spec.py

* print uops can stay
2025-01-31 13:54:04 +08:00
qazal
5643429c17 give BUFFER UOp a ShapeTracker [pr] (#8811)
* give BUFFER UOp a ShapeTracker [pr]

* move that

* update contiguous

* test_advancedindex should use movement ops
2025-01-30 22:33:32 +02:00
qazal
ba17786068 do not construct unmasked VALID (#8759)
* new lines that exist in codegen/ops

* update tests

* update sops.gz (13071 -> 13070 asts)

* fix viz too

* remove that TODO

* diff pruning

* mask assert + device

* work

* diff pruning

* re: fix viz too

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-01-28 20:51:21 +02:00
qazal
3417bc1814 fix ShapeTracker spec for const [pr] (#8791) 2025-01-28 19:53:36 +02:00
qazal
aefbc2637f test fixups from unmasked valid deletion [pr] (#8776) 2025-01-28 09:23:30 +02:00