Commit Graph

262 Commits

Author SHA1 Message Date
Clément Verrier
ae013beab8 handle empty VECTORIZE in UOp.render() (#13847)
`UOp.render()` crashed with `IndexError: tuple index out of range` when
the UOp graph contained a `VECTORIZE` with empty `src=()`. This occurs
when reshaping to scalar shape `()`, e.g., `Tensor.ones(4).sum()`.

The bug was in the renderer's VECTORIZE pattern: `all_same(())` returns
`True` (vacuous truth), causing the code to access `x.src[0]` on an
empty tuple.

- Fix `IndexError` when calling `UOp.render()` on graphs containing
  empty `VECTORIZE` nodes.
- Add test for empty `VECTORIZE` rendering.
2025-12-27 10:09:39 -05:00
George Hotz
8dcba2e2cc no full_rewrite [pr] (#13809)
* no full_rewrite [pr]

* fix

* fix docs
2025-12-22 23:20:01 -05:00
George Hotz
744af193f0 remove ScheduleItem and merge it with ExecItem (#13759)
* remove ExecItem and merge it with ScheduleItem

* less diff

* fix issues

* min diff

* don't change bufs in _lower

* min diff

* update

* revert

* fixes

* diff
2025-12-19 17:04:24 -04:00
Christopher Milan
97103831c5 Revert "remove image from BufferSpec (#13636)" (#13761)
This reverts commit 2571a1eb47.
2025-12-19 13:54:36 -05:00
Christopher Milan
2571a1eb47 remove image from BufferSpec (#13636)
* remove image from BufferSpec

* cl tiny_gemm (64) works

* mypy

* padding

* openpilot CL

* reshape properly

* remove extra qcom checks

* pad output

* mypy

* update compile test

* move undo

* TestImageCopy valid images

* TestImageRealization valid images

* TestImageDType valid images

* cleanups

* test_renderer_failures

* ruff

* mypy

* simplify ops_qcom

* bump step time
2025-12-19 13:41:20 -05:00
kamilisjon
3d76ef9ba8 Update tests (#13479) 2025-11-28 18:35:28 -08:00
George Hotz
957cf717e7 Python speed (#13355)
* skip process replay by default

* work on python speed

* fix names of rewrite rules

* fix that test
2025-11-19 09:03:00 -08:00
George Hotz
2d4f01fda0 move mixins to mixin dir (#13105)
* move mixins to mixin dir

* math
2025-11-05 10:18:33 -08:00
chenyu
ca17718b6d remove symbolic_flat (#13083)
* remove symbolic_flat

some kernels are different but sometimes it's better so not clear, will merge as long as benchmark passes

* test_location
2025-11-03 17:25:21 -05:00
George Hotz
bc178d14a9 matmul example on metal showing off tensor core (#13033)
* matmul example on metal showing off tensor core

* flip the args of placeholder

* mat_idx

* imp
2025-10-31 19:40:36 +08:00
George Hotz
e456f2cb1e more uop programs (#13007)
* more uop program

* test_matmul_relu

* tests fix
2025-10-30 14:57:59 +08:00
George Hotz
e64d4b3b44 uops programs (#13005)
* uops programs

* work

* work

* more syntax

* more syntax

* comments
2025-10-30 12:28:10 +08:00
George Hotz
2da02f1ae1 add loads at the end (#12988)
* add loads at the end

* simpler

* late load

* tests passing

* fix matvec

* spec test passes

* fix where on load

* fix abs2

* fix more tests
2025-10-30 10:42:19 +08:00
Sieds Lykles
9f39f6391c shared_codegen_spec and fix index spec (#12967)
* split shared_codegen_spec and fix index

* add VCONST to program_spec and move index to shared_codegen_spec

* working ignore_oob=0

* cleanup

* fix spec

* undo that

* move barrier and special earlier

* fix more spec issues

* more updates

* remove special from program_spec

* cleanup and fixes

* move more to shared

* special is not in shared_spec

* some comments

* dont do bounds check there
2025-10-29 09:14:11 +01:00
George Hotz
5e01cc299b zero len ranges fail (#12974)
* zero len ranges fail

* fix Python backend

* fix llvm

* fix ptx

* yolo fix nir

* this works...

* always store...

* always store...

* Revert "always store..."

This reverts commit 0816cf344d.
2025-10-28 22:49:55 +08:00
George Hotz
701a632907 move VECTORIZE/CONST (#12942) 2025-10-27 17:37:13 +08:00
George Hotz
804133cffd rename RECIP to RECIPROCAL (#12939) 2025-10-27 16:53:13 +08:00
George Hotz
8a941d95a4 SPEC=2 is full spec, SPEC=1 is default (#12910)
* SPEC=1 passes all tests

* just use SPEC, not __debug__
2025-10-25 11:10:43 +08:00
George Hotz
e85cee0aad flip Ops.END srcs (#12882)
* flip Ops.END srcs

* backward

* late end split
2025-10-23 12:47:50 +08:00
George Hotz
74b4cfe44b Ops.GROUP + range check (#12880)
* simpler

* fix that

* Ops.GROUP + range check

* fix bugs

* fix linter

* fix test
2025-10-23 12:05:21 +08:00
George Hotz
7762b3558b clean up the spec (#12868)
* tighten up the spec

* move validate into a different file

* that moved to validate

* after(barr)
2025-10-22 19:50:42 +08:00
George Hotz
726988fa4b late ifs try 2 (#12865)
* late ifs try 2

* fix image

* fix that test

* panic

* ptx fixups

* preserve toposort

* those pass locally

* Revert "those pass locally"

This reverts commit 063409f828.

* no ls

* make that explicit
2025-10-22 18:49:27 +08:00
George Hotz
92778c7a8b rename opts to ren, add store ranges back (#12856)
* rename opts to ren

* fix docs and bring store back
2025-10-22 09:15:38 +08:00
qazal
b6835f4134 remove Ops.VIEW and related UOp methods (#12522)
* remove Ops.VIEW and related UOp methods

* update abstractions2.py

* no ShapeTrackers in abstractions2.py

* it's a size 1
2025-10-08 14:47:02 +03:00
qazal
a7cb80bfab use recursive_property in UOp device (#12477)
* simple failing test with RecursionError

* switch to @recursive_property

* merge 2

* diff
2025-10-08 06:15:05 +03:00
George Hotz
945cc46475 delete children tracking from uop (#12491)
* delete children tracking from uop

* uop children no longer exists

* no tracked children

* that test is flaky too
2025-10-08 09:04:14 +08:00
George Hotz
403fdfcfd4 check spec in test, cleanup vectorize render (#12484) 2025-10-07 17:05:50 +08:00
qazal
a95159d579 remove TestShapeSpec, it relies on ShapeTracker [pr] (#12369) 2025-09-30 14:20:35 +03:00
chenyu
0e266f376c ops_gpu -> ops_cl (#12103) 2025-09-10 15:15:48 -04:00
nimlgen
551560b87c do not use getenv('PTX') in tests (#12095)
* test without ptx

* fix tests

* fix test

* linters
2025-09-10 14:04:07 +03:00
Sieds Lykles
c6c16b2946 var_vals uses str for var (#12011)
* var_vals is str,int

* remove imports

* remove print

* fix test

* change var_vals in hcq

* update test_hcq

* fix multitensor _device_num var

* fix syminfer test

* shorten line

* p.vars stays list[Variable]

* shorten line

* vars is back to tuple[Variable, ...]

* change var_vals in extra

* change var_vals from shapetracker

* var_vals is str:int

* fix signature
2025-09-06 04:16:12 +02:00
George Hotz
ee4f696086 delete more tests (#12043)
* delete more tests

* delete and simplify

* flaky on windows

* a few more, those remained
2025-09-05 15:31:30 -07:00
Sieds Lykles
572a3c15c6 Move Ops.SPECIAL arg to src (#11918)
* initial moving bound to src

* arg to src

* remove import

* fixup linearizer

* arg to src

* fix test_uop_graph

* fix more tests

* fix python renderer

* get const value from const uop

* ssimplify uop estimates

* fix webgpu locals

* fix old test

* gate Ops.SPECIAL in linearizer

* use ssimplify() for local/global_size

* remove toposort gate_parents_instead_of_self

* fix rendering in comment

* cleanup

* rename and add comments

* add BottomUpGate with test
2025-09-04 09:31:44 +02:00
Sieds Lykles
d1d0960e6e remove intermediate cast using bounds - weaker pattern (#11974) 2025-09-03 06:24:40 +02:00
Sieds Lykles
d9560a631c remove cast between ints if safe (#11946) 2025-09-01 05:56:49 +02:00
Sieds Lykles
f32f3464d6 Can safe cast from certain ints to floats (#11941)
* add rule

* add some tests

* prevent infinite loop with bfloat16

* add some ints to double and float can_safe_cast

* add tests
2025-09-01 00:51:24 +02:00
Sieds Lykles
1c6e43c203 Double cast is one cast if intermediate cast is safe (#11939)
* add rule

* add some tests

* prevent infinite loop with bfloat16

* prevent more infinite rewrite
2025-09-01 00:36:29 +02:00
George Hotz
afad7d0cd1 remove dtype from range, it will be dtypes.index soon [pr] (#11914)
* remove dtype from range, it will be dtypes.index soon [pr]

* a few more
2025-08-29 09:52:07 -07:00
George Hotz
b9b438c516 small updates from postopt (#11903)
* tests from postopt

* modernize

* skip lin tests

* that's fixed?

* skip, not failure
2025-08-28 12:34:52 -07:00
Ben Waldron
ea1be2e4cd [bounty] Remove using reshape to register symbolic shape (#11771)
* Modify tests and start work towards removing symbolic reshape

* Refactor symbolic reshape

* fix small error

* much cleaner + fix more tests

* Can remove this now

* Update test_symbolic_ops and test_tiny

* Couple more tests

* Unused import

* More tests and add EXPAND to Tensor.empty

* Fix test beam search

* all int

* Fix rangeify by adding shrink

* Remove OOB check and so fix test_symbolic_jit

* test_symbolic_jit doesn't need OOB Context anymore either

* Should remove that test now

* Cleanups part 1

* fix linters

* Final cleanups

* Don't reassign inside for loop

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-08-28 12:30:49 -04:00
Sieds Lykles
a286a1a6f7 Fast idiv try removing factors of two before cast (#11824)
* try removing factors of two

* dont return if None

* add test
2025-08-24 20:04:25 +02:00
Sieds Lykles
10540414cd Add Ops.CMPEQ (#10431)
* Add op

* add to Groupop.ALU

* fix spec

* fix ptx

* temporary pickle by name to see process replay

* add Ops.EQ to binary ops

* Actuall rename properly

* add test to assert CMPEQ is being used

* Ops.CMPEQ is automatic cast to bool

* add Ops.CMPEQ to llvm

* add Ops.CMPEQ to llvm
2025-08-10 13:13:16 +02:00
George Hotz
82be8abfd2 move opt under codegen (#11569) 2025-08-07 14:19:17 -07:00
George Hotz
6fd1332763 update some tests for less Kernel (#11543)
* update some tests for less Kernel

* get_program update
2025-08-06 14:19:59 -07:00
George Hotz
108aac8af4 use AddrSpace instead of local (#11314)
* use AddrSpace instead of local

* addrspace in test
2025-07-21 14:00:06 -07:00
qazal
7619bf35e7 cleanup: remove disabled TestIndexingOrdering (#11101)
* cleanup: remove disabled TestIndexingOrdering

* don't import kernelize internals
2025-07-05 18:14:37 +03:00
Ignacio Sica
21f1c4cc09 remove some linearize calls from tests [pr] (#10978)
* remove some linearize calls from tests

speed_compare_cuda_ptx
test_uop_spec
test_linearizer
test_uops
test_winograd

* more clear assert message
2025-06-25 12:37:17 -07:00
George Hotz
b41e0563a3 move stuff to kernelize folder (#10902)
* move stuff to kernelize folder

* oops, forgot that
2025-06-20 16:10:20 -07:00
George Hotz
92678e59ee move kernel to opt (#10899) 2025-06-20 15:22:28 -07:00
George Hotz
cba6e15937 split grouper and kernelize [pr] (#10854) 2025-06-17 17:54:20 -07:00