Commit Graph

57 Commits

Author SHA1 Message Date
George Hotz
8dcba2e2cc no full_rewrite [pr] (#13809)
* no full_rewrite [pr]

* fix

* fix docs
2025-12-22 23:20:01 -05:00
George Hotz
744af193f0 remove ScheduleItem and merge it with ExecItem (#13759)
* remove ExecItem and merge it with ScheduleItem

* less diff

* fix issues

* min diff

* don't change bufs in _lower

* min diff

* update

* revert

* fixes

* diff
2025-12-19 17:04:24 -04:00
qazal
366badaa68 require renderer argument in get_program, removes device opening in process replay [pr] (#13524) 2025-12-03 02:05:31 +08:00
George Hotz
2da02f1ae1 add loads at the end (#12988)
* add loads at the end

* simpler

* late load

* tests passing

* fix matvec

* spec test passes

* fix where on load

* fix abs2

* fix more tests
2025-10-30 10:42:19 +08:00
George Hotz
203a93363c Revert "after clean up of locals (#12813)" (#12814)
This reverts commit 5d0d3d7aac.
2025-10-20 19:33:35 +08:00
George Hotz
5d0d3d7aac after clean up of locals (#12813) 2025-10-20 19:24:24 +08:00
chenyu
ae51bdd06a remove trivial use of RANGEIFY flag (#12550)
some tests need update still
2025-10-09 02:29:38 -04:00
George Hotz
44558a37f7 fix some rangeify tests (#12370)
* fix bad range merges

* fix rng

* fix uop gc

* fix some rangeify tests

* now that needs rangeify 2 also
2025-09-30 20:12:08 +08:00
nimlgen
551560b87c do not use getenv('PTX') in tests (#12095)
* test without ptx

* fix tests

* fix test

* linters
2025-09-10 14:04:07 +03:00
George Hotz
ee4f696086 delete more tests (#12043)
* delete more tests

* delete and simplify

* flaky on windows

* a few more, those remained
2025-09-05 15:31:30 -07:00
Sieds Lykles
f5404ca53c Divmod combine - associative variations (#12017)
* add rule and test

* more rules and tests

* add all four variations

* fix test

* test fixed!

* adjust commment

* add new variations

* disable intel tensor core ops count test for bigger_matmul_half
2025-09-05 03:44:02 +02:00
chenyu
edc8b99853 more tests that pass PTX now (#11992) 2025-09-03 21:18:14 -04:00
George Hotz
00391db628 no ast for mem estimate (#11744)
* no ast for mem estimate

* skip for webgpu
2025-08-19 20:18:45 -07:00
George Hotz
82be8abfd2 move opt under codegen (#11569) 2025-08-07 14:19:17 -07:00
George Hotz
6fd1332763 update some tests for less Kernel (#11543)
* update some tests for less Kernel

* get_program update
2025-08-06 14:19:59 -07:00
George Hotz
4fe11725c6 pass through sink arg, update linearizer test (#11536)
* pass through sink arg, update linearizer test

* get_program help

* bump line count

* use new api
2025-08-06 09:48:48 -07:00
chenyu
a0438012af remove Kernel.get_program [pr] (#11203) 2025-07-12 20:50:29 -04:00
George Hotz
92678e59ee move kernel to opt (#10899) 2025-06-20 15:22:28 -07:00
George Hotz
411392dfb7 move files into uop dir (#10399)
* move files into uop dir [pr]

* tinygrad.uop is a thing

* fix uop docs, no pr

* fix viz
2025-05-18 11:38:28 -07:00
George Hotz
603c03bef2 fix tests for rewrite [pr] (#10167)
* fix tests for rewrite [pr]

* cleaner

* delete linearize_uop

* clean up the rest
2025-05-05 19:19:49 -07:00
George Hotz
7c33924a50 don't use real_size for mem_bytes [pr] (#10147) 2025-05-03 09:41:21 -04:00
George Hotz
cac8bcf8b5 use Ops.REDUCE (#9721)
* decrease bert python time [pr]

* order copies

* Revert "order copies"

This reverts commit 3f62c8693b.

* rewrite count

* Ops.REDUCE

* acc first in the add chain

* Fix tensor core acc

* arange patterns look good

* fix multireduce gate

* reduce rewrite rule

* bump that to 15 minutes

* multiwmma isn't fusing

* gep through wmma is gep pushing

* bump that timeout too, it's all env setup

* add failing test
2025-04-04 10:14:34 +08:00
chenyu
2e7c2780a9 CLANG -> CPU (#9189) 2025-02-20 18:03:09 -05:00
qazal
cde18fddce fix DEBUG=2 output for copy runners [pr] (#8579)
* fix DEBUG=2 output for copy runners [pr]

* itemsize is constant
2025-01-12 12:03:01 -05:00
qazal
866dfa1f23 create_schedule([x.lazydata]) -> x.schedule() in tests (#8449) 2024-12-31 03:15:52 +08:00
George Hotz
0ad264ed2d new from uops [pr] (#8330)
* new from uops [pr]

* mem_estimate is it's own thing
2024-12-18 23:42:58 -08:00
George Hotz
52243b258c move flops_mem to renderer [pr] (#8320) 2024-12-18 12:13:17 -08:00
George Hotz
8f95b578f6 use Estimates class [pr] (#8319)
* use Estimates class [pr]

* frozen dataclass
2024-12-18 10:19:32 -08:00
Carl Basho
630a7f37cf update tests (#7554)
Co-authored-by: John Doe <null@mail.com>
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-11-05 11:35:15 -05:00
George Hotz
c8bf09b7d4 s/UOps/Ops (#7500)
* s/UOps/Ops [pr]

* fix
2024-11-03 11:26:10 +08:00
George Hotz
ee9ef93617 delete old rules [pr] (#7400) 2024-10-30 19:45:04 +08:00
George Hotz
ded1b38b84 minor dtype cleanup [pr] (#7124)
* minor dtype cleanup [pr]

* use ptr() function
2024-10-17 17:41:23 +08:00
George Hotz
e7a0ffe46a break out linearization [pr] (#6994) 2024-10-11 15:27:33 +08:00
George Hotz
b199b699ed use shl everywhere (#6744)
* use shl everywhere

* fix parens

* late patterns

* works as an extra pass

* ptx
2024-09-26 09:59:36 +08:00
George Hotz
0ab06d5840 push geps through wmma (#6559)
* push geps through wmma

* update tests
2024-09-17 14:38:40 +08:00
George Hotz
a2239c812e minimum new style expand (#6534)
* minimum new style expand [run_process_replay]

* float4 folding works

* fix uop graph

* if means or

* dype.count idx overload

* fix test arange

* expand nope

* fix expand contract

* fix amd tensor core

* oh, that's a good test with a real failure

* remove prints

* early reduce

* tomorrow, we remove sorted on expand args

* fix wmma issue

* that makes test_arange pass

* vectorized folding

* no check

* broadcast

* fix clang with self assign rule
2024-09-17 13:02:41 +08:00
CaltropHungerton
002f60b4c3 fix intel wmma flop counting, add flop counting tests for different tensor cores (#6192)
* fix wmma flop counting on intel, add count tests

* half

* add half gemm

* Update test.yml

* one test

* Update test_uops_stats.py

* Update test_uops_stats.py

* Update test_uops_stats.py

* smaller matrix, use unittest skipUnless decorator
2024-08-25 18:37:05 -07:00
George Hotz
16f420f7a7 split full_graph_rewrite and linearize_uop [run_process_replay] (#6215)
* split full_graph_rewrite and linearize_uop

* fix tests

* graph rewrite in test uops

* add types
2024-08-20 20:12:33 -07:00
qazal
5a266d5d0c type verify ImageDType and PtrDType [run_process_replay] (#6137)
* type verify ImageDType and PtrDType [run_process_replay]

* fix tests
2024-08-17 16:37:07 +03:00
George Hotz
912f01ed4b UOpGraph -> linearize_uop [run_process_replay] (#6119) 2024-08-16 19:48:39 -07:00
George Hotz
74ee9febec remove iter from uopgraph (#6110)
* remove iter from uopgraph

* linearize returns uops

* fix tests

* linearize in linearize

* tests fix

* touchup

* test failures
2024-08-16 15:58:29 -07:00
qazal
28c75bf2a6 merge uops with ops (#6111)
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-08-16 18:17:57 -04:00
George Hotz
d73bc85ba9 UOpGraph not in renderer or Program [run_process_replay] (#5867)
* UOpGraph not in renderer or Program [run_process_replay]

* fix some tests

* fix ptx
2024-08-01 16:20:30 -07:00
George Hotz
72621d9e7c count the specials in uops [run_process_replay] (#5848)
* count the specials in uops [run_process_replay]

* cleanups
2024-07-31 14:53:18 -07:00
George Hotz
7c4b177e3a add tests for uops stats (#5649)
* add tests for uops stats

* no locals skip is fine

* eh
2024-07-22 21:57:03 -07:00
George Hotz
d0ab20a5e5 careful memory counting (with tests to specify behavior) (#5587) 2024-07-19 11:37:34 -07:00
George Hotz
2de82b8a5d remove get_lazyop_info (#5570)
* don't use get_lazyop_info more

* keep that min

* no ptx for that test
2024-07-19 03:05:33 -07:00
George Hotz
d13654a820 move uopgraph to file [run_process_replay] (#5364)
* move uopgraph to file [run_process_replay]

* fix print tree test
2024-07-10 17:34:50 -07:00
George Hotz
63a8add2c2 move uops add logic to linearize (#4952)
* move logic to linearize

* idk how this should work

* empty
2024-06-14 03:52:37 -07:00
George Hotz
9823752397 make uops.add private (#4950)
* make uops.add private

* modernize all tests
2024-06-14 03:23:25 -07:00