Commit Graph

10172 Commits

Author SHA1 Message Date
George Hotz
e0bc99b6c9 check clSetKernelArg 2025-09-13 16:05:29 +08:00
George Hotz
0695e322a8 fix android cpu device (#12148) 2025-09-13 15:42:04 +08:00
Sieds Lykles
e3a3764917 delete fold_unrolled_divs (#12146) 2025-09-13 03:09:36 +02:00
Sieds Lykles
51ed6e94b2 AxisType __repr__ method (#12145) 2025-09-13 01:15:38 +02:00
Sieds Lykles
0757a9a819 add pytest-timeout of 3 min per item (#12144)
* add pytest-timeout with timeout of 3 min

* func_only
2025-09-13 00:48:41 +02:00
Sieds Lykles
2fc0bd150b Arange overflow raises error and one_hot upcast (#11975)
* add error

* to_dtype

* shorten line

* add test

* upcast one hot dim im overflows
2025-09-13 00:18:25 +02:00
chenyu
aac3dceaf6 merge two PYTHON backend ci job (#12143)
* merge two PYTHON backend ci job

and mark anything that takes > 10 in test_ops slow

* two more
2025-09-12 17:36:46 -04:00
ttomsa
a12d0933c1 fix vec dtype in fast idiv (#12080)
* fix

* add vec dtypes to fuzzer

* add vec=False

---------

Co-authored-by: Sieds Lykles <93992551+S-Lykles@users.noreply.github.com>
2025-09-12 23:00:43 +02:00
chenyu
25091951ba update test/models (#12142)
minor fix and run more stuff in tinygrad for speed
2025-09-12 16:43:28 -04:00
Sieds Lykles
62376c8b2b update store load noop pattern to use Invalid (#12141)
* update pattern

* add test
2025-09-12 22:25:53 +02:00
chenyu
647965fb09 test_train cleanup (#12140)
* test_train cleanup

remove skipIf due to buffer sizes, runs locally

* those are slow
2025-09-12 13:21:30 -04:00
chenyu
0fad07c684 viz serve default path (#12139)
`python tinygrad/viz/serve.py` shows last session instead of an empty page
2025-09-12 18:32:44 +03:00
nimlgen
81e33b8439 system: cpu memory mappings are uncached (#12137)
* system: cpu memory mappings is uncached

* adm amd
2025-09-12 13:28:25 +03:00
qazal
68b0ad05a4 viz: format tuple tags (#12135)
* viz: format tuple tags

* use python repr
2025-09-12 11:36:53 +03:00
qazal
e80c8a7548 merge TestIndexing with TestSchedule + remove duplicate tests (#12134)
* merge TestIndexing with TestSchedule

* remove the arange_copy tests

* no FUSE_ARANGE import
2025-09-12 10:35:14 +03:00
Sieds Lykles
b5a3b8de20 remove where on gated load if gates are the same (#12129)
* add rules

* add tests
2025-09-12 06:52:35 +02:00
George Hotz
a2f502b89e fix rangeify=1 ops on GPU (#12130) 2025-09-12 11:17:37 +08:00
George Hotz
0766616962 isolate the const hacks in the old kernelize (#12126)
* isolate the const hacks in the old kernelize

* if rangeify, don't waste time
2025-09-12 08:35:35 +08:00
Sieds Lykles
1f3950a484 Invalid idx (#12067)
* merge index_dtype_3

* new lowering with Invalid idx

* remove that dtype from range

* finish merge

* annotate better

* indentation

* dont need that anymore

* always process replay for openpilot

* more uop_given_valid for idx

* valid past index_child

* fix bug preventing load getting an alt value

* add track_match_stats back in in shapetracker and remove cache

* get_valid_idx -> get_valid and get_idx

* fix heuristics with new idx

* split line

* fix typo

* fix signature

* dont skip idx if stride is 0

the idx may still be invalid

* lower const with new valid

* delete to_indexed_uops

* update shapetracker test

* delete axis_is_masked

* add cache back

* move around comment

* fix get_valid bug

* move invalid fold to symbolic so its earlier

* cleanup

* update applying padto to new idx

* add unit tests

* cleanup

* fold line

* improve spec

* dont try to render Invalid as a float

* more consistent invalid index

* update some tests

* Fold index with true cond

* skip test

* vconst min max if Invalid in arg

* fix signature of UOp.const

* add test for min/max of Invalid CONST/VCONST

* add InvalidType to as_const signature

* is Invalid to isinstance

* Add InvalidType to ConstLike

* index gate is a where gate

* make that a metaclass

* fix heurisics for new idx

* mypy happy
2025-09-12 01:42:02 +02:00
chenyu
544eb2c402 clean up test_scatter_reduce (#12125) 2025-09-11 16:36:58 -04:00
chenyu
9ad6a56d17 smaller test_simple_reduce (#12124) 2025-09-11 15:45:38 -04:00
chenyu
e5ef9ec5b1 remove IGNORE_OOB=0 in ci tests (#12117) 2025-09-11 15:05:04 -04:00
chenyu
3a83b56da5 fix test_dequantization_mxfp4 (#12123)
* fix test_dequantization_mxfp4

* assert_allclose

* rtol
2025-09-11 14:22:06 -04:00
chenyu
520e2e0727 actually run unit tests in ci MacOS (unit) (#12122)
* actually run unit tests in ci MacOS (unit)

* that's always wrong
2025-09-11 13:32:30 -04:00
nimlgen
acb700fc26 ci: fix ptx env (#12120) 2025-09-11 12:42:15 -04:00
chenyu
20cd7177de delete test_bert_fuse_arange (#12121)
* delete test_bert_fuse_arange

it's the default now and we are not interested in FUSE_ARANGE=0 version

* remove -v
2025-09-11 12:35:51 -04:00
chenyu
b07f962058 split metal model tests (#12119)
* split metal model tests

* llama too
2025-09-11 12:20:12 -04:00
chenyu
66593f135f remove duplicated test_real_world (#12118)
included in the test/models right below
2025-09-11 11:57:14 -04:00
qazal
e76211fcbc viz: specify all rect styles in parent (#12115)
* viz: specify all rect styles in parent

Visually a no-op, but it's easier to reason about when the rect's coloring comes from `g` parent that holds UOp data.

* this stays
2025-09-11 13:48:59 +03:00
nimlgen
400ad93892 ci: gate boost paths for macos only (#12114) 2025-09-11 12:48:34 +03:00
George Hotz
3ef0e5e01e rangeify: use Ops.REALIZE and not Ops.CONTIGUOUS if it's added by system (#12111)
* rangeify: use Ops.REALIZE and not Ops.CONTIGUOUS if it's added by system

* fix contig + BufferizeOpts

* no outerworld
2025-09-11 11:56:59 +08:00
b1tg
52ebed991e change dtype promo lattice when fp8s is supported (#12088)
* change dtype promo lattice when fp8s is supported

* no device check

* int64 + uint64 => fp8
2025-09-10 22:09:11 -04:00
George Hotz
d4eba5800d rangeify cost function infrastructure (#12091)
* one call to hc opt

* does that pass?

* add cost function to rangeify

* test

* more test

* gate thread

* bufferize has shape

* ish

* match old behavior

* no ci there
2025-09-11 07:19:53 +08:00
qazal
78610b681e viz: light up children (#12107)
* viz: light up children

* keep tag coloring
2025-09-11 01:28:01 +03:00
Sieds Lykles
3989f5b559 Revert "Simplify valid in symbolic (#12104)" (#12108)
This reverts commit 73d479a016.
2025-09-10 23:36:40 +02:00
Sieds Lykles
73d479a016 Simplify valid in symbolic (#12104)
* cleanup cast_folding

* from sym to symbolic

* no more sym in dtype lowering

* move around simplify_valid

* update test
2025-09-10 23:26:19 +02:00
chenyu
e306650d39 remove GPUDevice (#12106) 2025-09-10 16:35:00 -04:00
George Hotz
d8a7a1c9c7 BUFFERIZE shape should be each range, not the product (#12105)
* BUFFERIZE shape should be each range, not the product

* fix tests

* resolve
2025-09-11 04:02:24 +08:00
Sieds Lykles
3730172c10 cleanup cast_folding (#12101)
* cleanup cast_folding

* from sym to symbolic

* no more sym in dtype lowering
2025-09-10 21:30:20 +02:00
chenyu
0e266f376c ops_gpu -> ops_cl (#12103) 2025-09-10 15:15:48 -04:00
chenyu
0599e86186 replace hardcoded GPU in llama debug msg (#12102) 2025-09-10 13:56:40 -04:00
qazal
5a84d86db7 viz: fix buffer tooltip offset (#12100)
* fixup offsets

* add buffer num to tooltip
2025-09-10 20:12:20 +03:00
nimlgen
fb96394ff5 auto-select available compilers (#12094)
* device: auto select compilers

* fix

* metal+opencl

* nv/cuda

* test without ptx

* ptx

* fix tests

* fix

* fix test

* rename

* test + cleaner

* xx

* ops

* better test

* win?

* um?

* types

* debug

* win??

* sep rung

* wtf?

* debug

* skip win

* revert this

* types
2025-09-10 19:52:01 +03:00
chenyu
bb67829e99 raise KernelOptError in TC _apply_tc_opt (#12099)
currently getting
```
2025-09-10 13:18:19
  File "/home/chenyu/tinygrad/tinygrad/codegen/opt/search.py", line 149, in beam_search
2025-09-10 13:18:19
    acted_lins: list[Scheduler] = flatten([get_kernel_actions(lin, include_0=False).values() for lin,_ in beam])
2025-09-10 13:18:19
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-09-10 13:18:19
  File "/home/chenyu/tinygrad/tinygrad/codegen/opt/search.py", line 107, in get_kernel_actions
2025-09-10 13:18:19
    lin2.apply_opt(a)
2025-09-10 13:18:19
  File "/home/chenyu/tinygrad/tinygrad/codegen/opt/postrange.py", line 169, in apply_opt
2025-09-10 13:18:19
    ret = self._apply_tc_opt(use_tensor_cores, cast(int, opt.axis), tc_select, tc_opt)
2025-09-10 13:18:19
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-09-10 13:18:19
  File "/home/chenyu/tinygrad/tinygrad/codegen/opt/postrange.py", line 235, in _apply_tc_opt
2025-09-10 13:18:19
    idx = self.rngs.index(a)
2025-09-10 13:18:19
          ^^^^^^^^^^^^^^^^^^
2025-09-10 13:18:19
ValueError: UOp(Ops.RANGE, dtypes.index, arg=(1002, <AxisType.REDUCE: 6>), src=(
2025-09-10 13:18:19
  UOp(Ops.CONST, dtypes.index, arg=15, src=()),)) is not in list
  ```
2025-09-10 12:32:19 -04:00
George Hotz
84b249ef0e move simplify reduce out of devectorizer (#12098) 2025-09-10 21:24:57 +08:00
qazal
5d66a2d885 viz: refactor range clipping (#12097) 2025-09-10 16:23:46 +03:00
George Hotz
9789337722 early reduce simplify (#12046)
* early reduce simplify

* min changes

* need that

* that goes in simplify

* no more arange reduce opt
2025-09-10 21:02:46 +08:00
nimlgen
21e6926a6a HostLLVMCompiler -> CPULLVMCompiler (#12096) 2025-09-10 14:04:16 +03:00
nimlgen
551560b87c do not use getenv('PTX') in tests (#12095)
* test without ptx

* fix tests

* fix test

* linters
2025-09-10 14:04:07 +03:00
Sieds Lykles
0e420e68b4 delete axis_is_masked (#12092) 2025-09-10 05:26:19 +02:00