Commit Graph

10132 Commits

Author SHA1 Message Date
chenyu
0599e86186 replace hardcoded GPU in llama debug msg (#12102) 2025-09-10 13:56:40 -04:00
qazal
5a84d86db7 viz: fix buffer tooltip offset (#12100)
* fixup offsets

* add buffer num to tooltip
2025-09-10 20:12:20 +03:00
nimlgen
fb96394ff5 auto-select available compilers (#12094)
* device: auto select compilers

* fix

* metal+opencl

* nv/cuda

* test without ptx

* ptx

* fix tests

* fix

* fix test

* rename

* test + cleaner

* xx

* ops

* better test

* win?

* um?

* types

* debug

* win??

* sep rung

* wtf?

* debug

* skip win

* revert this

* types
2025-09-10 19:52:01 +03:00
chenyu
bb67829e99 raise KernelOptError in TC _apply_tc_opt (#12099)
currently getting
```
2025-09-10 13:18:19
  File "/home/chenyu/tinygrad/tinygrad/codegen/opt/search.py", line 149, in beam_search
2025-09-10 13:18:19
    acted_lins: list[Scheduler] = flatten([get_kernel_actions(lin, include_0=False).values() for lin,_ in beam])
2025-09-10 13:18:19
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-09-10 13:18:19
  File "/home/chenyu/tinygrad/tinygrad/codegen/opt/search.py", line 107, in get_kernel_actions
2025-09-10 13:18:19
    lin2.apply_opt(a)
2025-09-10 13:18:19
  File "/home/chenyu/tinygrad/tinygrad/codegen/opt/postrange.py", line 169, in apply_opt
2025-09-10 13:18:19
    ret = self._apply_tc_opt(use_tensor_cores, cast(int, opt.axis), tc_select, tc_opt)
2025-09-10 13:18:19
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-09-10 13:18:19
  File "/home/chenyu/tinygrad/tinygrad/codegen/opt/postrange.py", line 235, in _apply_tc_opt
2025-09-10 13:18:19
    idx = self.rngs.index(a)
2025-09-10 13:18:19
          ^^^^^^^^^^^^^^^^^^
2025-09-10 13:18:19
ValueError: UOp(Ops.RANGE, dtypes.index, arg=(1002, <AxisType.REDUCE: 6>), src=(
2025-09-10 13:18:19
  UOp(Ops.CONST, dtypes.index, arg=15, src=()),)) is not in list
  ```
2025-09-10 12:32:19 -04:00
George Hotz
84b249ef0e move simplify reduce out of devectorizer (#12098) 2025-09-10 21:24:57 +08:00
qazal
5d66a2d885 viz: refactor range clipping (#12097) 2025-09-10 16:23:46 +03:00
George Hotz
9789337722 early reduce simplify (#12046)
* early reduce simplify

* min changes

* need that

* that goes in simplify

* no more arange reduce opt
2025-09-10 21:02:46 +08:00
nimlgen
21e6926a6a HostLLVMCompiler -> CPULLVMCompiler (#12096) 2025-09-10 14:04:16 +03:00
nimlgen
551560b87c do not use getenv('PTX') in tests (#12095)
* test without ptx

* fix tests

* fix test

* linters
2025-09-10 14:04:07 +03:00
Sieds Lykles
0e420e68b4 delete axis_is_masked (#12092) 2025-09-10 05:26:19 +02:00
George Hotz
ef53a6fc19 one call to hc opt (#12074)
* one call to hc opt

* does that pass?

* Clean up postrange.py by removing comments
2025-09-10 11:18:18 +08:00
Sieds Lykles
499f50483b x | !x -> True (#12090) 2025-09-10 03:26:01 +02:00
Sieds Lykles
5b73076e48 assert benchmark times (#12042)
* assert jitted times in openpilot

* better error

* better error

* add ASSERT_MIN_STEP_TIME to more models

* t is step_times

* update benchmark times

* update times
2025-09-09 23:40:02 +02:00
b1tg
58d13a6e3e remove redundant check (#12087) 2025-09-09 15:15:39 -04:00
qazal
71fcb23d4a viz: cleanup renderDag (#12086) 2025-09-09 19:19:45 +03:00
b1tg
82e955fe79 fix inf bug in float_to_fp8 (#12085) 2025-09-09 12:02:56 -04:00
b1tg
14faf7a5c0 AutoCastType tests for fp8s/bf16 (#12084) 2025-09-09 11:33:01 -04:00
qazal
5e76eff26d viz: pre fetch workers (#12083)
* viz: pre fetch workers

* move check
2025-09-09 15:56:39 +03:00
qazal
5fde033794 viz: prune worker payload (#12082) 2025-09-09 14:45:13 +03:00
nimlgen
1c6c42715f unify cpu and llvm (#11982)
* try unify cpu and llvm

* fixes

* fix

* ops

* no llvm

* fix

* rm

* lvmm is ot

* oops

* override

* no llvm

* ignore

* skip llvm

* ooops
2025-09-09 13:54:44 +03:00
qazal
50cc7175cb viz: use complete progress helper (#12081)
* viz: use complete progress helper

* min diff

* rename show to start
2025-09-09 11:00:52 +03:00
Sieds Lykles
239091d111 numba>=0.55 for uv resolution (#12079)
* force numba version

* update comment
2025-09-09 01:43:32 +02:00
chenyu
2bd1fff79c ci GPU misc cleanups (#12078) 2025-09-08 16:47:29 -04:00
chenyu
1781d5bced remove PYTHONPATH in test.yml (#12077)
set globally already
2025-09-08 15:41:47 -04:00
nimlgen
9182948951 remove llvm_bf16_cast (#12075) 2025-09-08 20:51:15 +03:00
chenyu
11213398b9 reorder amdremote in test yml (#12073) 2025-09-08 13:43:04 -04:00
nimlgen
ebbcdd6577 cpu: use suppress_finalizing (#12071) 2025-09-08 18:28:09 +03:00
qazal
73ca0e870c viz: index visible rects (#12070) 2025-09-08 17:37:17 +03:00
chenyu
d40f5b766b default BEAM_PADTO to 0 (#12069)
seems incorrect, disable by default now
2025-09-08 10:17:03 -04:00
Sieds Lykles
75b58fe2d3 move simplify_valid pat to sym (#12065)
* move simplify_valid pat to sym

* fix expectedfailure
2025-09-08 07:01:26 +02:00
chenyu
56861852be enable IMAGE for test_mnist and test_mnist_backward (#12064)
passes now
2025-09-07 09:06:39 -04:00
nimlgen
ef71acc88a hcq: cleanup fileio iface (#12063)
* hcq: cleanup fileio iface

* typo

* _
2025-09-07 15:43:27 +03:00
nimlgen
35ddfc3d39 change default cpu_count (#12062) 2025-09-06 23:30:20 +03:00
nimlgen
97187bf8b6 cleanup win and arch checks (#12060)
* cleanup win and arch checks

* stupid mypy
2025-09-06 23:08:46 +03:00
Sieds Lykles
f326df8ae8 add type: ignore (#12059) 2025-09-06 21:17:35 +02:00
George Hotz
c66935f7b9 only run hcopts once (#12053)
* only run hcopts once

* same?
2025-09-06 11:14:52 -07:00
qazal
801be5f7b9 viz: memory graph cleanups (#12057)
* delete the total nbytes tooltip

* split pixel rescaling from layout
2025-09-06 19:44:53 +03:00
nimlgen
10ac427aaa cpu threading (#11951)
* start cpu threading

* fix

* fix2

* fix

* hacks?

* threads

* minor

* no dsp

* dsp 2

* n

* more

* test

* xm

* cleaner

* readable

* f

* reorder

* when no threads

* rangeify

* typos

* not needed

* reapply

* remoev this

* linter

* fixed cpu count in ci

* fix

* fixes

* rm

* typo

* sort based on speed

* test if test works in ci

* Revert "test if test works in ci"

This reverts commit 1f05edb531.

* do not pad thread
2025-09-06 16:13:43 +03:00
nimlgen
2b1844da27 cpu: support several threads in runtime (#12055) 2025-09-06 13:29:31 +03:00
nimlgen
f37b836618 factor out _globalizable_rngs (#12054) 2025-09-06 13:29:23 +03:00
nimlgen
1630c87d0e run optimize_local_size only when locals supported (#12056) 2025-09-06 13:29:09 +03:00
Jordan Chalupka
48ec5efad9 only run autogen tests on change (#12049)
* only run autogen tests on change

* example change

* rm example change
2025-09-05 23:53:01 -07:00
Sieds Lykles
581b2388c2 add dtypes.index (#12015)
* add dtypes.index

* cast shape, stride and mask to dtypes.index in view.create

* move pm_lower_index_dtype to ops

* DEFINE_VAR is dtype.index by default

* merge var_val_using_str

* remove int from commutative

* fix test_rewrite_map

* change that to dtypes.index

* change some int to index

* shorten those

* remove old cast in renderer

* cleanup

* change that back

* add comment

* delete comment

* just delete those

* view doesnt have to cast anymore

* adjust comment
2025-09-06 06:03:44 +02:00
Sieds Lykles
c6c16b2946 var_vals uses str for var (#12011)
* var_vals is str,int

* remove imports

* remove print

* fix test

* change var_vals in hcq

* update test_hcq

* fix multitensor _device_num var

* fix syminfer test

* shorten line

* p.vars stays list[Variable]

* shorten line

* vars is back to tuple[Variable, ...]

* change var_vals in extra

* change var_vals from shapetracker

* var_vals is str:int

* fix signature
2025-09-06 04:16:12 +02:00
George Hotz
8658a97197 hotfix: name the shift rewrite better + no ctx there 2025-09-05 19:01:59 -07:00
George Hotz
6ef3270fc8 fix opt gate (#12050) 2025-09-05 18:59:54 -07:00
George Hotz
66c5206b42 hotfix: minimal scheduler copy 2025-09-05 18:24:00 -07:00
George Hotz
478e758755 Revert "fix scheduler copy (#12048)"
This reverts commit 51b7c40788.
2025-09-05 18:21:55 -07:00
George Hotz
51b7c40788 fix scheduler copy (#12048)
* fix scheduler copy

* hand coded opt only runs once
2025-09-05 17:17:49 -07:00
George Hotz
0123c394e5 early simplfy_merge_adjacent (#12045)
* do simplify_merge_adjacent before schedule

* do simplify_merge_adjacent before schedule

* disable that slow test
2025-09-05 16:39:20 -07:00