chenyu
0599e86186
replace hardcoded GPU in llama debug msg ( #12102 )
2025-09-10 13:56:40 -04:00
qazal
5a84d86db7
viz: fix buffer tooltip offset ( #12100 )
...
* fixup offsets
* add buffer num to tooltip
2025-09-10 20:12:20 +03:00
nimlgen
fb96394ff5
auto-select available compilers ( #12094 )
...
* device: auto select compilers
* fix
* metal+opencl
* nv/cuda
* test without ptx
* ptx
* fix tests
* fix
* fix test
* rename
* test + cleaner
* xx
* ops
* better test
* win?
* um?
* types
* debug
* win??
* sep rung
* wtf?
* debug
* skip win
* revert this
* types
2025-09-10 19:52:01 +03:00
chenyu
bb67829e99
raise KernelOptError in TC _apply_tc_opt ( #12099 )
...
currently getting
```
2025-09-10 13:18:19
File "/home/chenyu/tinygrad/tinygrad/codegen/opt/search.py", line 149, in beam_search
2025-09-10 13:18:19
acted_lins: list[Scheduler] = flatten([get_kernel_actions(lin, include_0=False).values() for lin,_ in beam])
2025-09-10 13:18:19
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-09-10 13:18:19
File "/home/chenyu/tinygrad/tinygrad/codegen/opt/search.py", line 107, in get_kernel_actions
2025-09-10 13:18:19
lin2.apply_opt(a)
2025-09-10 13:18:19
File "/home/chenyu/tinygrad/tinygrad/codegen/opt/postrange.py", line 169, in apply_opt
2025-09-10 13:18:19
ret = self._apply_tc_opt(use_tensor_cores, cast(int, opt.axis), tc_select, tc_opt)
2025-09-10 13:18:19
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-09-10 13:18:19
File "/home/chenyu/tinygrad/tinygrad/codegen/opt/postrange.py", line 235, in _apply_tc_opt
2025-09-10 13:18:19
idx = self.rngs.index(a)
2025-09-10 13:18:19
^^^^^^^^^^^^^^^^^^
2025-09-10 13:18:19
ValueError: UOp(Ops.RANGE, dtypes.index, arg=(1002, <AxisType.REDUCE: 6>), src=(
2025-09-10 13:18:19
UOp(Ops.CONST, dtypes.index, arg=15, src=()),)) is not in list
```
2025-09-10 12:32:19 -04:00
George Hotz
84b249ef0e
move simplify reduce out of devectorizer ( #12098 )
2025-09-10 21:24:57 +08:00
qazal
5d66a2d885
viz: refactor range clipping ( #12097 )
2025-09-10 16:23:46 +03:00
George Hotz
9789337722
early reduce simplify ( #12046 )
...
* early reduce simplify
* min changes
* need that
* that goes in simplify
* no more arange reduce opt
2025-09-10 21:02:46 +08:00
nimlgen
21e6926a6a
HostLLVMCompiler -> CPULLVMCompiler ( #12096 )
2025-09-10 14:04:16 +03:00
nimlgen
551560b87c
do not use getenv('PTX') in tests ( #12095 )
...
* test without ptx
* fix tests
* fix test
* linters
2025-09-10 14:04:07 +03:00
Sieds Lykles
0e420e68b4
delete axis_is_masked ( #12092 )
2025-09-10 05:26:19 +02:00
George Hotz
ef53a6fc19
one call to hc opt ( #12074 )
...
* one call to hc opt
* does that pass?
* Clean up postrange.py by removing comments
2025-09-10 11:18:18 +08:00
Sieds Lykles
499f50483b
x | !x -> True ( #12090 )
2025-09-10 03:26:01 +02:00
Sieds Lykles
5b73076e48
assert benchmark times ( #12042 )
...
* assert jitted times in openpilot
* better error
* better error
* add ASSERT_MIN_STEP_TIME to more models
* t is step_times
* update benchmark times
* update times
2025-09-09 23:40:02 +02:00
b1tg
58d13a6e3e
remove redundant check ( #12087 )
2025-09-09 15:15:39 -04:00
qazal
71fcb23d4a
viz: cleanup renderDag ( #12086 )
2025-09-09 19:19:45 +03:00
b1tg
82e955fe79
fix inf bug in float_to_fp8 ( #12085 )
2025-09-09 12:02:56 -04:00
b1tg
14faf7a5c0
AutoCastType tests for fp8s/bf16 ( #12084 )
2025-09-09 11:33:01 -04:00
qazal
5e76eff26d
viz: pre fetch workers ( #12083 )
...
* viz: pre fetch workers
* move check
2025-09-09 15:56:39 +03:00
qazal
5fde033794
viz: prune worker payload ( #12082 )
2025-09-09 14:45:13 +03:00
nimlgen
1c6c42715f
unify cpu and llvm ( #11982 )
...
* try unify cpu and llvm
* fixes
* fix
* ops
* no llvm
* fix
* rm
* lvmm is ot
* oops
* override
* no llvm
* ignore
* skip llvm
* ooops
2025-09-09 13:54:44 +03:00
qazal
50cc7175cb
viz: use complete progress helper ( #12081 )
...
* viz: use complete progress helper
* min diff
* rename show to start
2025-09-09 11:00:52 +03:00
Sieds Lykles
239091d111
numba>=0.55 for uv resolution ( #12079 )
...
* force numba version
* update comment
2025-09-09 01:43:32 +02:00
chenyu
2bd1fff79c
ci GPU misc cleanups ( #12078 )
2025-09-08 16:47:29 -04:00
chenyu
1781d5bced
remove PYTHONPATH in test.yml ( #12077 )
...
set globally already
2025-09-08 15:41:47 -04:00
nimlgen
9182948951
remove llvm_bf16_cast ( #12075 )
2025-09-08 20:51:15 +03:00
chenyu
11213398b9
reorder amdremote in test yml ( #12073 )
2025-09-08 13:43:04 -04:00
nimlgen
ebbcdd6577
cpu: use suppress_finalizing ( #12071 )
2025-09-08 18:28:09 +03:00
qazal
73ca0e870c
viz: index visible rects ( #12070 )
2025-09-08 17:37:17 +03:00
chenyu
d40f5b766b
default BEAM_PADTO to 0 ( #12069 )
...
seems incorrect, disable by default now
2025-09-08 10:17:03 -04:00
Sieds Lykles
75b58fe2d3
move simplify_valid pat to sym ( #12065 )
...
* move simplify_valid pat to sym
* fix expectedfailure
2025-09-08 07:01:26 +02:00
chenyu
56861852be
enable IMAGE for test_mnist and test_mnist_backward ( #12064 )
...
passes now
2025-09-07 09:06:39 -04:00
nimlgen
ef71acc88a
hcq: cleanup fileio iface ( #12063 )
...
* hcq: cleanup fileio iface
* typo
* _
2025-09-07 15:43:27 +03:00
nimlgen
35ddfc3d39
change default cpu_count ( #12062 )
2025-09-06 23:30:20 +03:00
nimlgen
97187bf8b6
cleanup win and arch checks ( #12060 )
...
* cleanup win and arch checks
* stupid mypy
2025-09-06 23:08:46 +03:00
Sieds Lykles
f326df8ae8
add type: ignore ( #12059 )
2025-09-06 21:17:35 +02:00
George Hotz
c66935f7b9
only run hcopts once ( #12053 )
...
* only run hcopts once
* same?
2025-09-06 11:14:52 -07:00
qazal
801be5f7b9
viz: memory graph cleanups ( #12057 )
...
* delete the total nbytes tooltip
* split pixel rescaling from layout
2025-09-06 19:44:53 +03:00
nimlgen
10ac427aaa
cpu threading ( #11951 )
...
* start cpu threading
* fix
* fix2
* fix
* hacks?
* threads
* minor
* no dsp
* dsp 2
* n
* more
* test
* xm
* cleaner
* readable
* f
* reorder
* when no threads
* rangeify
* typos
* not needed
* reapply
* remoev this
* linter
* fixed cpu count in ci
* fix
* fixes
* rm
* typo
* sort based on speed
* test if test works in ci
* Revert "test if test works in ci"
This reverts commit 1f05edb531 .
* do not pad thread
2025-09-06 16:13:43 +03:00
nimlgen
2b1844da27
cpu: support several threads in runtime ( #12055 )
2025-09-06 13:29:31 +03:00
nimlgen
f37b836618
factor out _globalizable_rngs ( #12054 )
2025-09-06 13:29:23 +03:00
nimlgen
1630c87d0e
run optimize_local_size only when locals supported ( #12056 )
2025-09-06 13:29:09 +03:00
Jordan Chalupka
48ec5efad9
only run autogen tests on change ( #12049 )
...
* only run autogen tests on change
* example change
* rm example change
2025-09-05 23:53:01 -07:00
Sieds Lykles
581b2388c2
add dtypes.index ( #12015 )
...
* add dtypes.index
* cast shape, stride and mask to dtypes.index in view.create
* move pm_lower_index_dtype to ops
* DEFINE_VAR is dtype.index by default
* merge var_val_using_str
* remove int from commutative
* fix test_rewrite_map
* change that to dtypes.index
* change some int to index
* shorten those
* remove old cast in renderer
* cleanup
* change that back
* add comment
* delete comment
* just delete those
* view doesnt have to cast anymore
* adjust comment
2025-09-06 06:03:44 +02:00
Sieds Lykles
c6c16b2946
var_vals uses str for var (#12011 )
...
* var_vals is str,int
* remove imports
* remove print
* fix test
* change var_vals in hcq
* update test_hcq
* fix multitensor _device_num var
* fix syminfer test
* shorten line
* p.vars stays list[Variable]
* shorten line
* vars is back to tuple[Variable, ...]
* change var_vals in extra
* change var_vals from shapetracker
* var_vals is str:int
* fix signature
2025-09-06 04:16:12 +02:00
George Hotz
8658a97197
hotfix: name the shift rewrite better + no ctx there
2025-09-05 19:01:59 -07:00
George Hotz
6ef3270fc8
fix opt gate ( #12050 )
2025-09-05 18:59:54 -07:00
George Hotz
66c5206b42
hotfix: minimal scheduler copy
2025-09-05 18:24:00 -07:00
George Hotz
478e758755
Revert "fix scheduler copy ( #12048 )"
...
This reverts commit 51b7c40788 .
2025-09-05 18:21:55 -07:00
George Hotz
51b7c40788
fix scheduler copy ( #12048 )
...
* fix scheduler copy
* hand coded opt only runs once
2025-09-05 17:17:49 -07:00
George Hotz
0123c394e5
early simplfy_merge_adjacent ( #12045 )
...
* do simplify_merge_adjacent before schedule
* do simplify_merge_adjacent before schedule
* disable that slow test
2025-09-05 16:39:20 -07:00