George Hotz
d4eba5800d
rangeify cost function infrastructure ( #12091 )
...
* one call to hc opt
* does that pass?
* add cost function to rangeify
* test
* more test
* gate thread
* bufferize has shape
* ish
* match old behavior
* no ci there
2025-09-11 07:19:53 +08:00
qazal
78610b681e
viz: light up children ( #12107 )
...
* viz: light up children
* keep tag coloring
2025-09-11 01:28:01 +03:00
Sieds Lykles
3989f5b559
Revert "Simplify valid in symbolic ( #12104 )" ( #12108 )
...
This reverts commit 73d479a016 .
2025-09-10 23:36:40 +02:00
Sieds Lykles
73d479a016
Simplify valid in symbolic ( #12104 )
...
* cleanup cast_folding
* from sym to symbolic
* no more sym in dtype lowering
* move around simplify_valid
* update test
2025-09-10 23:26:19 +02:00
chenyu
e306650d39
remove GPUDevice ( #12106 )
2025-09-10 16:35:00 -04:00
George Hotz
d8a7a1c9c7
BUFFERIZE shape should be each range, not the product ( #12105 )
...
* BUFFERIZE shape should be each range, not the product
* fix tests
* resolve
2025-09-11 04:02:24 +08:00
Sieds Lykles
3730172c10
cleanup cast_folding ( #12101 )
...
* cleanup cast_folding
* from sym to symbolic
* no more sym in dtype lowering
2025-09-10 21:30:20 +02:00
chenyu
0e266f376c
ops_gpu -> ops_cl ( #12103 )
2025-09-10 15:15:48 -04:00
chenyu
0599e86186
replace hardcoded GPU in llama debug msg ( #12102 )
2025-09-10 13:56:40 -04:00
qazal
5a84d86db7
viz: fix buffer tooltip offset ( #12100 )
...
* fixup offsets
* add buffer num to tooltip
2025-09-10 20:12:20 +03:00
nimlgen
fb96394ff5
auto-select available compilers ( #12094 )
...
* device: auto select compilers
* fix
* metal+opencl
* nv/cuda
* test without ptx
* ptx
* fix tests
* fix
* fix test
* rename
* test + cleaner
* xx
* ops
* better test
* win?
* um?
* types
* debug
* win??
* sep rung
* wtf?
* debug
* skip win
* revert this
* types
2025-09-10 19:52:01 +03:00
chenyu
bb67829e99
raise KernelOptError in TC _apply_tc_opt ( #12099 )
...
currently getting
```
2025-09-10 13:18:19
File "/home/chenyu/tinygrad/tinygrad/codegen/opt/search.py", line 149, in beam_search
2025-09-10 13:18:19
acted_lins: list[Scheduler] = flatten([get_kernel_actions(lin, include_0=False).values() for lin,_ in beam])
2025-09-10 13:18:19
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-09-10 13:18:19
File "/home/chenyu/tinygrad/tinygrad/codegen/opt/search.py", line 107, in get_kernel_actions
2025-09-10 13:18:19
lin2.apply_opt(a)
2025-09-10 13:18:19
File "/home/chenyu/tinygrad/tinygrad/codegen/opt/postrange.py", line 169, in apply_opt
2025-09-10 13:18:19
ret = self._apply_tc_opt(use_tensor_cores, cast(int, opt.axis), tc_select, tc_opt)
2025-09-10 13:18:19
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-09-10 13:18:19
File "/home/chenyu/tinygrad/tinygrad/codegen/opt/postrange.py", line 235, in _apply_tc_opt
2025-09-10 13:18:19
idx = self.rngs.index(a)
2025-09-10 13:18:19
^^^^^^^^^^^^^^^^^^
2025-09-10 13:18:19
ValueError: UOp(Ops.RANGE, dtypes.index, arg=(1002, <AxisType.REDUCE: 6>), src=(
2025-09-10 13:18:19
UOp(Ops.CONST, dtypes.index, arg=15, src=()),)) is not in list
```
2025-09-10 12:32:19 -04:00
George Hotz
84b249ef0e
move simplify reduce out of devectorizer ( #12098 )
2025-09-10 21:24:57 +08:00
qazal
5d66a2d885
viz: refactor range clipping ( #12097 )
2025-09-10 16:23:46 +03:00
George Hotz
9789337722
early reduce simplify ( #12046 )
...
* early reduce simplify
* min changes
* need that
* that goes in simplify
* no more arange reduce opt
2025-09-10 21:02:46 +08:00
nimlgen
21e6926a6a
HostLLVMCompiler -> CPULLVMCompiler ( #12096 )
2025-09-10 14:04:16 +03:00
nimlgen
551560b87c
do not use getenv('PTX') in tests ( #12095 )
...
* test without ptx
* fix tests
* fix test
* linters
2025-09-10 14:04:07 +03:00
Sieds Lykles
0e420e68b4
delete axis_is_masked ( #12092 )
2025-09-10 05:26:19 +02:00
George Hotz
ef53a6fc19
one call to hc opt ( #12074 )
...
* one call to hc opt
* does that pass?
* Clean up postrange.py by removing comments
2025-09-10 11:18:18 +08:00
Sieds Lykles
499f50483b
x | !x -> True ( #12090 )
2025-09-10 03:26:01 +02:00
Sieds Lykles
5b73076e48
assert benchmark times ( #12042 )
...
* assert jitted times in openpilot
* better error
* better error
* add ASSERT_MIN_STEP_TIME to more models
* t is step_times
* update benchmark times
* update times
2025-09-09 23:40:02 +02:00
b1tg
58d13a6e3e
remove redundant check ( #12087 )
2025-09-09 15:15:39 -04:00
qazal
71fcb23d4a
viz: cleanup renderDag ( #12086 )
2025-09-09 19:19:45 +03:00
b1tg
82e955fe79
fix inf bug in float_to_fp8 ( #12085 )
2025-09-09 12:02:56 -04:00
b1tg
14faf7a5c0
AutoCastType tests for fp8s/bf16 ( #12084 )
2025-09-09 11:33:01 -04:00
qazal
5e76eff26d
viz: pre fetch workers ( #12083 )
...
* viz: pre fetch workers
* move check
2025-09-09 15:56:39 +03:00
qazal
5fde033794
viz: prune worker payload ( #12082 )
2025-09-09 14:45:13 +03:00
nimlgen
1c6c42715f
unify cpu and llvm ( #11982 )
...
* try unify cpu and llvm
* fixes
* fix
* ops
* no llvm
* fix
* rm
* lvmm is ot
* oops
* override
* no llvm
* ignore
* skip llvm
* ooops
2025-09-09 13:54:44 +03:00
qazal
50cc7175cb
viz: use complete progress helper ( #12081 )
...
* viz: use complete progress helper
* min diff
* rename show to start
2025-09-09 11:00:52 +03:00
Sieds Lykles
239091d111
numba>=0.55 for uv resolution ( #12079 )
...
* force numba version
* update comment
2025-09-09 01:43:32 +02:00
chenyu
2bd1fff79c
ci GPU misc cleanups ( #12078 )
2025-09-08 16:47:29 -04:00
chenyu
1781d5bced
remove PYTHONPATH in test.yml ( #12077 )
...
set globally already
2025-09-08 15:41:47 -04:00
nimlgen
9182948951
remove llvm_bf16_cast ( #12075 )
2025-09-08 20:51:15 +03:00
chenyu
11213398b9
reorder amdremote in test yml ( #12073 )
2025-09-08 13:43:04 -04:00
nimlgen
ebbcdd6577
cpu: use suppress_finalizing ( #12071 )
2025-09-08 18:28:09 +03:00
qazal
73ca0e870c
viz: index visible rects ( #12070 )
2025-09-08 17:37:17 +03:00
chenyu
d40f5b766b
default BEAM_PADTO to 0 ( #12069 )
...
seems incorrect, disable by default now
2025-09-08 10:17:03 -04:00
Sieds Lykles
75b58fe2d3
move simplify_valid pat to sym ( #12065 )
...
* move simplify_valid pat to sym
* fix expectedfailure
2025-09-08 07:01:26 +02:00
chenyu
56861852be
enable IMAGE for test_mnist and test_mnist_backward ( #12064 )
...
passes now
2025-09-07 09:06:39 -04:00
nimlgen
ef71acc88a
hcq: cleanup fileio iface ( #12063 )
...
* hcq: cleanup fileio iface
* typo
* _
2025-09-07 15:43:27 +03:00
nimlgen
35ddfc3d39
change default cpu_count ( #12062 )
2025-09-06 23:30:20 +03:00
nimlgen
97187bf8b6
cleanup win and arch checks ( #12060 )
...
* cleanup win and arch checks
* stupid mypy
2025-09-06 23:08:46 +03:00
Sieds Lykles
f326df8ae8
add type: ignore ( #12059 )
2025-09-06 21:17:35 +02:00
George Hotz
c66935f7b9
only run hcopts once ( #12053 )
...
* only run hcopts once
* same?
2025-09-06 11:14:52 -07:00
qazal
801be5f7b9
viz: memory graph cleanups ( #12057 )
...
* delete the total nbytes tooltip
* split pixel rescaling from layout
2025-09-06 19:44:53 +03:00
nimlgen
10ac427aaa
cpu threading ( #11951 )
...
* start cpu threading
* fix
* fix2
* fix
* hacks?
* threads
* minor
* no dsp
* dsp 2
* n
* more
* test
* xm
* cleaner
* readable
* f
* reorder
* when no threads
* rangeify
* typos
* not needed
* reapply
* remoev this
* linter
* fixed cpu count in ci
* fix
* fixes
* rm
* typo
* sort based on speed
* test if test works in ci
* Revert "test if test works in ci"
This reverts commit 1f05edb531 .
* do not pad thread
2025-09-06 16:13:43 +03:00
nimlgen
2b1844da27
cpu: support several threads in runtime ( #12055 )
2025-09-06 13:29:31 +03:00
nimlgen
f37b836618
factor out _globalizable_rngs ( #12054 )
2025-09-06 13:29:23 +03:00
nimlgen
1630c87d0e
run optimize_local_size only when locals supported ( #12056 )
2025-09-06 13:29:09 +03:00
Jordan Chalupka
48ec5efad9
only run autogen tests on change ( #12049 )
...
* only run autogen tests on change
* example change
* rm example change
2025-09-05 23:53:01 -07:00