qazal
122a50fe8c
assert kernel count ( #12205 )
2025-09-16 14:24:39 +03:00
chenyu
e555748807
test rangeify const folding ( #12200 )
...
* test rangeify const folding
reduce i know how to fix, multi and test_cast_padded tbd
* test_instancenorm_3d is very slow
2025-09-15 20:03:48 -04:00
chenyu
f732f66709
rangeify test_nn almost pass ( #12198 )
...
* rangeify test_nn almost pass
* issue with jit
* flaky
2025-09-15 17:49:20 -04:00
chenyu
82e037aad5
ci test.yml updates ( #12197 )
...
* ci test.yml updates
move docs together and external_benchmark_schedule to unit
* torch
2025-09-15 17:09:02 -04:00
chenyu
146c31586d
split RANGEIFY ci ( #12196 )
...
one CPU and one CL for speed
2025-09-15 15:41:10 -04:00
chenyu
df1c183e46
Revert "more llvm intrinsics ( #11961 )" ( #12194 )
...
This reverts commit d01e3d7719 .
2025-09-15 13:56:43 -04:00
b1tg
d01e3d7719
more llvm intrinsics ( #11961 )
...
* more llvm intrinsics
* assert nan
* skip test_log_nan on metal
---------
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
2025-09-15 13:05:23 -04:00
nimlgen
b63bd02969
update runtime docs ( #12191 )
2025-09-15 17:46:20 +03:00
qazal
57e8bf61e8
viz: fix Specificity for rect styling ( #12190 )
2025-09-15 17:33:37 +03:00
chenyu
72e010d816
fix rangeify ci ( #12189 )
...
CL=1, and multitensor needs to test with CPU since CL does not support multi in CI
2025-09-15 10:24:57 -04:00
qazal
f1bd06134d
test fuse with RANGEIFY=2 ( #12187 )
2025-09-15 15:51:23 +03:00
qazal
ef0ef705fe
viz: remove async from event listener ( #12186 )
2025-09-15 15:08:28 +03:00
qazal
d8855ec266
viz/serve.py cleanups ( #12185 )
...
* don't assign unused variable
* *path to
2025-09-15 13:43:26 +03:00
qazal
b8a74c1569
cpu: add disassembler err message ( #12184 )
...
* cpu: add disassembler err message
* print msg
2025-09-15 13:29:44 +03:00
qazal
a388d2cb1a
remove PROFILE=1 option, it's just VIZ=1 [pr] ( #12176 )
...
* remove PROFILE=1 option, it's just VIZ=1 [pr]
* sqtt
* sqtt 2
* return last
* rename
2025-09-15 12:51:50 +03:00
George Hotz
65397bfdeb
set testpath on pytest ( #12183 )
2025-09-15 16:13:05 +08:00
George Hotz
ae0edc8a67
renumber ranges ( #12182 )
...
* enable rangeify const folding
* renumber ranges for kernel deduping
2025-09-15 13:03:39 +08:00
hooved
e1fef895b1
don't hardcode weights path ( #12171 )
2025-09-15 00:33:47 -04:00
hooved
3a9db08b49
download data and ckpts for sd train/eval ( #12170 )
2025-09-15 00:31:45 -04:00
chenyu
bdb3afd566
failed test case for symbolic pad ( #12179 )
2025-09-15 00:25:21 -04:00
George Hotz
9fcc87761e
enable rangeify const folding ( #12181 )
2025-09-15 12:02:19 +08:00
George Hotz
1353250b6c
tags on bufferize are the tensor tags ( #12180 )
2025-09-15 11:46:03 +08:00
George Hotz
60d7db093e
delete bufferized consts + output noops ( #12163 )
...
* bring const folding to rangeify
* comment that
2025-09-15 11:07:44 +08:00
qazal
525c20dc7e
viz: remove unused runtime_stats feature ( #12177 )
2025-09-15 02:53:05 +03:00
qazal
75ff9b7a9a
viz: add buffer lifetime to tooltip ( #12175 )
2025-09-15 02:33:50 +03:00
chenyu
15b166ce6d
bump test_module_runs to 30 seconds ( #12174 )
...
25 seconds sometimes
2025-09-14 16:48:40 -04:00
ttomsa
943236ef74
move cast pat out of symbolic_simple ( #11945 )
...
* move pat
* move it here
* rm extra check
---------
Co-authored-by: Sieds Lykles <93992551+S-Lykles@users.noreply.github.com >
2025-09-14 21:39:48 +02:00
Steven Shi
25b1bc8eff
added top k sampling to examples/mamba ( #12061 )
2025-09-14 15:27:34 -04:00
Shun Usami
34a05b31fe
Fix advanced tensor indexing setitem ( #12128 )
...
* Add failure test case for advanced tensor indexing setitem
* Fix advanced tensor indexing setitem when permuted
* Reduce line count
* Revert unnecessary change
* Combine two lines into one
2025-09-14 15:22:40 -04:00
chenyu
d09c0f28c5
increase test_module_runs ( #12173 )
...
timed out on ci windows llvm
2025-09-14 15:19:21 -04:00
chenyu
12a910f1d2
update torch 2.8 ( #12172 )
...
support _reshape_alias. something is wrong with one case of unfold
2025-09-14 15:19:03 -04:00
chenyu
98ecab7563
remove ml_dtypes ( #12169 )
2025-09-14 14:20:05 -04:00
qazal
02054b53fe
remove tests that pre date the uop spec ( #12168 )
...
* remove tests that pre date the uop spec
* const src
* for RANGEIFY=1
* update with bind
* remove import
2025-09-14 18:47:42 +03:00
qazal
1591e4f66b
update outbufs selection in test_linearizer [pr] ( #12166 )
2025-09-14 13:46:49 +03:00
nimlgen
d1ae30f7ef
hcq: do not spam with errors in -m device ( #12150 )
...
* hcq: do not spam with errors in -m device
* um?
* um?
* nn
* helps?
* um?
* no gc?
* fix
2025-09-14 10:56:59 +03:00
George Hotz
d5bc27797b
fix some multitensor on rangeify ( #12162 )
...
* fix some multitensor on rangeify
* rangeify multi hacks
* copy on const
2025-09-14 14:31:57 +08:00
Meng Zhuo
4b7904eca9
add cpu support for riscv64 ( #12136 )
2025-09-14 11:40:58 +08:00
George Hotz
bcafa72b7f
use tags instead of graph_rewrite_map in rangeify ( #12110 )
...
* use tags instead of graph_rewrite_map in rangeify
* new style, add realize
* metadata works
* simple failure
* fix
* loops
* stuff becomes a NOOP when you remove it
* stuff becomes a NOOP when you remove it
* tags on bufferize
* bmnist works
* locals don't work
* shippable
* fix some tests
* simpler map_realize
* remove const hack
* debuggable test
* broke
* assign test
* straight up bug
* wooo it passes
* sink shouldn't be there
* fix ops
* bmnist
* kv cache ish
* Set RANGEIFY context variable to 0
* should work normal
* better
* types
* hacks to fix test_symbolic
* pm_add_buffers
* tests should pass
2025-09-14 11:39:01 +08:00
chenyu
d2316ba91a
don't validate output in sdxl with fakeweights ( #12160 )
...
NULL backend passed validation before because both desired and actual went through NULL backend
2025-09-13 21:47:51 -04:00
nimlgen
b1d1816f43
device: fix envvars ( #12159 )
2025-09-13 23:38:09 +03:00
nimlgen
19d9d29b7e
device: compilers in tinygrad.device ( #12151 )
...
* hcq: do not spam with errors in -m device
* -m tinygrad p2
* fix
* ugh
* comp in ckey
* fix
* one more
* print defaults
* xx
2025-09-13 21:45:29 +03:00
qazal
6410dcb7c2
viz: less verbose render loop ( #12158 )
...
* define visible once
* move y offsets to one place
2025-09-13 19:04:37 +03:00
nimlgen
92df52d79a
make method_cache account for compiler ( #12156 )
...
* make method_cache account for compiler
* sorry
2025-09-13 17:00:11 +03:00
chenyu
0c392089d9
update mypy ( #12155 )
2025-09-13 09:48:38 -04:00
qazal
fbca6183ad
do not launch BEAM when opts_to_apply exists [pr] ( #12152 )
2025-09-13 14:57:46 +03:00
George Hotz
b2a95d32bb
check clSetKernelArg ( #12149 )
2025-09-13 17:24:55 +08:00
George Hotz
0695e322a8
fix android cpu device ( #12148 )
2025-09-13 15:42:04 +08:00
Sieds Lykles
e3a3764917
delete fold_unrolled_divs ( #12146 )
2025-09-13 03:09:36 +02:00
Sieds Lykles
51ed6e94b2
AxisType __repr__ method ( #12145 )
2025-09-13 01:15:38 +02:00
Sieds Lykles
0757a9a819
add pytest-timeout of 3 min per item ( #12144 )
...
* add pytest-timeout with timeout of 3 min
* func_only
2025-09-13 00:48:41 +02:00