Commit Graph

10417 Commits

Author SHA1 Message Date
qazal
122a50fe8c assert kernel count (#12205) 2025-09-16 14:24:39 +03:00
chenyu
e555748807 test rangeify const folding (#12200)
* test rangeify const folding

reduce i know how to fix, multi and test_cast_padded tbd

* test_instancenorm_3d is very slow
2025-09-15 20:03:48 -04:00
chenyu
f732f66709 rangeify test_nn almost pass (#12198)
* rangeify test_nn almost pass

* issue with jit

* flaky
2025-09-15 17:49:20 -04:00
chenyu
82e037aad5 ci test.yml updates (#12197)
* ci test.yml updates

move docs together and external_benchmark_schedule to unit

* torch
2025-09-15 17:09:02 -04:00
chenyu
146c31586d split RANGEIFY ci (#12196)
one CPU and one CL for speed
2025-09-15 15:41:10 -04:00
chenyu
df1c183e46 Revert "more llvm intrinsics (#11961)" (#12194)
This reverts commit d01e3d7719.
2025-09-15 13:56:43 -04:00
b1tg
d01e3d7719 more llvm intrinsics (#11961)
* more llvm intrinsics

* assert nan

* skip test_log_nan on metal

---------

Co-authored-by: b1tg <b1tg@users.noreply.github.com>
2025-09-15 13:05:23 -04:00
nimlgen
b63bd02969 update runtime docs (#12191) 2025-09-15 17:46:20 +03:00
qazal
57e8bf61e8 viz: fix Specificity for rect styling (#12190) 2025-09-15 17:33:37 +03:00
chenyu
72e010d816 fix rangeify ci (#12189)
CL=1, and multitensor needs to test with CPU since CL does not support multi in CI
2025-09-15 10:24:57 -04:00
qazal
f1bd06134d test fuse with RANGEIFY=2 (#12187) 2025-09-15 15:51:23 +03:00
qazal
ef0ef705fe viz: remove async from event listener (#12186) 2025-09-15 15:08:28 +03:00
qazal
d8855ec266 viz/serve.py cleanups (#12185)
* don't assign unused variable

* *path to
2025-09-15 13:43:26 +03:00
qazal
b8a74c1569 cpu: add disassembler err message (#12184)
* cpu: add disassembler err message

* print msg
2025-09-15 13:29:44 +03:00
qazal
a388d2cb1a remove PROFILE=1 option, it's just VIZ=1 [pr] (#12176)
* remove PROFILE=1 option, it's just VIZ=1 [pr]

* sqtt

* sqtt 2

* return last

* rename
2025-09-15 12:51:50 +03:00
George Hotz
65397bfdeb set testpath on pytest (#12183) 2025-09-15 16:13:05 +08:00
George Hotz
ae0edc8a67 renumber ranges (#12182)
* enable rangeify const folding

* renumber ranges for kernel deduping
2025-09-15 13:03:39 +08:00
hooved
e1fef895b1 don't hardcode weights path (#12171) 2025-09-15 00:33:47 -04:00
hooved
3a9db08b49 download data and ckpts for sd train/eval (#12170) 2025-09-15 00:31:45 -04:00
chenyu
bdb3afd566 failed test case for symbolic pad (#12179) 2025-09-15 00:25:21 -04:00
George Hotz
9fcc87761e enable rangeify const folding (#12181) 2025-09-15 12:02:19 +08:00
George Hotz
1353250b6c tags on bufferize are the tensor tags (#12180) 2025-09-15 11:46:03 +08:00
George Hotz
60d7db093e delete bufferized consts + output noops (#12163)
* bring const folding to rangeify

* comment that
2025-09-15 11:07:44 +08:00
qazal
525c20dc7e viz: remove unused runtime_stats feature (#12177) 2025-09-15 02:53:05 +03:00
qazal
75ff9b7a9a viz: add buffer lifetime to tooltip (#12175) 2025-09-15 02:33:50 +03:00
chenyu
15b166ce6d bump test_module_runs to 30 seconds (#12174)
25 seconds sometimes
2025-09-14 16:48:40 -04:00
ttomsa
943236ef74 move cast pat out of symbolic_simple (#11945)
* move pat

* move it here

* rm extra check

---------

Co-authored-by: Sieds Lykles <93992551+S-Lykles@users.noreply.github.com>
2025-09-14 21:39:48 +02:00
Steven Shi
25b1bc8eff added top k sampling to examples/mamba (#12061) 2025-09-14 15:27:34 -04:00
Shun Usami
34a05b31fe Fix advanced tensor indexing setitem (#12128)
* Add failure test case for advanced tensor indexing setitem

* Fix advanced tensor indexing setitem when permuted

* Reduce line count

* Revert unnecessary change

* Combine two lines into one
2025-09-14 15:22:40 -04:00
chenyu
d09c0f28c5 increase test_module_runs (#12173)
timed out on ci windows llvm
2025-09-14 15:19:21 -04:00
chenyu
12a910f1d2 update torch 2.8 (#12172)
support _reshape_alias. something is wrong with one case of unfold
2025-09-14 15:19:03 -04:00
chenyu
98ecab7563 remove ml_dtypes (#12169) 2025-09-14 14:20:05 -04:00
qazal
02054b53fe remove tests that pre date the uop spec (#12168)
* remove tests that pre date the uop spec

* const src

* for RANGEIFY=1

* update with bind

* remove import
2025-09-14 18:47:42 +03:00
qazal
1591e4f66b update outbufs selection in test_linearizer [pr] (#12166) 2025-09-14 13:46:49 +03:00
nimlgen
d1ae30f7ef hcq: do not spam with errors in -m device (#12150)
* hcq: do not spam with errors in -m device

* um?

* um?

* nn

* helps?

* um?

* no gc?

* fix
2025-09-14 10:56:59 +03:00
George Hotz
d5bc27797b fix some multitensor on rangeify (#12162)
* fix some multitensor on rangeify

* rangeify multi hacks

* copy on const
2025-09-14 14:31:57 +08:00
Meng Zhuo
4b7904eca9 add cpu support for riscv64 (#12136) 2025-09-14 11:40:58 +08:00
George Hotz
bcafa72b7f use tags instead of graph_rewrite_map in rangeify (#12110)
* use tags instead of graph_rewrite_map in rangeify

* new style, add realize

* metadata works

* simple failure

* fix

* loops

* stuff becomes a NOOP when you remove it

* stuff becomes a NOOP when you remove it

* tags on bufferize

* bmnist works

* locals don't work

* shippable

* fix some tests

* simpler map_realize

* remove const hack

* debuggable test

* broke

* assign test

* straight up bug

* wooo it passes

* sink shouldn't be there

* fix ops

* bmnist

* kv cache ish

* Set RANGEIFY context variable to 0

* should work normal

* better

* types

* hacks to fix test_symbolic

* pm_add_buffers

* tests should pass
2025-09-14 11:39:01 +08:00
chenyu
d2316ba91a don't validate output in sdxl with fakeweights (#12160)
NULL backend passed validation before because both desired and actual went through NULL backend
2025-09-13 21:47:51 -04:00
nimlgen
b1d1816f43 device: fix envvars (#12159) 2025-09-13 23:38:09 +03:00
nimlgen
19d9d29b7e device: compilers in tinygrad.device (#12151)
* hcq: do not spam with errors in -m device

* -m tinygrad p2

* fix

* ugh

* comp in ckey

* fix

* one more

* print defaults

* xx
2025-09-13 21:45:29 +03:00
qazal
6410dcb7c2 viz: less verbose render loop (#12158)
* define visible once

* move y offsets to one place
2025-09-13 19:04:37 +03:00
nimlgen
92df52d79a make method_cache account for compiler (#12156)
* make method_cache account for compiler

* sorry
2025-09-13 17:00:11 +03:00
chenyu
0c392089d9 update mypy (#12155) 2025-09-13 09:48:38 -04:00
qazal
fbca6183ad do not launch BEAM when opts_to_apply exists [pr] (#12152) 2025-09-13 14:57:46 +03:00
George Hotz
b2a95d32bb check clSetKernelArg (#12149) 2025-09-13 17:24:55 +08:00
George Hotz
0695e322a8 fix android cpu device (#12148) 2025-09-13 15:42:04 +08:00
Sieds Lykles
e3a3764917 delete fold_unrolled_divs (#12146) 2025-09-13 03:09:36 +02:00
Sieds Lykles
51ed6e94b2 AxisType __repr__ method (#12145) 2025-09-13 01:15:38 +02:00
Sieds Lykles
0757a9a819 add pytest-timeout of 3 min per item (#12144)
* add pytest-timeout with timeout of 3 min

* func_only
2025-09-13 00:48:41 +02:00