George Hotz
d5bc27797b
fix some multitensor on rangeify ( #12162 )
...
* fix some multitensor on rangeify
* rangeify multi hacks
* copy on const
2025-09-14 14:31:57 +08:00
Meng Zhuo
4b7904eca9
add cpu support for riscv64 ( #12136 )
2025-09-14 11:40:58 +08:00
George Hotz
bcafa72b7f
use tags instead of graph_rewrite_map in rangeify ( #12110 )
...
* use tags instead of graph_rewrite_map in rangeify
* new style, add realize
* metadata works
* simple failure
* fix
* loops
* stuff becomes a NOOP when you remove it
* stuff becomes a NOOP when you remove it
* tags on bufferize
* bmnist works
* locals don't work
* shippable
* fix some tests
* simpler map_realize
* remove const hack
* debuggable test
* broke
* assign test
* straight up bug
* wooo it passes
* sink shouldn't be there
* fix ops
* bmnist
* kv cache ish
* Set RANGEIFY context variable to 0
* should work normal
* better
* types
* hacks to fix test_symbolic
* pm_add_buffers
* tests should pass
2025-09-14 11:39:01 +08:00
chenyu
d2316ba91a
don't validate output in sdxl with fakeweights ( #12160 )
...
NULL backend passed validation before because both desired and actual went through NULL backend
2025-09-13 21:47:51 -04:00
nimlgen
b1d1816f43
device: fix envvars ( #12159 )
2025-09-13 23:38:09 +03:00
nimlgen
19d9d29b7e
device: compilers in tinygrad.device ( #12151 )
...
* hcq: do not spam with errors in -m device
* -m tinygrad p2
* fix
* ugh
* comp in ckey
* fix
* one more
* print defaults
* xx
2025-09-13 21:45:29 +03:00
qazal
6410dcb7c2
viz: less verbose render loop ( #12158 )
...
* define visible once
* move y offsets to one place
2025-09-13 19:04:37 +03:00
nimlgen
92df52d79a
make method_cache account for compiler ( #12156 )
...
* make method_cache account for compiler
* sorry
2025-09-13 17:00:11 +03:00
chenyu
0c392089d9
update mypy ( #12155 )
2025-09-13 09:48:38 -04:00
qazal
fbca6183ad
do not launch BEAM when opts_to_apply exists [pr] ( #12152 )
2025-09-13 14:57:46 +03:00
George Hotz
b2a95d32bb
check clSetKernelArg ( #12149 )
2025-09-13 17:24:55 +08:00
George Hotz
0695e322a8
fix android cpu device ( #12148 )
2025-09-13 15:42:04 +08:00
Sieds Lykles
e3a3764917
delete fold_unrolled_divs ( #12146 )
2025-09-13 03:09:36 +02:00
Sieds Lykles
51ed6e94b2
AxisType __repr__ method ( #12145 )
2025-09-13 01:15:38 +02:00
Sieds Lykles
0757a9a819
add pytest-timeout of 3 min per item ( #12144 )
...
* add pytest-timeout with timeout of 3 min
* func_only
2025-09-13 00:48:41 +02:00
Sieds Lykles
2fc0bd150b
Arange overflow raises error and one_hot upcast ( #11975 )
...
* add error
* to_dtype
* shorten line
* add test
* upcast one hot dim im overflows
2025-09-13 00:18:25 +02:00
chenyu
aac3dceaf6
merge two PYTHON backend ci job ( #12143 )
...
* merge two PYTHON backend ci job
and mark anything that takes > 10 in test_ops slow
* two more
2025-09-12 17:36:46 -04:00
ttomsa
a12d0933c1
fix vec dtype in fast idiv ( #12080 )
...
* fix
* add vec dtypes to fuzzer
* add vec=False
---------
Co-authored-by: Sieds Lykles <93992551+S-Lykles@users.noreply.github.com >
2025-09-12 23:00:43 +02:00
chenyu
25091951ba
update test/models ( #12142 )
...
minor fix and run more stuff in tinygrad for speed
2025-09-12 16:43:28 -04:00
Sieds Lykles
62376c8b2b
update store load noop pattern to use Invalid ( #12141 )
...
* update pattern
* add test
2025-09-12 22:25:53 +02:00
chenyu
647965fb09
test_train cleanup ( #12140 )
...
* test_train cleanup
remove skipIf due to buffer sizes, runs locally
* those are slow
2025-09-12 13:21:30 -04:00
chenyu
0fad07c684
viz serve default path ( #12139 )
...
`python tinygrad/viz/serve.py` shows last session instead of an empty page
2025-09-12 18:32:44 +03:00
nimlgen
81e33b8439
system: cpu memory mappings are uncached ( #12137 )
...
* system: cpu memory mappings is uncached
* adm amd
2025-09-12 13:28:25 +03:00
qazal
68b0ad05a4
viz: format tuple tags ( #12135 )
...
* viz: format tuple tags
* use python repr
2025-09-12 11:36:53 +03:00
qazal
e80c8a7548
merge TestIndexing with TestSchedule + remove duplicate tests ( #12134 )
...
* merge TestIndexing with TestSchedule
* remove the arange_copy tests
* no FUSE_ARANGE import
2025-09-12 10:35:14 +03:00
Sieds Lykles
b5a3b8de20
remove where on gated load if gates are the same ( #12129 )
...
* add rules
* add tests
2025-09-12 06:52:35 +02:00
George Hotz
a2f502b89e
fix rangeify=1 ops on GPU ( #12130 )
2025-09-12 11:17:37 +08:00
George Hotz
0766616962
isolate the const hacks in the old kernelize ( #12126 )
...
* isolate the const hacks in the old kernelize
* if rangeify, don't waste time
2025-09-12 08:35:35 +08:00
Sieds Lykles
1f3950a484
Invalid idx ( #12067 )
...
* merge index_dtype_3
* new lowering with Invalid idx
* remove that dtype from range
* finish merge
* annotate better
* indentation
* dont need that anymore
* always process replay for openpilot
* more uop_given_valid for idx
* valid past index_child
* fix bug preventing load getting an alt value
* add track_match_stats back in in shapetracker and remove cache
* get_valid_idx -> get_valid and get_idx
* fix heuristics with new idx
* split line
* fix typo
* fix signature
* dont skip idx if stride is 0
the idx may still be invalid
* lower const with new valid
* delete to_indexed_uops
* update shapetracker test
* delete axis_is_masked
* add cache back
* move around comment
* fix get_valid bug
* move invalid fold to symbolic so its earlier
* cleanup
* update applying padto to new idx
* add unit tests
* cleanup
* fold line
* improve spec
* dont try to render Invalid as a float
* more consistent invalid index
* update some tests
* Fold index with true cond
* skip test
* vconst min max if Invalid in arg
* fix signature of UOp.const
* add test for min/max of Invalid CONST/VCONST
* add InvalidType to as_const signature
* is Invalid to isinstance
* Add InvalidType to ConstLike
* index gate is a where gate
* make that a metaclass
* fix heurisics for new idx
* mypy happy
2025-09-12 01:42:02 +02:00
chenyu
544eb2c402
clean up test_scatter_reduce ( #12125 )
2025-09-11 16:36:58 -04:00
chenyu
9ad6a56d17
smaller test_simple_reduce ( #12124 )
2025-09-11 15:45:38 -04:00
chenyu
e5ef9ec5b1
remove IGNORE_OOB=0 in ci tests ( #12117 )
2025-09-11 15:05:04 -04:00
chenyu
3a83b56da5
fix test_dequantization_mxfp4 ( #12123 )
...
* fix test_dequantization_mxfp4
* assert_allclose
* rtol
2025-09-11 14:22:06 -04:00
chenyu
520e2e0727
actually run unit tests in ci MacOS (unit) ( #12122 )
...
* actually run unit tests in ci MacOS (unit)
* that's always wrong
2025-09-11 13:32:30 -04:00
nimlgen
acb700fc26
ci: fix ptx env ( #12120 )
2025-09-11 12:42:15 -04:00
chenyu
20cd7177de
delete test_bert_fuse_arange ( #12121 )
...
* delete test_bert_fuse_arange
it's the default now and we are not interested in FUSE_ARANGE=0 version
* remove -v
2025-09-11 12:35:51 -04:00
chenyu
b07f962058
split metal model tests ( #12119 )
...
* split metal model tests
* llama too
2025-09-11 12:20:12 -04:00
chenyu
66593f135f
remove duplicated test_real_world ( #12118 )
...
included in the test/models right below
2025-09-11 11:57:14 -04:00
qazal
e76211fcbc
viz: specify all rect styles in parent ( #12115 )
...
* viz: specify all rect styles in parent
Visually a no-op, but it's easier to reason about when the rect's coloring comes from `g` parent that holds UOp data.
* this stays
2025-09-11 13:48:59 +03:00
nimlgen
400ad93892
ci: gate boost paths for macos only ( #12114 )
2025-09-11 12:48:34 +03:00
George Hotz
3ef0e5e01e
rangeify: use Ops.REALIZE and not Ops.CONTIGUOUS if it's added by system ( #12111 )
...
* rangeify: use Ops.REALIZE and not Ops.CONTIGUOUS if it's added by system
* fix contig + BufferizeOpts
* no outerworld
2025-09-11 11:56:59 +08:00
b1tg
52ebed991e
change dtype promo lattice when fp8s is supported ( #12088 )
...
* change dtype promo lattice when fp8s is supported
* no device check
* int64 + uint64 => fp8
2025-09-10 22:09:11 -04:00
George Hotz
d4eba5800d
rangeify cost function infrastructure ( #12091 )
...
* one call to hc opt
* does that pass?
* add cost function to rangeify
* test
* more test
* gate thread
* bufferize has shape
* ish
* match old behavior
* no ci there
2025-09-11 07:19:53 +08:00
qazal
78610b681e
viz: light up children ( #12107 )
...
* viz: light up children
* keep tag coloring
2025-09-11 01:28:01 +03:00
Sieds Lykles
3989f5b559
Revert "Simplify valid in symbolic ( #12104 )" ( #12108 )
...
This reverts commit 73d479a016 .
2025-09-10 23:36:40 +02:00
Sieds Lykles
73d479a016
Simplify valid in symbolic ( #12104 )
...
* cleanup cast_folding
* from sym to symbolic
* no more sym in dtype lowering
* move around simplify_valid
* update test
2025-09-10 23:26:19 +02:00
chenyu
e306650d39
remove GPUDevice ( #12106 )
2025-09-10 16:35:00 -04:00
George Hotz
d8a7a1c9c7
BUFFERIZE shape should be each range, not the product ( #12105 )
...
* BUFFERIZE shape should be each range, not the product
* fix tests
* resolve
2025-09-11 04:02:24 +08:00
Sieds Lykles
3730172c10
cleanup cast_folding ( #12101 )
...
* cleanup cast_folding
* from sym to symbolic
* no more sym in dtype lowering
2025-09-10 21:30:20 +02:00
chenyu
0e266f376c
ops_gpu -> ops_cl ( #12103 )
2025-09-10 15:15:48 -04:00