Commit Graph

4317 Commits

Author SHA1 Message Date
Shun Usami
34a05b31fe Fix advanced tensor indexing setitem (#12128)
* Add failure test case for advanced tensor indexing setitem

* Fix advanced tensor indexing setitem when permuted

* Reduce line count

* Revert unnecessary change

* Combine two lines into one
2025-09-14 15:22:40 -04:00
chenyu
d09c0f28c5 increase test_module_runs (#12173)
timed out on ci windows llvm
2025-09-14 15:19:21 -04:00
chenyu
12a910f1d2 update torch 2.8 (#12172)
support _reshape_alias. something is wrong with one case of unfold
2025-09-14 15:19:03 -04:00
chenyu
98ecab7563 remove ml_dtypes (#12169) 2025-09-14 14:20:05 -04:00
qazal
02054b53fe remove tests that pre date the uop spec (#12168)
* remove tests that pre date the uop spec

* const src

* for RANGEIFY=1

* update with bind

* remove import
2025-09-14 18:47:42 +03:00
qazal
1591e4f66b update outbufs selection in test_linearizer [pr] (#12166) 2025-09-14 13:46:49 +03:00
George Hotz
bcafa72b7f use tags instead of graph_rewrite_map in rangeify (#12110)
* use tags instead of graph_rewrite_map in rangeify

* new style, add realize

* metadata works

* simple failure

* fix

* loops

* stuff becomes a NOOP when you remove it

* stuff becomes a NOOP when you remove it

* tags on bufferize

* bmnist works

* locals don't work

* shippable

* fix some tests

* simpler map_realize

* remove const hack

* debuggable test

* broke

* assign test

* straight up bug

* wooo it passes

* sink shouldn't be there

* fix ops

* bmnist

* kv cache ish

* Set RANGEIFY context variable to 0

* should work normal

* better

* types

* hacks to fix test_symbolic

* pm_add_buffers

* tests should pass
2025-09-14 11:39:01 +08:00
nimlgen
b1d1816f43 device: fix envvars (#12159) 2025-09-13 23:38:09 +03:00
nimlgen
92df52d79a make method_cache account for compiler (#12156)
* make method_cache account for compiler

* sorry
2025-09-13 17:00:11 +03:00
Sieds Lykles
e3a3764917 delete fold_unrolled_divs (#12146) 2025-09-13 03:09:36 +02:00
Sieds Lykles
2fc0bd150b Arange overflow raises error and one_hot upcast (#11975)
* add error

* to_dtype

* shorten line

* add test

* upcast one hot dim im overflows
2025-09-13 00:18:25 +02:00
chenyu
aac3dceaf6 merge two PYTHON backend ci job (#12143)
* merge two PYTHON backend ci job

and mark anything that takes > 10 in test_ops slow

* two more
2025-09-12 17:36:46 -04:00
ttomsa
a12d0933c1 fix vec dtype in fast idiv (#12080)
* fix

* add vec dtypes to fuzzer

* add vec=False

---------

Co-authored-by: Sieds Lykles <93992551+S-Lykles@users.noreply.github.com>
2025-09-12 23:00:43 +02:00
chenyu
25091951ba update test/models (#12142)
minor fix and run more stuff in tinygrad for speed
2025-09-12 16:43:28 -04:00
Sieds Lykles
62376c8b2b update store load noop pattern to use Invalid (#12141)
* update pattern

* add test
2025-09-12 22:25:53 +02:00
chenyu
647965fb09 test_train cleanup (#12140)
* test_train cleanup

remove skipIf due to buffer sizes, runs locally

* those are slow
2025-09-12 13:21:30 -04:00
qazal
e80c8a7548 merge TestIndexing with TestSchedule + remove duplicate tests (#12134)
* merge TestIndexing with TestSchedule

* remove the arange_copy tests

* no FUSE_ARANGE import
2025-09-12 10:35:14 +03:00
Sieds Lykles
b5a3b8de20 remove where on gated load if gates are the same (#12129)
* add rules

* add tests
2025-09-12 06:52:35 +02:00
George Hotz
0766616962 isolate the const hacks in the old kernelize (#12126)
* isolate the const hacks in the old kernelize

* if rangeify, don't waste time
2025-09-12 08:35:35 +08:00
Sieds Lykles
1f3950a484 Invalid idx (#12067)
* merge index_dtype_3

* new lowering with Invalid idx

* remove that dtype from range

* finish merge

* annotate better

* indentation

* dont need that anymore

* always process replay for openpilot

* more uop_given_valid for idx

* valid past index_child

* fix bug preventing load getting an alt value

* add track_match_stats back in in shapetracker and remove cache

* get_valid_idx -> get_valid and get_idx

* fix heuristics with new idx

* split line

* fix typo

* fix signature

* dont skip idx if stride is 0

the idx may still be invalid

* lower const with new valid

* delete to_indexed_uops

* update shapetracker test

* delete axis_is_masked

* add cache back

* move around comment

* fix get_valid bug

* move invalid fold to symbolic so its earlier

* cleanup

* update applying padto to new idx

* add unit tests

* cleanup

* fold line

* improve spec

* dont try to render Invalid as a float

* more consistent invalid index

* update some tests

* Fold index with true cond

* skip test

* vconst min max if Invalid in arg

* fix signature of UOp.const

* add test for min/max of Invalid CONST/VCONST

* add InvalidType to as_const signature

* is Invalid to isinstance

* Add InvalidType to ConstLike

* index gate is a where gate

* make that a metaclass

* fix heurisics for new idx

* mypy happy
2025-09-12 01:42:02 +02:00
chenyu
544eb2c402 clean up test_scatter_reduce (#12125) 2025-09-11 16:36:58 -04:00
chenyu
9ad6a56d17 smaller test_simple_reduce (#12124) 2025-09-11 15:45:38 -04:00
chenyu
3a83b56da5 fix test_dequantization_mxfp4 (#12123)
* fix test_dequantization_mxfp4

* assert_allclose

* rtol
2025-09-11 14:22:06 -04:00
chenyu
520e2e0727 actually run unit tests in ci MacOS (unit) (#12122)
* actually run unit tests in ci MacOS (unit)

* that's always wrong
2025-09-11 13:32:30 -04:00
chenyu
20cd7177de delete test_bert_fuse_arange (#12121)
* delete test_bert_fuse_arange

it's the default now and we are not interested in FUSE_ARANGE=0 version

* remove -v
2025-09-11 12:35:51 -04:00
George Hotz
3ef0e5e01e rangeify: use Ops.REALIZE and not Ops.CONTIGUOUS if it's added by system (#12111)
* rangeify: use Ops.REALIZE and not Ops.CONTIGUOUS if it's added by system

* fix contig + BufferizeOpts

* no outerworld
2025-09-11 11:56:59 +08:00
b1tg
52ebed991e change dtype promo lattice when fp8s is supported (#12088)
* change dtype promo lattice when fp8s is supported

* no device check

* int64 + uint64 => fp8
2025-09-10 22:09:11 -04:00
George Hotz
d4eba5800d rangeify cost function infrastructure (#12091)
* one call to hc opt

* does that pass?

* add cost function to rangeify

* test

* more test

* gate thread

* bufferize has shape

* ish

* match old behavior

* no ci there
2025-09-11 07:19:53 +08:00
Sieds Lykles
3989f5b559 Revert "Simplify valid in symbolic (#12104)" (#12108)
This reverts commit 73d479a016.
2025-09-10 23:36:40 +02:00
Sieds Lykles
73d479a016 Simplify valid in symbolic (#12104)
* cleanup cast_folding

* from sym to symbolic

* no more sym in dtype lowering

* move around simplify_valid

* update test
2025-09-10 23:26:19 +02:00
chenyu
0e266f376c ops_gpu -> ops_cl (#12103) 2025-09-10 15:15:48 -04:00
nimlgen
fb96394ff5 auto-select available compilers (#12094)
* device: auto select compilers

* fix

* metal+opencl

* nv/cuda

* test without ptx

* ptx

* fix tests

* fix

* fix test

* rename

* test + cleaner

* xx

* ops

* better test

* win?

* um?

* types

* debug

* win??

* sep rung

* wtf?

* debug

* skip win

* revert this

* types
2025-09-10 19:52:01 +03:00
George Hotz
9789337722 early reduce simplify (#12046)
* early reduce simplify

* min changes

* need that

* that goes in simplify

* no more arange reduce opt
2025-09-10 21:02:46 +08:00
nimlgen
551560b87c do not use getenv('PTX') in tests (#12095)
* test without ptx

* fix tests

* fix test

* linters
2025-09-10 14:04:07 +03:00
Sieds Lykles
0e420e68b4 delete axis_is_masked (#12092) 2025-09-10 05:26:19 +02:00
Sieds Lykles
499f50483b x | !x -> True (#12090) 2025-09-10 03:26:01 +02:00
Sieds Lykles
5b73076e48 assert benchmark times (#12042)
* assert jitted times in openpilot

* better error

* better error

* add ASSERT_MIN_STEP_TIME to more models

* t is step_times

* update benchmark times

* update times
2025-09-09 23:40:02 +02:00
b1tg
58d13a6e3e remove redundant check (#12087) 2025-09-09 15:15:39 -04:00
b1tg
82e955fe79 fix inf bug in float_to_fp8 (#12085) 2025-09-09 12:02:56 -04:00
b1tg
14faf7a5c0 AutoCastType tests for fp8s/bf16 (#12084) 2025-09-09 11:33:01 -04:00
nimlgen
1c6c42715f unify cpu and llvm (#11982)
* try unify cpu and llvm

* fixes

* fix

* ops

* no llvm

* fix

* rm

* lvmm is ot

* oops

* override

* no llvm

* ignore

* skip llvm

* ooops
2025-09-09 13:54:44 +03:00
nimlgen
9182948951 remove llvm_bf16_cast (#12075) 2025-09-08 20:51:15 +03:00
Sieds Lykles
75b58fe2d3 move simplify_valid pat to sym (#12065)
* move simplify_valid pat to sym

* fix expectedfailure
2025-09-08 07:01:26 +02:00
chenyu
56861852be enable IMAGE for test_mnist and test_mnist_backward (#12064)
passes now
2025-09-07 09:06:39 -04:00
nimlgen
10ac427aaa cpu threading (#11951)
* start cpu threading

* fix

* fix2

* fix

* hacks?

* threads

* minor

* no dsp

* dsp 2

* n

* more

* test

* xm

* cleaner

* readable

* f

* reorder

* when no threads

* rangeify

* typos

* not needed

* reapply

* remoev this

* linter

* fixed cpu count in ci

* fix

* fixes

* rm

* typo

* sort based on speed

* test if test works in ci

* Revert "test if test works in ci"

This reverts commit 1f05edb531.

* do not pad thread
2025-09-06 16:13:43 +03:00
Sieds Lykles
581b2388c2 add dtypes.index (#12015)
* add dtypes.index

* cast shape, stride and mask to dtypes.index in view.create

* move pm_lower_index_dtype to ops

* DEFINE_VAR is dtype.index by default

* merge var_val_using_str

* remove int from commutative

* fix test_rewrite_map

* change that to dtypes.index

* change some int to index

* shorten those

* remove old cast in renderer

* cleanup

* change that back

* add comment

* delete comment

* just delete those

* view doesnt have to cast anymore

* adjust comment
2025-09-06 06:03:44 +02:00
Sieds Lykles
c6c16b2946 var_vals uses str for var (#12011)
* var_vals is str,int

* remove imports

* remove print

* fix test

* change var_vals in hcq

* update test_hcq

* fix multitensor _device_num var

* fix syminfer test

* shorten line

* p.vars stays list[Variable]

* shorten line

* vars is back to tuple[Variable, ...]

* change var_vals in extra

* change var_vals from shapetracker

* var_vals is str:int

* fix signature
2025-09-06 04:16:12 +02:00
George Hotz
38dcadf07b delete kernel.py (#12040)
* delete kernel.py

* delete that file

* rip and tear

* don't test search

* imports

* fix torch frontend

* not a part of regen
2025-09-05 15:52:07 -07:00
George Hotz
ee4f696086 delete more tests (#12043)
* delete more tests

* delete and simplify

* flaky on windows

* a few more, those remained
2025-09-05 15:31:30 -07:00
George Hotz
12c7b1bb01 cleanup lin tests without Kernel (#12041)
* cleanup lin tests without Kernel

* no kernel.py there

* remove that test
2025-09-05 15:13:14 -07:00