Commit Graph

9368 Commits

Author SHA1 Message Date
qazal
f6d55d9272 viz: pickle UPat location (#11086) 2025-07-04 13:09:00 +03:00
qazal
2403f126ed move printable out of UPat [pr] (#11085)
* move printable out of UPat [pr]

* print_match_stats
2025-07-04 12:31:11 +03:00
qazal
988540f401 support capturing cpu_profile on error (#11078)
* support capturing cpu_profile on error

* spacing

* pylint complains
2025-07-04 11:53:12 +03:00
chenyu
a2f5a54458 move sparse_categorical_crossentropy to test_ops (#11083)
also flattened the tests
2025-07-03 21:40:54 -04:00
chenyu
7c8ccb0267 sparse_categorical_crossentropy cleanup [pr] (#11082) 2025-07-03 18:32:52 -04:00
nimlgen
e02ee8ef1b nv: cleanups from 5090 (#11081) 2025-07-04 00:08:47 +03:00
George Hotz
e9a01dd04a Revert "Fix division by zero in add views (#11075)" (#11080)
This reverts commit 19f07e72f6.
2025-07-03 11:39:44 -07:00
Sieds Lykles
19f07e72f6 Fix division by zero in add views (#11075) 2025-07-03 11:37:59 -07:00
chenyu
678cabc6f2 use argfix in Tensor.stack (#11077)
works for multiple Tensor args or single tuple/list of Tensors, but not the mixed
2025-07-03 12:15:11 -04:00
qazal
b695e8c4d6 viz: remove support for naming with self (#11076) 2025-07-03 17:29:14 +03:00
Sieds Lykles
53985297bd add test, fix rewrite rule and raise error on division by zero (#11073) 2025-07-03 08:25:06 -04:00
nimlgen
2d138c6cf1 am: factor out init_sw (#11070) 2025-07-03 11:01:17 +03:00
quortus
a937ac80dc Replace ASSIGN with STORE in UPat compiler (#11065) 2025-07-02 19:15:43 -07:00
George Hotz
d049639221 little setitem test (#11064)
* setitem has one less realize, why broken

* put realize back
2025-07-02 15:10:24 -07:00
quortus
17d85b9793 Refactor STORE implementation in ops_python (#11060) 2025-07-02 14:29:12 -07:00
George Hotz
3b85534df0 outerworld range test [pr] (#11059)
* outerworld range test [pr]

* bound range

* grad acc test

* more tests

* 5 steps is fine
2025-07-02 14:28:44 -07:00
chenyu
425d5f55c4 generate kernel dataset and upload artifact (#11063) 2025-07-02 17:21:25 -04:00
chenyu
09cc64eea7 remove const 0 clause in "UOp with size 0 is zero" [pr] (#11061) 2025-07-02 16:36:40 -04:00
chenyu
4d57437a67 add timeout to benchmark_search and mlperf action (#11058)
default timeout is 6 hours which is too long and occupies a box
2025-07-02 14:17:34 -04:00
nimlgen
6067568087 nv: remove hardcoded CTRL_CMD_VASPACE_COPY_SERVER_RESERVED_PDES (#11057) 2025-07-02 20:41:10 +03:00
qazal
ad155f5454 print inputs to get_program in process replay [pr] (#11051)
* print inputs to get_program in process replay [pr]

* colors

* keep dataclass default escapes

* Revert "keep dataclass default escapes"

This reverts commit c6db7e8a7a.

* note for ast_repr

* add that back
2025-07-02 20:20:01 +03:00
Ignacio Sica
a22aa77c82 cleanup opts_to_apply (#11055)
* fix kernelinfo init in fixup_ast

* opts_to_apply None
2025-07-02 20:03:19 +03:00
qazal
a919b8325b add test_kernel_info (#11054)
* add test_kernel_info

* reorder
2025-07-02 19:48:12 +03:00
kevvz
3b041d188f [bounty] Singular Value Decomposition (#10875)
* inital commit

* add qr + expand svd to full matrix

* add odd number support

* add linalg tests

* qr supports dims of arbitrary size

* add qr tests

* svd supports dims of arbitrary size

* small cleanip

* improvements over svd batch handling

* improve linalg tests

* make u_pad match q shape

* add nonfull matrix tests

* little less verbose nonfull svd test

* added dtypes on svd + return vt instead of vt

* lint

* more lint

* lint + set seed

* small fix

* small lint

* lint

* add int casting to indices and shapes

* remove int from shape tuple in svd

* small cleanup

* add return types

* reuse inverse_permute

* refactoring

* whitespace

* remove regularization term to prevent bad outputs on ill conditioned matrices

* remove seed

* refactor

* lint

* refactor

* spacing

* remove clone

* line reduction

* smarter heuristic for iterations_per_round

* add big test

* lint

* turns out no constant needed?

* wrap tests

* some small matrices need the constant

* remove realize

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-07-02 09:06:03 -07:00
Ignacio Sica
fc42c3063e use kernel info (#11049)
* use kernel info

* keep api

* revert change in comment
2025-07-02 08:42:32 -07:00
Ahmed Harmouche
e992ed10dc WebGPU on Windows (#10890)
* WebGPU on Windows

* Fix dawn-python install

* New test

* pydeps

* Minor fix

* Only install dawn-python on windows webgpu

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-07-02 08:38:45 -07:00
nimlgen
e67a6d2310 nv: tiny cleanups (#11053) 2025-07-02 18:37:32 +03:00
chenyu
4626e9c172 is_numpy_ndarray helper [pr] (#11050) 2025-07-02 09:12:53 -04:00
qazal
452b22c9b6 fix process replay diff in PYTHON device [pr] (#11052)
* fix process replay diff in PYTHON device [pr]

The PYTHON backend pickles and encodes UOps, the encoded binary can't be
directly diffed in process replay.

* note
2025-07-02 11:06:46 +03:00
geohotstan
8ebf0abaae ONNX external_test_onnx_backend use PYTHON device for model (#10915)
* try

* ruff check --fix

* no skip test

* hmmmmmmm I don't get this D:

* run CI again

* why is PYTHON device faster than CPU?

* run ci again and fix lint

* actually doesn't PYTHON device make sense here?

* see cpu speed again

* Revert "see cpu speed again"

This reverts commit 1e366f2256.

* trigger CI

* pretty good

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-07-01 12:11:17 -04:00
qazal
8b0871ac31 viz: test for no lockup on infinite loop (#11041)
* viz: add test infinite loop fallback

* assert

* continue til the end

* work

* bring that back

* fallback to nop
2025-07-01 17:44:20 +03:00
b1tg
fcbefde8f5 fix DiskDevice reuse (#11039)
* fix DiskDevice reuse

* fix mypy and DiskDevice.count

* mypy

* add test

---------

Co-authored-by: b1tg <b1tg@users.noreply.github.com>
2025-07-01 10:29:21 -04:00
George Hotz
5628e2054c hotfix: if no ranges, return None 2025-06-30 18:07:56 -07:00
George Hotz
0597735f28 remove TC=3 not porting this (#11045) 2025-06-30 15:12:49 -07:00
George Hotz
cccfe6b422 hotfix: test_no_inf_loop_bottom_up 2025-06-30 14:21:45 -07:00
George Hotz
752c76ceb7 tc3 shape expand [pr] (#11043)
* tc3 shape expand [pr]

* remove unused stuff in lowerer
2025-06-30 13:38:14 -07:00
George Hotz
539b17fcbf expand local shape so shapes work [pr] (#11042) 2025-06-30 13:03:31 -07:00
nimlgen
9ea7deb515 hcq: select_iface shared (#11033)
* hcq: select_iface shared

* errs

* sorry

* upprt
2025-06-30 21:12:39 +03:00
qazal
013085da7d viz: only path "/" serves the UI (#11037)
The dict used to exist for /profiler and main localhost:8000, we don't
need it anymore.
2025-06-30 19:10:33 +03:00
George Hotz
b829331219 infinite loop detect in fixed_point_rewrite [pr] (#11038) 2025-06-30 08:57:29 -07:00
Nino Risteski
bc15e98f5c clean up unused imports in examples and update CI linting (#11024)
* clean up unused imports in examples

* enable unused import checking in examples

* lint

* ignore F541 and F841 - focus on unused imports only

* clean up

* restore tinygrad.frontend.torch for TINY_BACKEND

* tiny change
2025-06-30 08:21:27 -07:00
George Hotz
cb531dba42 detect infinite loop in graph rewrite [pr] (#11036) 2025-06-30 08:15:13 -07:00
qazal
710d734ce7 viz: don't need PICKLE_BUFFER=0 in capture (#11031) 2025-06-30 16:20:04 +03:00
qazal
2ea4737930 viz: fix newlines breaking label colors (#11030)
* viz: fix newlines breaking label colors

* TestViz.test_colored_label

* TestWordWrap
2025-06-30 13:39:44 +03:00
George Hotz
5911b71404 early support for bidirectional pattern matcher (#11027)
* early support for bidirectional pattern matcher

* expose it and add a test

* no bottom up arg there

* disable flaky test
2025-06-29 16:54:07 -07:00
George Hotz
ec1d97191d minor cleanup to lowerer [pr] (#11026)
* minor cleanup to lowerer [pr]

* add that rule to sym
2025-06-29 11:01:29 -07:00
Piyush
454bc3393d redundant code (#11014) 2025-06-29 09:06:10 -07:00
qazal
19b11cb778 hotfix: check canvas exists before access (#11022) 2025-06-29 14:44:14 +03:00
chenyu
126fcf4129 clean up AMD_LLVM in tests (#11021) 2025-06-28 22:45:47 -04:00
qazal
cb6a66ea84 viz: remove per schedule renderMemoryGraph (#11019)
replaced with per device Buffer viz https://github.com/tinygrad/tinygrad/pull/10960
2025-06-28 22:09:38 +03:00