ttomsa
4905af4ae0
remove invalid int div test ( #11106 )
...
* rm test
* also rm this
2025-07-05 18:57:55 -04:00
qazal
a4aa769c0a
fix: type checking for track_rewrites key [pr] ( #11104 )
...
* fix: type checking for track_rewrites key [pr]
* also for cpu_profile
* func.__name__ to start
2025-07-05 20:11:21 +03:00
qazal
81781dc12b
viz: renames and spacing changes to tracing ( #11102 )
2025-07-05 18:40:39 +03:00
qazal
7619bf35e7
cleanup: remove disabled TestIndexingOrdering ( #11101 )
...
* cleanup: remove disabled TestIndexingOrdering
* don't import kernelize internals
2025-07-05 18:14:37 +03:00
qazal
4fcfaa0ef7
viz: switch to TracingKey ( #11100 )
...
* viz: switch to TracingKey
* tuple
* order is name, keys, fmt
* add test_tracing_key
2025-07-05 17:46:18 +03:00
qazal
458be950d9
viz: add TINY device ( #11095 )
...
* viz: add TINY device
* replace Any with a proper type
* reorder
* diff
* rename
* space
* from diff
* multiple keys
2025-07-05 16:54:55 +03:00
nimlgen
4dccb2ea49
am_smi: increase kill retries ( #11099 )
2025-07-05 16:23:50 +03:00
chenyu
39b4d72687
remove flatten and reshape in sparse_categorical_crossentropy [pr] ( #11093 )
...
not needed, directly operating on the classes dim is fine
2025-07-04 15:15:27 -04:00
nimlgen
577afc9f05
hcq: remove redunt syncs and fix typing ( #11096 )
...
Before this patch the code could issues reduntdant syncs because of
the typing issue. Current tests should cover all correctness checks.
2025-07-04 21:49:47 +03:00
qazal
41aa54eb5a
viz: resolve all graph references in python ( #11087 )
...
* viz: resolve all graph references in python
* it just maps things to the index
* always map the name
* key on the uop
* diff
* close
2025-07-04 20:35:25 +03:00
qazal
3d8569f6d8
hotfix: infinite loop in tracking pattern matcher ( #11094 )
...
* failing test
* fix that
* given matchers
2025-07-04 19:55:26 +03:00
qazal
a783211fc7
viz: allow end_time=None in trace events ( #11092 )
2025-07-04 17:48:17 +03:00
0xSG
17119b0f23
hip_ioctl: platform.machine added ( #11084 )
2025-07-04 17:20:24 +03:00
nimlgen
6656aa162c
nv: enable huge pages ( #11091 )
2025-07-04 17:17:24 +03:00
nimlgen
01f3c4f44d
memory: simpler paddr allocation logic ( #11090 )
...
* memory: new paddr allocation logic
* am fix
* am refactrros
* fix
* mypy
* use it
* am
2025-07-04 17:00:36 +03:00
qazal
f6d55d9272
viz: pickle UPat location ( #11086 )
2025-07-04 13:09:00 +03:00
qazal
2403f126ed
move printable out of UPat [pr] ( #11085 )
...
* move printable out of UPat [pr]
* print_match_stats
2025-07-04 12:31:11 +03:00
qazal
988540f401
support capturing cpu_profile on error ( #11078 )
...
* support capturing cpu_profile on error
* spacing
* pylint complains
2025-07-04 11:53:12 +03:00
chenyu
a2f5a54458
move sparse_categorical_crossentropy to test_ops ( #11083 )
...
also flattened the tests
2025-07-03 21:40:54 -04:00
chenyu
7c8ccb0267
sparse_categorical_crossentropy cleanup [pr] ( #11082 )
2025-07-03 18:32:52 -04:00
nimlgen
e02ee8ef1b
nv: cleanups from 5090 ( #11081 )
2025-07-04 00:08:47 +03:00
George Hotz
e9a01dd04a
Revert "Fix division by zero in add views ( #11075 )" ( #11080 )
...
This reverts commit 19f07e72f6 .
2025-07-03 11:39:44 -07:00
Sieds Lykles
19f07e72f6
Fix division by zero in add views ( #11075 )
2025-07-03 11:37:59 -07:00
chenyu
678cabc6f2
use argfix in Tensor.stack ( #11077 )
...
works for multiple Tensor args or single tuple/list of Tensors, but not the mixed
2025-07-03 12:15:11 -04:00
qazal
b695e8c4d6
viz: remove support for naming with self ( #11076 )
2025-07-03 17:29:14 +03:00
Sieds Lykles
53985297bd
add test, fix rewrite rule and raise error on division by zero ( #11073 )
2025-07-03 08:25:06 -04:00
nimlgen
2d138c6cf1
am: factor out init_sw ( #11070 )
2025-07-03 11:01:17 +03:00
quortus
a937ac80dc
Replace ASSIGN with STORE in UPat compiler ( #11065 )
2025-07-02 19:15:43 -07:00
George Hotz
d049639221
little setitem test ( #11064 )
...
* setitem has one less realize, why broken
* put realize back
2025-07-02 15:10:24 -07:00
quortus
17d85b9793
Refactor STORE implementation in ops_python ( #11060 )
2025-07-02 14:29:12 -07:00
George Hotz
3b85534df0
outerworld range test [pr] ( #11059 )
...
* outerworld range test [pr]
* bound range
* grad acc test
* more tests
* 5 steps is fine
2025-07-02 14:28:44 -07:00
chenyu
425d5f55c4
generate kernel dataset and upload artifact ( #11063 )
2025-07-02 17:21:25 -04:00
chenyu
09cc64eea7
remove const 0 clause in "UOp with size 0 is zero" [pr] ( #11061 )
2025-07-02 16:36:40 -04:00
chenyu
4d57437a67
add timeout to benchmark_search and mlperf action ( #11058 )
...
default timeout is 6 hours which is too long and occupies a box
2025-07-02 14:17:34 -04:00
nimlgen
6067568087
nv: remove hardcoded CTRL_CMD_VASPACE_COPY_SERVER_RESERVED_PDES ( #11057 )
2025-07-02 20:41:10 +03:00
qazal
ad155f5454
print inputs to get_program in process replay [pr] ( #11051 )
...
* print inputs to get_program in process replay [pr]
* colors
* keep dataclass default escapes
* Revert "keep dataclass default escapes"
This reverts commit c6db7e8a7a .
* note for ast_repr
* add that back
2025-07-02 20:20:01 +03:00
Ignacio Sica
a22aa77c82
cleanup opts_to_apply ( #11055 )
...
* fix kernelinfo init in fixup_ast
* opts_to_apply None
2025-07-02 20:03:19 +03:00
qazal
a919b8325b
add test_kernel_info ( #11054 )
...
* add test_kernel_info
* reorder
2025-07-02 19:48:12 +03:00
kevvz
3b041d188f
[bounty] Singular Value Decomposition ( #10875 )
...
* inital commit
* add qr + expand svd to full matrix
* add odd number support
* add linalg tests
* qr supports dims of arbitrary size
* add qr tests
* svd supports dims of arbitrary size
* small cleanip
* improvements over svd batch handling
* improve linalg tests
* make u_pad match q shape
* add nonfull matrix tests
* little less verbose nonfull svd test
* added dtypes on svd + return vt instead of vt
* lint
* more lint
* lint + set seed
* small fix
* small lint
* lint
* add int casting to indices and shapes
* remove int from shape tuple in svd
* small cleanup
* add return types
* reuse inverse_permute
* refactoring
* whitespace
* remove regularization term to prevent bad outputs on ill conditioned matrices
* remove seed
* refactor
* lint
* refactor
* spacing
* remove clone
* line reduction
* smarter heuristic for iterations_per_round
* add big test
* lint
* turns out no constant needed?
* wrap tests
* some small matrices need the constant
* remove realize
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-07-02 09:06:03 -07:00
Ignacio Sica
fc42c3063e
use kernel info ( #11049 )
...
* use kernel info
* keep api
* revert change in comment
2025-07-02 08:42:32 -07:00
Ahmed Harmouche
e992ed10dc
WebGPU on Windows ( #10890 )
...
* WebGPU on Windows
* Fix dawn-python install
* New test
* pydeps
* Minor fix
* Only install dawn-python on windows webgpu
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-07-02 08:38:45 -07:00
nimlgen
e67a6d2310
nv: tiny cleanups ( #11053 )
2025-07-02 18:37:32 +03:00
chenyu
4626e9c172
is_numpy_ndarray helper [pr] ( #11050 )
2025-07-02 09:12:53 -04:00
qazal
452b22c9b6
fix process replay diff in PYTHON device [pr] ( #11052 )
...
* fix process replay diff in PYTHON device [pr]
The PYTHON backend pickles and encodes UOps, the encoded binary can't be
directly diffed in process replay.
* note
2025-07-02 11:06:46 +03:00
geohotstan
8ebf0abaae
ONNX external_test_onnx_backend use PYTHON device for model ( #10915 )
...
* try
* ruff check --fix
* no skip test
* hmmmmmmm I don't get this D:
* run CI again
* why is PYTHON device faster than CPU?
* run ci again and fix lint
* actually doesn't PYTHON device make sense here?
* see cpu speed again
* Revert "see cpu speed again"
This reverts commit 1e366f2256 .
* trigger CI
* pretty good
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-07-01 12:11:17 -04:00
qazal
8b0871ac31
viz: test for no lockup on infinite loop ( #11041 )
...
* viz: add test infinite loop fallback
* assert
* continue til the end
* work
* bring that back
* fallback to nop
2025-07-01 17:44:20 +03:00
b1tg
fcbefde8f5
fix DiskDevice reuse ( #11039 )
...
* fix DiskDevice reuse
* fix mypy and DiskDevice.count
* mypy
* add test
---------
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
2025-07-01 10:29:21 -04:00
George Hotz
5628e2054c
hotfix: if no ranges, return None
2025-06-30 18:07:56 -07:00
George Hotz
0597735f28
remove TC=3 not porting this ( #11045 )
2025-06-30 15:12:49 -07:00
George Hotz
cccfe6b422
hotfix: test_no_inf_loop_bottom_up
2025-06-30 14:21:45 -07:00