Commit Graph

10633 Commits

Author SHA1 Message Date
uuuvn
0d45e1a3ec Explicitly use CUDA_KERNEL_NODE_PARAMS v1 (#10776) 2025-06-11 16:27:50 +03:00
George Hotz
a38947b4bb move symbolic and transcendental to uop [pr] (#10771) 2025-06-10 20:51:22 -07:00
chenyu
81e296d7b8 remove Tensor.test() in retinanet (#10770)
test was removed
2025-06-10 22:14:57 -04:00
chenyu
25304c3dd0 default AMD_LLVM=1 (#10253) 2025-06-10 18:19:21 -04:00
George Hotz
9d0383634d bump cache and include full python version [pr] (#10768)
* bump cache and include full python version [pr]

* stupid windows

* really stupid windows
2025-06-10 15:07:30 -07:00
chenyu
612cdf5146 move fuzz_shape_ops to run with other fuzzer (#10767)
* move fuzz_shape_ops to run with other fuzzer

* don't skip CPU
2025-06-10 17:43:04 -04:00
chenyu
5e7ad70aae don't run linearize().uop tests in get_action_space test (#10766)
* don't run linearize().uop tests in get_action_space test

this part takes 2 minutes in CI and has nothing to do with action space. also not sure if the "for some reason" comment is still relevant

* -n=auto test/models
2025-06-10 17:23:53 -04:00
b1tg
52c49dd4f3 fix onnx ci (#10762)
Co-authored-by: b1tg <b1tg@users.noreply.github.com>
2025-06-10 14:28:40 -04:00
qazal
9e1d1ebc52 print tag in UOp [pr] (#10755) 2025-06-10 21:16:07 +03:00
chenyu
14fa62c61d move high level tests to unit (#10760)
either no need a backend, or running on one to check suffice
2025-06-10 12:55:44 -04:00
George Hotz
0fbf3f5554 Revert "Revert "Update autogen ci runner to ubuntu 24.04 (#10736)" (#10757)" (#10758)
This reverts commit a6dba9b9d9.
2025-06-10 09:32:27 -07:00
George Hotz
a6dba9b9d9 Revert "Update autogen ci runner to ubuntu 24.04 (#10736)" (#10757)
This reverts commit 1d15374c7a.
2025-06-10 09:31:51 -07:00
uuuvn
1d15374c7a Update autogen ci runner to ubuntu 24.04 (#10736)
For `kfd.AMDKFD_IOC_EXPORT_DMABUF`
2025-06-10 08:33:02 -07:00
Adrian Wijaya
78b9c30640 move idiv to MathTraits [pr] (#10748) 2025-06-10 08:32:09 -07:00
Sieds Lykles
0daa4c6ed0 Add DType.min and DType.max properties (#10749)
* add properties

* cleaner test

* remove added newline
2025-06-10 08:31:34 -07:00
qazal
5d9c274924 keep UOp tags if sources are replaced (#10754)
* keep UOp tags in unified_rewrite

* add failing test, print tag if defined

* remove the repr change
2025-06-10 08:30:14 -07:00
qazal
3de4c9839f viz: display UOp tags (#10751)
* viz: display UOp tags

* g.tag
2025-06-10 16:02:23 +03:00
nimlgen
800d1796d5 am_smi: kill process group (#10750) 2025-06-10 15:23:39 +03:00
qazal
5bd4ad2e8b viz: remove unused arg (#10747) 2025-06-10 12:00:09 +03:00
George Hotz
413e223d6e Revert "remove cpu graph, it's different from the others (#10743)" (#10745)
This reverts commit 3d64a98432.
2025-06-09 22:40:48 -07:00
George Hotz
3d64a98432 remove cpu graph, it's different from the others (#10743)
* remove cpu graph, it's different from the others

* remote was blacklisting CPUGraph
2025-06-09 22:17:10 -07:00
George Hotz
245b1d3a46 move add/mul to MathTrait [pr] (#10741)
* move add to MathTrait [pr]

* both add and mul
2025-06-09 21:48:55 -07:00
George Hotz
c28eceaf44 move to mathtraits.py (#10742) 2025-06-09 21:17:35 -07:00
George Hotz
acf72872b3 move view left to the outer graph prereqs + testing (#10725)
* move view left to the outer graph

* global view right

* dont need that one

* remove comment

* test kernelize

* simple

* split onnx, test sdxl null

* fix testing

* ugh, wrong one

* Update test.yml
2025-06-09 20:43:25 -07:00
chenyu
b7198fdcfd linearizer failure from wino fuse arange cifar (#10739) 2025-06-09 23:10:19 -04:00
George Hotz
58eebdb507 don't reassign metadata to the same uop + ignore oob in pr [pr] (#10737) 2025-06-09 18:43:39 -07:00
chenyu
364b903850 minor cleanups in linearize.py [pr] (#10735) 2025-06-09 19:49:19 -04:00
George Hotz
81ef879da3 non recursive top_down_rewrite (#10729)
* non recursive top_down_rewrite

* nicer algorithm

* rewrite bottom up also

* only top down is broken?

* simpler iterative algo

* no recursion errors

* top down and bottom up

* unified rewrite

* simpler rewrite

* clean up comments

* move that comment
2025-06-09 16:33:04 -07:00
chenyu
53cbd4254b suppress filter_too_much on test_float_cast_to_unsigned (#10733)
falky, already done in test_float_cast_to_unsigned_overflow and test_float_cast_to_unsigned_underflow
2025-06-09 18:30:04 -04:00
George Hotz
916bbd5c6b fixed point rewrite [pr] (#10732) 2025-06-09 14:46:20 -07:00
chenyu
55cdbb9a20 fix mask in expand into symbolic size (#10730)
failed before when old size is 1 and it expands into symbolic size, because `resolve(s != ns, False)` is False and it does not expand the mask
2025-06-09 17:33:22 -04:00
wozeparrot
926b11381c failing test for symbolic expand after pad (#10727)
* feat: failing test for symbolic expand after pad

* feat: mark test as failing
2025-06-09 16:55:21 -04:00
chenyu
49f999d919 update _reshape_mask for symbolic shape expand (#10726)
* don't merge shape symbolic reshape symbolic

* proper fix
2025-06-09 16:35:02 -04:00
wozeparrot
27dd97f688 support variable shape none slice in getitem (#10724) 2025-06-09 11:53:02 -07:00
Ignacio Sica
afd5140a09 remove no longer used IndexContext acc_num var (#10720) 2025-06-09 14:06:59 -04:00
George Hotz
f84c320548 better external_benchmark_schedule [pr] (#10722) 2025-06-09 10:26:11 -07:00
George Hotz
6270c0eac0 default ignore oob to 0 (#10660) 2025-06-09 10:25:43 -07:00
b1tg
24d328e313 onnx parser (#10435)
* onnx parser

* fix compile, lint

* onnx.load -> onnx_load

* compatible with ModelProto

* fix test external_test_onnx_ops.py

* fix tests

* fix signed int

* reduce to 261 lines

* fix TypeProto.Optional

* debug for _parse_message, add TypeProto.Sequence, cleanup

* onnx_load from Tensor

* remove BufferedReader

* 174 lines and reduce tensor copy

* cleanup

* use onnx_load in external_model_benchmark.py

* fix qcom test

* [onnx] parser support external data

---------

Co-authored-by: b1tg <b1tg@users.noreply.github.com>
Co-authored-by: chenyu <chenyu@fastmail.com>
2025-06-09 12:44:28 -04:00
Sieds Lykles
cfa65bea05 Subtract 1 from Variable upper bound (#10715) 2025-06-09 09:25:53 -07:00
George Hotz
ef58ab340a hotfix: remove n=auto from REMOTE=1 test 2025-06-09 09:19:36 -07:00
qazal
419a1286f2 viz: share cacheKey [pr] (#10717) 2025-06-09 17:48:29 +03:00
chenyu
35523dc35f move BLOCK_REORDER to caller [pr] (#10711)
so block_reorder tests won't fail with flag set to 0
2025-06-08 23:26:01 -04:00
chenyu
bb34c28b36 debug flag for linearize block_reorder [pr] (#10710) 2025-06-08 22:26:06 -04:00
chenyu
d93a0bee6b mlperf ci uses its own cache (#10705)
not to interfere with regular cache which is used by benchmark
2025-06-08 19:43:32 -04:00
qazal
8cdf6e4d1e viz memory graph tiny fixes [pr] (#10709)
* sched_sink is a step

* offset for yaxis

* clear existing

* scale offset
2025-06-09 01:10:12 +03:00
George Hotz
81b9c04574 move high level stuff to unit tests [pr] (#10708)
* move high level stuff to unit tests [pr]

* process replay on unit tests

* fix pr, less compute

* set omp num threads

* set 200MB buffer size limit

* delete junk

* fix tests

* faster

* move test_indexing to unit

* faster
2025-06-08 14:05:56 -07:00
nimlgen
171580e9ec am: fix reg update (#10707) 2025-06-08 21:45:55 +03:00
George Hotz
4305f532d9 clean up apt stuff (#10706)
* clean up apt stuff

* single apt install

* fixes

* fix opencl + ldconfig
2025-06-08 11:06:09 -07:00
George Hotz
4e2c3560b4 smaller tests are faster tests [pr] (#10704)
* remove del spam from CI

* more

* preconstruct default buffer spec

* ignore those errors

* check exception

* more exception check

* skip stuff

* smaller tests mean faster tests

* a few more
2025-06-08 10:54:19 -07:00
George Hotz
67a1c92fc0 remove del spam from CI (#10699)
* remove del spam from CI

* more

* preconstruct default buffer spec

* ignore those errors

* check exception

* more exception check

* skip stuff
2025-06-08 10:14:30 -07:00