George Hotz
770a558585
lil cleanups from uop branch [pr] ( #11197 )
2025-07-12 09:46:28 -07:00
George Hotz
5625e1904b
axis types in KernelInfo ( #11196 )
...
* axis types in KernelInfo [pr]
* simpler lowerer
* fix tests
2025-07-12 09:36:20 -07:00
nimlgen
ea7f2f779c
hcq: p2p nv-amd ( #11195 )
...
* hcq: p2p between diff devices
* fix
2025-07-12 18:53:34 +03:00
qazal
6a9f059b21
viz: early convert to cpu time ( #11192 )
2025-07-12 17:19:41 +03:00
chenyu
12b04efd69
remove a TODO prod(k.full_shape[k.first_upcast:]) ( #11191 )
...
IMAGE=2 test/test_ops.py works now
2025-07-12 10:16:56 -04:00
nimlgen
6f5250d158
nv: fix typing in rpc_rm_control ( #11189 )
2025-07-12 16:09:42 +03:00
qazal
c0a5490c72
viz: minor profiler cleanup ( #11190 )
2025-07-12 14:18:24 +03:00
chenyu
fdcc25e392
some noop hand_coded_optimizations cleanup [pr] ( #11188 )
2025-07-12 00:09:23 -04:00
chenyu
1ad852a892
break up Kernel.reshape_and_permute [pr] ( #11187 )
2025-07-11 18:08:08 -04:00
uuuvn
d11b20129d
DMARef infra ( #10753 )
...
Co-authored-by: wozeparrot <wozeparrot@gmail.com >
2025-07-11 14:09:47 -07:00
chenyu
b072be0e2d
hotfix whisper main script ( #11184 )
2025-07-11 12:34:00 -04:00
qazal
0b7e9b5db7
viz: bugfix for multiple rewrites with the same name ( #11182 )
2025-07-11 18:26:12 +03:00
nimlgen
f9e4c4e57a
nv: nvpci blackwell support ( #11127 )
...
* nv: start 5090
* gsp init 5090
* mmu
* works
* after merge
* clenaer
* rwk
* x
* fx
* finish?
* fix
* unrelated
* fix
* commenbt
2025-07-11 17:02:09 +03:00
qazal
1d85323572
viz: absolute scaling of memory graph ( #11181 )
2025-07-11 16:39:11 +03:00
nimlgen
c7f6b617b4
nv: do not hardcode lv0 pd size ( #11180 )
2025-07-11 16:26:18 +03:00
nimlgen
27922c986a
nv: generic mmu impl ( #11179 )
2025-07-11 16:26:09 +03:00
qazal
d3ec63a5c3
viz: add base class for unittests ( #11178 )
2025-07-11 13:58:03 +03:00
qazal
b791ea117d
viz: enable scrolling in profiler ( #11169 )
...
* viz: add scrollbar to profiler
* using margin fixes the layout bug
* s/profiler.clientHeight/profiler.scrollHeight, it's important
* closer
* scrolling on the device list also works
2025-07-11 11:30:13 +03:00
chenyu
b219e47bef
remove Kernel.upcasted_axis [pr] ( #11175 )
2025-07-10 23:19:21 -04:00
George Hotz
ccd382bc6f
use axis_types more [pr] ( #11172 )
...
* use axis_types more
* fix local shape
* simpler clause
* fix local shape
2025-07-10 15:05:13 -07:00
nimlgen
fb278c6a02
do not recreate Compiled.profile_events in helper_collect_profile ( #11171 )
2025-07-10 23:55:12 +03:00
George Hotz
5c5eb92ed4
tc unroll after upcast [pr] ( #11170 )
2025-07-10 13:43:50 -07:00
George Hotz
05613c8cac
use shape str for tensor cores upcast/reduce [pr] ( #11168 )
...
* use shape str for tensor cores upcast/reduce [pr]
* reduce axis count isn't fixed
2025-07-10 13:10:58 -07:00
nimlgen
cc6ed30f4f
nv: relative lv addressing in NVPageTableEntry ( #11164 )
2025-07-10 22:35:50 +03:00
chenyu
439d033af9
update the README matmul example ( #11167 )
...
don't call rand and numpy to show that it's indeed one kernel
2025-07-10 14:47:29 -04:00
qazal
bde80c0cdf
record GraphEvents in metal graph ( #11145 )
...
* record GraphEvents in metal graph
* add TestProfiler.test_graph, revert old stuff
* move profile capture to MetalGraph
* comment
* don't double record graph command buffers
* wait_check
* explicit delete
2025-07-10 21:32:06 +03:00
George Hotz
8ce3d5906b
use shape_str for tensor cores ( #11165 )
2025-07-10 09:10:36 -07:00
nimlgen
581397110f
nv: use classes in GSP_IP ( #11163 )
2025-07-10 17:47:12 +03:00
nimlgen
705de6b8a6
nv: parse sizes of ctx buffers ( #11161 )
2025-07-10 17:46:48 +03:00
qazal
dcc9704b6b
viz: profile RewriteSteps in TINY device ( #11125 )
...
* viz: profile RewriteSteps in TINY device
* use TracingKey with category
* split by whitespace
* add tracing.py
* work
* tracing_key
* TRACK_MATCH_STATS=3, can this be in defaults?
* fallback name
* work
* javascript
* measure text is slow
* checkout
* profile graph_rewrite/graph_rewrite_map
* change that
* no as
* finally
* work
* linking works
2025-07-10 17:45:57 +03:00
Pyry Kovanen
32117402dd
metal: fix incorrect _free on interpreter exit ( #11158 )
2025-07-10 14:01:30 +03:00
qazal
3d610f6d2b
viz: small ui cleanup ( #11157 )
...
* viz: small ui cleanup
* 2
2025-07-10 11:43:36 +03:00
chenyu
7db07e5f2c
don't narrow range of CAST on bool/unsigned ( #11156 )
2025-07-09 22:20:09 -04:00
George Hotz
e154a66f43
unroll axis 0 in tensor core ( #11155 )
...
* unroll is 0 in tc [pr]
* flip order of upcast/reduce in tensor core
* Revert "flip order of upcast/reduce in tensor core"
This reverts commit e564e38bcd .
2025-07-09 17:28:23 -07:00
George Hotz
b7742ad9e4
migrate to string swizzle [pr] ( #11154 )
2025-07-09 16:57:53 -07:00
George Hotz
4156baee93
break swizzle into three chunks [pr] ( #11153 )
...
* break swizzle into three chunks [pr]
* test failed
2025-07-09 15:30:34 -07:00
George Hotz
ca2dc95433
swizzle in tc can't be none [pr] ( #11152 )
2025-07-09 14:44:23 -07:00
George Hotz
53ae153404
tc should be in opt ( #11148 )
...
* tc should be in opt [pr]
* fix import
2025-07-09 14:12:21 -07:00
wozeparrot
6697d0089d
initial gfx950 kfd support ( #11151 )
...
* feat: initial gfx950 support
* fix: lint
2025-07-09 13:45:16 -07:00
George Hotz
262054be52
gfx950 tc support ( #11150 )
2025-07-09 13:30:42 -07:00
nimlgen
b6981404ed
memory: use page shifts in memory manager ( #11149 )
...
* memory: use page shifts in memory manager
* fix
2025-07-09 22:05:00 +03:00
qazal
5c1d215b41
viz: add Graph stream ( #11144 )
...
* viz: stack an event for the entire batch
* multi
* whitespace
* work
* multi graph, Graph gets its own row
2025-07-09 20:56:46 +03:00
George Hotz
22305260e0
move tc to tc.py [pr] ( #11147 )
2025-07-09 10:55:56 -07:00
George Hotz
2893feb9f6
cleanups for kernel.py ( #11143 )
...
* cleanups for kernel.py
* fixups
2025-07-08 18:10:25 -07:00
George Hotz
b11ca104e9
axis cleanups [pr] ( #11142 )
2025-07-08 17:07:26 -07:00
chenyu
7ce9e45474
mypy onnx_parser ( #11141 )
2025-07-08 19:50:28 -04:00
George Hotz
a1b8f3e64f
delete info from kernel [pr] ( #11139 )
...
* delete info from kernel [pr]
* update kernel info
* delete info
2025-07-08 15:53:13 -07:00
George Hotz
359bed74f8
axis type tracking [pr] ( #11137 )
...
* axis type tracking [pr]
* keep update_info
* keep legacy colors
* update tests to apply_opt
2025-07-08 14:16:25 -07:00
chenyu
dada3f5bf3
skip some new onnx tests ( #11135 )
...
these fails on master with latest onnx
2025-07-08 16:12:48 -04:00
chenyu
ffcc557986
lint onnx and onnx_parser ( #11134 )
2025-07-08 15:28:35 -04:00