Commit Graph

1791 Commits

Author SHA1 Message Date
nimlgen
6f1cb6be86 am: tiny err handling cleanups (#14981)
* am: tiny err handling cleanups

* x

* x
2026-02-24 12:43:45 +03:00
imaolo
405d37423e call release() in MetalAllocator._free (#14970)
* add failing test

* call MTLBuffer.release() in MetalAllocator._free()

* Update test_metal.py

---------

Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com>
2026-02-23 23:33:31 +03:00
nimlgen
77db8e1c07 cpu: wait on dep signals (#14862)
* cpu: task_done() in case of failures

* print

* fix

* x

* f

* x

* um

* ?

* u

* f

* x

* gh

* f

* f

* virt

* x

* simpler
2026-02-23 21:09:41 +03:00
nimlgen
d86f1d66b5 system: apl validate dev_id bounds (#14964) 2026-02-23 12:18:03 +03:00
qazal
60f90dd97c sqtt: fix jitted program deduping, failing test for graphed kernels (#14951)
* work

* hcq_profile fix, test with JIT=2 passes

* ci, -n=auto

* rm duplicate test

* less
2026-02-22 15:22:31 +09:00
nimlgen
6de15dc480 mockam usb (#14916)
* mockam usb

* f

* win

* x

* x
2026-02-21 23:05:54 +03:00
Christopher Milan
815780f72f cl: fix multi-image arg kernels (#14920) 2026-02-20 17:34:17 -05:00
nimlgen
071403f9a1 system: use MAP_FIXED_NOREPLACE (#14884) 2026-02-19 18:32:50 +03:00
nimlgen
041dc0cf85 fix typos (#14886) 2026-02-19 17:37:15 +03:00
Kartik Vashishta
9a9c7648e9 system: fix pci_scan_bus vendor filter (#14885)
* system: fix pci_scan_bus vendor filter

* fix: formatting
2026-02-19 17:23:32 +03:00
nimlgen
1c8c17a593 am: aca (#14861) 2026-02-18 21:40:09 +03:00
nimlgen
dda5ccf63b hcq: fix usb<->cpu mappings (#14827)
* hcq: fix usb<->cpu mappings

* non cpu

* um
2026-02-17 18:04:18 +03:00
nimlgen
801677cf12 am: GCVM_L2_PROTECTION_FAULT_STATUS prints device (#14830) 2026-02-17 18:03:52 +03:00
nimlgen
a2586e4c70 nv: move reset earlier (#14824) 2026-02-17 17:25:49 +03:00
qazal
f8e485ee9e nvcc/nvdisasm macos shim (#14822)
* move to backend

* and arch

* setup_nvcc_osx

* blackwell

* min test

* now getting dumb assert is_ptx

* support cubin.

* work

* remove that

* simpler
2026-02-17 20:07:05 +09:00
Christopher Milan
275319c789 IMAGE=1 2d indexing (#14809)
* IMAGE=1 2d indexing

* cleanup

* oops

* go back to 'idx'

* fix vals

* fix

* ugh
2026-02-16 22:51:18 -05:00
nimlgen
131bbbbfd8 am: smu_v13_0_12 (#14800) 2026-02-16 22:58:10 +03:00
nimlgen
7ddc888ad5 am: 48bit for gfx950 (#14799) 2026-02-16 22:48:07 +03:00
nimlgen
9f8afb518c viz: sdma gb/s in graph (#14798)
* viz: sdma gb/s in graph

* f
2026-02-16 16:45:06 +03:00
qazal
db3db476ff viz: add GB/s to SDMA (#14795)
* work

* better

* fix that

* no decimal
2026-02-16 20:09:20 +09:00
Christopher Milan
9c95a11f90 autogen: handle rocm bump and better error wording (#14776)
* autogen: handle rocm bump and better error wording

* regen
2026-02-15 19:23:47 -05:00
nimlgen
26193cbf9a nv: prof cpu_access for nvd only (#14769) 2026-02-15 21:42:04 +03:00
nimlgen
e1a18dadae fix devices for copies (#14747)
* fix devices for copies

* add test
2026-02-14 17:39:41 +03:00
nimlgen
7d88626068 nv: fix pma_bytes to be system memory (#14733) 2026-02-13 17:55:46 +03:00
nimlgen
ba67425680 am: reset mi300 with pm4 (#14727) 2026-02-13 11:22:32 +03:00
Christopher Milan
7993f3a277 autogen: use snapshot.debian.org for linux src (#14718) 2026-02-12 23:36:38 -05:00
George Hotz
4088d686b2 remove llvm requirement from amd (#14717)
* remove llvm requirement from amd

* tests pass

* test

* sink kernarg_size

* move stuff

* amd_asm_matmul to new style

* default type

* fix tests, simpler

* cu mode is faster and simpler

* darken
2026-02-13 10:50:12 +08:00
Christopher Milan
d4bc5ab609 autogen: download linux sources (#14714) 2026-02-12 18:50:50 -05:00
chenyu
8551fa50d3 support bitcast in sym_infer (#14708)
fixed `DEBUG=2 DEV=WEBGPU python -m pytest test/backend/test_tensor_variable.py::TestTensorVariable::test_symbolic_pad`
2026-02-12 10:21:05 -05:00
nimlgen
10c94d2c2d amd: print more info about device hang (#14705) 2026-02-12 15:34:08 +03:00
George Hotz
4680247e35 renderer/amd: move in tree (#14702)
* renderer/amd: move in tree

* fix paths in tests

* 24000 lines

* no delete for amd files
2026-02-12 18:09:16 +08:00
nimlgen
869083e373 nv: pciiface pma (#14686)
* x

* w

* z

* clean

* o

* r

* x

* c

* r

* list

* deanon

* b
2026-02-11 23:29:07 +03:00
nimlgen
42ded7c34d amd: bind aql (#14666)
* amd: bind to aql

* bind

* x

* f
2026-02-10 16:28:11 +03:00
Christopher Milan
cdb78954cb better cl compiler name (#14660)
cl_compiler instead of compiler because overriding Compiled.compiler seems more confusing
2026-02-10 01:03:46 -05:00
Christopher Milan
e6562a5061 remove CompilerPair (#14638) 2026-02-09 19:51:18 -05:00
Christopher Milan
27f7ea478b new style DSP renderer (#14636)
* new style DSP renderer

* cleanup
2026-02-09 00:39:03 -05:00
Christopher Milan
efac5b9ef6 new style NV/CUDA renderers, try 2 (#14634)
* new style NV/CUDA renderers, try 2

* fix diskcache
2026-02-08 22:58:48 -05:00
Christopher Milan
0ebb508b85 new style metal compiler (#14632) 2026-02-08 21:58:25 -05:00
Christopher Milan
9eef9f38ad new style python renderer (#14631) 2026-02-08 21:45:07 -05:00
Christopher Milan
5f2f2cc956 Revert "new style NV/CUDA renderers (#14627)" (#14633)
This reverts commit 0e505951b0.
2026-02-08 21:16:03 -05:00
Christopher Milan
4ad787ece2 new style CPULLVMRenderer (#14629) 2026-02-08 21:05:01 -05:00
Christopher Milan
0e505951b0 new style NV/CUDA renderers (#14627)
* new style NV/CUDA renderers

* fix pickle

* oops

* fix CUDA_CC=NVCC

* mockgpu uses PTXCompiler

* oops

* ruff

* dont discard stderr

* ugh
2026-02-08 21:04:51 -05:00
nimlgen
a615b9d781 am: f8_mode for gfx94x only (#14620) 2026-02-08 17:38:48 +03:00
nimlgen
88c3022223 amd: kfd iface early exit (#14612)
* amd: kfd iface early exit

* l

* revert
2026-02-07 18:57:10 +03:00
nimlgen
ce7bfc6ce8 nv: use nv_flags for all fields (#14607) 2026-02-07 15:01:38 +03:00
nimlgen
fbeb978170 diff devices for sdma (#14589)
* start

* x

* fix

* sdma

* c

* clean

* x

* hm

* cleaer
2026-02-06 16:39:12 +03:00
nimlgen
483bba4f05 nv: use prof_exec_counter (#14559) 2026-02-05 19:00:14 +03:00
nimlgen
ec2b6bbda8 hcq: update signal logic (#14531) 2026-02-04 19:32:56 +03:00
nimlgen
62786d488a am: mi3xx perf (#14529) 2026-02-04 19:32:43 +03:00
nimlgen
2f55005ad9 qcom: sync cpu cache when from_blob (#14518)
* um

* fx

* d

* x

* x

* x

* x

* f

* ren
2026-02-03 21:51:03 +03:00