nimlgen
|
230d08ec70
|
test for am recovery and faults handling (#14421)
* test for am recovery and faults handling
* linter
|
2026-01-29 17:11:24 +03:00 |
|
George Hotz
|
793afbd473
|
simplify nn.Embedding, support AFTER in CUSTOM_KERNEL (#14419)
|
2026-01-29 17:22:13 +08:00 |
|
Christopher Milan
|
0c855d6149
|
ci: remove unused pydeps (#14418)
|
2026-01-29 01:51:26 -05:00 |
|
wozeparrot
|
4845e42135
|
llama3 gradacc fixes (#14414)
|
2026-01-28 19:12:39 -08:00 |
|
chenyu
|
37cde4a01a
|
add one line mypy report (#14415)
|
2026-01-28 20:39:32 -05:00 |
|
chenyu
|
15aed51544
|
return types for all math.py function (#14413)
calling int() on sint -> int, i think it's better support since some UOp can be safely cast to int
|
2026-01-28 20:10:11 -05:00 |
|
nimlgen
|
aec1ae0de1
|
llama: set manual_seed (#14409)
|
2026-01-28 14:40:00 -08:00 |
|
chenyu
|
0870ed28b1
|
add Self type to MathMixin (#14411)
these don't cause error
|
2026-01-28 16:59:38 -05:00 |
|
chenyu
|
079f33c208
|
fix type in Tensor.mean and Tensor.var (#14410)
use Tensor.from_uop to wrap UOp from symbolic shape, kernels are the same
|
2026-01-28 15:24:02 -05:00 |
|
chenyu
|
2b5e99ccc1
|
minor type cleanups [pr] (#14408)
mypy --warn-redundant-casts has false negative
|
2026-01-28 14:11:50 -05:00 |
|
chenyu
|
726415dbc8
|
import sint directly in movement.py TYPE_CHECKING (#14406)
avoid creating string TypeAlias, fixed warning in `TYPED=1 python test/test_tiny.py`
|
2026-01-28 12:47:26 -05:00 |
|
nimlgen
|
acb2fc36ba
|
nv_pma: add decoder (#14404)
* nv_pma: add decoder
* cl
|
2026-01-28 20:44:02 +03:00 |
|
chenyu
|
7b9bc1d8cf
|
_MockMemoryviewMeta for mockgpu (#14405)
fixed `PYTHONPATH=. TYPED=1 DEV=AMD MOCKGPU=1 python test/test_tiny.py`. basically make `isinstance(TrackedMemoryView_instance, memoryview)` true
|
2026-01-28 11:59:00 -05:00 |
|
chenyu
|
93793a645b
|
use cl.cl_mem instead of internal ctypes._CData (#14403)
fixed `CHECK_OOB=0 DEV=CL TYPED=1 python test/test_tiny.py`
|
2026-01-28 10:56:41 -05:00 |
|
chenyu
|
a9b44070a8
|
fix webgpu runtime types (#14402)
`CHECK_OOB=0 DEV=WEBGPU TYPED=1 python test/test_tiny.py` passed, also skip tests that failed locally
|
2026-01-28 10:37:25 -05:00 |
|
George Hotz
|
0c6b3f50aa
|
add marker to llama training (#14401)
|
2026-01-28 22:44:28 +08:00 |
|
Jakob Sachs
|
2b7c00d3d2
|
fix sd-example dtype for CLIP embeddings (#14397)
|
2026-01-28 09:07:19 -05:00 |
|
qazal
|
a5a9ce3fdf
|
viz: disasm cleanups from null emulate (#14399)
* it's AMDHIPRenderer
* don't need that indent
* less assignment stuff
* that arg order did not make sense
* pmc
|
2026-01-28 22:03:30 +09:00 |
|
nimlgen
|
544928766d
|
hcq_smi: kill mac pids (#14398)
|
2026-01-28 15:00:28 +03:00 |
|
George Hotz
|
202b74b369
|
assembly/amd: continue refactors (#14386)
* simpler
* merge
* flat
* no ctx
* use the correct apis
* dup code
* write clean code
* remove bad helpers
* bits junk remove
* junk remove
* smem test
* fix tests
* correct fix + tests
* Fmt matters it seems
* wmma refactor
* a lil more
* kimi cleanups
* line
|
2026-01-28 17:33:03 +08:00 |
|
qazal
|
5bffa17f82
|
llama train: better NULL=1 EMULATE=AMD_CDNA4 dev experience (#14395)
* beam opens devices
* switch to hip renderer
* amd: true?
* llvm true is for test_autogen
|
2026-01-28 17:31:22 +09:00 |
|
qazal
|
0294014108
|
fix bufferize cost function for multi, improve VIZ=-1 cli (#14394)
* improve cli
* remove_bufferize change
|
2026-01-28 15:53:18 +09:00 |
|
qazal
|
c158acea29
|
failing multi ram usage test from llama gemm (#14392)
|
2026-01-28 14:32:32 +09:00 |
|
Christopher Milan
|
067e27857e
|
nested composite actions don't work (#14393)
|
2026-01-28 00:13:30 -05:00 |
|
Christopher Milan
|
9dddf3d478
|
don't save caches for PRs, try 2 (#14391)
|
2026-01-27 23:30:17 -05:00 |
|
Christopher Milan
|
68fe5d8b36
|
Revert "don't save caches for PRs (#14389)" (#14390)
|
2026-01-27 23:22:26 -05:00 |
|
Christopher Milan
|
4ab228b498
|
don't save caches for PRs (#14389)
|
2026-01-27 23:21:31 -05:00 |
|
Christopher Milan
|
5e36482314
|
decompose long to ints where unsupported, try 2 (#14383)
|
2026-01-27 23:20:43 -05:00 |
|
wozeparrot
|
e496547720
|
llama3 gradacc (#14291)
|
2026-01-27 19:48:10 -08:00 |
|
George Hotz
|
88bc5ee212
|
assembly/amd: rename to better names (#14384)
* assembly/amd: rename to better names
* might help fuzzing segfault
* emu2 -> emu
|
2026-01-28 10:00:54 +08:00 |
|
George Hotz
|
065b95cfb0
|
Revert "add retry to fetch (#14370)" (#14385)
This reverts commit dc4d7f2d55.
|
2026-01-28 09:35:37 +08:00 |
|
Eitan Turok
|
dc4d7f2d55
|
add retry to fetch (#14370)
|
2026-01-27 14:04:25 -08:00 |
|
chenyu
|
8d1f3c8885
|
fix copysign for inf input (#14381)
* fix copysign for inf input
* llvm olt
|
2026-01-27 16:45:48 -05:00 |
|
Christopher Milan
|
289a3e415e
|
also skip test_nonoverlapping_shrink_assignment (#14382)
|
2026-01-27 16:26:26 -05:00 |
|
Christopher Milan
|
f34efc1ad1
|
DISABLE_FAST_IDIV actually works as a ContextVar (#14378)
|
2026-01-27 16:12:42 -05:00 |
|
chenyu
|
8c899e4aaf
|
fix copysign for -0 (#14380)
test both x and 1/x < 0 work too. and found another big with the * 0 hack
|
2026-01-27 15:44:58 -05:00 |
|
chenyu
|
62884585a7
|
failed test case for copysign -0.0 (#14379)
* failed test case for copysign -0.0
* skip those
|
2026-01-27 14:37:17 -05:00 |
|
nimlgen
|
ec1b28bc2c
|
am: exit early in case of failures (#14376)
* am: exit early in case of failures
* sorry, pre-linter
* reset when error state
|
2026-01-27 22:10:02 +03:00 |
|
chenyu
|
cd22ee9ed0
|
add InvalidType to ConstType [pr] (#14373)
* add InvalidType to ConstType [pr]
TYPED=1 python test/test_tiny.py passes.
added PyConst = float|int|bool for some Tensor level input types
* hcq
|
2026-01-27 14:09:34 -05:00 |
|
Christopher Milan
|
5b42a1357b
|
SCACHE=0 works with DEBUG (#14377)
|
2026-01-27 13:12:43 -05:00 |
|
chenyu
|
db010a31be
|
IGNORE_OOB -> CHECK_OOB [pr] (#14374)
flip the meaning
|
2026-01-27 12:20:59 -05:00 |
|
chenyu
|
c22667b0c4
|
also skip test_overlapping_shrink_assignment_reverse (#14375)
crashing
|
2026-01-27 12:20:39 -05:00 |
|
nimlgen
|
e52d58b041
|
autogen: update amd (#14372)
|
2026-01-27 19:53:14 +03:00 |
|
nimlgen
|
cbf94a0a95
|
nv: exit early in case of failures (#14363)
* nv: exit early in case of failures
* f
* cleaner
|
2026-01-27 19:16:22 +03:00 |
|
nimlgen
|
ec691cb299
|
am: print sq intrs (#14366)
* am: print sq intrs
* cleaner
|
2026-01-27 18:28:13 +03:00 |
|
qazal
|
a5f3d46423
|
hcq: do not assume kernel names are unique (#14371)
* hcq: do not assume kernel names are unique
* colored kernel name
|
2026-01-27 23:03:15 +09:00 |
|
George Hotz
|
e5df7e640b
|
fix branches in amd_asm_matmul (#14369)
|
2026-01-27 20:48:42 +08:00 |
|
George Hotz
|
0ced258726
|
HOTFIX: skip crashing assign test
|
2026-01-27 20:35:17 +08:00 |
|
George Hotz
|
131ae604de
|
force_transcendental on sqrt (#14368)
|
2026-01-27 20:24:41 +08:00 |
|
imaolo
|
14574c68fa
|
Add ContextVar to disable the scheduler cache (#14257)
* add scheduler cache ContextVar
* test scheduler cache context var
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
|
2026-01-27 19:55:29 +08:00 |
|