chenyu
e8252e6e4f
use offical gguf in test ( #14872 )
...
also deleted bad test_load_sample_mxfp4, added some hard coded simple tests
2026-02-18 19:55:09 -05:00
qazal
f590564bf7
gemm multiple is only for cdna4 asm ( #14814 )
...
* gemm multiple is only for cdna4 asm
* move to backend
* and arch
* path
2026-02-17 14:00:02 +09:00
nimlgen
131bbbbfd8
am: smu_v13_0_12 ( #14800 )
2026-02-16 22:58:10 +03:00
George Hotz
dff9cf35c2
amd asm emulator fixes + run it in CI ( #14786 )
...
* amd asm fix, try 2
* fix tests
2026-02-16 13:24:21 +08:00
qazal
55a4dfa2e0
cdna4 asm_gemm tests in CI on the null backend ( #14785 )
...
* cdna4 asm_gemm tests in CI on the null backend
* no .numpy() in null
* better
* gemm/asm: device comes from renderer
2026-02-16 14:06:23 +09:00
kevvz
33b2ade8cd
Rdna4 emulator test_ops, dtypes pass ( #14773 )
...
* test_ops, test_dtypes pass
* merge cdna4
* ruff + more tests
* reorganize
* /backend
* again
* again...
* add rdna4
2026-02-16 10:13:39 +08:00
George Hotz
bd18217f32
add rdna3/rdna4/cdna4 to testamd ( #14778 )
...
* add rdna3/rdna4/cdna4 to testamd
* test simplify
* ci cleanups
* mergable
* skip slow
2026-02-16 09:45:16 +08:00
Christopher Milan
9c95a11f90
autogen: handle rocm bump and better error wording ( #14776 )
...
* autogen: handle rocm bump and better error wording
* regen
2026-02-15 19:23:47 -05:00
qazal
33b31d9cd6
tinykittens flash attention dtype fix, add CI ( #14770 )
...
* don't hardcdoe amd device
* add failing tests, ci too
* fix: fix for dtype mixin
* bump to rocm 7.1
---------
Co-authored-by: Woze Parrot <wozeparrot@gmail.com >
2026-02-16 01:15:11 +09:00
qazal
9da7f5e733
disable process replay for AMD emulator renderer [pr] ( #14766 )
...
* disable process replay for AMD emulator renderer [pr]
* line
* skip
2026-02-15 18:52:37 +09:00
George Hotz
5289b4e882
renderer/amd: add cdna emulator ( #14721 )
...
* renderer/amd: add cdna emulator
* fixes
* no predecode
* no early
* REMU_PATH
* delete that
* round
* Fix cache invalidation check in _compile_smem
2026-02-13 16:06:58 +08:00
George Hotz
d3adb8428e
Revert "hotfix: skip test/amd in macpytest" ( #14704 )
...
* Revert "hotfix: skip test/amd in macpytest"
This reverts commit b7dade2adf .
* no llvm subprocess
* simpler
* sys.exec
* cleanup
* process safe
* diag
* arm ftz support
* 5 sec
* this one
2026-02-13 08:00:24 +08:00
Christopher Milan
084d0d0103
cleanup macos webgpu tests ( #14715 )
2026-02-12 17:56:34 -05:00
Christopher Milan
c30bb0f006
fix WEBGPU isnan check ( #14711 )
2026-02-12 17:01:18 -05:00
George Hotz
b7dade2adf
hotfix: skip test/amd in macpytest
2026-02-12 18:16:04 +08:00
George Hotz
4680247e35
renderer/amd: move in tree ( #14702 )
...
* renderer/amd: move in tree
* fix paths in tests
* 24000 lines
* no delete for amd files
2026-02-12 18:09:16 +08:00
George Hotz
095a064ba8
test.yml explicitly says backend ( #14700 )
...
* test.yml explicitly says backend
* 1e-5
2026-02-12 16:03:44 +08:00
George Hotz
c331798201
move tests to test/backend ( #14691 )
...
* move tests to test/backend
* fix imports
* fix CI
* revert that one
* Fix formatting in README for test command
2026-02-12 11:09:44 +08:00
George Hotz
cc9bf8ccbc
move more to null/unit tests ( #14658 )
...
* move more to null tests
* move test_gc
* no test fusion op
2026-02-10 13:35:17 +08:00
Christopher Milan
b36b62eb59
don't push docker cache for PRs ( #14652 )
2026-02-09 19:55:55 -05:00
Christopher Milan
396e1320fb
bump cache version for z3 ( #14650 )
2026-02-09 19:32:07 -05:00
wozeparrot
d87ae1c84c
feat: tinyfs load test in benchmark ( #14602 )
2026-02-06 18:00:00 -08:00
Garret Castro
cee7ef7ab2
disable threads ( #14555 )
2026-02-05 16:11:32 -05:00
chenyu
41a179f542
fix test_xlm_roberta_large ( #14564 )
...
onnxruntime does not allow symlink that's outside model dir. update snapshot_download to use local_dir instead of cache_dir. some ad hoc migration step to copy the existing model too
2026-02-05 14:56:06 -05:00
Christopher Milan
b47397ab17
list ml_dtypes as dependency for DSP ( #14562 )
...
* pin onnxruntime to 1.23.2 for DSP
* list ml_dtypes instead
This reverts commit 84bb2cc0fc .
2026-02-05 14:27:50 -05:00
George Hotz
d59e6e7a37
move more tests to test/null, split some existing ones ( #14512 )
...
* move more tests to test/null, split some existing ones
* null work
* null work
* move more
* fixes
* move PIL
* PIL in CLIP
* don't move that
2026-02-03 20:20:20 +08:00
George Hotz
dc77b3318b
move files that pass with NULL=1 to test/null ( #14508 )
...
* move files that pass with NULL=1 to test/null
* fix windows
* cpu 0
* bugfix + durations
2026-02-03 13:52:36 +08:00
George Hotz
85c7b23160
add pytest -nauto to benchmark for mac ( #14458 )
...
* add pytest -nauto to benchmark
* 3 minute timeout
* 3 min
* setup env
* comment
* fresh db
* in the pyenv
2026-02-03 12:26:09 +08:00
Christopher Milan
a5d7eb37db
IR3 works on versions earlier than 3.14 ( #14507 )
2026-02-02 23:10:19 -05:00
George Hotz
33c886cafa
disable copyout on NULL backend by default ( #14506 )
...
* disable copyout on NULL backend
* gate it
* allow copyout on some tests
2026-02-03 11:57:47 +08:00
George Hotz
6e958dbfd4
assembly/amd: add RDNA4 support to emulator ( #14341 )
...
* start new rdna4
* work
* plus works
* more pass
* rdna4
* assembly/amd: fix RDNA4 emulator for float16 and VOP3 clamp
* stale
* rev
* rr
* rdna4 emu tests
* cleanup
* cleanup
* simp
* works
* better factorizaion
* hacks
* fix mockgpu
* guard both
* cleaner
* gate
* bug fix and a few tests
* all test_tiny
2026-02-02 21:35:59 +08:00
Christopher Milan
e575dd8275
prevent UB in long decomp and more emulated tests ( #14447 )
2026-01-30 19:38:41 -05:00
Christopher Milan
1803ee939d
EMULATED_DTYPES=long works with CPU_LLVM ( #14446 )
2026-01-30 13:54:43 -05:00
Christopher Milan
88caf57ef4
ci: unify python versions ( #14430 )
2026-01-29 21:42:03 -05:00
Christopher Milan
e47f12f671
ci: replace testing_minimal with testing_unit ( #14427 )
2026-01-29 18:02:43 -05:00
Christopher Milan
0c855d6149
ci: remove unused pydeps ( #14418 )
2026-01-29 01:51:26 -05:00
chenyu
37cde4a01a
add one line mypy report ( #14415 )
2026-01-28 20:39:32 -05:00
nimlgen
544928766d
hcq_smi: kill mac pids ( #14398 )
2026-01-28 15:00:28 +03:00
qazal
5bffa17f82
llama train: better NULL=1 EMULATE=AMD_CDNA4 dev experience ( #14395 )
...
* beam opens devices
* switch to hip renderer
* amd: true?
* llvm true is for test_autogen
2026-01-28 17:31:22 +09:00
Christopher Milan
067e27857e
nested composite actions don't work ( #14393 )
2026-01-28 00:13:30 -05:00
Christopher Milan
9dddf3d478
don't save caches for PRs, try 2 ( #14391 )
2026-01-27 23:30:17 -05:00
Christopher Milan
68fe5d8b36
Revert "don't save caches for PRs ( #14389 )" ( #14390 )
2026-01-27 23:22:26 -05:00
Christopher Milan
4ab228b498
don't save caches for PRs ( #14389 )
2026-01-27 23:21:31 -05:00
Christopher Milan
5e36482314
decompose long to ints where unsupported, try 2 ( #14383 )
2026-01-27 23:20:43 -05:00
George Hotz
88bc5ee212
assembly/amd: rename to better names ( #14384 )
...
* assembly/amd: rename to better names
* might help fuzzing segfault
* emu2 -> emu
2026-01-28 10:00:54 +08:00
chenyu
cd22ee9ed0
add InvalidType to ConstType [pr] ( #14373 )
...
* add InvalidType to ConstType [pr]
TYPED=1 python test/test_tiny.py passes.
added PyConst = float|int|bool for some Tensor level input types
* hcq
2026-01-27 14:09:34 -05:00
chenyu
db010a31be
IGNORE_OOB -> CHECK_OOB [pr] ( #14374 )
...
flip the meaning
2026-01-27 12:20:59 -05:00
Christopher Milan
c9c533fc78
libclang path is homebrew on macos ( #14357 )
...
* libclang path is homebrew macos
* typo
* ugh
* typo
* regen
* no LIBCLANG_PATH
2026-01-26 17:32:09 -05:00
qazal
2d91fe6310
use amdgpu dsl in mmapeak ( #14342 )
...
* use amdgpu dsl in mmapeak
* don't rely on llvm for vgpr counting
* llvm roundtrip assert
* rm it, add ci
* vgpr_count
* move emulated test to amd, it needs comgr
* env
* arch
* inst._fields -> inst.operands
* vgpr offset
2026-01-26 22:03:43 +09:00
qazal
b2e2ace85b
viz: remove ci check, it's VIZ=-1/-2 ( #14343 )
2026-01-26 20:36:23 +09:00