chenyu
7cbafb2ef1
update hypothesis min version ( #13983 )
...
there was a local_constants perf regression that made hypothesis related tests slow
2026-01-02 21:01:57 -05:00
Christopher Milan
9dc524536f
IMAGE=1 creates "dynamic" images ( #13769 )
...
* remove image from BufferSpec
* cl tiny_gemm (64) works
* mypy
* padding
* openpilot CL
* reshape properly
* remove extra qcom checks
* pad output
* mypy
* update compile test
* move undo
* TestImageCopy valid images
* TestImageRealization valid images
* TestImageDType valid images
* cleanups
* test_renderer_failures
* ruff
* mypy
* simplify ops_qcom
* bump step time
* Revert "bump step time"
This reverts commit 75a037c7d0 .
* "dynamic textures" are optional
* a start
* IMAGE=1 works, no FLOAT16
* fast but wrong
* mypy
* some fixes
* better
* works
* refactor
* oops
2026-01-02 16:22:39 -05:00
Christopher Milan
61dc70f1a8
add driving_vision IMAGE=1 benchmark ( #13979 )
2026-01-02 13:58:27 -05:00
George Hotz
0e282025ff
assembly/amd: split test_emu into hw tests ( #13966 )
...
* assmebly/amd: split test_emu into hw tests
* hw tests
* bugfixes
* more tests and fix
2026-01-02 08:04:56 -08:00
chenyu
2e2b5fed12
fix misspellings ( #13976 )
2026-01-02 10:37:38 -05:00
nietras
f49e4714af
Fix spelling errors in README for AMD assembly ( #13975 )
2026-01-02 10:15:20 -05:00
b1tg
a78fcc55a4
amd tc 1616128 ( #13439 )
...
* amd tc 1616128
* fix test
* remove hardcoded check in test
2026-01-02 09:01:05 -05:00
chenyu
fcbb896e05
remove unused to_struct [pr] ( #13973 )
2026-01-02 08:54:57 -05:00
nimlgen
ff7853a65a
am: fix aid doorbells ( #13971 )
2026-01-02 15:53:44 +03:00
nimlgen
42abb0586c
am: fix aid doorbells ( #13972 )
2026-01-02 15:53:13 +03:00
nimlgen
ebbaad6bfd
am: enable all sdma engines ( #13970 )
2026-01-02 15:25:15 +03:00
qazal
5f52266225
mi350x gemm: use Tensor.custom_kernel in asm test ( #13969 )
...
* mi350x gemm: use Tensor.custom_kernel in asm test
* A @ B for baseline
2026-01-02 18:30:50 +09:00
George Hotz
5a1a561e0f
assembly/amd: rdna4 autogen ( #13967 )
...
* assembly/amd: add pcode ds ops
* refactors
* fix ds op
* update autogen
* fix flat bug
* more tests
* fix emu test
* that's a hack
* generic
* fix all tests
* two tests
* fix test failure
* better
* remove __all__
* assembly/amd: fix autogen for RDNA4
2026-01-01 23:12:18 -05:00
wozeparrot
b27527f05a
fix: missed inner tracked range ( #13964 )
2026-01-01 18:09:57 -08:00
wozeparrot
ecbac8a338
tk: fa cleanups + causal test ( #13963 )
2026-01-01 18:05:00 -08:00
chenyu
af0392efea
only set DiskDevice.size if it opens successfully ( #13962 )
2026-01-01 19:33:26 -05:00
chenyu
e036d6df89
properly fix DiskDevice reuse ( #13961 )
2026-01-01 18:08:23 -05:00
George Hotz
dfb813b760
assembly/amd: add pcode ds ops ( #13939 )
...
* assembly/amd: add pcode ds ops
* refactors
* fix ds op
* update autogen
* fix flat bug
* more tests
* fix emu test
* that's a hack
* generic
* fix all tests
* two tests
* fix test failure
* better
* remove __all__
2026-01-01 16:24:13 -05:00
chenyu
cb7c76a3bd
update test_fuzz_failure to not contruct full UOp ( #13960 )
2026-01-01 15:09:58 -05:00
chenyu
51398edf9c
fix indirect import ( #13958 )
...
also deleted old external tests
2026-01-01 14:22:45 -05:00
chenyu
8e416df438
simpler InvalidType [pr] ( #13957 )
...
simpler singleton pattern
2026-01-01 13:55:51 -05:00
nimlgen
b8ea0d779c
am: remove pipe, queue from setup_ring ( #13947 )
2026-01-01 21:06:41 +03:00
chenyu
4d5c4d256d
update tqdm for edge case ( #13956 )
...
1.00kit/s and not 1000it/s for value 999.5
2026-01-01 11:37:26 -05:00
chenyu
ed222070f7
update xlog2 fp16 decomp to not use fp32 ( #13955 )
2026-01-01 11:18:29 -05:00
chenyu
ce84a23142
remove tee in benchmark ( #13954 )
2026-01-01 10:55:36 -05:00
b1tg
24723327ac
fix tc_up in search ( #13438 )
...
* tensor_core is missing from Scheduler
* test upcast max
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2026-01-01 10:25:08 -05:00
qazal
9726500de8
enable using assembly in Tensor.custom_kernel ( #13895 )
2026-01-02 00:12:01 +09:00
qazal
c0f52c9dcb
split assembly gemm to per arch directory ( #13953 )
2026-01-02 00:10:22 +09:00
chenyu
c69470be52
fix test_symbolic_arange_sym_step ( #13952 )
2026-01-01 09:41:07 -05:00
chenyu
b91b46091c
delete test_tensor_uop ( #13951 )
...
old test for shape tracker. also update tests that refer shapetracker
names
2026-01-01 09:25:05 -05:00
chenyu
17ef4af72c
new ceildiv that fixed symbolic conv ( #13944 )
...
* new ceildiv that fixed symbolic conv
* smaller test case
2026-01-01 09:02:41 -05:00
qazal
6a5430ab00
correct args order in mi350x gemm ( #13949 )
2026-01-01 23:01:46 +09:00
chenyu
baff10d32c
clean up Tensor.svd slices ( #13948 )
2026-01-01 08:18:45 -05:00
nimlgen
1c5ed8e8b5
am: remove doorbells from setup_ring ( #13946 )
2026-01-01 14:39:21 +03:00
haofei
526fd4ec71
Fix SVD rank‑1 Jacobi rotation when tau == 0 ( #13945 )
2026-01-01 00:30:18 -05:00
haofei
20777f30b9
Fix QR/SVD NaNs on zero/orthogonal inputs ( #13943 )
2025-12-31 23:40:09 -05:00
chenyu
0ed58c1fcd
clean up some functions in helpers [pr] ( #13942 )
2025-12-31 18:29:16 -05:00
chenyu
e2987001ee
unify pre-commit mypy and ci mypy ( #13940 )
2025-12-31 17:51:51 -05:00
chenyu
8bf7c9c1d2
no-op cleanups for ptx [pr] ( #13938 )
2025-12-31 17:28:39 -05:00
George Hotz
2bb07d4824
assembly/amd: move Reg out of the psuedocode ( #13934 )
...
* assembly/amd: move Reg out of the psuedocode
* remove extra
* fix pcode tests
* simpler pcode
* simpler
* simpler
* cleaner
* fix mypy
2025-12-31 15:34:51 -05:00
chenyu
52acadc160
consolidate IGNORE_OOB=0 tests ( #13937 )
...
add a new unit test file and add more cases
2025-12-31 15:24:20 -05:00
chenyu
c0c1c1c8c8
remove unused validate rule ( #13936 )
2025-12-31 15:02:49 -05:00
chenyu
b6d08f247d
assert z3_xor input type ( #13933 )
2025-12-31 13:37:57 -05:00
George Hotz
f14428090f
assembly/amd: speed up emulator ( #13932 )
2025-12-31 13:32:25 -05:00
Christopher Milan
13973e4dea
refactor image pitch ( #13928 )
2025-12-31 13:22:38 -05:00
chenyu
051fe6c8bc
less toposort iteration in oob validate ( #13929 )
2025-12-31 13:16:34 -05:00
chenyu
a9a7b33404
IGNORE_OOB=0 in CI ( #13903 )
2025-12-31 12:56:59 -05:00
George Hotz
29402034a1
assembly/amd: cleanups to asm and emu ( #13912 )
...
* a bunch of cleanups
* ops are back
* bug fixes
* cleanups
* a lil simpler
* more refactors
* _disasm_vop1
* sops
* more
* continue
* more
* num_srcs
* simpler
* no _is16
* op cleanups
* isinstnace
2025-12-31 12:46:11 -05:00
chenyu
ba9aa5cd6f
skip some PTX IGNORE_OOB validation ( #13927 )
2025-12-31 12:40:21 -05:00
chenyu
4968060ad4
fix IGNORE_OOB=0 for WEBGPU ( #13926 )
2025-12-31 10:41:28 -05:00