Commit Graph

  • cb7c76a3bd update test_fuzz_failure to not contruct full UOp (#13960) chenyu 2026-01-01 15:09:58 -05:00
  • 51398edf9c fix indirect import (#13958) chenyu 2026-01-01 14:22:45 -05:00
  • 8e416df438 simpler InvalidType [pr] (#13957) chenyu 2026-01-01 13:55:51 -05:00
  • 729bb04d8c fix test failure George Hotz 2026-01-01 13:21:55 -05:00
  • 8f4de73141 two tests George Hotz 2026-01-01 13:13:01 -05:00
  • a5959ef0f1 fix all tests George Hotz 2026-01-01 13:11:51 -05:00
  • b8ea0d779c am: remove pipe, queue from setup_ring (#13947) nimlgen 2026-01-01 21:06:41 +03:00
  • 5ba06892c0 generic George Hotz 2026-01-01 12:46:08 -05:00
  • 469efe313d that's a hack George Hotz 2026-01-01 12:40:14 -05:00
  • 103a00d4c5 Merge origin/master rdna4_asm George Hotz 2026-01-01 17:15:45 +00:00
  • 8c14d9f427 rdna4 George Hotz 2026-01-01 17:14:52 +00:00
  • e3b3cb163d fix emu test George Hotz 2026-01-01 12:12:47 -05:00
  • 3e32185faf more tests George Hotz 2026-01-01 12:04:41 -05:00
  • 5328913d2b fix flat bug George Hotz 2026-01-01 11:51:10 -05:00
  • 4e03b3ebef rdna4 work George Hotz 2026-01-01 16:45:39 +00:00
  • 4d5c4d256d update tqdm for edge case (#13956) chenyu 2026-01-01 11:37:26 -05:00
  • 9c49ec1cc1 update autogen George Hotz 2026-01-01 11:36:33 -05:00
  • ed222070f7 update xlog2 fp16 decomp to not use fp32 (#13955) chenyu 2026-01-01 11:18:29 -05:00
  • ce84a23142 remove tee in benchmark (#13954) chenyu 2026-01-01 10:55:36 -05:00
  • 000d4a125b fix ds op George Hotz 2026-01-01 10:36:37 -05:00
  • 24723327ac fix tc_up in search (#13438) b1tg 2026-01-01 23:25:08 +08:00
  • 9726500de8 enable using assembly in Tensor.custom_kernel (#13895) qazal 2026-01-02 00:12:01 +09:00
  • c0f52c9dcb split assembly gemm to per arch directory (#13953) qazal 2026-01-02 00:10:22 +09:00
  • c69470be52 fix test_symbolic_arange_sym_step (#13952) chenyu 2026-01-01 09:41:07 -05:00
  • b91b46091c delete test_tensor_uop (#13951) chenyu 2026-01-01 09:25:05 -05:00
  • 17ef4af72c new ceildiv that fixed symbolic conv (#13944) chenyu 2026-01-01 09:02:41 -05:00
  • 6a5430ab00 correct args order in mi350x gemm (#13949) qazal 2026-01-01 23:01:46 +09:00
  • baff10d32c clean up Tensor.svd slices (#13948) chenyu 2026-01-01 08:18:45 -05:00
  • 1c5ed8e8b5 am: remove doorbells from setup_ring (#13946) nimlgen 2026-01-01 14:39:21 +03:00
  • 526fd4ec71 Fix SVD rank‑1 Jacobi rotation when tau == 0 (#13945) haofei 2025-12-31 21:30:18 -08:00
  • 20777f30b9 Fix QR/SVD NaNs on zero/orthogonal inputs (#13943) haofei 2025-12-31 20:40:09 -08:00
  • 0ed58c1fcd clean up some functions in helpers [pr] (#13942) chenyu 2025-12-31 18:29:16 -05:00
  • 63289902d8 refactors George Hotz 2025-12-31 17:57:27 -05:00
  • e2987001ee unify pre-commit mypy and ci mypy (#13940) chenyu 2025-12-31 17:51:51 -05:00
  • 8bf7c9c1d2 no-op cleanups for ptx [pr] (#13938) chenyu 2025-12-31 17:28:39 -05:00
  • b596f77e33 assembly/amd: add pcode ds ops George Hotz 2025-12-31 16:59:02 -05:00
  • 4571979fac refactor George Hotz 2025-12-31 21:33:37 +00:00
  • 9302f38f5b rdna4 works George Hotz 2025-12-31 21:20:47 +00:00
  • aec4d65241 ds compiled more_pcode George Hotz 2025-12-31 14:54:39 -05:00
  • f022a7d8a7 assembly/amd: move more instructions to pcode George Hotz 2025-12-31 13:13:19 -05:00
  • 2bb07d4824 assembly/amd: move Reg out of the psuedocode (#13934) George Hotz 2025-12-31 15:34:51 -05:00
  • 52acadc160 consolidate IGNORE_OOB=0 tests (#13937) chenyu 2025-12-31 15:24:20 -05:00
  • c0c1c1c8c8 remove unused validate rule (#13936) chenyu 2025-12-31 15:02:49 -05:00
  • b6d08f247d assert z3_xor input type (#13933) chenyu 2025-12-31 13:37:57 -05:00
  • f14428090f assembly/amd: speed up emulator (#13932) George Hotz 2025-12-31 13:32:25 -05:00
  • 13973e4dea refactor image pitch (#13928) Christopher Milan 2025-12-31 10:22:38 -08:00
  • 051fe6c8bc less toposort iteration in oob validate (#13929) chenyu 2025-12-31 13:16:34 -05:00
  • 2a6904029b more rdna4 work George Hotz 2025-12-31 18:07:42 +00:00
  • a9a7b33404 IGNORE_OOB=0 in CI (#13903) chenyu 2025-12-31 12:56:59 -05:00
  • 14bc1b0c68 Merge origin/master George Hotz 2025-12-31 17:47:59 +00:00
  • 29402034a1 assembly/amd: cleanups to asm and emu (#13912) George Hotz 2025-12-31 12:46:11 -05:00
  • ba9aa5cd6f skip some PTX IGNORE_OOB validation (#13927) chenyu 2025-12-31 12:40:21 -05:00
  • c9b074639e work on rdna4 asm George Hotz 2025-12-31 16:18:19 +00:00
  • 4968060ad4 fix IGNORE_OOB=0 for WEBGPU (#13926) chenyu 2025-12-31 10:41:28 -05:00
  • 35bd39e4ba update mypy and torch version in ci (#13925) chenyu 2025-12-31 10:29:28 -05:00
  • b998a80b5d assembly/amd: split generated stuff into enum/ins (#13924) George Hotz 2025-12-31 10:10:52 -05:00
  • 404755bafd merge ci ruff tests and update ruff version (#13922) chenyu 2025-12-31 09:53:49 -05:00
  • 25440f0f72 all2all (#13902) nimlgen 2025-12-31 16:38:32 +03:00
  • f7ee644950 amd: lazy sdma queue allocation (#13920) nimlgen 2025-12-31 15:17:13 +03:00
  • b063518ea7 am: several sdmas (#13919) nimlgen 2025-12-31 14:19:22 +03:00
  • b23f4517ab prep mi350x gemm for python dsl (#13918) qazal 2025-12-31 20:00:57 +09:00
  • 3f3786ded9 mmapeak: fix compiler import (#13915) qazal 2025-12-31 16:52:23 +09:00
  • a14896fff2 refactor QCOM arg parsing (#13914) Christopher Milan 2025-12-30 16:26:02 -08:00
  • c475c3a6d7 remove useless cast (#13911) Christopher Milan 2025-12-30 16:24:29 -08:00
  • 0221b96761 assembly/amd: fix all ops tests (#13910) George Hotz 2025-12-30 18:01:34 -05:00
  • dc27eb48ac remove PYTHONPATH="." from test.yml (#13909) chenyu 2025-12-30 17:00:16 -05:00
  • efc99d0c55 assembly/amd: more refactors (#13907) George Hotz 2025-12-30 16:13:24 -05:00
  • 7d82cd45a8 tests gen_pdf_fast George Hotz 2025-12-30 14:48:21 -05:00
  • 06809da01d Merge origin/master into gen_pdf_fast George Hotz 2025-12-30 14:48:03 -05:00
  • ef5ee0f723 assembly/amd: factor out pdf generation George Hotz 2025-12-30 14:44:45 -05:00
  • 7a1190b729 Merge origin/master into only_reg_emu2 (keep branch's Reg-based approach) only_reg_emu2 George Hotz 2025-12-30 18:53:50 +00:00
  • 49d1bf93d6 assembly/amd: refactor asm.py to be simpler (#13900) George Hotz 2025-12-30 13:51:40 -05:00
  • 433248c998 assembly/amd: only reg emu George Hotz 2025-12-30 18:05:09 +00:00
  • 04c79505ec no subnormal bf16 (#13905) George Hotz 2025-12-30 13:02:53 -05:00
  • 39f99b207a update IGNORE_OOB error message (#13904) chenyu 2025-12-30 12:25:55 -05:00
  • 7e14cdcb06 assembly/amd: clean up clt/ctz hack (#13901) George Hotz 2025-12-30 11:59:28 -05:00
  • 69cdc8066d assembly/amd: add dtype tests to AMD IDE CI (#13899) George Hotz 2025-12-30 11:09:51 -05:00
  • 9c89be5235 assembly/amd: fix v_perm_b32 + PC fixes (#13897) George Hotz 2025-12-30 09:25:40 -05:00
  • 2b838dc1d8 assembly/amd: fix AMD_LLVM=1 support in emulator (#13881) George Hotz 2025-12-30 09:09:57 -05:00
  • 7f139a934f assembly/amd: switch to Reg in pcode George Hotz 2025-12-30 14:00:28 +00:00
  • 05d27abcc2 tests pass only_reg_emu George Hotz 2025-12-30 13:49:05 +00:00
  • a19d21ea9c am: mi3xx smu clocks (#13894) nimlgen 2025-12-30 16:44:17 +03:00
  • 153c5a1670 assembly/amd: use Reg in emu George Hotz 2025-12-30 12:52:03 +00:00
  • b557c46233 assembly gemm clean ups, instructions for cli (#13892) qazal 2025-12-30 16:14:06 +09:00
  • d7e1f26e3d command line interface for sqtt viz (#13891) qazal 2025-12-30 12:33:21 +09:00
  • ab58926b00 update sampling in test_float_cast_to_unsigned (#13889) chenyu 2025-12-29 21:35:46 -05:00
  • 0497387e45 NIR: new-style (fix beam) (#13887) Christopher Milan 2025-12-29 15:41:29 -08:00
  • fc4faed0b2 Revert "NIR: new-style compilers (#13875)" (#13888) Christopher Milan 2025-12-29 14:42:28 -08:00
  • 94bca91f3e assembly/amd: have asm go through the dsl (#13886) George Hotz 2025-12-29 17:39:11 -05:00
  • 7322d9ec4a assembly/amd: add new instruction support to pcode (#13885) George Hotz 2025-12-29 17:30:17 -05:00
  • 170e8825c7 3 tests fail rdna3_vibes George Hotz 2025-12-29 22:12:45 +00:00
  • d0e470c308 Merge origin/master George Hotz 2025-12-29 21:31:58 +00:00
  • 0d326f5b9b fix missing instructions in psuedocode (#13884) George Hotz 2025-12-29 16:11:22 -05:00
  • 9c6850fc01 remove try-catches on llvm import (#13883) Christopher Milan 2025-12-29 12:56:17 -08:00
  • 9d8397be11 add CDNA3+RDNA4 support (#13882) George Hotz 2025-12-29 15:51:29 -05:00
  • 72236bbd3d NIR: new-style compilers (#13875) Christopher Milan 2025-12-29 12:31:41 -08:00
  • 81cf9ea0ab rename to extra.assembly.amd (#13879) George Hotz 2025-12-29 14:10:55 -05:00
  • 37f0fa11b6 rdna3 test cleanups (#13878) George Hotz 2025-12-29 13:41:59 -05:00
  • 6352e4dcea Merge origin/master George Hotz 2025-12-29 18:41:13 +00:00
  • 35db73b231 add cdna4 support to parsers (#13877) George Hotz 2025-12-29 13:23:43 -05:00