Commit Graph

  • a071adffc0 viz: amdgpu disassembly register highlighting UI (#14059) qazal 2026-01-08 21:27:09 -05:00
  • b878f9d5a4 reuse Tensor init with const path [pr] (#14076) chenyu 2026-01-08 17:49:37 -05:00
  • efcb32f6a9 unique const when requires_grad is set to True (#14075) chenyu 2026-01-08 16:30:45 -05:00
  • b34c637767 support bfloat16 for CL (#14073) chenyu 2026-01-08 14:14:29 -05:00
  • 16b652302e skip bf16 test if not supported by device (#14070) Garret Castro 2026-01-08 10:37:24 -08:00
  • 3f61a96d79 am: SetSoftMaxByFreq on gfx10+ (#14068) nimlgen 2026-01-08 17:00:03 +03:00
  • d10668283d Merge remote-tracking branch 'origin/master' into asm_ucode George Hotz 2026-01-08 05:14:42 -08:00
  • 627c440d38 minmax George Hotz 2026-01-08 05:14:04 -08:00
  • e7b5d8a434 assembly/amd: more RDNA4 asm (#14062) George Hotz 2026-01-08 05:09:37 -08:00
  • 0c40e52ae1 no void George Hotz 2026-01-08 05:06:23 -08:00
  • 894230d0a9 fix parser bugs George Hotz 2026-01-08 05:02:55 -08:00
  • 544a877960 cvt functions George Hotz 2026-01-08 04:57:04 -08:00
  • e372c841ba hevc: beam in decode (#14067) nimlgen 2026-01-08 15:47:16 +03:00
  • 1732a4ec4b am: rework set_clocks (#14065) nimlgen 2026-01-08 15:33:32 +03:00
  • 0dfdad0e76 cleanup pcode_parse George Hotz 2026-01-08 04:27:26 -08:00
  • f3aceaa08b hevc: fast decoder (#14057) nimlgen 2026-01-08 15:20:37 +03:00
  • 4a7456caef move more George Hotz 2026-01-08 04:18:46 -08:00
  • 309197bca5 assembly/amd: test_roundtrip for cdna/rdna4 (#14066) qazal 2026-01-08 07:03:13 -05:00
  • d84db5851f calls George Hotz 2026-01-08 00:55:57 -08:00
  • 15a056715d fix amd assembly IDE tests on macbook (#14063) qazal 2026-01-08 03:27:52 -05:00
  • 37b4751958 isNAN George Hotz 2026-01-08 00:24:10 -08:00
  • 5e923ccb5e simpler ucode George Hotz 2026-01-08 00:10:23 -08:00
  • 10836a5dba lil cleanups George Hotz 2026-01-07 23:49:53 -08:00
  • 027b935269 tk: fix grouped load store (#14035) wozeparrot 2026-01-08 01:38:02 -05:00
  • 56ba96f5cd uops have types George Hotz 2026-01-07 21:52:55 -08:00
  • 2db04d0696 assembly/amd: start adding RDNA4 support (#14060) George Hotz 2026-01-07 21:19:30 -08:00
  • c8b42edec6 a bunch of todos for my boy claude George Hotz 2026-01-06 14:02:29 -08:00
  • add569d94c test_unrealized_const_input_frozen (#14044) chenyu 2026-01-06 14:17:43 -05:00
  • e33d79226d amd: copies w/o sdma (#14036) nimlgen 2026-01-06 21:15:58 +03:00
  • caa52dcbe5 raise when jit fxn returns non-Tensor output (#14042) chenyu 2026-01-06 12:59:20 -05:00
  • aa96d826f4 JitError (#14041) chenyu 2026-01-06 12:19:50 -05:00
  • b4fd0954b7 test jit tolist failure (#14040) chenyu 2026-01-06 11:16:57 -05:00
  • 02ab3eb153 test case for jit a function with item call (#14039) chenyu 2026-01-06 10:40:43 -05:00
  • a6198a67fc mockdsp: use dsp allocator (#14037) nimlgen 2026-01-06 16:04:47 +03:00
  • 0069cd9a0b tk: support sliced local -> reg load (#14034) wozeparrot 2026-01-06 05:33:24 -05:00
  • cb500466c2 assembly/amd: amd_asm_matmul (#13989) George Hotz 2026-01-07 20:11:05 -08:00
  • 3caa1e2c98 fix cast HALF with PYTHON backend (#14058) chenyu 2026-01-07 16:52:05 -05:00
  • 5f1ede7f7e clean up test_dtype (#14055) chenyu 2026-01-07 15:45:42 -05:00
  • 5bd4593eda hevc: cleaner decoder (#14056) nimlgen 2026-01-07 18:29:30 +03:00
  • 241f0402b4 add seed in bert data shuffle (#14054) b1tg 2026-01-07 23:02:05 +08:00
  • 25c82dd242 nv: profile nvdec (#14053) nimlgen 2026-01-07 15:56:54 +03:00
  • 35900290b2 viz: configure text height for cfg (#14052) qazal 2026-01-07 04:58:56 -05:00
  • 87f4bc5446 update variable names around jit [pr] (#14049) chenyu 2026-01-06 22:32:41 -05:00
  • 2833c5a54b few more jit tests with multi tensor inputs (#14047) chenyu 2026-01-06 22:05:22 -05:00
  • 72a3f78d19 jit includes tensor inputs in containers (#14043) chenyu 2026-01-06 19:42:06 -05:00
  • c714881832 don't allow jit input to be const (#14045) chenyu 2026-01-06 18:15:22 -05:00
  • a8896f28e1 test_unrealized_const_input_frozen (#14044) chenyu 2026-01-06 14:17:43 -05:00
  • 325f4006ff amd: copies w/o sdma (#14036) nimlgen 2026-01-06 21:15:58 +03:00
  • 7fb18f7e47 raise when jit fxn returns non-Tensor output (#14042) chenyu 2026-01-06 12:59:20 -05:00
  • 4491ec0c9e JitError (#14041) chenyu 2026-01-06 12:19:50 -05:00
  • 6ddddc68af test jit tolist failure (#14040) chenyu 2026-01-06 11:16:57 -05:00
  • b699b9f763 test case for jit a function with item call (#14039) chenyu 2026-01-06 10:40:43 -05:00
  • 02084f5376 mockdsp: use dsp allocator (#14037) nimlgen 2026-01-06 16:04:47 +03:00
  • 2b3e01e79c tk: support sliced local -> reg load (#14034) wozeparrot 2026-01-06 05:33:24 -05:00
  • 947747eb5e Merge branch 'master' into asm_ucode George Hotz 2026-01-06 00:16:03 -08:00
  • 45f7fd073d assembly/amd: pcode bug fixes (#14032) George Hotz 2026-01-06 00:15:48 -08:00
  • 21d0f6bb76 tk: flat global -> local load (#14033) wozeparrot 2026-01-06 02:35:53 -05:00
  • 3170365a5b visualize SQTT with the same cfg infrastructure (#13870) qazal 2026-01-06 00:53:20 -05:00
  • 640dac46c2 pcode_exec George Hotz 2026-01-05 21:15:29 -08:00
  • 0120d69caa autogen: avcodec (and simplify workflow) (#14031) Christopher Milan 2026-01-05 20:30:25 -08:00
  • 05129e58b0 pcode back George Hotz 2026-01-05 20:15:30 -08:00
  • b7dc59a68d fix emu George Hotz 2026-01-05 20:09:20 -08:00
  • ec7ec99cbd better George Hotz 2026-01-05 20:06:52 -08:00
  • c8c6346336 tests George Hotz 2026-01-05 19:57:30 -08:00
  • 6de310c87f parsing George Hotz 2026-01-05 19:54:56 -08:00
  • ffba806b65 pdf/qcode work George Hotz 2026-01-05 19:42:47 -08:00
  • c4016d5cac fix psuedocode parsing George Hotz 2026-01-05 18:58:55 -08:00
  • a5587fbda1 Merge origin/master, delete pcode.py George Hotz 2026-01-05 18:53:43 -08:00
  • 20653d2996 assembly/amd: make pdf.py code shine (#14029) George Hotz 2026-01-05 18:49:40 -08:00
  • ea7b149ca5 viz command line tool (#14030) qazal 2026-01-05 20:19:47 -05:00
  • f86c728440 load libclang as 'libclang.so' too (#14028) Christopher Milan 2026-01-05 13:56:16 -08:00
  • f7e25e7632 regen cdna_asm George Hotz 2026-01-05 11:26:24 -08:00
  • 1155508f80 fix tests George Hotz 2026-01-05 10:18:00 -08:00
  • 4e213cee95 tests George Hotz 2026-01-05 09:42:26 -08:00
  • a6c17e7081 test fails George Hotz 2026-01-05 09:25:42 -08:00
  • 893f34a80f impl George Hotz 2026-01-05 08:57:41 -08:00
  • bb5103fdb0 simpler George Hotz 2026-01-05 08:20:03 -08:00
  • eaa5a05f3d 100% asm George Hotz 2026-01-05 07:45:31 -08:00
  • eda6a73897 clean up canonicalize_device (#14027) chenyu 2026-01-05 10:29:55 -05:00
  • ea244a4fce cdna progress George Hotz 2026-01-05 07:19:07 -08:00
  • ce464b147a clean up comments that mentioned outdated terms (#14026) chenyu 2026-01-05 09:42:58 -05:00
  • 83063cc3e4 onnx TensorScatter (#14024) chenyu 2026-01-05 09:05:22 -05:00
  • 9497ec00f2 fix onnx attention permute (#14025) chenyu 2026-01-05 08:58:50 -05:00
  • 5cff5698f7 viz: g key toggles graph and text view (#14023) qazal 2026-01-05 08:41:45 -05:00
  • 7a81a3cb98 more passed onnx tests (#14022) chenyu 2026-01-05 07:46:27 -05:00
  • 34fe105386 remove unused LazySeq (#14020) kim yongjin 2026-01-05 21:38:33 +09:00
  • 4f2f38bf64 viz: split cfg and table render (#14021) qazal 2026-01-05 06:59:08 -05:00
  • 74da1c6310 test qcode George Hotz 2026-01-05 02:42:27 -08:00
  • 70405b4f3c am_smi: mi350 (#14018) nimlgen 2026-01-05 13:10:56 +03:00
  • f2b11010e8 no skip George Hotz 2026-01-04 20:51:01 -08:00
  • 400d59c06b simpler George Hotz 2026-01-04 20:37:06 -08:00
  • b2a0b9c551 autogen: dump patch in CI (#14010) Christopher Milan 2026-01-04 19:38:12 -08:00
  • aae08b20e0 enable passed onnx tests (#14017) chenyu 2026-01-04 22:12:50 -05:00
  • 85b28faf33 more asm George Hotz 2026-01-04 18:41:01 -08:00
  • 57684d2777 no pcode George Hotz 2026-01-04 18:35:16 -08:00
  • 785d04d127 simpler einsum (#14014) chenyu 2026-01-04 20:38:59 -05:00
  • 8147a78d24 wide dtypes George Hotz 2026-01-04 17:26:12 -08:00
  • 486248f775 fix pcode George Hotz 2026-01-04 17:04:52 -08:00
  • 6175f4ce70 assembly/amd: cdna asm George Hotz 2026-01-04 16:36:49 -08:00
  • 87e72f1540 ftz George Hotz 2026-01-04 16:32:35 -08:00