Commit Graph

  • 3573037342 no need George Hotz 2025-12-20 03:43:31 +00:00
  • 9a7432487f code version George Hotz 2025-12-20 03:41:07 +00:00
  • b63d34bd79 tpye errors George Hotz 2025-12-20 03:34:41 +00:00
  • 3e4186f882 amd rdna George Hotz 2025-12-19 23:27:56 -04:00
  • f201c66c96 revert that George Hotz 2025-12-19 23:26:21 -04:00
  • b8e0fee3c6 tests pass George Hotz 2025-12-20 03:04:56 +00:00
  • b5204e69dd remu improvements George Hotz 2025-12-19 19:37:08 +00:00
  • e0d9c8ef2b remu fixes George Hotz 2025-12-19 19:05:29 +00:00
  • 8a8e7d6103 add RDNA backend CI runner George Hotz 2025-12-19 18:21:02 +00:00
  • 61b0a4886a more George Hotz 2025-12-19 18:17:37 +00:00
  • 19a581e1b7 all ops pass George Hotz 2025-12-19 18:09:48 +00:00
  • 41f1ae51fa all ops pass George Hotz 2025-12-19 18:00:28 +00:00
  • 4872ad2bf4 fix trig George Hotz 2025-12-19 16:45:57 +00:00
  • 6009a5e72b 6 failures George Hotz 2025-12-19 15:32:34 +00:00
  • 9e765ba513 work George Hotz 2025-12-19 12:13:40 +00:00
  • d782d5fdba refactor George Hotz 2025-12-19 01:16:49 +00:00
  • c253f15025 less lines George Hotz 2025-12-19 00:31:36 +00:00
  • 649ef75c5e less George Hotz 2025-12-19 00:09:54 +00:00
  • ec52c2821d progress George Hotz 2025-12-18 20:42:12 +00:00
  • 174b72fa55 no George Hotz 2025-12-18 16:49:48 +00:00
  • c6681d63bb tests George Hotz 2025-12-18 16:49:26 +00:00
  • 3bed227c14 fix wall time George Hotz 2025-12-17 23:19:57 +00:00
  • 8aae624a92 works George Hotz 2025-12-17 23:06:29 +00:00
  • e4bf751687 work George Hotz 2025-12-17 17:11:36 +00:00
  • c14594acb8 look ahead George Hotz 2025-12-17 14:32:17 +00:00
  • 66718494ef enable support float4 George Hotz 2025-12-17 12:18:50 +00:00
  • 70747d760f vibing George Hotz 2025-12-16 22:54:35 +00:00
  • 1282b387f3 more work George Hotz 2025-12-16 19:55:11 +00:00
  • 8b5d1e8a13 rdna3: add missing ops (NEG, MOD, IDIV) George Hotz 2025-12-16 15:58:08 +00:00
  • 14c9712259 progress George Hotz 2025-12-16 15:45:09 +00:00
  • 935c148f69 rdna3 assembly backend George Hotz 2025-12-16 08:43:30 -04:00
  • 0e409ff5ce fix indentation in UOp pretty_print for repeated references (#13857) Clément Verrier 2025-12-29 16:46:16 +01:00
  • f1471a3b99 speed up rdna3 unit tests + add to CI (#13871) George Hotz 2025-12-29 10:26:48 -05:00
  • 37720fd6c0 also look for linux libraries in RHEL-themed paths (#13863) h-vetinari 2025-12-30 02:05:32 +11:00
  • 25ef866e89 write python emulator from RDNA3 psuedocode in pdf (#13841) George Hotz 2025-12-29 07:39:53 -05:00
  • 88eb230326 memory: correct pa allocator size (#13861) nimlgen 2025-12-29 14:49:44 +03:00
  • f541540129 variable N for asm gemm (#13869) qazal 2025-12-29 19:35:50 +09:00
  • c6769badc2 mockgpu: async support (#13868) nimlgen 2025-12-29 13:18:37 +03:00
  • fc5278746f mi350x assembly gemm cleanups (#13867) qazal 2025-12-29 18:47:23 +09:00
  • f07c39cfa4 hwtest fixes for rdna3 dsl (#13865) George Hotz 2025-12-28 20:42:29 -05:00
  • 65aa41a116 hwtest fixes for rdna3 dsl hwtest_fixes George Hotz 2025-12-28 20:23:10 -05:00
  • d9603c1bee improve asm dsl syntax (#13864) George Hotz 2025-12-28 20:04:59 -05:00
  • f5090192c8 reorder AMD tensor core benchmark test (#13860) chenyu 2025-12-28 12:29:51 -05:00
  • 066d96c397 print tflops in asm gemm test (#13859) qazal 2025-12-29 02:26:40 +09:00
  • a03cd43e78 fix typing in compute_gradient (#13852) chenyu 2025-12-28 11:52:14 -05:00
  • cba05acadf re-enable TYPED=1 import test (#13858) chenyu 2025-12-28 11:49:06 -05:00
  • 2cfbabdc34 mi350x 1tflop bf16 gemm in extra (#13702) qazal 2025-12-28 21:45:42 +09:00
  • 2180eee5e4 use the asm dsl in remu hwtest.py (#13856) qazal 2025-12-28 11:32:41 +09:00
  • 784b919f7f Revert "optim empty shard #13513 (#13598)" (#13855) chenyu 2025-12-27 21:10:23 -05:00
  • 9b4de8abc7 fix beam in python 3.14+ (#13836) anu 2025-12-27 16:24:22 -05:00
  • 0f74909ae9 clean up rearrange (#13851) chenyu 2025-12-27 11:06:10 -05:00
  • f6c660f7fa simplify sqtt decoder infra (#13849) qazal 2025-12-28 00:31:16 +09:00
  • ae013beab8 handle empty VECTORIZE in UOp.render() (#13847) Clément Verrier 2025-12-27 16:09:39 +01:00
  • a2da61d096 use new style amd compiler in viz (#13848) qazal 2025-12-27 23:59:30 +09:00
  • 1ee92003ea minor typo (#13846) JINO ROHIT 2025-12-27 20:04:57 +05:30
  • 276159cb87 system: add base_class to pci_scan_bus (#13845) nimlgen 2025-12-27 13:22:21 +03:00
  • fac137779e remove flux1 seed image (#13843) Francis Lata 2025-12-27 00:45:11 -05:00
  • f6de9095a0 switch asm tests to dsl (#13840) qazal 2025-12-27 02:15:16 +09:00
  • ba922094f2 remove redudant check in disk_supports_fast_copyout (#13838) chenyu 2025-12-26 11:30:55 -05:00
  • e9f2aaba2a simplify rdna3 asm (#13835) George Hotz 2025-12-26 11:21:03 -05:00
  • c44b4f9ae0 am: fix sdma warm boot (#13837) nimlgen 2025-12-26 12:38:06 +03:00
  • c6937fa744 more work on RDNA3 asm (#13833) George Hotz 2025-12-25 23:28:14 -05:00
  • f1111ac7de move amd compilers to new style (#13831) George Hotz 2025-12-25 13:42:24 -05:00
  • 9d94b8c6b2 python asm dsl in extra + python REMU (#13436) George Hotz 2025-12-25 13:04:14 -05:00
  • b5f3a5ad79 am: cleanup comment (#13828) nimlgen 2025-12-25 18:00:28 +03:00
  • 8985a4a023 one less branch in Buffer.view [pr] (#13829) chenyu 2025-12-25 09:34:15 -05:00
  • 094753b4e0 renderer arch version cleanup [pr] (#13830) chenyu 2025-12-25 09:32:56 -05:00
  • 54af29dbdb trange can just be a function (#13827) chenyu 2025-12-24 23:57:10 -05:00
  • a1c1684b91 set .amdhsa_kernarg_size in asm test (#13826) qazal 2025-12-25 13:08:14 +09:00
  • da1cb6a9ec update llama dataloader (#13825) chenyu 2025-12-24 17:42:08 -05:00
  • a7fc0c288b clean up BufferCopy init [pr] (#13824) chenyu 2025-12-24 10:40:15 -05:00
  • 903753c60c llama wandb logging (#13822) chenyu 2025-12-24 10:24:59 -05:00
  • e3a646dce3 viz: skip plaintext disassemble for cfg (#13821) qazal 2025-12-24 23:16:59 +09:00
  • 63447d50ef pickle more_early_comps George Hotz 2025-12-23 19:34:04 -05:00
  • 2621e57c53 more George Hotz 2025-12-23 19:22:39 -05:00
  • cb07c5d0e8 fewer import annotations (#13819) chenyu 2025-12-23 18:45:50 -05:00
  • 8c05401d5d fix George Hotz 2025-12-23 18:28:13 -05:00
  • 7b0ce86e2a more early compilers George Hotz 2025-12-23 18:15:58 -05:00
  • 43c6e973d8 add optional compiler in Renderer (#13817) George Hotz 2025-12-23 17:58:46 -05:00
  • 8eab6175ee get_program refactor (#13816) George Hotz 2025-12-23 16:44:46 -05:00
  • 3d3c5b2fb9 add device to program (#13815) George Hotz 2025-12-23 16:15:33 -05:00
  • a07c9da26b Merge branch 'master' into remove_programspec remove_programspec George Hotz 2025-12-23 15:21:46 -05:00
  • 816a359a3c do programspec removal George Hotz 2025-12-23 15:21:02 -05:00
  • 90b217896f am: xgmi p2p (#13811) nimlgen 2025-12-23 20:11:38 +03:00
  • 6439a515be test fixups / speedups / var_vals refactor (#13812) George Hotz 2025-12-23 12:05:59 -05:00
  • 8dcba2e2cc no full_rewrite [pr] (#13809) George Hotz 2025-12-22 23:20:01 -05:00
  • edce2303f4 rewrite to program (#13808) George Hotz 2025-12-22 20:03:33 -05:00
  • 2af2b4da5d Revert "rewrites for renderer and compiler (#13646)" (#13806) George Hotz 2025-12-22 19:21:33 -05:00
  • 2aec58654a one compiler path one_compiler George Hotz 2025-12-22 19:20:04 -05:00
  • 339dadf056 rewrites for renderer and compiler (#13646) George Hotz 2025-12-22 18:58:43 -05:00
  • 4edaaf19e5 Handle tied embeddings for llama 3.2 1B (#13796) Daniel Xu 2025-12-22 13:31:40 -08:00
  • 7f1d41c9f9 delete files that import ShapeTracker (#13805) chenyu 2025-12-22 15:54:18 -05:00
  • 9e0a42ec0e typed typed_checks George Hotz 2025-12-22 18:40:53 +00:00
  • 703ab8c63e Merge branch 'master' into typed_checks George Hotz 2025-12-22 13:24:10 -05:00
  • b31373ca70 remove llvm-mca stuff from viz (#13802) qazal 2025-12-23 02:41:51 +09:00
  • 27d899ce97 TRAIN=0 to only eval llama (#13804) chenyu 2025-12-22 11:55:46 -05:00
  • 39d962106f update llama logging (#13803) chenyu 2025-12-22 11:28:29 -05:00
  • 389f01c7f4 viz: amdgpu assembly basic block graph (#13755) qazal 2025-12-23 00:17:16 +09:00
  • df0f9d6860 add olmoe support to llm (#13792) George Hotz 2025-12-22 10:41:35 -04:00
  • 81d9053013 roc: cast to nullptr instead of changing header (#13801) qazal 2025-12-22 23:34:06 +09:00