Commit Graph

  • d178235309 delete tree structure from CLAUDE.md (#13876) Clément Verrier 2025-12-29 19:23:20 +01:00
  • ff856a74cb minor refactoring for rdna3 (#13873) George Hotz 2025-12-29 13:20:00 -05:00
  • 39923203ba fix exception in cuda bindings code on windows (#13823) C T 2025-12-29 19:58:22 +02:00
  • 63a1bb8507 multi custom kernel: support input mixed with copy and shard (#13748) b1tg 2025-12-30 01:54:27 +08:00
  • 0a98fd38b3 fix tests that failed locally on mac (#13872) chenyu 2025-12-29 11:23:38 -05:00
  • f0f08c75e5 3 failures George Hotz 2025-12-29 01:07:54 +00:00
  • 0f2fd824e6 bug fix George Hotz 2025-12-28 13:41:24 +00:00
  • 923e5158e7 test okay George Hotz 2025-12-28 13:11:23 +00:00
  • 8440f35534 never change heuristic George Hotz 2025-12-27 21:19:13 +00:00
  • 0d7624d7cf 64-bit crap George Hotz 2025-12-27 21:04:21 +00:00
  • 6984125197 all tests pass except test_padded_conv3d George Hotz 2025-12-27 20:25:16 +00:00
  • 1104d659af bitfield cache George Hotz 2025-12-27 20:01:58 +00:00
  • a95b641a49 conv2d passes, even without ILP George Hotz 2025-12-27 18:27:05 +00:00
  • 82068cff9a fix recip George Hotz 2025-12-27 16:36:40 +00:00
  • 1382f9b9ab fix div George Hotz 2025-12-27 00:07:54 +00:00
  • 81e9ea2bec more George Hotz 2025-12-26 23:56:58 +00:00
  • 09c4f61aed work George Hotz 2025-12-26 21:36:16 +00:00
  • bc35d7ca37 simpler George Hotz 2025-12-26 19:13:58 +00:00
  • f851b885cd work George Hotz 2025-12-26 17:19:55 +00:00
  • b0b08604d8 rebase George Hotz 2025-12-26 16:32:54 +00:00
  • 834de38f72 work George Hotz 2025-12-26 16:10:35 +00:00
  • 1a2b954e7c rdna new George Hotz 2025-12-26 04:43:17 +00:00
  • 16be4f2107 switch to rdna_new renderer George Hotz 2025-12-26 04:29:31 +00:00
  • 727da0f4b3 all tests pass fast George Hotz 2025-12-25 23:20:35 -05:00
  • 8f0578f665 all tests pass George Hotz 2025-12-25 23:16:34 -05:00
  • e4d940263d dual mov George Hotz 2025-12-25 22:58:41 -05:00
  • c5ea05c682 tests pass George Hotz 2025-12-25 17:25:32 -05:00
  • 6cf535fd07 work George Hotz 2025-12-25 16:55:11 -05:00
  • 74266eaee5 more handwritten George Hotz 2025-12-25 16:45:12 -05:00
  • d41bb12a13 more handwritten George Hotz 2025-12-25 16:38:12 -05:00
  • f6d68f2090 work George Hotz 2025-12-25 16:23:11 -05:00
  • e500d0b197 roundtrip test George Hotz 2025-12-25 14:43:57 -05:00
  • 3ed01037ba more llvm asm tests George Hotz 2025-12-25 13:25:55 -05:00
  • e756709548 factorize a lil George Hotz 2025-12-25 16:11:13 +00:00
  • badf9339e1 simpler George Hotz 2025-12-23 23:31:48 +00:00
  • 0823952864 heuristic refactor George Hotz 2025-12-23 16:46:01 +00:00
  • c489eba654 nonsense George Hotz 2025-12-22 23:51:45 +00:00
  • afa490e3f4 this diff is getting dumb George Hotz 2025-12-22 21:44:18 +00:00
  • d6863e42bd that George Hotz 2025-12-22 15:32:52 +00:00
  • f0510d0e1d fix George Hotz 2025-12-22 14:42:28 +00:00
  • ab56fe5347 tests George Hotz 2025-12-22 14:00:54 +00:00
  • 1ea1ce8923 more tests George Hotz 2025-12-21 16:46:16 +00:00
  • a6b55a1db0 better dtypes George Hotz 2025-12-20 20:05:52 +00:00
  • 4ebdc9f86c work George Hotz 2025-12-20 12:11:35 +00:00
  • 1c932ccb8d fixes George Hotz 2025-12-20 04:18:05 +00:00
  • 3573037342 no need George Hotz 2025-12-20 03:43:31 +00:00
  • 9a7432487f code version George Hotz 2025-12-20 03:41:07 +00:00
  • b63d34bd79 tpye errors George Hotz 2025-12-20 03:34:41 +00:00
  • 3e4186f882 amd rdna George Hotz 2025-12-19 23:27:56 -04:00
  • f201c66c96 revert that George Hotz 2025-12-19 23:26:21 -04:00
  • b8e0fee3c6 tests pass George Hotz 2025-12-20 03:04:56 +00:00
  • b5204e69dd remu improvements George Hotz 2025-12-19 19:37:08 +00:00
  • e0d9c8ef2b remu fixes George Hotz 2025-12-19 19:05:29 +00:00
  • 8a8e7d6103 add RDNA backend CI runner George Hotz 2025-12-19 18:21:02 +00:00
  • 61b0a4886a more George Hotz 2025-12-19 18:17:37 +00:00
  • 19a581e1b7 all ops pass George Hotz 2025-12-19 18:09:48 +00:00
  • 41f1ae51fa all ops pass George Hotz 2025-12-19 18:00:28 +00:00
  • 4872ad2bf4 fix trig George Hotz 2025-12-19 16:45:57 +00:00
  • 6009a5e72b 6 failures George Hotz 2025-12-19 15:32:34 +00:00
  • 9e765ba513 work George Hotz 2025-12-19 12:13:40 +00:00
  • d782d5fdba refactor George Hotz 2025-12-19 01:16:49 +00:00
  • c253f15025 less lines George Hotz 2025-12-19 00:31:36 +00:00
  • 649ef75c5e less George Hotz 2025-12-19 00:09:54 +00:00
  • ec52c2821d progress George Hotz 2025-12-18 20:42:12 +00:00
  • 174b72fa55 no George Hotz 2025-12-18 16:49:48 +00:00
  • c6681d63bb tests George Hotz 2025-12-18 16:49:26 +00:00
  • 3bed227c14 fix wall time George Hotz 2025-12-17 23:19:57 +00:00
  • 8aae624a92 works George Hotz 2025-12-17 23:06:29 +00:00
  • e4bf751687 work George Hotz 2025-12-17 17:11:36 +00:00
  • c14594acb8 look ahead George Hotz 2025-12-17 14:32:17 +00:00
  • 66718494ef enable support float4 George Hotz 2025-12-17 12:18:50 +00:00
  • 70747d760f vibing George Hotz 2025-12-16 22:54:35 +00:00
  • 1282b387f3 more work George Hotz 2025-12-16 19:55:11 +00:00
  • 8b5d1e8a13 rdna3: add missing ops (NEG, MOD, IDIV) George Hotz 2025-12-16 15:58:08 +00:00
  • 14c9712259 progress George Hotz 2025-12-16 15:45:09 +00:00
  • 935c148f69 rdna3 assembly backend George Hotz 2025-12-16 08:43:30 -04:00
  • 0e409ff5ce fix indentation in UOp pretty_print for repeated references (#13857) Clément Verrier 2025-12-29 16:46:16 +01:00
  • f1471a3b99 speed up rdna3 unit tests + add to CI (#13871) George Hotz 2025-12-29 10:26:48 -05:00
  • 37720fd6c0 also look for linux libraries in RHEL-themed paths (#13863) h-vetinari 2025-12-30 02:05:32 +11:00
  • 25ef866e89 write python emulator from RDNA3 psuedocode in pdf (#13841) George Hotz 2025-12-29 07:39:53 -05:00
  • 88eb230326 memory: correct pa allocator size (#13861) nimlgen 2025-12-29 14:49:44 +03:00
  • f541540129 variable N for asm gemm (#13869) qazal 2025-12-29 19:35:50 +09:00
  • c6769badc2 mockgpu: async support (#13868) nimlgen 2025-12-29 13:18:37 +03:00
  • fc5278746f mi350x assembly gemm cleanups (#13867) qazal 2025-12-29 18:47:23 +09:00
  • f07c39cfa4 hwtest fixes for rdna3 dsl (#13865) George Hotz 2025-12-28 20:42:29 -05:00
  • 65aa41a116 hwtest fixes for rdna3 dsl hwtest_fixes George Hotz 2025-12-28 20:23:10 -05:00
  • d9603c1bee improve asm dsl syntax (#13864) George Hotz 2025-12-28 20:04:59 -05:00
  • f5090192c8 reorder AMD tensor core benchmark test (#13860) chenyu 2025-12-28 12:29:51 -05:00
  • 066d96c397 print tflops in asm gemm test (#13859) qazal 2025-12-29 02:26:40 +09:00
  • a03cd43e78 fix typing in compute_gradient (#13852) chenyu 2025-12-28 11:52:14 -05:00
  • cba05acadf re-enable TYPED=1 import test (#13858) chenyu 2025-12-28 11:49:06 -05:00
  • 2cfbabdc34 mi350x 1tflop bf16 gemm in extra (#13702) qazal 2025-12-28 21:45:42 +09:00
  • 2180eee5e4 use the asm dsl in remu hwtest.py (#13856) qazal 2025-12-28 11:32:41 +09:00
  • 784b919f7f Revert "optim empty shard #13513 (#13598)" (#13855) chenyu 2025-12-27 21:10:23 -05:00
  • 9b4de8abc7 fix beam in python 3.14+ (#13836) anu 2025-12-27 16:24:22 -05:00
  • 0f74909ae9 clean up rearrange (#13851) chenyu 2025-12-27 11:06:10 -05:00
  • f6c660f7fa simplify sqtt decoder infra (#13849) qazal 2025-12-28 00:31:16 +09:00
  • ae013beab8 handle empty VECTORIZE in UOp.render() (#13847) Clément Verrier 2025-12-27 16:09:39 +01:00
  • a2da61d096 use new style amd compiler in viz (#13848) qazal 2025-12-27 23:59:30 +09:00
  • 1ee92003ea minor typo (#13846) JINO ROHIT 2025-12-27 20:04:57 +05:30