Commit Graph

  • d65bd669f8 update tiny torch backend hook (#12575) Daniel 2025-10-15 20:02:33 +02:00
  • db5ae846aa nv: do not use va_addr for cpu accesses (#12697) nimlgen 2025-10-15 22:48:12 +08:00
  • 3ab23af829 nv: copy prog with copyin (#12701) nimlgen 2025-10-15 22:48:01 +08:00
  • fafbf3daea memory: reserve ptable (#12702) nimlgen 2025-10-15 22:47:50 +08:00
  • 85a907605c hotfix: only 20 steps of beautiful_mnist_torch, some CI machines are slow George Hotz 2025-10-15 22:29:34 +08:00
  • e1996d358c use RTLD_GLOBAL on macos (#12699) Christopher Milan 2025-10-15 10:24:50 -04:00
  • 312c622d35 support None in pad_to and shrink_to (#12700) chenyu 2025-10-15 09:25:31 -04:00
  • 612e3d6143 replace mop arg with vectorized index (#12695) George Hotz 2025-10-15 20:50:06 +08:00
  • 9ec4c06d7d feat: one request per device (#12698) wozeparrot 2025-10-15 05:22:07 -07:00
  • 99aa3bd5f9 reduce collapse reduce only the cut range (#12687) Sieds Lykles 2025-10-15 13:57:41 +02:00
  • 91ac4f1f92 late merging of where and load (#12694) Sieds Lykles 2025-10-15 13:33:06 +02:00
  • 768dc952de viz ui cleanups / renaming (#12691) qazal 2025-10-15 18:40:22 +08:00
  • 2e50ed0767 increase timeout of resnet cron (#12693) chenyu 2025-10-15 06:08:58 -04:00
  • 0aabc1e938 Mesa NIR backend (NAK/LLVMpipe) (#12089) Christopher Milan 2025-10-15 05:38:33 -04:00
  • f0268d13f6 cleanup viz server (#12688) qazal 2025-10-15 15:58:36 +08:00
  • aa81bde150 amd: usb4/thunderbolt on macs (#12641) nimlgen 2025-10-15 13:02:01 +08:00
  • 236c4590c3 use margs as intermediate for new style mops (#12686) George Hotz 2025-10-15 12:43:00 +08:00
  • 7597e1dcac pyrender in viz (#12682) qazal 2025-10-15 11:53:30 +08:00
  • 60e03eec37 viz: add View Program option (#12683) qazal 2025-10-15 11:37:51 +08:00
  • a59439d013 use UOp.shape property instead of UOp.st (#12664) George Hotz 2025-10-15 10:01:34 +08:00
  • 0b00981cd1 fix wmma new_shape George Hotz 2025-10-15 09:38:07 +08:00
  • 5d485660da reproed failure in emulation George Hotz 2025-10-15 09:19:56 +08:00
  • 0e06b5cbb6 support emulate in the NullDevice George Hotz 2025-10-15 09:11:57 +08:00
  • 89df6f611d reenable sdxl mac benchmark (#12680) chenyu 2025-10-14 17:36:17 -04:00
  • d25ceffe8d update padto opts tests (#12679) chenyu 2025-10-14 17:00:42 -04:00
  • e8380968f2 add venv_sd_mlperf to gitignore (#12676) chenyu 2025-10-14 12:51:36 -04:00
  • f228c03f9f fetch raid from cloud (#10799) wozeparrot 2025-10-14 07:53:55 -07:00
  • 3a4a3e09ea Merge branch 'master' into new_shape George Hotz 2025-10-14 21:16:44 +08:00
  • 70dd297a05 BS=96 for bert (#12675) chenyu 2025-10-14 09:07:43 -04:00
  • 3b0b3dcff3 oops, i didn't mean to change that George Hotz 2025-10-14 20:12:47 +08:00
  • accee5d840 hack for 3 op assign George Hotz 2025-10-14 20:10:53 +08:00
  • e6812bbe63 one less st George Hotz 2025-10-14 19:54:37 +08:00
  • 18a6492e98 test is broken George Hotz 2025-10-14 19:50:17 +08:00
  • 9723b4f1c1 close George Hotz 2025-10-14 19:46:52 +08:00
  • 852d80dff9 better where on load folding (#12651) Sieds Lykles 2025-10-14 13:30:47 +02:00
  • 07162df323 size doesn't use st George Hotz 2025-10-14 19:30:24 +08:00
  • 7a2e206a0d fix tests George Hotz 2025-10-14 19:22:16 +08:00
  • c7e63601fd gfx1200 tc for AMD_LLVM (#12673) nimlgen 2025-10-14 19:17:48 +08:00
  • 61855c24a8 Merge branch 'master' into new_shape George Hotz 2025-10-14 19:15:09 +08:00
  • db4a359374 fix up some slow tests that launch python (#12672) George Hotz 2025-10-14 19:13:55 +08:00
  • 28076d9270 const uses _shape George Hotz 2025-10-14 19:12:03 +08:00
  • 2c90f3ea76 split test_advancedindex remove_slow_tests George Hotz 2025-10-14 19:02:21 +08:00
  • d99457657b svd nonfull in parallel George Hotz 2025-10-14 18:50:11 +08:00
  • 8a34a4e2c7 fix up some slow tests that launch python George Hotz 2025-10-14 18:42:42 +08:00
  • 4918c827c2 amd: lib_gpu does not need cpu_access (#12670) nimlgen 2025-10-14 18:34:34 +08:00
  • 0c9d47deab hcq: add alignment to kernargs (#12669) nimlgen 2025-10-14 18:33:12 +08:00
  • d51cae1396 shape is good George Hotz 2025-10-14 18:28:00 +08:00
  • 0b69698ad4 mostly works George Hotz 2025-10-14 18:19:02 +08:00
  • d3bfcd3277 minor patches for SQTT over usb on gfx12 (#12627) qazal 2025-10-14 18:07:46 +08:00
  • 1e6e5a0efd parse_valid returns None instead of raising (#12663) Sieds Lykles 2025-10-14 11:57:38 +02:00
  • 04ead92ebd _shape is like _device George Hotz 2025-10-14 17:53:17 +08:00
  • 471bd30d16 cleanup viz/serve.py (#12665) qazal 2025-10-14 17:50:39 +08:00
  • faddebef07 need to cache it George Hotz 2025-10-14 17:35:29 +08:00
  • a659cb18a4 all mops George Hotz 2025-10-14 17:24:08 +08:00
  • 8721b6884c more mops George Hotz 2025-10-14 17:20:04 +08:00
  • 59512a49fa reshape causing issues George Hotz 2025-10-14 16:59:25 +08:00
  • a73b59caa2 work on shape property George Hotz 2025-10-14 16:50:43 +08:00
  • fb61f3519f remove assign contiguous hack (#12659) George Hotz 2025-10-14 16:42:14 +08:00
  • 30ee7c4c26 cleanup Device usage in Tensor (#12662) George Hotz 2025-10-14 16:22:22 +08:00
  • 1fd14a0889 assign remove_forced_re George Hotz 2025-10-14 16:15:54 +08:00
  • c29075ba8d remove bad contiguous usage in torch backend George Hotz 2025-10-14 16:11:26 +08:00
  • e06cbfcb8a combine pm_drop_and_clauses (#12660) Sieds Lykles 2025-10-14 10:09:41 +02:00
  • 84d4589ed4 remove pylint from pre-commit and CI (#12658) George Hotz 2025-10-14 15:39:59 +08:00
  • de6a8f5bd6 how did that typecheck? remove_pylint George Hotz 2025-10-14 15:22:50 +08:00
  • 33d7c19c49 better name George Hotz 2025-10-14 15:15:17 +08:00
  • f14ccd06c9 8 is faster than 4 George Hotz 2025-10-14 15:09:38 +08:00
  • 65d8e1e0bf faster pre-commit George Hotz 2025-10-14 15:05:58 +08:00
  • 31bbcd729a multidevice test is fast George Hotz 2025-10-14 15:00:23 +08:00
  • 235cc39b96 remove pylint from pre-commit and CI George Hotz 2025-10-14 14:56:32 +08:00
  • 4c593feed3 remove assign contiguous hack George Hotz 2025-10-14 14:54:39 +08:00
  • 8ecaf839e2 cleanup UOp tracing [pr] (#12657) qazal 2025-10-14 14:50:59 +08:00
  • 30ff87eab4 realize sched outerworld_work George Hotz 2025-10-14 14:44:56 +08:00
  • fe683bafa6 Merge branch 'master' into outerworld_work George Hotz 2025-10-14 14:26:52 +08:00
  • b9eb5b5d49 clean up the LLM tokenizer (#12653) George Hotz 2025-10-14 14:22:01 +08:00
  • a9ef93176f viz: add colored text helper (#12654) qazal 2025-10-14 13:05:26 +08:00
  • ecdc7539a2 add typing to MathTraits (#12650) George Hotz 2025-10-14 12:35:20 +08:00
  • 147fd0e2c6 fix assign mt_typing George Hotz 2025-10-14 11:15:12 +08:00
  • 1ecb99480e add typing to MathTraits George Hotz 2025-10-14 10:59:00 +08:00
  • 9bf032de69 viz: keep focused shape in view (#12648) qazal 2025-10-14 10:49:08 +08:00
  • 77b5e6774e fix bert training config (#12647) chenyu 2025-10-13 15:03:47 -04:00
  • f1041dc0ac pylint 4.0.0 (#12642) nimlgen 2025-10-13 23:28:36 +08:00
  • 47e0c43976 feat: Tensor.{load, store} (#12629) wozeparrot 2025-10-13 08:04:41 -07:00
  • 0f776c6e46 examples/mlperf/training_submission_v6.0 (#12644) chenyu 2025-10-13 09:58:25 -04:00
  • e0139fafc1 UOp symbolic tests use eval to check against string (#12643) Sieds Lykles 2025-10-13 14:19:42 +02:00
  • 218225e8d0 pylint error (#12630) b1tg 2025-10-13 20:05:12 +08:00
  • 9096d7cc2e amd: support for rx9060 (#12640) nimlgen 2025-10-13 19:44:15 +08:00
  • 066d25f5fb refactor to trace_num property in buffers (#12638) qazal 2025-10-13 18:06:55 +08:00
  • cd6aeebfee sqtt: osx decoder installer (#12637) qazal 2025-10-13 17:26:12 +08:00
  • e537e895b1 drop unused invalid conditions (#12635) Sieds Lykles 2025-10-13 10:52:21 +02:00
  • 9ab06dffad hotfix: block from env (#12628) wozeparrot 2025-10-12 08:07:32 -07:00
  • 12435a2dab actual tinyfs device (#12620) wozeparrot 2025-10-12 07:51:17 -07:00
  • 8f5f57c7d9 smaller CNT fuzz shapetracker (#12626) chenyu 2025-10-12 08:52:30 -04:00
  • 1ecf403294 cleanup long lines [pr] (#12623) George Hotz 2025-10-12 20:18:05 +08:00
  • fd51ecf983 process_replay for get_rangeify_map (#12624) qazal 2025-10-12 15:14:40 +03:00
  • b5afa3848e viz: fix memory graph total nbytes (#12622) qazal 2025-10-12 14:32:46 +03:00
  • 822eab057f cpu: respect taskset + allow all cores (#12619) nimlgen 2025-10-12 14:31:40 +08:00
  • 7ac74d1550 remove unused type ignore [pr] (#12618) chenyu 2025-10-11 21:24:04 -04:00
  • 772a8dfe31 reshape uses valid when simplifying (#12597) Sieds Lykles 2025-10-11 17:02:54 +02:00
  • 08e62454b6 amd: use cpu_view() in sqtt (#12610) nimlgen 2025-10-11 18:11:25 +08:00
  • a2ae56674a uop_given_valid try multiple clauses (#12615) Sieds Lykles 2025-10-11 11:53:42 +02:00