Commit Graph

  • 5cf42dc4db add Scheduler to replace Kernel with POSTOPT=2 (#11924) George Hotz 2025-09-03 19:23:30 -07:00
  • b13e071463 move test_winograd to unit test (#11993) chenyu 2025-09-03 21:47:32 -04:00
  • edc8b99853 more tests that pass PTX now (#11992) chenyu 2025-09-03 21:18:14 -04:00
  • ed2f45712b remove skip PTX in test_arange (#11991) chenyu 2025-09-03 20:45:19 -04:00
  • a5f2b4872a use_tensor_cores is a heuristic (#11989) George Hotz 2025-09-03 17:05:10 -07:00
  • 63e930fec3 apply_tensor_cores is a heuristic (#11988) George Hotz 2025-09-03 16:39:33 -07:00
  • bf6feb7dcc flip that sign tc_is_heur George Hotz 2025-09-03 16:21:07 -07:00
  • 04363c9c86 move apply_tensor_cores to heuristics George Hotz 2025-09-03 16:12:16 -07:00
  • d0e739453e update many einsum tests (#11981) chenyu 2025-09-03 15:40:20 -04:00
  • 55e4bdd353 split_uop is a method (#11984) George Hotz 2025-09-03 10:46:17 -07:00
  • 1877eddde4 broadcast for upat (#11940) ttomsa 2025-09-03 18:04:23 +01:00
  • 5ed262982a remove some tc hacks from BEAM (#11980) George Hotz 2025-09-03 09:59:10 -07:00
  • 6d53cac457 dtype fuzz: log need input > 0 (#11979) b1tg 2025-09-04 00:10:42 +08:00
  • 68e83b850f nbytes should raise an exception when size is unlimited (#11928) Jordan Chalupka 2025-09-03 10:06:20 -04:00
  • 86e908db57 cast parents of int64 alu to int32 if possible (#11977) Sieds Lykles 2025-09-03 11:05:04 +02:00
  • 033184b3cb parse_valid with non const rhs (#11957) Sieds Lykles 2025-09-03 08:08:46 +02:00
  • 53eff8970a add Ops.GEP to _min_max (#11976) Sieds Lykles 2025-09-03 07:07:54 +02:00
  • d1d0960e6e remove intermediate cast using bounds - weaker pattern (#11974) Sieds Lykles 2025-09-03 06:24:40 +02:00
  • 8a2846b31a assert embedding input is integer dtype (#11963) Sieds Lykles 2025-09-03 01:44:26 +02:00
  • d16cc6c012 feat: resume ckpt (#11970) wozeparrot 2025-09-02 15:47:48 -07:00
  • 1b73993521 pyrender to render uops (#11968) George Hotz 2025-09-02 15:44:01 -07:00
  • e921fb44ee clean up testnvidia env (#11969) chenyu 2025-09-02 18:29:00 -04:00
  • 69dd1817d0 raise RuntimeError in merge_dicts instead of assert [pr] (#11965) chenyu 2025-09-02 17:18:44 -04:00
  • f750c15965 viz: add python marker (#11952) qazal 2025-09-02 23:44:00 +03:00
  • 550cf2ca7f tests from postopt (#11964) George Hotz 2025-09-02 13:34:17 -07:00
  • f150e27ad1 reraise is fine test_from_postopt George Hotz 2025-09-02 13:01:20 -07:00
  • 81d597ebbc tests from postopt George Hotz 2025-09-02 12:59:36 -07:00
  • b977ec0813 viz: axes domains cleanup (#11962) qazal 2025-09-02 19:30:45 +03:00
  • 897254ad6c ci: add dev<->cpu copy speeds (#11959) nimlgen 2025-09-02 15:22:44 +03:00
  • 74040663bf make ptrdtype a UOp property (#11955) George Hotz 2025-09-01 16:35:43 -07:00
  • 0dfca4e74b add failing test for rangeify setitem (#11954) George Hotz 2025-09-01 16:24:35 -07:00
  • 7c21271a5f feat: end_lr envvar (#11953) wozeparrot 2025-09-01 14:53:07 -07:00
  • 6a40216724 correct bf16 fuzz input in test_dtype_alu (#11933) chenyu 2025-09-01 10:52:26 -04:00
  • 965ea59b16 test_dtype_alu use AMD_LLVM from helpers (#11950) chenyu 2025-09-01 10:03:17 -04:00
  • a9f07c31bc fix amd llvm sqrt (#11936) b1tg 2025-09-01 21:31:14 +08:00
  • 0a53e72f70 viz: fix trace duration in python test decoder (#11949) qazal 2025-09-01 14:32:25 +03:00
  • 27c9ed5a84 viz: more consistent naming of events (#11948) qazal 2025-09-01 14:16:47 +03:00
  • c7bb561ef9 remu: add v_rsq_f32_e32 instruction (#11947) qazal 2025-09-01 11:29:31 +03:00
  • d9560a631c remove cast between ints if safe (#11946) Sieds Lykles 2025-09-01 05:56:49 +02:00
  • a19d689481 fix vec dtype _min_max (#11944) Sieds Lykles 2025-09-01 03:24:07 +02:00
  • f32f3464d6 Can safe cast from certain ints to floats (#11941) Sieds Lykles 2025-09-01 00:51:24 +02:00
  • 1c6e43c203 Double cast is one cast if intermediate cast is safe (#11939) Sieds Lykles 2025-09-01 00:36:29 +02:00
  • 7e68045fb2 feat: small llama3 training (#11829) wozeparrot 2025-08-31 13:41:47 -07:00
  • 020abe0556 hcq: finalize without synchronization when in error state (#11872) nimlgen 2025-08-31 18:39:13 +03:00
  • 2004c9757d tracing: add default clock (#11935) qazal 2025-08-31 18:24:44 +03:00
  • c1eeb3b99c only skip AMD_LLVM (#11934) b1tg 2025-08-31 23:15:47 +08:00
  • 75d380a77c fix transcendentals in python renderer (#11932) b1tg 2025-08-31 21:37:17 +08:00
  • 61e4dc6ad5 render special arg in cstyle if arg is UOp (#11931) Sieds Lykles 2025-08-31 07:01:29 +02:00
  • d3252ccd85 fix special vmax when arg is UOp (#11930) Sieds Lykles 2025-08-31 06:54:39 +02:00
  • 0bacd9fc9b viz: give disassembly its own node (#11927) qazal 2025-08-31 00:28:52 +03:00
  • af89be317e relax rtol for bfloat16 test_dtype_alu (#11926) chenyu 2025-08-30 17:16:08 -04:00
  • 632c2fb119 lowerer works on rangeifed + print exception (#11925) George Hotz 2025-08-30 12:05:44 -07:00
  • c27b99d68f viz: refactor to indexed rewrite traces (#11923) qazal 2025-08-30 20:01:47 +03:00
  • 9aff00a6ea switch viz command line args to pathlib (#11922) qazal 2025-08-30 18:13:47 +03:00
  • c86ee5bfaf viz: canonicalize device name colors (#11921) qazal 2025-08-30 18:12:30 +03:00
  • a4f05ebd1a ci: rebuild gpuocelot with boost libs (#11920) nimlgen 2025-08-30 17:24:19 +03:00
  • bf0d055b39 viz: color by name (#11919) qazal 2025-08-30 16:04:58 +03:00
  • 0bc34c000f simplify range mod its own upper bound (#11917) Sieds Lykles 2025-08-30 08:37:35 +02:00
  • f4e1b93225 fix beam working_postopt George Hotz 2025-08-29 19:05:49 -07:00
  • 4a31c319b3 work George Hotz 2025-08-29 18:42:42 -07:00
  • 59081645f7 beam in RKernel George Hotz 2025-08-29 18:17:23 -07:00
  • 561318fea7 Tensor.cos in test_stype_alu (#11916) chenyu 2025-08-29 20:26:36 -04:00
  • 0838021753 remove np from beautiful_cifar (#10988) NoahKusaba 2025-08-29 19:34:16 -04:00
  • cf9d8c8142 ci: pin boost for macos runners (#11910) nimlgen 2025-08-30 01:38:06 +03:00
  • c6e342cdac mockgpu: no hang if gpuocelot failed (#11915) nimlgen 2025-08-30 00:44:49 +03:00
  • 26d03a86a1 test_symbolic_ops.py cleanup (#11895) chenyu 2025-08-29 17:11:59 -04:00
  • 6e57905c6d Merge branch 'master' into working_postopt George Hotz 2025-08-29 12:37:01 -07:00
  • b2cc06218a python bfloat16 (#11912) b1tg 2025-08-30 03:18:02 +08:00
  • 40606c60b0 Merge branch 'master' into working_postopt George Hotz 2025-08-29 11:23:17 -07:00
  • afad7d0cd1 remove dtype from range, it will be dtypes.index soon [pr] (#11914) George Hotz 2025-08-29 09:52:07 -07:00
  • 80986321eb Merge branch 'master' into working_postopt George Hotz 2025-08-29 09:43:05 -07:00
  • 30e72d5820 multi device and copy tracing for NULL device (#11913) qazal 2025-08-29 15:31:00 +03:00
  • d8e1e4dc61 tracing: show NULL programs (#11911) qazal 2025-08-29 14:09:33 +03:00
  • 75678b2cbe amd: retire pm4 xcc sync (#11835) nimlgen 2025-08-29 09:56:27 +03:00
  • bd263cbcb0 Merge branch 'master' into working_postopt George Hotz 2025-08-28 15:13:02 -07:00
  • 394c2d1db1 update Kernel API in tests + move optimize_local_size (#11907) George Hotz 2025-08-28 15:12:47 -07:00
  • 2e41472b02 support tc 2, all are pad George Hotz 2025-08-28 14:40:50 -07:00
  • ac641f7b10 support tc padding George Hotz 2025-08-28 14:34:42 -07:00
  • 226c59fa5a bugfix George Hotz 2025-08-28 14:13:34 -07:00
  • 3bbfcbccde work George Hotz 2025-08-28 13:55:07 -07:00
  • fa695ac1ce ci: mac gpuocelot (#11906) nimlgen 2025-08-28 23:29:43 +03:00
  • 78a56b3461 fix some tests George Hotz 2025-08-28 12:55:12 -07:00
  • b5ac4501d4 Merge branch 'master' into working_postopt George Hotz 2025-08-28 12:35:39 -07:00
  • b9b438c516 small updates from postopt (#11903) George Hotz 2025-08-28 12:34:52 -07:00
  • bb55a3001f nv: flush reset message (#11897) nimlgen 2025-08-28 22:17:20 +03:00
  • 038d3bc295 clean up test George Hotz 2025-08-28 11:39:02 -07:00
  • 4b223c820a fix test George Hotz 2025-08-28 11:29:02 -07:00
  • e8289c75b1 ci: do not reinstall existing pkgs in macos (#11900) nimlgen 2025-08-28 21:20:15 +03:00
  • 528e285d81 more tests George Hotz 2025-08-28 11:15:47 -07:00
  • cd3dc67636 tensor cores need to pad George Hotz 2025-08-28 11:04:30 -07:00
  • b19a8963c3 revert George Hotz 2025-08-28 11:00:20 -07:00
  • ec10e00cf5 Merge branch 'master' into working_postopt George Hotz 2025-08-28 10:50:00 -07:00
  • d3aa38ad4a work George Hotz 2025-08-28 10:46:18 -07:00
  • 134cf56904 update cache name for gpuocelot (#11896) chenyu 2025-08-28 13:11:10 -04:00
  • ea1be2e4cd [bounty] Remove using reshape to register symbolic shape (#11771) Ben Waldron 2025-08-28 16:30:49 +00:00
  • 53853ae49b viz: switch to Path2D (#11892) qazal 2025-08-28 18:58:16 +03:00
  • 874c1db4af am: init support for aql (#11888) nimlgen 2025-08-28 18:41:46 +03:00
  • 17ecaf4682 Add test_variable_empty (#11889) Ben Waldron 2025-08-28 15:38:27 +00:00
  • 54be477152 rope cache optim for jit prune in llm.py (#11678) Nino Risteski 2025-08-28 17:31:29 +02:00
  • 6e41040e91 get tc ranges George Hotz 2025-08-28 08:24:33 -07:00