Commit Graph

9 Commits

Author SHA1 Message Date
nimlgen
fb96394ff5 auto-select available compilers (#12094)
* device: auto select compilers

* fix

* metal+opencl

* nv/cuda

* test without ptx

* ptx

* fix tests

* fix

* fix test

* rename

* test + cleaner

* xx

* ops

* better test

* win?

* um?

* types

* debug

* win??

* sep rung

* wtf?

* debug

* skip win

* revert this

* types
2025-09-10 19:52:01 +03:00
George Hotz
394c2d1db1 update Kernel API in tests + move optimize_local_size (#11907) 2025-08-28 15:12:47 -07:00
George Hotz
82be8abfd2 move opt under codegen (#11569) 2025-08-07 14:19:17 -07:00
chenyu
a0438012af remove Kernel.get_program [pr] (#11203) 2025-07-12 20:50:29 -04:00
George Hotz
92678e59ee move kernel to opt (#10899) 2025-06-20 15:22:28 -07:00
George Hotz
411392dfb7 move files into uop dir (#10399)
* move files into uop dir [pr]

* tinygrad.uop is a thing

* fix uop docs, no pr

* fix viz
2025-05-18 11:38:28 -07:00
chenyu
720f20865b remove required_optimizations (#9848) 2025-04-19 16:51:16 -04:00
chenyu
f5256e0020 Kernel.apply_opts [pr] (#9917)
* Kernel.apply_opts [pr]

updated all `for opt in`. also updated a few test_liinearizer tests to not implcitly depend on hand_coded_optimization

* not you yet
2025-04-17 08:00:56 -04:00
Francis Lam
1e5d9ad8f7 extra/gemm/max_matmul: start of custom kernels for GEMM (#6926)
* extra/gemm/max_matmul: start of custom kernels for GEMM

* add an unoptimized FP16/FP16 MMA example

* add slow 3-stage fp16 acc example

* add correct 3-stage pipeline with unswizzled/flat smem input (slow)

* add acc fp16 example with 3 stages and swizzle (no bank conflicts)

* add max version of NV fp16_fp16_fp16

* fix up comments and removed unused code in max variations

* add start of no_xor example

* fix to account for UOps to Ops
2025-03-19 15:04:57 +08:00