* assembler maybe

* custom asm

* rdna3 on quiet

* trigger crashes

* fixed notes

* non-fatal rdna2 crash

* Crash4

* improve rdna sniffer

* comments

* improve sniffer

* asm

* 131 TFLOPS RDNA3

* opt simple matmul

* todos
This commit is contained in:
George Hotz
2023-05-16 05:33:57 -07:00
committed by GitHub
parent 89b8b39d9c
commit 90fff82c8a
14 changed files with 389 additions and 35 deletions

View File

@@ -197,7 +197,7 @@ class CStyleCodegen(Linearizer):
# sometimes, there's more dimensions than len(self.lang.gid).
# compact all the dimensions into the first
# NOTE: this might make multiview shapetrackers
# TODO: this exposes bugs in the optimizers assuming the strides are on a single vie
# TODO: this exposes bugs in the optimizers assuming the strides are on a single view
"""
if len(self.lang.gid) and self.first_reduce > len(self.lang.gid):
num_to_merge = (self.first_reduce - len(self.lang.gid))+1