15 Commits

Author SHA1 Message Date
George Hotz
c417cd3c97 fast HIP gemm -> 100 TFLOPS (#1476)
* fast HIP gemm

* wmma

* correct b

* fix spilling

* 60 TFLOPS

* 64 TFLOPS

* 65 TFLOPS
2023-08-09 06:54:15 -07:00
George Hotz
fe71282ba1 faster RDNA assembly backend (#990)
* fast asm

* torch gemm
2023-06-16 12:06:38 -07:00
George Hotz
ba56ee6020 RDNA assembly backend ($1000 bounty) (#787)
* Revert "Revert "ops rdna""

This reverts commit 0400315078.

* Revert "Revert "writing 2""

This reverts commit 325a3bf2cf.

* no dump

* 2x 2

* simple asm

* local size

* sub

* lil work

* support args != 3

* assembler work

* generate that

* ptx assembler

* begin index renderer

* max

* ptx loops

* gemms work

* valid works

* asm working a bit more

* close

* passing all ops tests

* ptx is a codegen only, not a backend

* ptx

* float16 support

* rdna goes here

* install types

* make amd disassemble

* ansilen for pretty print

* fix ptx log2/exp2

* assemblyinstruction

* new asm

* working gemm

* fix cmp

* more passing

* mod

* ptx works again

* rdan3 add works

* log exp

* sin is sin 2pi

* fix types

* progress

* loops work

* rdna xyz

* better addressing

* cleanups

* handle exception in early process

* div support

* rdna float4

* locals work

* fix neg index

* cast

* smaller diff

* yaml

* import only if selected

* fromimport

* types

* this all needs rewriting

* a few more
2023-06-16 09:33:18 -07:00
George Hotz
0400315078 Revert "ops rdna"
This reverts commit 81a11d891d.
2023-05-21 13:02:18 -07:00
George Hotz
325a3bf2cf Revert "writing 2"
This reverts commit dddd6c42f0.
2023-05-21 13:02:17 -07:00
George Hotz
dddd6c42f0 writing 2 2023-05-21 12:52:36 -07:00
George Hotz
81a11d891d ops rdna 2023-05-21 11:45:38 -07:00
George Hotz
90fff82c8a Rdna (#776)
* assembler maybe

* custom asm

* rdna3 on quiet

* trigger crashes

* fixed notes

* non-fatal rdna2 crash

* Crash4

* improve rdna sniffer

* comments

* improve sniffer

* asm

* 131 TFLOPS RDNA3

* opt simple matmul

* todos
2023-05-16 05:33:57 -07:00
George Hotz
5190037cbc rocm: disassembler for shader 2023-05-06 19:07:52 +00:00
George Hotz
42256c0d9d rocm sniffer dumps code 2023-05-05 18:36:53 +00:00
George Hotz
f2a964f447 nocopy (#764) 2023-05-05 09:32:06 -07:00
George Hotz
3a2011ab2d rocm sniffer 2023-05-04 22:22:39 +00:00
George Hotz
a55c4f5000 better rocm build scripts 2023-05-04 09:14:05 +00:00
George Hotz
987b1aaf96 rocm build scripts 2023-05-04 08:45:23 +00:00
George Hotz
3b933b0a2f rocm setup script 2023-05-03 16:01:17 +00:00