George Hotz
|
d10668283d
|
Merge remote-tracking branch 'origin/master' into asm_ucode
# Conflicts:
# test/test_jit.py
# test/test_jit_footguns.py
# tinygrad/engine/jit.py
|
2026-01-08 05:14:42 -08:00 |
|
George Hotz
|
627c440d38
|
minmax
|
2026-01-08 05:14:04 -08:00 |
|
George Hotz
|
e7b5d8a434
|
assembly/amd: more RDNA4 asm (#14062)
* rdna4 more
* asm
* fixes
* assembly/amd: handwritten wmma failing test
* passes
* wmma default hacks
* space
* 0 skips in rdna3/rdna4 disasm
* more RDNA4 tests
---------
Co-authored-by: qazal <qazal.software@gmail.com>
|
2026-01-08 05:09:37 -08:00 |
|
George Hotz
|
0c40e52ae1
|
no void
|
2026-01-08 05:06:23 -08:00 |
|
George Hotz
|
894230d0a9
|
fix parser bugs
|
2026-01-08 05:02:55 -08:00 |
|
George Hotz
|
544a877960
|
cvt functions
|
2026-01-08 04:57:04 -08:00 |
|
nimlgen
|
e372c841ba
|
hevc: beam in decode (#14067)
* hevc: beam in decode
* fine
* g
|
2026-01-08 15:47:16 +03:00 |
|
George Hotz
|
0dfdad0e76
|
cleanup pcode_parse
|
2026-01-08 04:27:26 -08:00 |
|
nimlgen
|
f3aceaa08b
|
hevc: fast decoder (#14057)
|
2026-01-08 15:20:37 +03:00 |
|
George Hotz
|
4a7456caef
|
move more
|
2026-01-08 04:18:46 -08:00 |
|
qazal
|
309197bca5
|
assembly/amd: test_roundtrip for cdna/rdna4 (#14066)
|
2026-01-08 21:03:13 +09:00 |
|
George Hotz
|
d84db5851f
|
calls
|
2026-01-08 00:55:57 -08:00 |
|
qazal
|
15a056715d
|
fix amd assembly IDE tests on macbook (#14063)
|
2026-01-08 17:27:52 +09:00 |
|
George Hotz
|
37b4751958
|
isNAN
|
2026-01-08 00:24:10 -08:00 |
|
George Hotz
|
5e923ccb5e
|
simpler ucode
|
2026-01-08 00:10:23 -08:00 |
|
George Hotz
|
10836a5dba
|
lil cleanups
|
2026-01-07 23:49:53 -08:00 |
|
George Hotz
|
56ba96f5cd
|
uops have types
|
2026-01-07 21:52:55 -08:00 |
|
George Hotz
|
2db04d0696
|
assembly/amd: start adding RDNA4 support (#14060)
* assembly/amd: start adding RDNA4 support
* rdna4 asm
|
2026-01-07 21:19:30 -08:00 |
|
George Hotz
|
c8b42edec6
|
a bunch of todos for my boy claude
|
2026-01-07 20:18:04 -08:00 |
|
wozeparrot
|
0069cd9a0b
|
tk: support sliced local -> reg load (#14034)
|
2026-01-07 20:18:04 -08:00 |
|
George Hotz
|
cb500466c2
|
assembly/amd: amd_asm_matmul (#13989)
* amd_asm_matmul
* dsl transform
* asm roundtrip
* fixed
* less
* better
* more
* simpler
* simplify
* lil
* simpler
* compact
* work
* cleanups
* simplify
* simpler
* cleanup
* name the regs
* simp
* big simp
* big simp
* simp
* acc grid
* fast
* stuff
* fast
* simpler
* owrks
* save vgprs
* save vgprs
* Compact
* less VGPRs
* after
* SQTT support
* fastest
* faster
* lil faster
* tile regs
* faster
* readable
* one more
* simpler
* lil simpler
* NO_GLOBAL skips early globals
* stock kernel
* cleanups
* cleanups
* one b reg
* safe reg changes
* acc is compact now
* remove confusing stuff
* sregs
* lds cleanups
* vopd
|
2026-01-07 20:11:05 -08:00 |
|
nimlgen
|
5bd4593eda
|
hevc: cleaner decoder (#14056)
* hevc: cleaner decoder
* nn
|
2026-01-07 18:29:30 +03:00 |
|
chenyu
|
c714881832
|
don't allow jit input to be const (#14045)
* don't allow jit input to be unbuffered like const
* just const to fix multi
* fix rnnt
|
2026-01-06 18:15:22 -05:00 |
|
wozeparrot
|
2b3e01e79c
|
tk: support sliced local -> reg load (#14034)
|
2026-01-06 05:33:24 -05:00 |
|
George Hotz
|
947747eb5e
|
Merge branch 'master' into asm_ucode
|
2026-01-06 00:16:03 -08:00 |
|
George Hotz
|
45f7fd073d
|
assembly/amd: pcode bug fixes (#14032)
* bring over pcode parser
* fixes
* pdf test
* delay alu
|
2026-01-06 00:15:48 -08:00 |
|
wozeparrot
|
21d0f6bb76
|
tk: flat global -> local load (#14033)
|
2026-01-05 23:35:53 -08:00 |
|
George Hotz
|
640dac46c2
|
pcode_exec
|
2026-01-05 21:15:29 -08:00 |
|
George Hotz
|
05129e58b0
|
pcode back
|
2026-01-05 20:15:30 -08:00 |
|
George Hotz
|
b7dc59a68d
|
fix emu
|
2026-01-05 20:09:20 -08:00 |
|
George Hotz
|
ec7ec99cbd
|
better
|
2026-01-05 20:06:52 -08:00 |
|
George Hotz
|
c8c6346336
|
tests
|
2026-01-05 19:57:30 -08:00 |
|
George Hotz
|
6de310c87f
|
parsing
|
2026-01-05 19:54:56 -08:00 |
|
George Hotz
|
ffba806b65
|
pdf/qcode work
|
2026-01-05 19:42:47 -08:00 |
|
George Hotz
|
c4016d5cac
|
fix psuedocode parsing
|
2026-01-05 18:58:55 -08:00 |
|
George Hotz
|
a5587fbda1
|
Merge origin/master, delete pcode.py
|
2026-01-05 18:53:43 -08:00 |
|
George Hotz
|
20653d2996
|
assembly/amd: make pdf.py code shine (#14029)
* assembly/amd: make pdf.py code shine
* no merge
* pdf2 is the future
* something
* regen enums
* test
* work
* remove junk
* write
* pcode extraction
* pdf2 passes all tests
* simplify
* simpler pdf
* late filter
* remove hacks
* simplify pdf2.py
* field type
* remove defaults
* don't export srcenum
* simple pdf.py
* simpler
* cleaner
* less hack in PDF
|
2026-01-05 18:49:40 -08:00 |
|
qazal
|
ea7b149ca5
|
viz command line tool (#14030)
|
2026-01-06 10:19:47 +09:00 |
|
George Hotz
|
74da1c6310
|
test qcode
|
2026-01-05 02:42:27 -08:00 |
|
nimlgen
|
70405b4f3c
|
am_smi: mi350 (#14018)
|
2026-01-05 13:10:56 +03:00 |
|
George Hotz
|
400d59c06b
|
simpler
|
2026-01-04 20:37:06 -08:00 |
|
George Hotz
|
57684d2777
|
no pcode
|
2026-01-04 18:35:16 -08:00 |
|
George Hotz
|
8147a78d24
|
wide dtypes
|
2026-01-04 17:26:12 -08:00 |
|
George Hotz
|
486248f775
|
fix pcode
|
2026-01-04 17:04:52 -08:00 |
|
George Hotz
|
87e72f1540
|
ftz
|
2026-01-04 16:32:35 -08:00 |
|
George Hotz
|
b52ff63896
|
fixes
|
2026-01-04 15:48:31 -08:00 |
|
George Hotz
|
404eed6172
|
assembly/amd: improve tests for asm (#14007)
* assembly/amd: improve tests for asm
* upd
* skip
* tests
* re bug
* more passing
* cleanups
* cdna fixups
* improve tests, better CDNA parsing
* fix CI
* no defs
* simpler
* all pass
* from pdf
* regen
|
2026-01-04 15:14:08 -08:00 |
|
George Hotz
|
7f7f12d5b4
|
99% match
|
2026-01-04 15:05:05 -08:00 |
|
George Hotz
|
b10ae6958e
|
roundtripping
|
2026-01-04 14:31:40 -08:00 |
|
George Hotz
|
10e2c47d52
|
don't make dtype
|
2026-01-04 13:49:47 -08:00 |
|