Commit Graph

11735 Commits

Author SHA1 Message Date
George Hotz
322eb1fbc8 bitcast 2026-01-11 09:50:19 +09:00
George Hotz
18fabb8723 less spec 2026-01-11 09:48:41 +09:00
George Hotz
c028fdd83c simpler spec 2026-01-11 09:41:57 +09:00
George Hotz
688f57468b work 2026-01-11 09:14:48 +09:00
George Hotz
104f377c15 transform 2026-01-11 09:04:29 +09:00
George Hotz
e32259c937 spec 2026-01-11 08:51:39 +09:00
George Hotz
0de8f6dbdf work 2026-01-11 08:28:01 +09:00
George Hotz
ca9a7b5e10 less verbose 2026-01-11 08:24:07 +09:00
George Hotz
7c76883a99 all rewrite 2026-01-11 08:14:26 +09:00
George Hotz
7aae4a06fb more typed 2026-01-11 08:04:20 +09:00
George Hotz
d0921d944d simpler 2026-01-11 07:38:36 +09:00
George Hotz
8dda2a5ecf just Uops 2026-01-11 07:37:08 +09:00
George Hotz
ee87056305 single PM 2026-01-11 07:33:24 +09:00
George Hotz
146c4ebda4 simpler norm 2026-01-11 07:24:57 +09:00
George Hotz
172b3c870c assign/declare 2026-01-11 07:17:47 +09:00
George Hotz
25b79a72a1 move to transform 2026-01-11 07:10:54 +09:00
George Hotz
55677ff8ef comments 2026-01-09 17:02:02 -08:00
George Hotz
709a36ba20 transform 2026-01-09 13:11:47 -08:00
George Hotz
57fc145fdd pretty print 2026-01-09 12:58:18 -08:00
George Hotz
7a89b9866a more pcode parse 2026-01-09 11:50:33 -08:00
George Hotz
d10668283d Merge remote-tracking branch 'origin/master' into asm_ucode
# Conflicts:
#	test/test_jit.py
#	test/test_jit_footguns.py
#	tinygrad/engine/jit.py
2026-01-08 05:14:42 -08:00
George Hotz
627c440d38 minmax 2026-01-08 05:14:04 -08:00
George Hotz
e7b5d8a434 assembly/amd: more RDNA4 asm (#14062)
* rdna4 more

* asm

* fixes

* assembly/amd: handwritten wmma failing test

* passes

* wmma default hacks

* space

* 0 skips in rdna3/rdna4 disasm

* more RDNA4 tests

---------

Co-authored-by: qazal <qazal.software@gmail.com>
2026-01-08 05:09:37 -08:00
George Hotz
0c40e52ae1 no void 2026-01-08 05:06:23 -08:00
George Hotz
894230d0a9 fix parser bugs 2026-01-08 05:02:55 -08:00
George Hotz
544a877960 cvt functions 2026-01-08 04:57:04 -08:00
nimlgen
e372c841ba hevc: beam in decode (#14067)
* hevc: beam in decode

* fine

* g
2026-01-08 15:47:16 +03:00
nimlgen
1732a4ec4b am: rework set_clocks (#14065) 2026-01-08 15:33:32 +03:00
George Hotz
0dfdad0e76 cleanup pcode_parse 2026-01-08 04:27:26 -08:00
nimlgen
f3aceaa08b hevc: fast decoder (#14057) 2026-01-08 15:20:37 +03:00
George Hotz
4a7456caef move more 2026-01-08 04:18:46 -08:00
qazal
309197bca5 assembly/amd: test_roundtrip for cdna/rdna4 (#14066) 2026-01-08 21:03:13 +09:00
George Hotz
d84db5851f calls 2026-01-08 00:55:57 -08:00
qazal
15a056715d fix amd assembly IDE tests on macbook (#14063) 2026-01-08 17:27:52 +09:00
George Hotz
37b4751958 isNAN 2026-01-08 00:24:10 -08:00
George Hotz
5e923ccb5e simpler ucode 2026-01-08 00:10:23 -08:00
George Hotz
10836a5dba lil cleanups 2026-01-07 23:49:53 -08:00
wozeparrot
027b935269 tk: fix grouped load store (#14035) 2026-01-07 22:38:02 -08:00
George Hotz
56ba96f5cd uops have types 2026-01-07 21:52:55 -08:00
George Hotz
2db04d0696 assembly/amd: start adding RDNA4 support (#14060)
* assembly/amd: start adding RDNA4 support

* rdna4 asm
2026-01-07 21:19:30 -08:00
George Hotz
c8b42edec6 a bunch of todos for my boy claude 2026-01-07 20:18:04 -08:00
chenyu
add569d94c test_unrealized_const_input_frozen (#14044)
unrealized const is not replaced in jit
2026-01-07 20:18:04 -08:00
nimlgen
e33d79226d amd: copies w/o sdma (#14036)
* amd: copies w/o sdma

* as_args

* fixes

* f
2026-01-07 20:18:04 -08:00
chenyu
caa52dcbe5 raise when jit fxn returns non-Tensor output (#14042) 2026-01-07 20:18:04 -08:00
chenyu
aa96d826f4 JitError (#14041)
* JitError

* test_symbolic_jit
2026-01-07 20:18:04 -08:00
chenyu
b4fd0954b7 test jit tolist failure (#14040)
also moved tests to test_jit_footguns
2026-01-07 20:18:04 -08:00
chenyu
02ab3eb153 test case for jit a function with item call (#14039)
* test case for jit a function with item call

output is silently wrong now

* no dtype
2026-01-07 20:18:04 -08:00
nimlgen
a6198a67fc mockdsp: use dsp allocator (#14037)
* mockdsp: use dsp allocator

* fix

* ?
2026-01-07 20:18:04 -08:00
wozeparrot
0069cd9a0b tk: support sliced local -> reg load (#14034) 2026-01-07 20:18:04 -08:00
George Hotz
cb500466c2 assembly/amd: amd_asm_matmul (#13989)
* amd_asm_matmul

* dsl transform

* asm roundtrip

* fixed

* less

* better

* more

* simpler

* simplify

* lil

* simpler

* compact

* work

* cleanups

* simplify

* simpler

* cleanup

* name the regs

* simp

* big simp

* big simp

* simp

* acc grid

* fast

* stuff

* fast

* simpler

* owrks

* save vgprs

* save vgprs

* Compact

* less VGPRs

* after

* SQTT support

* fastest

* faster

* lil faster

* tile regs

* faster

* readable

* one more

* simpler

* lil simpler

* NO_GLOBAL skips early globals

* stock kernel

* cleanups

* cleanups

* one b reg

* safe reg changes

* acc is compact now

* remove confusing stuff

* sregs

* lds cleanups

* vopd
2026-01-07 20:11:05 -08:00