George Hotz
87e72f1540
ftz
2026-01-04 16:32:35 -08:00
George Hotz
b52ff63896
fixes
2026-01-04 15:48:31 -08:00
George Hotz
7f7f12d5b4
99% match
2026-01-04 15:05:05 -08:00
George Hotz
b10ae6958e
roundtripping
2026-01-04 14:31:40 -08:00
George Hotz
10e2c47d52
don't make dtype
2026-01-04 13:49:47 -08:00
George Hotz
058816dd92
use tinygrad UOps as DSL
2026-01-04 13:40:15 -08:00
George Hotz
28846cb6c4
simpler dsl
2026-01-04 13:24:01 -08:00
George Hotz
958bfa1c5b
Op
2026-01-04 13:05:38 -08:00
George Hotz
23cf30820f
more correct
2026-01-04 12:52:20 -08:00
George Hotz
ea51512f90
CMPLE
2026-01-04 12:45:21 -08:00
George Hotz
e9664fdf28
dtype is uop
2026-01-04 12:31:40 -08:00
George Hotz
5d50281896
Merge remote-tracking branch 'origin/master' into asm_ucode
...
# Conflicts:
# tinygrad/dtype.py
2026-01-04 12:22:52 -08:00
George Hotz
cfeeab8485
work
2026-01-04 12:22:01 -08:00
George Hotz
7abf4591ba
use bitsize on dtype ( #14011 )
...
* use bitsize on dtype [pr]
* bitsize
* bitsize in js export, but might be wrong
* reverts
* revert that
2026-01-04 12:16:21 -08:00
George Hotz
2be5f8b688
work
2026-01-04 11:57:42 -08:00
George Hotz
db9140b8b7
work
2026-01-04 11:34:07 -08:00
George Hotz
63f663bd4b
progress
2026-01-04 10:24:40 -08:00
chenyu
cfb8bf5814
faster image load ( #13977 )
...
sometimes image load does not need to init with NAN
2026-01-04 13:09:59 -05:00
George Hotz
e38d311f3c
simpler
2026-01-04 10:02:33 -08:00
George Hotz
8e8ad423a7
post parser
2026-01-04 09:16:00 -08:00
George Hotz
7ebda28692
assembly/amd: add CDNA support to asm ( #13982 )
...
* add CDNA support
* more cdna tests
* something
* fix more stuff
* more work
* simpler
* simplier
* cdna
* disasm
* less skip
* fixes
* simpler
2026-01-04 08:53:56 -08:00
George Hotz
acad5d7b30
parser
2026-01-04 08:43:41 -08:00
George Hotz
9ef8ae3199
qcode
2026-01-04 08:33:57 -08:00
George Hotz
1f96afb1cb
getting big
2026-01-04 08:05:23 -08:00
chenyu
ad041416ca
delete unused rewrite rule [pr] ( #14006 )
2026-01-04 09:48:52 -05:00
nimlgen
bf356ae996
am: mi300 48bit address space ( #14004 )
...
* am: mi300 48bit address space
* fix
2026-01-04 15:19:25 +03:00
nimlgen
606786e152
am: do not sleep for each hive node during resets ( #14003 )
2026-01-04 14:02:11 +03:00
George Hotz
59144c6af6
assembly/amd: start replacing pcode with ucode
2026-01-03 23:27:41 -08:00
George Hotz
34ea053b26
assembly/amd: clean up pcode, jit pcode instead of static ( #14001 )
...
* assembly/amd: clean up pcode
* regen
* lil
* jit the pcode
* sendmsg
* cleanups
* inst prefetch lol
2026-01-03 23:06:15 -08:00
kamilisjon
280790e438
Reuse toposort in recursive_property ( #13993 )
2026-01-03 22:04:13 -08:00
kamilisjon
9a9564118c
[pr] Delete reverse_toposort ( #13987 )
...
* Delete reverse_toposort
* Update comment and profiler name
* Update profiler name
2026-01-03 22:03:44 -08:00
George Hotz
8328511808
assembly/amd: make the emu.py code shine ( #13996 )
...
* assembly/amd: make the code shine
* lil clean
* reg back in pcode
* cleanups
* gen fma_mix
* no writelane hacks
* fn cleanup
* dead vgpr_write
* readable
* smem
* cleanup bench_emu
* speedups
* simpler and faster
* direct inst._fn
* split fxn
* Revert "simpler and faster"
This reverts commit e85f6594b3 .
* move lds to wavestate
* dispatcher
* pc in dispatch
* literal isn't wavestate
* cleanups + program
* one readlane
* exec_vop3sd in exec_vop
* cleaner exec_vopd
* fully merge VOP3P
* no special paths
* no SliceProxy
* low=0
* no bigint
* failing tests
* fma on python 3.13
2026-01-03 20:33:09 -08:00
qazal
bdb421f13e
process_replay: passthrough sink arg for Ops.PROGRAM input ( #14000 )
2026-01-04 13:09:39 +09:00
Galax
66caa9fe1d
fix: library linking for fedora systems ( #13999 )
2026-01-03 17:40:56 -08:00
chenyu
8003db2a28
test case of NOOP store load folding ( #13997 )
2026-01-03 14:39:26 -05:00
chenyu
c1b8644a3f
test removing expander rules [pr] ( #13994 )
2026-01-03 12:38:01 -05:00
Christopher Milan
35c2870b1f
gate image_conv2d pitch hacks on IMAGE==1 ( #13995 )
...
* gate image_conv2d pitch hacks on IMAGE==1
* fix opencl image copies
* cleanup
2026-01-03 12:27:31 -05:00
nimlgen
a49924a0e9
hcq: _sleep report status ( #13992 )
...
* hcq: _sleep report status
* msg
* print all
2026-01-03 14:28:28 +03:00
nimlgen
3b354bc11f
hcq: better queue managment ( #13991 )
2026-01-03 13:11:15 +03:00
nimlgen
efb2ae87c6
hcq sync aql ( #13756 )
...
* hcq sync aql
* w
2026-01-03 12:59:24 +03:00
qazal
bd55507ee4
RDNA3 fp16 assembly gemm 85 TFLOPS ( #13990 )
2026-01-03 18:34:23 +09:00
wozeparrot
6242a9d151
tk: no global copy and clear ranges ( #13988 )
2026-01-02 23:45:15 -08:00
wozeparrot
9f082e8e25
fa: split kv bwd into 2 kernels ( #13981 )
2026-01-02 18:45:51 -08:00
qazal
2cc64d71b0
simplify mi350x gemm / viz asm tests ( #13984 )
...
* mi350x gemm cleanup
* asm tests work
* simpler asm tests
2026-01-03 11:11:07 +09:00
chenyu
7cbafb2ef1
update hypothesis min version ( #13983 )
...
there was a local_constants perf regression that made hypothesis related tests slow
2026-01-02 21:01:57 -05:00
Christopher Milan
9dc524536f
IMAGE=1 creates "dynamic" images ( #13769 )
...
* remove image from BufferSpec
* cl tiny_gemm (64) works
* mypy
* padding
* openpilot CL
* reshape properly
* remove extra qcom checks
* pad output
* mypy
* update compile test
* move undo
* TestImageCopy valid images
* TestImageRealization valid images
* TestImageDType valid images
* cleanups
* test_renderer_failures
* ruff
* mypy
* simplify ops_qcom
* bump step time
* Revert "bump step time"
This reverts commit 75a037c7d0 .
* "dynamic textures" are optional
* a start
* IMAGE=1 works, no FLOAT16
* fast but wrong
* mypy
* some fixes
* better
* works
* refactor
* oops
2026-01-02 16:22:39 -05:00
Christopher Milan
61dc70f1a8
add driving_vision IMAGE=1 benchmark ( #13979 )
2026-01-02 13:58:27 -05:00
George Hotz
0e282025ff
assembly/amd: split test_emu into hw tests ( #13966 )
...
* assmebly/amd: split test_emu into hw tests
* hw tests
* bugfixes
* more tests and fix
2026-01-02 08:04:56 -08:00
chenyu
2e2b5fed12
fix misspellings ( #13976 )
2026-01-02 10:37:38 -05:00
nietras
f49e4714af
Fix spelling errors in README for AMD assembly ( #13975 )
2026-01-02 10:15:20 -05:00