mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-04-29 03:00:14 -04:00
An integrated environment for AMD GPU assembly and emulation
Test with `pytest -n12 test/amd/`
`DEV=AMD:LLVM pytest -n12 test/amd/`
* dsl.py -- helpers for the autogen instruction classes in `__init__.py`. should be standalone with init
* test/mockgpu/amd/emu.py -- an emulator for RDNA that runs in tinygrad with `DEV=MOCK{KFD|KFD|USB}+AMD`
* generate.py -- extract assembly format + instruction pseudocode from AMD XML + PDF
* test/mockgpu/amd/pcode.py -- pseudocode to UOp transformation
* sqtt.py -- SQTT parser
The code should be as readable and deduplicated as possible. emu (in test/mockgpu/amd/) shouldn't be required for dsl.
The autogen folder is autogenerated from the AMD PDFs with `python3 -m tinygrad.renderer.amd.pdf --arch all`
test_emu.py has a good set of instruction tests for the emulation, with USE_HW=1 it will compare to real hardware.
Whenever an instruction is fixed, regression tests should be added here and confirmed with real hardware.
test_llvm.py tests asm/disasm on the LLVM tests, confirming it behaves the same as LLVM.
tinygrad's dtype tests should pass with and without LLVM. they run in about 12 seconds.
`DEV=MOCKKFD+AMD pytest -n=12 test/backend/test_dtype_alu.py test/backend/test_dtype.py`
`DEV=MOCKKFD+AMD:LLVM pytest -n=12 test/backend/test_dtype_alu.py test/backend/test_dtype.py`
The ops tests also pass, but they are very slow, so you should run them one at a time.
`SKIP_SLOW_TEST=1 DEV=MOCKKFD+AMD pytest -n=12 test/backend/test_ops.py`
`SKIP_SLOW_TEST=1 DEV=NOCKKFD+AMD:LLVM pytest -n=12 test/backend/test_ops.py`
When something is caught by main tinygrad tests, a local regression test should be added to `test/amd`.
While working with tinygrad, you can dump the assembly with `DEBUG=7`. These tests all pass on real hardware
If a test is failing with `DEV=MOCKKFD+AMD` it's because an instruction is emulated incorrectly.
You can test with just `DEV=AMD` to test on real hardware, if it works on real hardware there's a bug in the emulator.
IMPORTANT: if a test is failing in the emulator, it's an instruction bug. Use DEBUG=7, get the instructions, and debug.
Currently, only RDNA3 is well supported, but when finished, this will support RDNA3+RDNA4+CDNA in ~3000 lines.
Get line count with `cloc --by-file tinygrad/renderer/amd/*.py`