qazal
7622be761f
add new remu instructions from #13533 ( #13539 )
2025-12-03 06:29:20 +08:00
qazal
2f95c10702
remu new instructions / use volatile in emulator tests ( #12862 )
...
* remu new instructions
* start moving to volatile
* test_simple works
* test_exec_mov works and lid is still here
* test_exec_cmp_vopc
* clang did s_mov_b32 exec_lo, 1
* don't hardcode v1
* support volatile in tests
* hw_test passes
* only the volatile version
* subrev saturating behavior
2025-10-23 11:13:43 +08:00
qazal
e8c595c29e
remu: add new instructions introduced in RANGEIFY ( #12363 )
...
* add v_mad_i64_i32 for test_output_padded_conv_transpose2d
* run amd test_ops
* skip test_masked_select
2025-09-30 12:36:29 +03:00
qazal
c7bb561ef9
remu: add v_rsq_f32_e32 instruction ( #11947 )
...
https://github.com/tinygrad/tinygrad/pull/11936 introduces a change to
the AMD LLVM renderer that outputs this instruction. Adding both 32 and
64 bit variants.
2025-09-01 11:29:31 +03:00
chenyu
126fcf4129
clean up AMD_LLVM in tests ( #11021 )
2025-06-28 22:45:47 -04:00
George Hotz
32e9949052
rename lazydata to uop ( #10698 )
2025-06-08 08:42:22 -07:00
qazal
17f0f5e764
add v_rcp_f32_e64 to remu ( #10393 )
...
* tests from the box
* add v_rcp_f32_e64 to remu
* f32::from_bits utils
* v_cndmask_b32 tests
2025-05-18 17:08:21 +03:00
Ignacio Sica
a54fd745c3
simpler barrier match in remu ( #10339 )
...
* s_barrier
* remove s_barrier from syncs
2025-05-16 14:40:58 +03:00
Ignacio Sica
3c453e96a9
add ds_load_b96 and ds_store_b96 instructions ( #10338 )
2025-05-15 18:11:08 +03:00
qazal
be8202b293
add s_abs_i32 instruction to remu ( #10334 )
2025-05-15 16:47:58 +03:00
qazal
9210280811
add v_fmac_f16 vop3 instruction to remu ( #10247 )
...
* fmac vop3
* from the box
2025-05-10 23:48:25 +03:00
qazal
4ea3e373aa
decode lds ops in remu ( #10184 )
2025-05-07 16:44:18 +08:00
Ignacio Sica
74c25bdc8b
add support for ds_load_u8 in remu ( #10180 )
...
* add support for ds_load_u8 in remu
* add test for ds_load_u8
2025-05-06 20:31:00 +03:00
qazal
ac37510f60
remu: only write v_cmp result if exec is set ( #10084 )
2025-04-28 20:31:52 +08:00
qazal
d6b436a815
remu bugfix with -0.0 negation ( #10082 )
2025-04-28 15:46:42 +08:00
qazal
e1d2b64e92
remu new instructions ( #10050 )
...
* remu new instructions
* test_ds_store_half
* test_v_mul_f16
2025-04-26 02:04:12 +03:00
qazal
bba5d0a3e4
remu refactors ( #10028 )
...
* remu refactors
* scc is sgpr 253
* remove that
* rename to vcc_lo
* run cargo test in CI
* llvm-mc
* meh
* work
* work_group work 1
* seeded_lanes is dumb
* better than seeded_lanes
* does not need to be address
* 128 sgpr per wave
* scc is sgpr, we don't know which one
* null_src once more
* derive clone, wave init is cleaner
* init comes first
2025-04-26 04:31:10 +08:00
qazal
0b482fb824
add RDNA3 parser to remu ( #10025 )
...
* llvm ref
* work
* all of them
* salu
* cleaner
* start
* vector ops
* done
* replace SMEM
* vopd
* sop1
* SOPC
* null stays null_src
* sopp
* SOPK
* sop2
* vop1
* vop2
* remove allow(dead_code)
* vopc
2025-04-24 21:34:07 +08:00
qazal
16dfe0a902
upstream remu ( #9921 )
2025-04-18 01:57:36 +03:00