Commit Graph

38 Commits

Author SHA1 Message Date
qazal
a2da61d096 use new style amd compiler in viz (#13848)
* working version, handcode gfx1100 arch

* get target from device properties

* lib in cfg test program spec
2025-12-27 23:59:30 +09:00
qazal
f6de9095a0 switch asm tests to dsl (#13840)
* switch asm tests to dsl

* labeled basic blocks also work

* indenting for basic blocks

* allow define from star import
2025-12-27 02:15:16 +09:00
qazal
a1c1684b91 set .amdhsa_kernarg_size in asm test (#13826) 2025-12-25 13:08:14 +09:00
qazal
389f01c7f4 viz: amdgpu assembly basic block graph (#13755) 2025-12-22 23:17:16 +08:00
George Hotz
744af193f0 remove ScheduleItem and merge it with ExecItem (#13759)
* remove ExecItem and merge it with ScheduleItem

* less diff

* fix issues

* min diff

* don't change bufs in _lower

* min diff

* update

* revert

* fixes

* diff
2025-12-19 17:04:24 -04:00
wozeparrot
99e667bdcd tk fa bwd (#13480) 2025-12-17 23:56:37 -08:00
wozeparrot
5d509499b2 tk: kernel finish groups stores (#13704) 2025-12-15 09:16:17 -08:00
wozeparrot
7ef7ce2856 tk reg local store (#13689) 2025-12-14 23:07:30 -08:00
wozeparrot
93f1baca77 feat: tk fa in tensor (#13580) 2025-12-05 14:36:29 -08:00
George Hotz
6bd355fa26 add needs_second_gpu decorator (#13543)
* add needs_second_gpu decorator

* more skips

* two more fixes
2025-12-02 19:08:23 -08:00
wozeparrot
0d55aec605 fix after end (#13542) 2025-12-02 18:42:58 -08:00
nimlgen
0874ba8cc8 test_hevc: do not download the whole file (#13531)
* test_hevc: do not download the whole file

* fix
2025-12-02 21:31:28 +03:00
wozeparrot
1b7dbfb37f tk: named kernels + per kernel range id (#13522) 2025-12-01 22:51:04 -08:00
nimlgen
455dd88236 nv: minimal hevc (#13502)
* nv: minimal hevc

* validate

* not needed

* tralin

* var

* cpu

* fxi

* desc

* move

* cleanup
2025-11-30 16:46:55 +03:00
qazal
ae9c56134e skip test_tk failing locally on macbook (#13476) 2025-11-29 01:15:37 +08:00
wozeparrot
ffc31a23f4 tk mi350 (#13288) 2025-11-25 15:49:44 -08:00
wozeparrot
33773fda87 tk initial mi350 (#13289) 2025-11-17 11:46:32 -08:00
wozeparrot
7eb0d8e744 feat: mixins on tiles (#13246) 2025-11-13 16:52:52 -08:00
wozeparrot
759557f633 feat: move tk tests to testextra (#13242) 2025-11-12 17:06:53 -08:00
George Hotz
4156baee93 break swizzle into three chunks [pr] (#11153)
* break swizzle into three chunks [pr]

* test failed
2025-07-09 15:30:34 -07:00
wozeparrot
66e00c04dd fix: skip kernel timing tests on ci cuda (#10348) 2025-05-16 11:48:06 -07:00
wozeparrot
1ed04f993b move benchmark stat tracking to influxdb (#10185) 2025-05-15 16:14:56 -07:00
chenyu
c4988bc07b only run test_u32_to_f16 if it supports fp16 (#10277)
* only run test_u32_to_f16 if it supports fp16

* cleanup
2025-05-13 11:16:14 -04:00
chenyu
2e7c2780a9 CLANG -> CPU (#9189) 2025-02-20 18:03:09 -05:00
George Hotz
0b26cee2f1 fix some slow tests [pr] (#8979) 2025-02-09 15:57:04 +08:00
nimlgen
a647f3dd2c move mockgpu to tests [pr] (#8396)
* move mockgpu to tests

* linter

* i'm so sorry

* sorry, python

* path
2024-12-24 23:48:02 +03:00
Ahmed Harmouche
ba35c4138b Use matching JS TypedArray for buffer dtype (#8080) 2024-12-06 14:52:23 +01:00
Ahmed Harmouche
ce72fe1411 u32 to f16 in tinygrad (#8074)
* f16 decompression in tinygrad

* Typing and cleanup
2024-12-06 12:00:13 +01:00
Ahmed Harmouche
ff9a89f714 Proper dtypes for input/output of exported WebGPU model (#8053)
* Respect input/output dtypes in exported WebGPU model

* Add some comments about skipped dtypes
2024-12-05 10:38:05 +01:00
uuuvn
94a484542b Hook memoryview via class instead of a function (#7627) 2024-11-11 09:07:06 +08:00
Roelof van Dijk
975b811ad9 names shadowing builtins (#5179)
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-06-27 08:15:01 -04:00
chenyu
b886d250fb improve test_dropout_on_shard (#4912)
tested some basic property, also minor formatting for a few Tensor.training setups
2024-06-11 11:36:02 -04:00
qazal
8b5bcf309a process replay in all of CI (#4884) 2024-06-10 14:49:29 -04:00
qazal
f64fa51a64 process replay for test/* (#4799)
* add input to unit tests [run_process_replay]

* add setup [run_process_replay]

* run tests [run_process_replay]

* add cuda and amd [run_process_replay]

* run everything but BEAM=2 [run_process_replay]

* skip export_model [run_process_replay]

* fix amd CI

* add concurrency back
2024-06-03 12:01:58 +03:00
George Hotz
17faae091b optimizer shouldn't be run without training (#4460)
* optimizer shouldn't be run without training

* set training in relevant tests

* fix multitensor

* that too
2024-05-06 15:34:12 -07:00
George Hotz
3527c5a9d2 add Tensor.replace (#3738)
* add Tensor.replace

* fix dtypes in that test

* should be replace

* and mixtral
2024-03-14 13:34:14 -07:00
George Hotz
838afbc351 assign tests (#3728) 2024-03-13 17:04:55 -07:00
George Hotz
ee83505fcc fix test extra issue (#3159) 2024-01-17 11:58:08 -08:00