George Hotz
32e9949052
rename lazydata to uop ( #10698 )
2025-06-08 08:42:22 -07:00
nimlgen
0e1beaf44f
nv: align copies + better test ( #10118 )
2025-04-30 20:09:53 +03:00
nimlgen
2ec3b722e2
nv: fix copies larger than 4g ( #10117 )
2025-04-30 18:43:17 +03:00
nimlgen
5c7d004da5
hcq: refactor int ptrs to hcqbuffers ( #10105 )
...
* hcq: refactor int ptrs to hcqbuffers
* more refactors
* linter
* use in allocator
* test fiz
* fx
* ops
* final?
* simpler
* keep this for now
2025-04-30 00:12:18 +03:00
George Hotz
68c5f7ba80
load fast in sdxl ( #10072 )
...
* load fast in sdxl
* back to that with the ret
* no context
2025-04-27 11:58:51 -04:00
nimlgen
fa888ee077
minor test cleanups ( #9770 )
...
* fix test_graph on max
* pcie5
2025-04-07 15:29:12 +03:00
nimlgen
8cae00833c
flaky test in ci ( #9321 )
2025-03-02 16:27:22 +03:00
nimlgen
70db8c3003
hcq: dyn alloc signals ( #9238 )
...
* hcq: dyn alloc signals
* types and uniqueue devs
* typing
* mypy
* mypy one more time
* test
* make fds to not intersect in mockgpu between drivers
2025-02-25 17:22:24 +03:00
Ignacio Sica
b240f12593
[TIP-9] rename Opt's amt to arg 2 ( #8770 )
...
* rename Opt amt to arg
* ignore_beam_cache for test_tiny
* move ignore_beam_cache to test_tiny
* move to separate pr
* revert space change
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-01-27 14:19:04 -05:00
George Hotz
3ed146a5ff
Revert "rename Opt amt to arg ( #8767 )" ( #8769 )
...
This reverts commit bf041659a5 .
2025-01-27 23:46:37 +09:00
Ignacio Sica
bf041659a5
rename Opt amt to arg ( #8767 )
2025-01-27 23:36:47 +09:00
nimlgen
d224d0ed7f
nv: fix fault info ( #8587 )
...
* nv: fix fault info
* and emu for amd
* skip if not mock
2025-01-13 14:38:43 +03:00
nimlgen
92b59c9b7a
test_hcq limits for mockgpu not (only) ci ( #8555 )
...
* test_hcq limits for mockgpu not (only) ci
* rm CI
2025-01-10 17:37:28 +03:00
nimlgen
31fcfe764d
adjust hcq test for ci macos ( #8534 )
2025-01-08 16:18:31 +03:00
qazal
866dfa1f23
create_schedule([x.lazydata]) -> x.schedule() in tests ( #8449 )
2024-12-31 03:15:52 +08:00
George Hotz
29c14f1cbf
hotfix: update tests for no uop mut
2024-12-30 10:05:37 -05:00
nimlgen
0a139b1436
amd iface abstraction ( #8413 )
...
* start on amd iface
* t
* unused import
* fixes
* internal api
2024-12-27 15:53:53 +03:00
nimlgen
af87e4b53c
viz profiler ( #8287 )
...
* only hcq
* fix get_metadata
* linter
* oops
* tiny
* linter
* time
* print pm
* hmm
* nits
2024-12-17 20:00:53 +03:00
nimlgen
10f431b96d
hcq replace update with sint ( #7899 )
...
* try sym hcq
* start with amd
* move to nv
* nv works
* cache and qcom
* fixes
* signals
* fix nv
* qcom fixes
* linter
* linter
* cache + typings
* fixes
* tiny fixes
* linter
* linter
* lntr
* ugh
* comments
2024-11-29 20:08:13 +03:00
George Hotz
e9ae2ccd09
_prg to match _buf [pr] ( #7816 )
2024-11-21 12:44:48 +08:00
George Hotz
c5d458ce02
BufferSpec and ProgramSpec [pr] ( #7814 )
...
* BufferSpec and ProgramSpec [pr]
* delete preallocate, it's unused
* Revert "delete preallocate, it's unused"
This reverts commit dcfcfaccde .
2024-11-21 12:18:05 +08:00
George Hotz
eb0bb7dc0b
final dname to device [pr] ( #7806 )
...
* final dname to device [pr]
* oops, fix nv
2024-11-20 20:20:28 +08:00
George Hotz
27995a2a04
vcount + cleanups ( #7393 )
...
* Revert "Revert "Restore vcount [pr] (#7390 )" (#7392 )"
This reverts commit 4ca53db604 .
* ugh bugfix [pr]
* uops_to_dtypes function
* fixups
* varnames
* fix mypy
* just 4,8
* tests
2024-10-30 12:50:15 +08:00
George Hotz
6b063450df
move hcq device to runtime [pr] ( #6879 )
...
* things that are only used in one place don't belong in helpers [pr]
* start moving hcq device [pr]
* fix paths
2024-10-04 22:26:50 +08:00
nimlgen
f0019ad29c
bump ci test timeout for test_speed_exec_time ( #6715 )
...
* bump ci test timeout for test_speed_exec_time
* more
2024-09-24 18:44:09 +08:00
George Hotz
a1a882b006
arange folding with new ge ( #6604 )
...
* arange folding with new ge
* bump allowed gated
* bump allowed speed
2024-09-19 18:01:28 +08:00
nimlgen
6c4ddd6260
hcq skip tests when no multidev ( #6235 )
...
* hcq skip tests when no multidev
* linter
* a bit higher tinout
2024-08-22 18:27:16 +03:00
chenyu
b36a7273c6
RUF018 assignment-in-assert [run_process_replay] ( #6172 )
...
assertion should not have side effect or `-O` breaks.
initially just wanted to fix the one in rearrange, but it also made some long lines less long
2024-08-19 00:34:52 -04:00
wozeparrot
0c5189de25
threefry half ( #6154 )
2024-08-18 15:23:12 -07:00
nimlgen
183c4c91a3
fix non-jitted transfers in profile ( #5980 )
...
* fix transfers in profile
* fix linter
* sync to be sure everythin is recorded
2024-08-08 17:58:08 +03:00
nimlgen
8d8704af2d
fix amd exec_update for locals ( #5966 )
2024-08-07 21:02:56 +03:00
nimlgen
590b9ebb34
hcq copy queue is optional ( #5909 )
...
* hcq copy queue is optional
* one more
* this
2024-08-05 14:03:25 +03:00
nimlgen
2777784b91
add dependency viewer to hcq profiler ( #5874 )
...
* hcq profiler support deps
* clean up
* cleaner
* cleanup
* revert this
* linter
* mypy
* add test
* sync is strange, need to take the end
* linter + test
2024-08-02 22:07:01 +03:00
George Hotz
53fcac9e80
hotfix: increase time on flaky NV test
2024-08-01 10:20:07 -07:00
nimlgen
ed1d784077
test profiler timer sync across devs ( #5751 )
...
* test profiler timer sync across devs
* more correct
* typo
2024-07-27 16:47:37 +03:00
nimlgen
1384f08cd4
hcq profile tests ( #5654 )
...
* profile tests
* fixes
* remove linter
2024-07-23 18:40:33 +03:00
nimlgen
26fc4610a0
amd more accurate cache managment ( #5631 )
...
* amd more accurate cache managment
* fix amd
* add memory_barrier + copies tests
* tranfer test as well
* linter
2024-07-22 19:07:01 +03:00
Vyacheslav Pachkov
edc58e6b6e
hcq: remove duplicate allocation of kernel args by abstracting ( #5633 )
2024-07-22 18:29:41 +03:00
nimlgen
b1782e3fef
hcq refactor signal into class ( #5575 )
...
* hcq refactor signal into class
* fix amd
* amd do not use amd_signal_t
* cleanup
* signal setter
* fix linter
* docs
* more docs + types
* fix types
2024-07-19 23:23:05 +03:00
nimlgen
9d7edc9269
hcq rename HCQCompat -> HCQ ( #5577 )
2024-07-19 11:34:17 +03:00
nimlgen
61822d1a14
nv fix timeline signal rollover on copy queue ( #5473 )
...
* hotfix: nv rollover to 32bits
* test both queues
2024-07-14 16:06:12 +03:00
nimlgen
8835d6c49a
cleanup nv/amd program ( #5449 )
...
* cleanup nv/amd program
* fix amd
* a bit cleaner
* ugh, typo
* linter
* fix nv
* tiny thing
2024-07-14 14:08:35 +03:00
nimlgen
1678199b15
add update_copy to hcq spec ( #5348 )
...
* add update_copy to hcq spec
* fix amd
2024-07-09 20:44:44 +03:00
nimlgen
7be776f9af
add _alloc_signal/_free_signal to hcq ( #5264 )
...
* add _alloc_signal/_free_signal api
* oops, revert this
* linter
2024-07-02 23:35:39 +03:00
nimlgen
57e89645cd
hcq spec test ( #5226 )
...
* start hcq spec test
* more test
* fixes
* run on amd as well
* test amdgpu exec
* fix amd
* amd mockgpu support sdma timestamp
2024-07-01 17:36:37 +03:00