nimlgen
|
f63a9fd649
|
hcq _cur_cmd_idx for readability (#6444)
* hcq _cur_cmd_idx for readability
* linter
|
2024-09-09 20:04:45 +03:00 |
|
wozeparrot
|
ea5b7910b7
|
AMD support gfx103x (#5926)
|
2024-08-28 14:17:08 -07:00 |
|
nimlgen
|
bc44e6501b
|
_gpu_alloc -> allocator.alloc (#6189)
* _gpu_alloc -> allocator.alloc
* not needed this import
* pylint
|
2024-08-19 23:34:22 +03:00 |
|
nimlgen
|
5f1554b574
|
amd fix uaf in program (#6114)
* amd fix uaf in program
* keep it align
* sync before free
|
2024-08-17 00:22:46 +03:00 |
|
nimlgen
|
7ab531aede
|
autogen cleanup (#6064)
* start autogen cleanup
* nvgpu
* better?
* better
* amd part
* gpu regen
* fix mockgpu amd
* nv
* amd fix linter
* remove import
* ugh
* nv on master
* amd on master
|
2024-08-14 20:20:35 +03:00 |
|
nimlgen
|
fa84e6ec48
|
init hcq args state (#6046)
* init hcq args state
* cleaner
* amd
* fillargs
* fixes
* myoy
* docs
* fix
* not needed
* spacing
|
2024-08-13 17:11:58 +03:00 |
|
nimlgen
|
e89eff11a6
|
amd raise when not supported arch (#5978)
|
2024-08-08 14:46:14 +03:00 |
|
nimlgen
|
8d8704af2d
|
fix amd exec_update for locals (#5966)
|
2024-08-07 21:02:56 +03:00 |
|
nimlgen
|
341c394c89
|
amd save exec offsets (#5928)
* amd save exec offsets
* fix
* better
* ugh
|
2024-08-06 12:11:46 +03:00 |
|
George Hotz
|
4dd24dc439
|
use decimal for timestamps for more precision [run_process_replay] (#5823)
* use decimal for timestamps for more precision
* err, didn't get saved
* fix types + 38 -> 40
|
2024-07-30 15:06:14 -07:00 |
|
nimlgen
|
71e1472290
|
hcq more types (#5791)
* mhcq more types
* linter
* pylint
* docs: bind
|
2024-07-29 18:03:23 +03:00 |
|
nimlgen
|
73fda023d3
|
amd better comments for ENABLE_SGPR_DISPATCH_PTR (#5768)
* amd better comments for ENABLE_SGPR_DISPATCH_PTR
* fix lkinter
|
2024-07-28 16:23:38 +03:00 |
|
nimlgen
|
5d53fa491b
|
amd autogened kfd ioctls (#5757)
* amd autogened kio
* unused import
* linter
|
2024-07-27 22:49:48 +03:00 |
|
chenyu
|
9838c1a6ff
|
update import style in runtime (#5735)
|
2024-07-26 14:00:23 -04:00 |
|
George Hotz
|
5c688560bc
|
move CUDA/HIP compilers to their own files [run_process_replay] (#5732)
|
2024-07-26 10:00:15 -07:00 |
|
nimlgen
|
6ec9ea9ddd
|
hcq update_exec with optional params (#5708)
|
2024-07-26 00:04:57 +03:00 |
|
nimlgen
|
a93982ef42
|
hcq move out program call to base class (#5638)
* hcq move out program call to base class
* fix
|
2024-07-23 14:25:38 +03:00 |
|
nimlgen
|
4dcca0a6d4
|
amd tiny cleanups (#5651)
|
2024-07-23 13:06:23 +03:00 |
|
chenyu
|
fe17ea5c88
|
typo in ops_amd invalidate_caches (#5643)
lead to silently not being called
|
2024-07-22 18:37:11 -04:00 |
|
nimlgen
|
ee633c1988
|
hcq move out synchronize to base class (#5634)
|
2024-07-22 20:36:04 +03:00 |
|
nimlgen
|
26fc4610a0
|
amd more accurate cache managment (#5631)
* amd more accurate cache managment
* fix amd
* add memory_barrier + copies tests
* tranfer test as well
* linter
|
2024-07-22 19:07:01 +03:00 |
|
Vyacheslav Pachkov
|
edc58e6b6e
|
hcq: remove duplicate allocation of kernel args by abstracting (#5633)
|
2024-07-22 18:29:41 +03:00 |
|
nimlgen
|
08a9c0ae5e
|
hcq cache invalidation for beam (#5630)
* nv full cache invalidation
* the same command on amd
* linter
* fix amd
* nv no hardcoded consts
* beam default
|
2024-07-22 18:13:17 +03:00 |
|
Vyacheslav Pachkov
|
583829ab44
|
helpers: remove duplicate data64 helpers in amd/nv (#5627)
|
2024-07-21 16:50:59 -07:00 |
|
nimlgen
|
0de5812032
|
hcq move map to allocator (#5610)
* hcq move map to allocator
* fix
|
2024-07-20 19:02:45 +03:00 |
|
nimlgen
|
646bdc1c0e
|
elf loader touchups (#5607)
* loadonly SHF_ALLOC sections
* revert this, just amd fix
|
2024-07-20 12:30:18 +03:00 |
|
nimlgen
|
b1782e3fef
|
hcq refactor signal into class (#5575)
* hcq refactor signal into class
* fix amd
* amd do not use amd_signal_t
* cleanup
* signal setter
* fix linter
* docs
* more docs + types
* fix types
|
2024-07-19 23:23:05 +03:00 |
|
nimlgen
|
9d7edc9269
|
hcq rename HCQCompat -> HCQ (#5577)
|
2024-07-19 11:34:17 +03:00 |
|
nimlgen
|
c30092e56d
|
amd remove useless barrier (#5550)
|
2024-07-18 18:05:33 +03:00 |
|
nimlgen
|
dcd462860f
|
elf loader (#5508)
* elf loader
* cleanup
* cleaner
* cleaner
* fixes
* revert this
* fix div 0
* fix nv
* amd fix
* fix mockgpu
* amd better?
* restore relocs for <12.4
* linter
* this is fixed now
* revert this
* process cdefines as function
* cleaner
* align
* save lines
* revert this change
|
2024-07-17 17:09:34 +03:00 |
|
nimlgen
|
8dfd11c1d8
|
docs: hcq add types (#5495)
* docs: hcq add types
* linter
|
2024-07-15 22:14:48 +03:00 |
|
nimlgen
|
c9ec7ce070
|
start hcq docs (#5411)
* start hcq docs
* more hcq docs
* docs
* docs
* linter
* correct args
* linter
* ts returns int
|
2024-07-15 21:31:11 +03:00 |
|
nimlgen
|
8835d6c49a
|
cleanup nv/amd program (#5449)
* cleanup nv/amd program
* fix amd
* a bit cleaner
* ugh, typo
* linter
* fix nv
* tiny thing
|
2024-07-14 14:08:35 +03:00 |
|
nimlgen
|
67f70cef02
|
amd better allocation error messages (#5462)
* amd better allocation error messages
* a bit better
|
2024-07-13 22:55:09 +03:00 |
|
nimlgen
|
f4944ced09
|
tiny amd cleanups (#5420)
|
2024-07-12 22:54:42 +03:00 |
|
uuuvn
|
3cb94a0a15
|
Rename tinygrad/runtime/driver to support (#5413)
|
2024-07-12 11:06:42 -07:00 |
|
nimlgen
|
6604d2b2c3
|
amd/nv respect visible devs (#5409)
* nv/amd respect visible devices
* linter
* sort amd gpus
* env docs
|
2024-07-12 20:02:12 +03:00 |
|
nimlgen
|
bd77efda2f
|
add HWCommandQueue base class for hcq devices (#5303)
* add HWCommandQueue as base queue for hcq devices
* try this
* fixes
* comments
* linter
* linetr2
* linter
* linter
* fixed
* revert this
|
2024-07-11 16:19:13 +03:00 |
|
nimlgen
|
1678199b15
|
add update_copy to hcq spec (#5348)
* add update_copy to hcq spec
* fix amd
|
2024-07-09 20:44:44 +03:00 |
|
nimlgen
|
e815c57039
|
use hcq_profile in nv/amd program (#5344)
|
2024-07-09 15:56:06 +03:00 |
|
nimlgen
|
2778b6046c
|
new memory scheduler (#5278)
* new memory schedule algo
* works
* fix
* fix
* linter
* tiny fixes
* do not optimize copy buffers
* mpre comments
* tiny cleanups
|
2024-07-04 18:06:04 +03:00 |
|
nimlgen
|
84b3e3bb6f
|
hcq exec no embedded signal (#5142)
|
2024-07-04 13:29:21 +03:00 |
|
Vyacheslav Pachkov
|
d3e4e21759
|
add return type for HCQCompatAllocator _alloc (#5267)
Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com>
|
2024-07-03 10:25:44 +03:00 |
|
nimlgen
|
7be776f9af
|
add _alloc_signal/_free_signal to hcq (#5264)
* add _alloc_signal/_free_signal api
* oops, revert this
* linter
|
2024-07-02 23:35:39 +03:00 |
|
nimlgen
|
57e89645cd
|
hcq spec test (#5226)
* start hcq spec test
* more test
* fixes
* run on amd as well
* test amdgpu exec
* fix amd
* amd mockgpu support sdma timestamp
|
2024-07-01 17:36:37 +03:00 |
|
nimlgen
|
7b7b751513
|
simple hip backend for debugging (#5201)
* hip backend
* fix mypy
* shorter
* fixes
* tiny changes
|
2024-06-30 23:00:11 +03:00 |
|
nimlgen
|
dd7eef7d71
|
libc defs to autogen (#5217)
* libc defs to autogen
* amd import libc
* linter
* better a bit
* remove comment, check this
* not hardcoded path
|
2024-06-29 14:37:33 +03:00 |
|
nimlgen
|
c941a58581
|
amd refactor queue creation (#5216)
* amd refactor queue creation
* fixes
* use data64_le
* fix linter
|
2024-06-28 23:24:49 +03:00 |
|
Roelof van Dijk
|
26e254c42b
|
ruff: else-raise and else-return (#5175)
* ruff: enable else-raise and else-return
* ruff: add error names
* fix order
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
|
2024-06-27 07:54:59 -04:00 |
|
nimlgen
|
69f116a7e1
|
nv/amd profiler (#4718)
* nv/amd profiler
* fix
* fix
* profile copies
* profile logger
* fixes
* more fixes
* less lines and fixes
* fixes
* some linter
* back sync, no related change
* fix gpu2cpu time def
* simpler
* linter
* linter
* docs
* add add_event api
|
2024-06-23 17:10:12 +03:00 |
|