Commit Graph

88 Commits

Author SHA1 Message Date
nimlgen
f63a9fd649 hcq _cur_cmd_idx for readability (#6444)
* hcq _cur_cmd_idx for readability

* linter
2024-09-09 20:04:45 +03:00
wozeparrot
ea5b7910b7 AMD support gfx103x (#5926) 2024-08-28 14:17:08 -07:00
nimlgen
bc44e6501b _gpu_alloc -> allocator.alloc (#6189)
* _gpu_alloc -> allocator.alloc

* not needed this import

* pylint
2024-08-19 23:34:22 +03:00
nimlgen
5f1554b574 amd fix uaf in program (#6114)
* amd fix uaf in program

* keep it align

* sync before free
2024-08-17 00:22:46 +03:00
nimlgen
7ab531aede autogen cleanup (#6064)
* start autogen cleanup

* nvgpu

* better?

* better

* amd part

* gpu regen

* fix mockgpu amd

* nv

* amd fix linter

* remove import

* ugh

* nv on master

* amd on master
2024-08-14 20:20:35 +03:00
nimlgen
fa84e6ec48 init hcq args state (#6046)
* init hcq args state

* cleaner

* amd

* fillargs

* fixes

* myoy

* docs

* fix

* not needed

* spacing
2024-08-13 17:11:58 +03:00
nimlgen
e89eff11a6 amd raise when not supported arch (#5978) 2024-08-08 14:46:14 +03:00
nimlgen
8d8704af2d fix amd exec_update for locals (#5966) 2024-08-07 21:02:56 +03:00
nimlgen
341c394c89 amd save exec offsets (#5928)
* amd save exec offsets

* fix

* better

* ugh
2024-08-06 12:11:46 +03:00
George Hotz
4dd24dc439 use decimal for timestamps for more precision [run_process_replay] (#5823)
* use decimal for timestamps for more precision

* err, didn't get saved

* fix types + 38 -> 40
2024-07-30 15:06:14 -07:00
nimlgen
71e1472290 hcq more types (#5791)
* mhcq more types

* linter

* pylint

* docs: bind
2024-07-29 18:03:23 +03:00
nimlgen
73fda023d3 amd better comments for ENABLE_SGPR_DISPATCH_PTR (#5768)
* amd better comments for ENABLE_SGPR_DISPATCH_PTR

* fix lkinter
2024-07-28 16:23:38 +03:00
nimlgen
5d53fa491b amd autogened kfd ioctls (#5757)
* amd autogened kio

* unused import

* linter
2024-07-27 22:49:48 +03:00
chenyu
9838c1a6ff update import style in runtime (#5735) 2024-07-26 14:00:23 -04:00
George Hotz
5c688560bc move CUDA/HIP compilers to their own files [run_process_replay] (#5732) 2024-07-26 10:00:15 -07:00
nimlgen
6ec9ea9ddd hcq update_exec with optional params (#5708) 2024-07-26 00:04:57 +03:00
nimlgen
a93982ef42 hcq move out program call to base class (#5638)
* hcq move out program call to base class

* fix
2024-07-23 14:25:38 +03:00
nimlgen
4dcca0a6d4 amd tiny cleanups (#5651) 2024-07-23 13:06:23 +03:00
chenyu
fe17ea5c88 typo in ops_amd invalidate_caches (#5643)
lead to silently not being called
2024-07-22 18:37:11 -04:00
nimlgen
ee633c1988 hcq move out synchronize to base class (#5634) 2024-07-22 20:36:04 +03:00
nimlgen
26fc4610a0 amd more accurate cache managment (#5631)
* amd more accurate cache managment

* fix amd

* add memory_barrier + copies tests

* tranfer test as well

* linter
2024-07-22 19:07:01 +03:00
Vyacheslav Pachkov
edc58e6b6e hcq: remove duplicate allocation of kernel args by abstracting (#5633) 2024-07-22 18:29:41 +03:00
nimlgen
08a9c0ae5e hcq cache invalidation for beam (#5630)
* nv full cache invalidation

* the same command on amd

* linter

* fix amd

* nv no hardcoded consts

* beam default
2024-07-22 18:13:17 +03:00
Vyacheslav Pachkov
583829ab44 helpers: remove duplicate data64 helpers in amd/nv (#5627) 2024-07-21 16:50:59 -07:00
nimlgen
0de5812032 hcq move map to allocator (#5610)
* hcq move map to allocator

* fix
2024-07-20 19:02:45 +03:00
nimlgen
646bdc1c0e elf loader touchups (#5607)
* loadonly SHF_ALLOC sections

* revert this, just amd fix
2024-07-20 12:30:18 +03:00
nimlgen
b1782e3fef hcq refactor signal into class (#5575)
* hcq refactor signal into class

* fix amd

* amd do not use amd_signal_t

* cleanup

* signal setter

* fix linter

* docs

* more docs + types

* fix types
2024-07-19 23:23:05 +03:00
nimlgen
9d7edc9269 hcq rename HCQCompat -> HCQ (#5577) 2024-07-19 11:34:17 +03:00
nimlgen
c30092e56d amd remove useless barrier (#5550) 2024-07-18 18:05:33 +03:00
nimlgen
dcd462860f elf loader (#5508)
* elf loader

* cleanup

* cleaner

* cleaner

* fixes

* revert this

* fix div 0

* fix nv

* amd fix

* fix mockgpu

* amd better?

* restore relocs for <12.4

* linter

* this is fixed now

* revert this

* process cdefines as function

* cleaner

* align

* save lines

* revert this change
2024-07-17 17:09:34 +03:00
nimlgen
8dfd11c1d8 docs: hcq add types (#5495)
* docs: hcq add types

* linter
2024-07-15 22:14:48 +03:00
nimlgen
c9ec7ce070 start hcq docs (#5411)
* start hcq docs

* more hcq docs

* docs

* docs

* linter

* correct args

* linter

* ts returns int
2024-07-15 21:31:11 +03:00
nimlgen
8835d6c49a cleanup nv/amd program (#5449)
* cleanup nv/amd program

* fix amd

* a bit cleaner

* ugh, typo

* linter

* fix nv

* tiny thing
2024-07-14 14:08:35 +03:00
nimlgen
67f70cef02 amd better allocation error messages (#5462)
* amd better allocation error messages

* a bit better
2024-07-13 22:55:09 +03:00
nimlgen
f4944ced09 tiny amd cleanups (#5420) 2024-07-12 22:54:42 +03:00
uuuvn
3cb94a0a15 Rename tinygrad/runtime/driver to support (#5413) 2024-07-12 11:06:42 -07:00
nimlgen
6604d2b2c3 amd/nv respect visible devs (#5409)
* nv/amd respect visible devices

* linter

* sort amd gpus

* env docs
2024-07-12 20:02:12 +03:00
nimlgen
bd77efda2f add HWCommandQueue base class for hcq devices (#5303)
* add HWCommandQueue as base queue for hcq devices

* try this

* fixes

* comments

* linter

* linetr2

* linter

* linter

* fixed

* revert this
2024-07-11 16:19:13 +03:00
nimlgen
1678199b15 add update_copy to hcq spec (#5348)
* add update_copy to hcq spec

* fix amd
2024-07-09 20:44:44 +03:00
nimlgen
e815c57039 use hcq_profile in nv/amd program (#5344) 2024-07-09 15:56:06 +03:00
nimlgen
2778b6046c new memory scheduler (#5278)
* new memory schedule algo

* works

* fix

* fix

* linter

* tiny fixes

* do not optimize copy buffers

* mpre comments

* tiny cleanups
2024-07-04 18:06:04 +03:00
nimlgen
84b3e3bb6f hcq exec no embedded signal (#5142) 2024-07-04 13:29:21 +03:00
Vyacheslav Pachkov
d3e4e21759 add return type for HCQCompatAllocator _alloc (#5267)
Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com>
2024-07-03 10:25:44 +03:00
nimlgen
7be776f9af add _alloc_signal/_free_signal to hcq (#5264)
* add _alloc_signal/_free_signal api

* oops, revert this

* linter
2024-07-02 23:35:39 +03:00
nimlgen
57e89645cd hcq spec test (#5226)
* start hcq spec test

* more test

* fixes

* run on amd as well

* test amdgpu exec

* fix amd

* amd mockgpu support sdma timestamp
2024-07-01 17:36:37 +03:00
nimlgen
7b7b751513 simple hip backend for debugging (#5201)
* hip backend

* fix mypy

* shorter

* fixes

* tiny changes
2024-06-30 23:00:11 +03:00
nimlgen
dd7eef7d71 libc defs to autogen (#5217)
* libc defs to autogen

* amd import libc

* linter

* better a bit

* remove comment, check this

* not hardcoded path
2024-06-29 14:37:33 +03:00
nimlgen
c941a58581 amd refactor queue creation (#5216)
* amd refactor queue creation

* fixes

* use data64_le

* fix linter
2024-06-28 23:24:49 +03:00
Roelof van Dijk
26e254c42b ruff: else-raise and else-return (#5175)
* ruff: enable else-raise and else-return

* ruff: add error names

* fix order

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-06-27 07:54:59 -04:00
nimlgen
69f116a7e1 nv/amd profiler (#4718)
* nv/amd profiler

* fix

* fix

* profile copies

* profile logger

* fixes

* more fixes

* less lines and fixes

* fixes

* some linter

* back sync, no related change

* fix gpu2cpu time def

* simpler

* linter

* linter

* docs

* add add_event api
2024-06-23 17:10:12 +03:00