Commit Graph

117 Commits

Author SHA1 Message Date
nimlgen
707c805a68 nv set localmem sm count to max (#6890) 2024-10-04 23:29:46 +03:00
George Hotz
6b063450df move hcq device to runtime [pr] (#6879)
* things that are only used in one place don't belong in helpers [pr]

* start moving hcq device [pr]

* fix paths
2024-10-04 22:26:50 +08:00
nimlgen
e213bea426 nv shorter (#6819) 2024-09-30 19:39:32 +03:00
nimlgen
d3ed50c769 fix typo in 'Too many resources requested for launch' (#6705) 2024-09-24 15:33:01 +08:00
nimlgen
eac046ea55 hcq check queue size before submit (#6481) 2024-09-11 23:13:13 +03:00
nimlgen
f63a9fd649 hcq _cur_cmd_idx for readability (#6444)
* hcq _cur_cmd_idx for readability

* linter
2024-09-09 20:04:45 +03:00
nimlgen
40e49b6b1a hcq share singal wait (#6394)
* hcq share singal wait

* linter
2024-09-06 18:35:06 +03:00
nimlgen
b1e5343133 nv better error msg for p2p failure (#6301)
* nv better error msg for p2p failure

* linetr

* from

* mypy
2024-08-28 01:40:45 +03:00
nimlgen
ac303146ca nv sure qmd addr less than 40bits (#6288) 2024-08-27 20:47:38 +03:00
nimlgen
89c4cffd86 nv fix size in SET_SEMAPHORE_A (#6213) 2024-08-21 01:47:10 +03:00
nimlgen
bc44e6501b _gpu_alloc -> allocator.alloc (#6189)
* _gpu_alloc -> allocator.alloc

* not needed this import

* pylint
2024-08-19 23:34:22 +03:00
nimlgen
b765996d54 hcq remove offset from progs (#6090) 2024-08-15 17:02:54 +03:00
nimlgen
fa84e6ec48 init hcq args state (#6046)
* init hcq args state

* cleaner

* amd

* fillargs

* fixes

* myoy

* docs

* fix

* not needed

* spacing
2024-08-13 17:11:58 +03:00
nimlgen
ce066fd754 nv do not recalc mv_address (#5998) 2024-08-09 17:16:34 +03:00
nimlgen
76eca0d27e nv fix host mem mappings (#5979) 2024-08-08 17:03:44 +03:00
nimlgen
564a352194 nv unify _gpu_free (#5961)
* nv unify _gpu_free

* revert this
2024-08-07 18:18:17 +03:00
nimlgen
895e062723 nv remove useless init (#5932) 2024-08-06 14:41:40 +03:00
George Hotz
7348c40d9d sampling time sync (8700 lines) (#5843)
* sampling time sync

* jitter matrix

* comment

* pass mypy

* line count
2024-08-02 14:44:35 -07:00
nimlgen
34168a64e3 optimize nv profiler (#5856)
* nv profiler fix

* cleanup hcq a bit

* fixes

* fix

* typo

* all signals put timestamp

* a bit cleaner

* merge fields

* type

* import

* tiny fix
2024-08-01 23:57:45 +03:00
George Hotz
4dd24dc439 use decimal for timestamps for more precision [run_process_replay] (#5823)
* use decimal for timestamps for more precision

* err, didn't get saved

* fix types + 38 -> 40
2024-07-30 15:06:14 -07:00
nimlgen
ca674c31f9 nv remove some type ignores (#5811) 2024-07-30 17:47:29 +03:00
nimlgen
a25e1a1c90 nv open correct device (#5796) 2024-07-29 23:40:52 +03:00
chenyu
471b188d79 fix mypy errors in latest mypy (#5794)
* fix mypy errors in latest mypy

mypy has stricter partial and api arg checks now

* PYTHONPATH="."
2024-07-29 14:53:30 -04:00
nimlgen
71e1472290 hcq more types (#5791)
* mhcq more types

* linter

* pylint

* docs: bind
2024-07-29 18:03:23 +03:00
nimlgen
ea27ec4cd0 nv switch classlist_v2 to classlist (#5763)
* nv switch classlist_v2 to classlist

* support in mockgpu

* fix mockgpu
2024-07-28 20:24:42 +03:00
nimlgen
1903542c2d nv/cuda compilers touchup (#5759)
* nv/cuda compilers touchup

* fix cuda check + move nv disasm

* remove includes

* fix nvrtc_check
2024-07-28 00:15:28 +03:00
chenyu
9838c1a6ff update import style in runtime (#5735) 2024-07-26 14:00:23 -04:00
George Hotz
5c688560bc move CUDA/HIP compilers to their own files [run_process_replay] (#5732) 2024-07-26 10:00:15 -07:00
nimlgen
6ec9ea9ddd hcq update_exec with optional params (#5708) 2024-07-26 00:04:57 +03:00
nimlgen
b026312a31 nv ptx print log (#5691) 2024-07-24 21:40:58 +03:00
nimlgen
2ea54176e2 docs: add more info on HCQProgram (#5683)
* docs: add more info on HCQProgram

* linter

* linter2

* one more type
2024-07-24 17:20:18 +03:00
nimlgen
baface413a nv better nvdisasm fail message (#5682)
* nv better nvdisasm message

* cuda
2024-07-24 16:19:26 +03:00
nimlgen
a93982ef42 hcq move out program call to base class (#5638)
* hcq move out program call to base class

* fix
2024-07-23 14:25:38 +03:00
nimlgen
ee633c1988 hcq move out synchronize to base class (#5634) 2024-07-22 20:36:04 +03:00
Vyacheslav Pachkov
edc58e6b6e hcq: remove duplicate allocation of kernel args by abstracting (#5633) 2024-07-22 18:29:41 +03:00
nimlgen
08a9c0ae5e hcq cache invalidation for beam (#5630)
* nv full cache invalidation

* the same command on amd

* linter

* fix amd

* nv no hardcoded consts

* beam default
2024-07-22 18:13:17 +03:00
Vyacheslav Pachkov
583829ab44 helpers: remove duplicate data64 helpers in amd/nv (#5627) 2024-07-21 16:50:59 -07:00
nimlgen
0de5812032 hcq move map to allocator (#5610)
* hcq move map to allocator

* fix
2024-07-20 19:02:45 +03:00
nimlgen
b1782e3fef hcq refactor signal into class (#5575)
* hcq refactor signal into class

* fix amd

* amd do not use amd_signal_t

* cleanup

* signal setter

* fix linter

* docs

* more docs + types

* fix types
2024-07-19 23:23:05 +03:00
nimlgen
9d7edc9269 hcq rename HCQCompat -> HCQ (#5577) 2024-07-19 11:34:17 +03:00
nimlgen
4e9d2b1615 nv memory_barrier command (#5548) 2024-07-18 16:23:11 +03:00
nimlgen
dcd462860f elf loader (#5508)
* elf loader

* cleanup

* cleaner

* cleaner

* fixes

* revert this

* fix div 0

* fix nv

* amd fix

* fix mockgpu

* amd better?

* restore relocs for <12.4

* linter

* this is fixed now

* revert this

* process cdefines as function

* cleaner

* align

* save lines

* revert this change
2024-07-17 17:09:34 +03:00
nimlgen
661da32aff nv do not map regions twice (#5521) 2024-07-17 11:20:02 +03:00
nimlgen
8dfd11c1d8 docs: hcq add types (#5495)
* docs: hcq add types

* linter
2024-07-15 22:14:48 +03:00
nimlgen
c9ec7ce070 start hcq docs (#5411)
* start hcq docs

* more hcq docs

* docs

* docs

* linter

* correct args

* linter

* ts returns int
2024-07-15 21:31:11 +03:00
chenyu
eef43c9f49 include dims in kernel/nv invalid err msg (#5487) 2024-07-14 22:51:30 -04:00
nimlgen
61822d1a14 nv fix timeline signal rollover on copy queue (#5473)
* hotfix: nv rollover to 32bits

* test both queues
2024-07-14 16:06:12 +03:00
nimlgen
8835d6c49a cleanup nv/amd program (#5449)
* cleanup nv/amd program

* fix amd

* a bit cleaner

* ugh, typo

* linter

* fix nv

* tiny thing
2024-07-14 14:08:35 +03:00
nimlgen
6943ea5f29 nv remove copy_from_cpu command (#5459) 2024-07-13 23:08:49 +03:00
nimlgen
6604d2b2c3 amd/nv respect visible devs (#5409)
* nv/amd respect visible devices

* linter

* sort amd gpus

* env docs
2024-07-12 20:02:12 +03:00