nimlgen
|
707c805a68
|
nv set localmem sm count to max (#6890)
|
2024-10-04 23:29:46 +03:00 |
|
George Hotz
|
6b063450df
|
move hcq device to runtime [pr] (#6879)
* things that are only used in one place don't belong in helpers [pr]
* start moving hcq device [pr]
* fix paths
|
2024-10-04 22:26:50 +08:00 |
|
nimlgen
|
e213bea426
|
nv shorter (#6819)
|
2024-09-30 19:39:32 +03:00 |
|
nimlgen
|
d3ed50c769
|
fix typo in 'Too many resources requested for launch' (#6705)
|
2024-09-24 15:33:01 +08:00 |
|
nimlgen
|
eac046ea55
|
hcq check queue size before submit (#6481)
|
2024-09-11 23:13:13 +03:00 |
|
nimlgen
|
f63a9fd649
|
hcq _cur_cmd_idx for readability (#6444)
* hcq _cur_cmd_idx for readability
* linter
|
2024-09-09 20:04:45 +03:00 |
|
nimlgen
|
40e49b6b1a
|
hcq share singal wait (#6394)
* hcq share singal wait
* linter
|
2024-09-06 18:35:06 +03:00 |
|
nimlgen
|
b1e5343133
|
nv better error msg for p2p failure (#6301)
* nv better error msg for p2p failure
* linetr
* from
* mypy
|
2024-08-28 01:40:45 +03:00 |
|
nimlgen
|
ac303146ca
|
nv sure qmd addr less than 40bits (#6288)
|
2024-08-27 20:47:38 +03:00 |
|
nimlgen
|
89c4cffd86
|
nv fix size in SET_SEMAPHORE_A (#6213)
|
2024-08-21 01:47:10 +03:00 |
|
nimlgen
|
bc44e6501b
|
_gpu_alloc -> allocator.alloc (#6189)
* _gpu_alloc -> allocator.alloc
* not needed this import
* pylint
|
2024-08-19 23:34:22 +03:00 |
|
nimlgen
|
b765996d54
|
hcq remove offset from progs (#6090)
|
2024-08-15 17:02:54 +03:00 |
|
nimlgen
|
fa84e6ec48
|
init hcq args state (#6046)
* init hcq args state
* cleaner
* amd
* fillargs
* fixes
* myoy
* docs
* fix
* not needed
* spacing
|
2024-08-13 17:11:58 +03:00 |
|
nimlgen
|
ce066fd754
|
nv do not recalc mv_address (#5998)
|
2024-08-09 17:16:34 +03:00 |
|
nimlgen
|
76eca0d27e
|
nv fix host mem mappings (#5979)
|
2024-08-08 17:03:44 +03:00 |
|
nimlgen
|
564a352194
|
nv unify _gpu_free (#5961)
* nv unify _gpu_free
* revert this
|
2024-08-07 18:18:17 +03:00 |
|
nimlgen
|
895e062723
|
nv remove useless init (#5932)
|
2024-08-06 14:41:40 +03:00 |
|
George Hotz
|
7348c40d9d
|
sampling time sync (8700 lines) (#5843)
* sampling time sync
* jitter matrix
* comment
* pass mypy
* line count
|
2024-08-02 14:44:35 -07:00 |
|
nimlgen
|
34168a64e3
|
optimize nv profiler (#5856)
* nv profiler fix
* cleanup hcq a bit
* fixes
* fix
* typo
* all signals put timestamp
* a bit cleaner
* merge fields
* type
* import
* tiny fix
|
2024-08-01 23:57:45 +03:00 |
|
George Hotz
|
4dd24dc439
|
use decimal for timestamps for more precision [run_process_replay] (#5823)
* use decimal for timestamps for more precision
* err, didn't get saved
* fix types + 38 -> 40
|
2024-07-30 15:06:14 -07:00 |
|
nimlgen
|
ca674c31f9
|
nv remove some type ignores (#5811)
|
2024-07-30 17:47:29 +03:00 |
|
nimlgen
|
a25e1a1c90
|
nv open correct device (#5796)
|
2024-07-29 23:40:52 +03:00 |
|
chenyu
|
471b188d79
|
fix mypy errors in latest mypy (#5794)
* fix mypy errors in latest mypy
mypy has stricter partial and api arg checks now
* PYTHONPATH="."
|
2024-07-29 14:53:30 -04:00 |
|
nimlgen
|
71e1472290
|
hcq more types (#5791)
* mhcq more types
* linter
* pylint
* docs: bind
|
2024-07-29 18:03:23 +03:00 |
|
nimlgen
|
ea27ec4cd0
|
nv switch classlist_v2 to classlist (#5763)
* nv switch classlist_v2 to classlist
* support in mockgpu
* fix mockgpu
|
2024-07-28 20:24:42 +03:00 |
|
nimlgen
|
1903542c2d
|
nv/cuda compilers touchup (#5759)
* nv/cuda compilers touchup
* fix cuda check + move nv disasm
* remove includes
* fix nvrtc_check
|
2024-07-28 00:15:28 +03:00 |
|
chenyu
|
9838c1a6ff
|
update import style in runtime (#5735)
|
2024-07-26 14:00:23 -04:00 |
|
George Hotz
|
5c688560bc
|
move CUDA/HIP compilers to their own files [run_process_replay] (#5732)
|
2024-07-26 10:00:15 -07:00 |
|
nimlgen
|
6ec9ea9ddd
|
hcq update_exec with optional params (#5708)
|
2024-07-26 00:04:57 +03:00 |
|
nimlgen
|
b026312a31
|
nv ptx print log (#5691)
|
2024-07-24 21:40:58 +03:00 |
|
nimlgen
|
2ea54176e2
|
docs: add more info on HCQProgram (#5683)
* docs: add more info on HCQProgram
* linter
* linter2
* one more type
|
2024-07-24 17:20:18 +03:00 |
|
nimlgen
|
baface413a
|
nv better nvdisasm fail message (#5682)
* nv better nvdisasm message
* cuda
|
2024-07-24 16:19:26 +03:00 |
|
nimlgen
|
a93982ef42
|
hcq move out program call to base class (#5638)
* hcq move out program call to base class
* fix
|
2024-07-23 14:25:38 +03:00 |
|
nimlgen
|
ee633c1988
|
hcq move out synchronize to base class (#5634)
|
2024-07-22 20:36:04 +03:00 |
|
Vyacheslav Pachkov
|
edc58e6b6e
|
hcq: remove duplicate allocation of kernel args by abstracting (#5633)
|
2024-07-22 18:29:41 +03:00 |
|
nimlgen
|
08a9c0ae5e
|
hcq cache invalidation for beam (#5630)
* nv full cache invalidation
* the same command on amd
* linter
* fix amd
* nv no hardcoded consts
* beam default
|
2024-07-22 18:13:17 +03:00 |
|
Vyacheslav Pachkov
|
583829ab44
|
helpers: remove duplicate data64 helpers in amd/nv (#5627)
|
2024-07-21 16:50:59 -07:00 |
|
nimlgen
|
0de5812032
|
hcq move map to allocator (#5610)
* hcq move map to allocator
* fix
|
2024-07-20 19:02:45 +03:00 |
|
nimlgen
|
b1782e3fef
|
hcq refactor signal into class (#5575)
* hcq refactor signal into class
* fix amd
* amd do not use amd_signal_t
* cleanup
* signal setter
* fix linter
* docs
* more docs + types
* fix types
|
2024-07-19 23:23:05 +03:00 |
|
nimlgen
|
9d7edc9269
|
hcq rename HCQCompat -> HCQ (#5577)
|
2024-07-19 11:34:17 +03:00 |
|
nimlgen
|
4e9d2b1615
|
nv memory_barrier command (#5548)
|
2024-07-18 16:23:11 +03:00 |
|
nimlgen
|
dcd462860f
|
elf loader (#5508)
* elf loader
* cleanup
* cleaner
* cleaner
* fixes
* revert this
* fix div 0
* fix nv
* amd fix
* fix mockgpu
* amd better?
* restore relocs for <12.4
* linter
* this is fixed now
* revert this
* process cdefines as function
* cleaner
* align
* save lines
* revert this change
|
2024-07-17 17:09:34 +03:00 |
|
nimlgen
|
661da32aff
|
nv do not map regions twice (#5521)
|
2024-07-17 11:20:02 +03:00 |
|
nimlgen
|
8dfd11c1d8
|
docs: hcq add types (#5495)
* docs: hcq add types
* linter
|
2024-07-15 22:14:48 +03:00 |
|
nimlgen
|
c9ec7ce070
|
start hcq docs (#5411)
* start hcq docs
* more hcq docs
* docs
* docs
* linter
* correct args
* linter
* ts returns int
|
2024-07-15 21:31:11 +03:00 |
|
chenyu
|
eef43c9f49
|
include dims in kernel/nv invalid err msg (#5487)
|
2024-07-14 22:51:30 -04:00 |
|
nimlgen
|
61822d1a14
|
nv fix timeline signal rollover on copy queue (#5473)
* hotfix: nv rollover to 32bits
* test both queues
|
2024-07-14 16:06:12 +03:00 |
|
nimlgen
|
8835d6c49a
|
cleanup nv/amd program (#5449)
* cleanup nv/amd program
* fix amd
* a bit cleaner
* ugh, typo
* linter
* fix nv
* tiny thing
|
2024-07-14 14:08:35 +03:00 |
|
nimlgen
|
6943ea5f29
|
nv remove copy_from_cpu command (#5459)
|
2024-07-13 23:08:49 +03:00 |
|
nimlgen
|
6604d2b2c3
|
amd/nv respect visible devs (#5409)
* nv/amd respect visible devices
* linter
* sort amd gpus
* env docs
|
2024-07-12 20:02:12 +03:00 |
|