Commit Graph

131 Commits

Author SHA1 Message Date
nimlgen
68cd2c0669 nv correct local memory based on device (#7307)
* nv correct local memory based on device

* linter

* oops

* oops2
2024-10-25 22:23:42 +03:00
nimlgen
98f8d0ccf9 nv limit max local memory with envvar (#7265) 2024-10-24 16:01:50 +03:00
George Hotz
de7b9d7c42 improve pre-commit [pr] (#7256)
* improve pre-commit [pr]

* mypy passes on windows
2024-10-24 15:38:47 +08:00
nimlgen
ea11382087 nv fix shared_memory_size (#7239) 2024-10-23 21:59:47 +03:00
nimlgen
cef7078c14 nv limit mappings debug (#7215) 2024-10-22 16:41:43 +03:00
nimlgen
81349213c0 nv min regs count is 16 (#7166) 2024-10-20 20:03:55 +03:00
nimlgen
211d9753f8 nv more lc checks (#7139)
* nv more lc checks

* revert

* linter
2024-10-18 00:21:53 +03:00
George Hotz
ca0dca35f7 move ptx renderer [pr] (#7118) 2024-10-17 14:50:32 +08:00
nimlgen
83e7dbd89e nv fix reallocation local memory when oom (#7098) 2024-10-16 18:17:50 +03:00
nimlgen
9f00eacde5 nv tagged memory + resnet failed kernel (#7061)
* nv tagged memory

* linter

* metal fix?
2024-10-15 18:19:58 +03:00
nimlgen
586ff4c910 nv record uvm mappings (#7059)
* nv record uvm mappings

* linteeer

* smth

* ooops
2024-10-15 00:12:49 +03:00
nimlgen
8094340221 nv print info about faults (#7057)
* nv print info about faults

* unrelated changes

* nv_gpu.GT200_DEBUGGER in mockgpu

* regen with ocrrect version

* spacing
2024-10-14 21:49:38 +03:00
nimlgen
0d526e251e nv sync on gpu before local update (#6954) 2024-10-08 17:43:58 +03:00
nimlgen
42609300ff hcq no timeline signals in init (#6944) 2024-10-07 23:36:19 +03:00
nimlgen
707c805a68 nv set localmem sm count to max (#6890) 2024-10-04 23:29:46 +03:00
George Hotz
6b063450df move hcq device to runtime [pr] (#6879)
* things that are only used in one place don't belong in helpers [pr]

* start moving hcq device [pr]

* fix paths
2024-10-04 22:26:50 +08:00
nimlgen
e213bea426 nv shorter (#6819) 2024-09-30 19:39:32 +03:00
nimlgen
d3ed50c769 fix typo in 'Too many resources requested for launch' (#6705) 2024-09-24 15:33:01 +08:00
nimlgen
eac046ea55 hcq check queue size before submit (#6481) 2024-09-11 23:13:13 +03:00
nimlgen
f63a9fd649 hcq _cur_cmd_idx for readability (#6444)
* hcq _cur_cmd_idx for readability

* linter
2024-09-09 20:04:45 +03:00
nimlgen
40e49b6b1a hcq share singal wait (#6394)
* hcq share singal wait

* linter
2024-09-06 18:35:06 +03:00
nimlgen
b1e5343133 nv better error msg for p2p failure (#6301)
* nv better error msg for p2p failure

* linetr

* from

* mypy
2024-08-28 01:40:45 +03:00
nimlgen
ac303146ca nv sure qmd addr less than 40bits (#6288) 2024-08-27 20:47:38 +03:00
nimlgen
89c4cffd86 nv fix size in SET_SEMAPHORE_A (#6213) 2024-08-21 01:47:10 +03:00
nimlgen
bc44e6501b _gpu_alloc -> allocator.alloc (#6189)
* _gpu_alloc -> allocator.alloc

* not needed this import

* pylint
2024-08-19 23:34:22 +03:00
nimlgen
b765996d54 hcq remove offset from progs (#6090) 2024-08-15 17:02:54 +03:00
nimlgen
fa84e6ec48 init hcq args state (#6046)
* init hcq args state

* cleaner

* amd

* fillargs

* fixes

* myoy

* docs

* fix

* not needed

* spacing
2024-08-13 17:11:58 +03:00
nimlgen
ce066fd754 nv do not recalc mv_address (#5998) 2024-08-09 17:16:34 +03:00
nimlgen
76eca0d27e nv fix host mem mappings (#5979) 2024-08-08 17:03:44 +03:00
nimlgen
564a352194 nv unify _gpu_free (#5961)
* nv unify _gpu_free

* revert this
2024-08-07 18:18:17 +03:00
nimlgen
895e062723 nv remove useless init (#5932) 2024-08-06 14:41:40 +03:00
George Hotz
7348c40d9d sampling time sync (8700 lines) (#5843)
* sampling time sync

* jitter matrix

* comment

* pass mypy

* line count
2024-08-02 14:44:35 -07:00
nimlgen
34168a64e3 optimize nv profiler (#5856)
* nv profiler fix

* cleanup hcq a bit

* fixes

* fix

* typo

* all signals put timestamp

* a bit cleaner

* merge fields

* type

* import

* tiny fix
2024-08-01 23:57:45 +03:00
George Hotz
4dd24dc439 use decimal for timestamps for more precision [run_process_replay] (#5823)
* use decimal for timestamps for more precision

* err, didn't get saved

* fix types + 38 -> 40
2024-07-30 15:06:14 -07:00
nimlgen
ca674c31f9 nv remove some type ignores (#5811) 2024-07-30 17:47:29 +03:00
nimlgen
a25e1a1c90 nv open correct device (#5796) 2024-07-29 23:40:52 +03:00
chenyu
471b188d79 fix mypy errors in latest mypy (#5794)
* fix mypy errors in latest mypy

mypy has stricter partial and api arg checks now

* PYTHONPATH="."
2024-07-29 14:53:30 -04:00
nimlgen
71e1472290 hcq more types (#5791)
* mhcq more types

* linter

* pylint

* docs: bind
2024-07-29 18:03:23 +03:00
nimlgen
ea27ec4cd0 nv switch classlist_v2 to classlist (#5763)
* nv switch classlist_v2 to classlist

* support in mockgpu

* fix mockgpu
2024-07-28 20:24:42 +03:00
nimlgen
1903542c2d nv/cuda compilers touchup (#5759)
* nv/cuda compilers touchup

* fix cuda check + move nv disasm

* remove includes

* fix nvrtc_check
2024-07-28 00:15:28 +03:00
chenyu
9838c1a6ff update import style in runtime (#5735) 2024-07-26 14:00:23 -04:00
George Hotz
5c688560bc move CUDA/HIP compilers to their own files [run_process_replay] (#5732) 2024-07-26 10:00:15 -07:00
nimlgen
6ec9ea9ddd hcq update_exec with optional params (#5708) 2024-07-26 00:04:57 +03:00
nimlgen
b026312a31 nv ptx print log (#5691) 2024-07-24 21:40:58 +03:00
nimlgen
2ea54176e2 docs: add more info on HCQProgram (#5683)
* docs: add more info on HCQProgram

* linter

* linter2

* one more type
2024-07-24 17:20:18 +03:00
nimlgen
baface413a nv better nvdisasm fail message (#5682)
* nv better nvdisasm message

* cuda
2024-07-24 16:19:26 +03:00
nimlgen
a93982ef42 hcq move out program call to base class (#5638)
* hcq move out program call to base class

* fix
2024-07-23 14:25:38 +03:00
nimlgen
ee633c1988 hcq move out synchronize to base class (#5634) 2024-07-22 20:36:04 +03:00
Vyacheslav Pachkov
edc58e6b6e hcq: remove duplicate allocation of kernel args by abstracting (#5633) 2024-07-22 18:29:41 +03:00
nimlgen
08a9c0ae5e hcq cache invalidation for beam (#5630)
* nv full cache invalidation

* the same command on amd

* linter

* fix amd

* nv no hardcoded consts

* beam default
2024-07-22 18:13:17 +03:00