nimlgen
|
68cd2c0669
|
nv correct local memory based on device (#7307)
* nv correct local memory based on device
* linter
* oops
* oops2
|
2024-10-25 22:23:42 +03:00 |
|
nimlgen
|
98f8d0ccf9
|
nv limit max local memory with envvar (#7265)
|
2024-10-24 16:01:50 +03:00 |
|
George Hotz
|
de7b9d7c42
|
improve pre-commit [pr] (#7256)
* improve pre-commit [pr]
* mypy passes on windows
|
2024-10-24 15:38:47 +08:00 |
|
nimlgen
|
ea11382087
|
nv fix shared_memory_size (#7239)
|
2024-10-23 21:59:47 +03:00 |
|
nimlgen
|
cef7078c14
|
nv limit mappings debug (#7215)
|
2024-10-22 16:41:43 +03:00 |
|
nimlgen
|
81349213c0
|
nv min regs count is 16 (#7166)
|
2024-10-20 20:03:55 +03:00 |
|
nimlgen
|
211d9753f8
|
nv more lc checks (#7139)
* nv more lc checks
* revert
* linter
|
2024-10-18 00:21:53 +03:00 |
|
George Hotz
|
ca0dca35f7
|
move ptx renderer [pr] (#7118)
|
2024-10-17 14:50:32 +08:00 |
|
nimlgen
|
83e7dbd89e
|
nv fix reallocation local memory when oom (#7098)
|
2024-10-16 18:17:50 +03:00 |
|
nimlgen
|
9f00eacde5
|
nv tagged memory + resnet failed kernel (#7061)
* nv tagged memory
* linter
* metal fix?
|
2024-10-15 18:19:58 +03:00 |
|
nimlgen
|
586ff4c910
|
nv record uvm mappings (#7059)
* nv record uvm mappings
* linteeer
* smth
* ooops
|
2024-10-15 00:12:49 +03:00 |
|
nimlgen
|
8094340221
|
nv print info about faults (#7057)
* nv print info about faults
* unrelated changes
* nv_gpu.GT200_DEBUGGER in mockgpu
* regen with ocrrect version
* spacing
|
2024-10-14 21:49:38 +03:00 |
|
nimlgen
|
0d526e251e
|
nv sync on gpu before local update (#6954)
|
2024-10-08 17:43:58 +03:00 |
|
nimlgen
|
42609300ff
|
hcq no timeline signals in init (#6944)
|
2024-10-07 23:36:19 +03:00 |
|
nimlgen
|
707c805a68
|
nv set localmem sm count to max (#6890)
|
2024-10-04 23:29:46 +03:00 |
|
George Hotz
|
6b063450df
|
move hcq device to runtime [pr] (#6879)
* things that are only used in one place don't belong in helpers [pr]
* start moving hcq device [pr]
* fix paths
|
2024-10-04 22:26:50 +08:00 |
|
nimlgen
|
e213bea426
|
nv shorter (#6819)
|
2024-09-30 19:39:32 +03:00 |
|
nimlgen
|
d3ed50c769
|
fix typo in 'Too many resources requested for launch' (#6705)
|
2024-09-24 15:33:01 +08:00 |
|
nimlgen
|
eac046ea55
|
hcq check queue size before submit (#6481)
|
2024-09-11 23:13:13 +03:00 |
|
nimlgen
|
f63a9fd649
|
hcq _cur_cmd_idx for readability (#6444)
* hcq _cur_cmd_idx for readability
* linter
|
2024-09-09 20:04:45 +03:00 |
|
nimlgen
|
40e49b6b1a
|
hcq share singal wait (#6394)
* hcq share singal wait
* linter
|
2024-09-06 18:35:06 +03:00 |
|
nimlgen
|
b1e5343133
|
nv better error msg for p2p failure (#6301)
* nv better error msg for p2p failure
* linetr
* from
* mypy
|
2024-08-28 01:40:45 +03:00 |
|
nimlgen
|
ac303146ca
|
nv sure qmd addr less than 40bits (#6288)
|
2024-08-27 20:47:38 +03:00 |
|
nimlgen
|
89c4cffd86
|
nv fix size in SET_SEMAPHORE_A (#6213)
|
2024-08-21 01:47:10 +03:00 |
|
nimlgen
|
bc44e6501b
|
_gpu_alloc -> allocator.alloc (#6189)
* _gpu_alloc -> allocator.alloc
* not needed this import
* pylint
|
2024-08-19 23:34:22 +03:00 |
|
nimlgen
|
b765996d54
|
hcq remove offset from progs (#6090)
|
2024-08-15 17:02:54 +03:00 |
|
nimlgen
|
fa84e6ec48
|
init hcq args state (#6046)
* init hcq args state
* cleaner
* amd
* fillargs
* fixes
* myoy
* docs
* fix
* not needed
* spacing
|
2024-08-13 17:11:58 +03:00 |
|
nimlgen
|
ce066fd754
|
nv do not recalc mv_address (#5998)
|
2024-08-09 17:16:34 +03:00 |
|
nimlgen
|
76eca0d27e
|
nv fix host mem mappings (#5979)
|
2024-08-08 17:03:44 +03:00 |
|
nimlgen
|
564a352194
|
nv unify _gpu_free (#5961)
* nv unify _gpu_free
* revert this
|
2024-08-07 18:18:17 +03:00 |
|
nimlgen
|
895e062723
|
nv remove useless init (#5932)
|
2024-08-06 14:41:40 +03:00 |
|
George Hotz
|
7348c40d9d
|
sampling time sync (8700 lines) (#5843)
* sampling time sync
* jitter matrix
* comment
* pass mypy
* line count
|
2024-08-02 14:44:35 -07:00 |
|
nimlgen
|
34168a64e3
|
optimize nv profiler (#5856)
* nv profiler fix
* cleanup hcq a bit
* fixes
* fix
* typo
* all signals put timestamp
* a bit cleaner
* merge fields
* type
* import
* tiny fix
|
2024-08-01 23:57:45 +03:00 |
|
George Hotz
|
4dd24dc439
|
use decimal for timestamps for more precision [run_process_replay] (#5823)
* use decimal for timestamps for more precision
* err, didn't get saved
* fix types + 38 -> 40
|
2024-07-30 15:06:14 -07:00 |
|
nimlgen
|
ca674c31f9
|
nv remove some type ignores (#5811)
|
2024-07-30 17:47:29 +03:00 |
|
nimlgen
|
a25e1a1c90
|
nv open correct device (#5796)
|
2024-07-29 23:40:52 +03:00 |
|
chenyu
|
471b188d79
|
fix mypy errors in latest mypy (#5794)
* fix mypy errors in latest mypy
mypy has stricter partial and api arg checks now
* PYTHONPATH="."
|
2024-07-29 14:53:30 -04:00 |
|
nimlgen
|
71e1472290
|
hcq more types (#5791)
* mhcq more types
* linter
* pylint
* docs: bind
|
2024-07-29 18:03:23 +03:00 |
|
nimlgen
|
ea27ec4cd0
|
nv switch classlist_v2 to classlist (#5763)
* nv switch classlist_v2 to classlist
* support in mockgpu
* fix mockgpu
|
2024-07-28 20:24:42 +03:00 |
|
nimlgen
|
1903542c2d
|
nv/cuda compilers touchup (#5759)
* nv/cuda compilers touchup
* fix cuda check + move nv disasm
* remove includes
* fix nvrtc_check
|
2024-07-28 00:15:28 +03:00 |
|
chenyu
|
9838c1a6ff
|
update import style in runtime (#5735)
|
2024-07-26 14:00:23 -04:00 |
|
George Hotz
|
5c688560bc
|
move CUDA/HIP compilers to their own files [run_process_replay] (#5732)
|
2024-07-26 10:00:15 -07:00 |
|
nimlgen
|
6ec9ea9ddd
|
hcq update_exec with optional params (#5708)
|
2024-07-26 00:04:57 +03:00 |
|
nimlgen
|
b026312a31
|
nv ptx print log (#5691)
|
2024-07-24 21:40:58 +03:00 |
|
nimlgen
|
2ea54176e2
|
docs: add more info on HCQProgram (#5683)
* docs: add more info on HCQProgram
* linter
* linter2
* one more type
|
2024-07-24 17:20:18 +03:00 |
|
nimlgen
|
baface413a
|
nv better nvdisasm fail message (#5682)
* nv better nvdisasm message
* cuda
|
2024-07-24 16:19:26 +03:00 |
|
nimlgen
|
a93982ef42
|
hcq move out program call to base class (#5638)
* hcq move out program call to base class
* fix
|
2024-07-23 14:25:38 +03:00 |
|
nimlgen
|
ee633c1988
|
hcq move out synchronize to base class (#5634)
|
2024-07-22 20:36:04 +03:00 |
|
Vyacheslav Pachkov
|
edc58e6b6e
|
hcq: remove duplicate allocation of kernel args by abstracting (#5633)
|
2024-07-22 18:29:41 +03:00 |
|
nimlgen
|
08a9c0ae5e
|
hcq cache invalidation for beam (#5630)
* nv full cache invalidation
* the same command on amd
* linter
* fix amd
* nv no hardcoded consts
* beam default
|
2024-07-22 18:13:17 +03:00 |
|