Commit Graph

139 Commits

Author SHA1 Message Date
George Hotz
439911b2e6 disable disable_abstract_method [pr] (#7815) 2024-11-21 12:28:57 +08:00
George Hotz
c5d458ce02 BufferSpec and ProgramSpec [pr] (#7814)
* BufferSpec and ProgramSpec [pr]

* delete preallocate, it's unused

* Revert "delete preallocate, it's unused"

This reverts commit dcfcfaccde.
2024-11-21 12:18:05 +08:00
George Hotz
490a6130af more hcq typing [pr] (#7813)
* more hcq typing [pr]

* minor

* less generic
2024-11-21 11:23:07 +08:00
George Hotz
9df5a62c5e unify to HWQueue [pr] (#7812)
* unify to HWCommandQueue [pr]

* all is HWQueue
2024-11-21 10:33:08 +08:00
George Hotz
eb0bb7dc0b final dname to device [pr] (#7806)
* final dname to device [pr]

* oops, fix nv
2024-11-20 20:20:28 +08:00
George Hotz
0a74acd90e add proper typing to HCQ [pr] (#7803)
* add proper typing to HCQ [pr]

* more types

* and qcom

* HCQProgram has device type

* typed allocator
2024-11-20 17:20:39 +08:00
George Hotz
6688539bc9 rename device to dev so Buffer can be Allocator [pr] (#7799)
* rename device to dev to Buffer can be Allocator [pr]

* missed those

* update the Program classes also

* more renames

* oops
2024-11-20 15:47:26 +08:00
George Hotz
6bb230287b pass the src into Metal [pr] (#7518)
* pass the src into Metal [pr]

* put that comment back

* keep old functionality

* move all to disassembler

* metal supports parallel beam

* touchups

* comment in correct place
2024-11-04 12:35:30 +08:00
nimlgen
68cd2c0669 nv correct local memory based on device (#7307)
* nv correct local memory based on device

* linter

* oops

* oops2
2024-10-25 22:23:42 +03:00
nimlgen
98f8d0ccf9 nv limit max local memory with envvar (#7265) 2024-10-24 16:01:50 +03:00
George Hotz
de7b9d7c42 improve pre-commit [pr] (#7256)
* improve pre-commit [pr]

* mypy passes on windows
2024-10-24 15:38:47 +08:00
nimlgen
ea11382087 nv fix shared_memory_size (#7239) 2024-10-23 21:59:47 +03:00
nimlgen
cef7078c14 nv limit mappings debug (#7215) 2024-10-22 16:41:43 +03:00
nimlgen
81349213c0 nv min regs count is 16 (#7166) 2024-10-20 20:03:55 +03:00
nimlgen
211d9753f8 nv more lc checks (#7139)
* nv more lc checks

* revert

* linter
2024-10-18 00:21:53 +03:00
George Hotz
ca0dca35f7 move ptx renderer [pr] (#7118) 2024-10-17 14:50:32 +08:00
nimlgen
83e7dbd89e nv fix reallocation local memory when oom (#7098) 2024-10-16 18:17:50 +03:00
nimlgen
9f00eacde5 nv tagged memory + resnet failed kernel (#7061)
* nv tagged memory

* linter

* metal fix?
2024-10-15 18:19:58 +03:00
nimlgen
586ff4c910 nv record uvm mappings (#7059)
* nv record uvm mappings

* linteeer

* smth

* ooops
2024-10-15 00:12:49 +03:00
nimlgen
8094340221 nv print info about faults (#7057)
* nv print info about faults

* unrelated changes

* nv_gpu.GT200_DEBUGGER in mockgpu

* regen with ocrrect version

* spacing
2024-10-14 21:49:38 +03:00
nimlgen
0d526e251e nv sync on gpu before local update (#6954) 2024-10-08 17:43:58 +03:00
nimlgen
42609300ff hcq no timeline signals in init (#6944) 2024-10-07 23:36:19 +03:00
nimlgen
707c805a68 nv set localmem sm count to max (#6890) 2024-10-04 23:29:46 +03:00
George Hotz
6b063450df move hcq device to runtime [pr] (#6879)
* things that are only used in one place don't belong in helpers [pr]

* start moving hcq device [pr]

* fix paths
2024-10-04 22:26:50 +08:00
nimlgen
e213bea426 nv shorter (#6819) 2024-09-30 19:39:32 +03:00
nimlgen
d3ed50c769 fix typo in 'Too many resources requested for launch' (#6705) 2024-09-24 15:33:01 +08:00
nimlgen
eac046ea55 hcq check queue size before submit (#6481) 2024-09-11 23:13:13 +03:00
nimlgen
f63a9fd649 hcq _cur_cmd_idx for readability (#6444)
* hcq _cur_cmd_idx for readability

* linter
2024-09-09 20:04:45 +03:00
nimlgen
40e49b6b1a hcq share singal wait (#6394)
* hcq share singal wait

* linter
2024-09-06 18:35:06 +03:00
nimlgen
b1e5343133 nv better error msg for p2p failure (#6301)
* nv better error msg for p2p failure

* linetr

* from

* mypy
2024-08-28 01:40:45 +03:00
nimlgen
ac303146ca nv sure qmd addr less than 40bits (#6288) 2024-08-27 20:47:38 +03:00
nimlgen
89c4cffd86 nv fix size in SET_SEMAPHORE_A (#6213) 2024-08-21 01:47:10 +03:00
nimlgen
bc44e6501b _gpu_alloc -> allocator.alloc (#6189)
* _gpu_alloc -> allocator.alloc

* not needed this import

* pylint
2024-08-19 23:34:22 +03:00
nimlgen
b765996d54 hcq remove offset from progs (#6090) 2024-08-15 17:02:54 +03:00
nimlgen
fa84e6ec48 init hcq args state (#6046)
* init hcq args state

* cleaner

* amd

* fillargs

* fixes

* myoy

* docs

* fix

* not needed

* spacing
2024-08-13 17:11:58 +03:00
nimlgen
ce066fd754 nv do not recalc mv_address (#5998) 2024-08-09 17:16:34 +03:00
nimlgen
76eca0d27e nv fix host mem mappings (#5979) 2024-08-08 17:03:44 +03:00
nimlgen
564a352194 nv unify _gpu_free (#5961)
* nv unify _gpu_free

* revert this
2024-08-07 18:18:17 +03:00
nimlgen
895e062723 nv remove useless init (#5932) 2024-08-06 14:41:40 +03:00
George Hotz
7348c40d9d sampling time sync (8700 lines) (#5843)
* sampling time sync

* jitter matrix

* comment

* pass mypy

* line count
2024-08-02 14:44:35 -07:00
nimlgen
34168a64e3 optimize nv profiler (#5856)
* nv profiler fix

* cleanup hcq a bit

* fixes

* fix

* typo

* all signals put timestamp

* a bit cleaner

* merge fields

* type

* import

* tiny fix
2024-08-01 23:57:45 +03:00
George Hotz
4dd24dc439 use decimal for timestamps for more precision [run_process_replay] (#5823)
* use decimal for timestamps for more precision

* err, didn't get saved

* fix types + 38 -> 40
2024-07-30 15:06:14 -07:00
nimlgen
ca674c31f9 nv remove some type ignores (#5811) 2024-07-30 17:47:29 +03:00
nimlgen
a25e1a1c90 nv open correct device (#5796) 2024-07-29 23:40:52 +03:00
chenyu
471b188d79 fix mypy errors in latest mypy (#5794)
* fix mypy errors in latest mypy

mypy has stricter partial and api arg checks now

* PYTHONPATH="."
2024-07-29 14:53:30 -04:00
nimlgen
71e1472290 hcq more types (#5791)
* mhcq more types

* linter

* pylint

* docs: bind
2024-07-29 18:03:23 +03:00
nimlgen
ea27ec4cd0 nv switch classlist_v2 to classlist (#5763)
* nv switch classlist_v2 to classlist

* support in mockgpu

* fix mockgpu
2024-07-28 20:24:42 +03:00
nimlgen
1903542c2d nv/cuda compilers touchup (#5759)
* nv/cuda compilers touchup

* fix cuda check + move nv disasm

* remove includes

* fix nvrtc_check
2024-07-28 00:15:28 +03:00
chenyu
9838c1a6ff update import style in runtime (#5735) 2024-07-26 14:00:23 -04:00
George Hotz
5c688560bc move CUDA/HIP compilers to their own files [run_process_replay] (#5732) 2024-07-26 10:00:15 -07:00