Commit Graph

91 Commits

Author SHA1 Message Date
chenyu
9838c1a6ff update import style in runtime (#5735) 2024-07-26 14:00:23 -04:00
George Hotz
5c688560bc move CUDA/HIP compilers to their own files [run_process_replay] (#5732) 2024-07-26 10:00:15 -07:00
nimlgen
6ec9ea9ddd hcq update_exec with optional params (#5708) 2024-07-26 00:04:57 +03:00
nimlgen
b026312a31 nv ptx print log (#5691) 2024-07-24 21:40:58 +03:00
nimlgen
2ea54176e2 docs: add more info on HCQProgram (#5683)
* docs: add more info on HCQProgram

* linter

* linter2

* one more type
2024-07-24 17:20:18 +03:00
nimlgen
baface413a nv better nvdisasm fail message (#5682)
* nv better nvdisasm message

* cuda
2024-07-24 16:19:26 +03:00
nimlgen
a93982ef42 hcq move out program call to base class (#5638)
* hcq move out program call to base class

* fix
2024-07-23 14:25:38 +03:00
nimlgen
ee633c1988 hcq move out synchronize to base class (#5634) 2024-07-22 20:36:04 +03:00
Vyacheslav Pachkov
edc58e6b6e hcq: remove duplicate allocation of kernel args by abstracting (#5633) 2024-07-22 18:29:41 +03:00
nimlgen
08a9c0ae5e hcq cache invalidation for beam (#5630)
* nv full cache invalidation

* the same command on amd

* linter

* fix amd

* nv no hardcoded consts

* beam default
2024-07-22 18:13:17 +03:00
Vyacheslav Pachkov
583829ab44 helpers: remove duplicate data64 helpers in amd/nv (#5627) 2024-07-21 16:50:59 -07:00
nimlgen
0de5812032 hcq move map to allocator (#5610)
* hcq move map to allocator

* fix
2024-07-20 19:02:45 +03:00
nimlgen
b1782e3fef hcq refactor signal into class (#5575)
* hcq refactor signal into class

* fix amd

* amd do not use amd_signal_t

* cleanup

* signal setter

* fix linter

* docs

* more docs + types

* fix types
2024-07-19 23:23:05 +03:00
nimlgen
9d7edc9269 hcq rename HCQCompat -> HCQ (#5577) 2024-07-19 11:34:17 +03:00
nimlgen
4e9d2b1615 nv memory_barrier command (#5548) 2024-07-18 16:23:11 +03:00
nimlgen
dcd462860f elf loader (#5508)
* elf loader

* cleanup

* cleaner

* cleaner

* fixes

* revert this

* fix div 0

* fix nv

* amd fix

* fix mockgpu

* amd better?

* restore relocs for <12.4

* linter

* this is fixed now

* revert this

* process cdefines as function

* cleaner

* align

* save lines

* revert this change
2024-07-17 17:09:34 +03:00
nimlgen
661da32aff nv do not map regions twice (#5521) 2024-07-17 11:20:02 +03:00
nimlgen
8dfd11c1d8 docs: hcq add types (#5495)
* docs: hcq add types

* linter
2024-07-15 22:14:48 +03:00
nimlgen
c9ec7ce070 start hcq docs (#5411)
* start hcq docs

* more hcq docs

* docs

* docs

* linter

* correct args

* linter

* ts returns int
2024-07-15 21:31:11 +03:00
chenyu
eef43c9f49 include dims in kernel/nv invalid err msg (#5487) 2024-07-14 22:51:30 -04:00
nimlgen
61822d1a14 nv fix timeline signal rollover on copy queue (#5473)
* hotfix: nv rollover to 32bits

* test both queues
2024-07-14 16:06:12 +03:00
nimlgen
8835d6c49a cleanup nv/amd program (#5449)
* cleanup nv/amd program

* fix amd

* a bit cleaner

* ugh, typo

* linter

* fix nv

* tiny thing
2024-07-14 14:08:35 +03:00
nimlgen
6943ea5f29 nv remove copy_from_cpu command (#5459) 2024-07-13 23:08:49 +03:00
nimlgen
6604d2b2c3 amd/nv respect visible devs (#5409)
* nv/amd respect visible devices

* linter

* sort amd gpus

* env docs
2024-07-12 20:02:12 +03:00
nimlgen
b3790b759b nv cleanup gpfifo setup (#5382)
* nv cleanup gpfifo setup

* save lines
2024-07-11 17:50:52 +03:00
nimlgen
2ba96d4c29 nv use mv_address (#5381)
* nv use mv_address

* unsued import
2024-07-11 16:45:03 +03:00
nimlgen
bd77efda2f add HWCommandQueue base class for hcq devices (#5303)
* add HWCommandQueue as base queue for hcq devices

* try this

* fixes

* comments

* linter

* linetr2

* linter

* linter

* fixed

* revert this
2024-07-11 16:19:13 +03:00
nimlgen
1678199b15 add update_copy to hcq spec (#5348)
* add update_copy to hcq spec

* fix amd
2024-07-09 20:44:44 +03:00
nimlgen
e815c57039 use hcq_profile in nv/amd program (#5344) 2024-07-09 15:56:06 +03:00
nimlgen
a2a9bfd2ec nv correct error messages with ptx (#5341)
* nv correct error messages with ptx

* return compile error
2024-07-09 10:39:39 +03:00
nimlgen
51d6f372e4 nv get classes based on device (#5325)
* nv get classes

* support in mockgpu

* choose sm based on gpu

* fix

* fix

* fix arch
2024-07-08 18:25:05 +03:00
nimlgen
b0c5c58833 nv rm_control to rmctrl type (#5327)
* nv rm_control to rmctrl type

* fix
2024-07-08 17:24:33 +03:00
nimlgen
778d1cdbee nv allocate local memory dynamically (#5277)
* nv allocate local memory dynamically

* fix

* linter

* linter 2

* linter

* fixes
2024-07-07 17:34:49 +03:00
nimlgen
2778b6046c new memory scheduler (#5278)
* new memory schedule algo

* works

* fix

* fix

* linter

* tiny fixes

* do not optimize copy buffers

* mpre comments

* tiny cleanups
2024-07-04 18:06:04 +03:00
nimlgen
84b3e3bb6f hcq exec no embedded signal (#5142) 2024-07-04 13:29:21 +03:00
nimlgen
21d41f06a2 nv follows HCQCompatAllocRes protocol (#5275)
* nv follows HCQCompatAllocRes protocol

* fix amd
2024-07-03 11:34:10 +03:00
Vyacheslav Pachkov
d3e4e21759 add return type for HCQCompatAllocator _alloc (#5267)
Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com>
2024-07-03 10:25:44 +03:00
nimlgen
7be776f9af add _alloc_signal/_free_signal to hcq (#5264)
* add _alloc_signal/_free_signal api

* oops, revert this

* linter
2024-07-02 23:35:39 +03:00
nimlgen
e050603b4b nv close fds after mapping (#5246) 2024-07-02 13:57:46 +03:00
nimlgen
dd7eef7d71 libc defs to autogen (#5217)
* libc defs to autogen

* amd import libc

* linter

* better a bit

* remove comment, check this

* not hardcoded path
2024-06-29 14:37:33 +03:00
nimlgen
ee02dcb98e nv supports PTX=1 (#5222)
* nv supports PTX=1

* not needed

* split nv compiler into nvrtc autogen

* remove to_c_array

* test

* Revert "test"

This reverts commit f0b56f308b.
2024-06-29 10:46:29 +03:00
nimlgen
ac748cccdb nv apply relocs (#5165)
* nv do reloc

* a bit cleaner
2024-06-27 23:54:16 +03:00
Roelof van Dijk
01e8838b65 ruff: suppressible-exception (#5182)
* fix: use contextlib to suppress errors

* enable rule SIM105

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-06-27 08:23:44 -07:00
Roelof van Dijk
f88f71d73a ruff: unnecessary-comprehension (#5174)
* enable ruff C416 unnecessary-comprehension

* already a list
2024-06-27 07:45:29 -04:00
nimlgen
69f116a7e1 nv/amd profiler (#4718)
* nv/amd profiler

* fix

* fix

* profile copies

* profile logger

* fixes

* more fixes

* less lines and fixes

* fixes

* some linter

* back sync, no related change

* fix gpu2cpu time def

* simpler

* linter

* linter

* docs

* add add_event api
2024-06-23 17:10:12 +03:00
nimlgen
2dcef5a0d7 hcq spec (#5081)
* hcq spec

* small change

* not used import

* fixes

* fix

* signals into base class

* more into base class

* remove imports

* fix wrap timeline

* raise when not implemented

* simpler
2024-06-22 15:32:12 +03:00
nimlgen
fb1bf48cfe io_uring for copies from disk (#5035)
* exp uring

* fixes and old version

* nv

* cleaner

* cmp vs aio

* fix

* no lib

* fix nv

* linter

* disk_speed_test now runs default

* fixes

* uring -> io_uring

* linter happy

* get_temp_buf comment added

* tiny nits

* put wait back

* test runs everywhere

* remove consts

* remove mmap consts

* do not require iouring to run test, they are generic
2024-06-21 11:36:51 +03:00
chenyu
a8e9307e0b pylint runtime/ and shape/ (#5044)
as pointed out by #4877, need to add `__init__.py` to trigger pylint. fixed some errors except ops_python (will do in a separate pr, it has a lot of errors), and sub-folders in runtime
2024-06-18 19:48:18 -04:00
nimlgen
194a168630 hcq signal scheduler (#5016)
* faster hcq

* fix nv

* linter

* cleaner

* fix sync

* cleaner

* a bit cleaner
2024-06-18 14:02:21 +03:00
nimlgen
794acefbf3 hcq update waits and signals in place (#4984)
* hcq update waits and signals in place

* start amd

* amd works

* prettier

* test

* normal messages

* linetr

* linter 2
2024-06-17 17:19:07 +03:00