Commit Graph

146 Commits

Author SHA1 Message Date
nimlgen
70db8c3003 hcq: dyn alloc signals (#9238)
* hcq: dyn alloc signals

* types and uniqueue devs

* typing

* mypy

* mypy one more time

* test

* make fds to not intersect in mockgpu between drivers
2025-02-25 17:22:24 +03:00
uuuvn
8926bac00a am: profiling working (#9119)
ops_amd.py registres device finalization via atexit.register after
finalize_profile is registred in device.py leading to AM device
being closed before finalizing profile leading to hang.
(atexit.register is LIFO: https://docs.python.org/3.12/library/atexit.html#atexit.register)

This pr moves registring device finalization to device.py before
registring profile finalization
2025-02-16 18:51:08 +03:00
nimlgen
101652a55c hcq: thread fence (#8991)
* amd: thread fence

* nv
2025-02-11 18:09:37 +03:00
nimlgen
3e005ca0c2 am: resize bar0 to max supported (#9006) 2025-02-10 16:48:44 +03:00
nimlgen
88add71c25 amd: increase sdma copy size (#8989)
* amd: increase sdma max copy size

* rm this

* fix

* fx

* ops
2025-02-09 20:53:35 +03:00
nimlgen
c6c2373bc0 replace libpciaccess autogen with just pci regs (#8983)
* replace libpciaccess autogen with just pci regs

* add pci.py
2025-02-09 18:40:45 +03:00
nimlgen
e5a3f60fc2 am: remove libpciaccess dep (#8980)
* am: remove libpciaccess dep

* offset in mockhwiface

* op

* fake regions
2025-02-09 16:06:55 +03:00
nimlgen
79de980565 am: do not fork pci bars (#8969) 2025-02-08 19:03:17 +03:00
nimlgen
86feb98dcd am: add support for 7600 (#8910)
* am: start to add support for 7600

* test_tiny passes

* mmhub 3 0 2

* cleaner
2025-02-06 14:04:07 +03:00
nimlgen
4c28235bd1 am: remove hardcodes (#8895)
* am: remove hardcodes for 7900

* h
2025-02-05 00:52:53 +03:00
nimlgen
fa90079370 amd: reallocate scratch (#8872)
* amd: reallocate scratch

* use it

* oops

* allocate default

* mypy

* ops

* address realloc from none better

* types correct

* this better

* ops

* rm
2025-02-03 23:21:37 +03:00
nimlgen
b3fa76419a am: move queues to gpus (#8848)
* am: fix

* add flsg for thos

* do not depend on host parameter,
2025-02-01 18:02:52 +03:00
nimlgen
741bbc900d Revert "am: queues allocated on gpus (#8836)" (#8837)
This reverts commit 7bbb568dec.
2025-01-31 22:53:41 +03:00
nimlgen
7bbb568dec am: queues allocated on gpus (#8836)
* am: fix

* add flsg for thos
2025-01-31 22:14:43 +03:00
nimlgen
50ba2bb642 am: move ring to host mem (#8802) 2025-01-29 20:56:11 +03:00
nimlgen
c74c5901a8 am disable bind (#8747) 2025-01-25 19:06:35 +03:00
nimlgen
2f06eccf1d am: script and vfio msg (#8742)
* am: script and vfio msg

* use sysfs bars always for now

* tiny chnages
2025-01-25 00:33:00 +03:00
nimlgen
dc10187fc0 am: add am_smi (#8739)
* am: start monitor

* cleanups

* fixes

* hmm

* progress

* cleanup
2025-01-24 20:16:19 +03:00
nimlgen
9d3c40601f am: fast memory manager (#8654)
* start

* progress

* fixes

* smth

* mini fixes

* fix2

* ugh, need this for now

* faster

* cleanups

* tiny linters

* make mypy happier

* test & free pts

* ops

* linter

* cleanup vm

* fix

* remove map_from

* tiny fixes

* add test to ci
2025-01-20 16:58:22 +03:00
nimlgen
f91ca508cf am: bind for sdma (#8633)
* am: bind for sdma

* fix
2025-01-16 15:22:27 +03:00
nimlgen
61665a63c9 am logs to debug2 (#8563) 2025-01-11 13:33:18 +03:00
nimlgen
f457cb64d6 am: do not reload fw each run (#8466)
* am do not reload fw each run

* works

* comment this

* clean + comment

* warn message

* linter

* move out pci en master

* useless

* more correct

* oops

* oops
2025-01-10 23:33:38 +03:00
nimlgen
337328e409 am: fini gpu after use (#8556)
* am: fini gpu after use

* mypy
2025-01-10 21:02:34 +03:00
patrini32
afef69a37d MOCKGPU on mac os (#8520)
* tweaks for macos

* fix

* fix

* typo

* remove nvidia changes

* remove nv related changes

* change address back
2025-01-07 20:27:43 +03:00
nimlgen
ab3ac2b58d hw interface abstraction (#8524)
* use HWInterface in autogen

* mockgpu

* HWInterface

* more HWInterface

* fix

* fix

* old code

* fix

* implicit field definition

* add offset check to mockgpu too

* refactor

* forgot to pass flags + read rewrite

* test

* play with vfio

* nv: this should be kept

* try this

* vfio

* rm overwrite=True

* linetr

* do not reinit kfd

* minor

* mypy

* mock

* init them once

---------

Co-authored-by: patrini32 <patrini23@proton.me>
2025-01-07 18:18:28 +03:00
nimlgen
b4f4a3ac12 am: minor parts (#8507) 2025-01-05 23:05:21 +03:00
George Hotz
ddad4d55da add typing to tqdm [pr] (#8500) 2025-01-04 13:55:52 -05:00
nimlgen
5d37d33fc5 update typing.Optional to 3.10 for hcq (#8479) 2025-01-03 16:20:49 +03:00
nimlgen
c18307e749 AM driver (#6923)
* connect to gpu

* rlc init?

* gfx comp start init

* early init is hardoded, some progress with fw

* gart

* progress, next mqd

* ring setup, still does not execute anything

* ugh write correct reg

* pci2: vm

* pci2: start psp

* vm seems to work

* pci2: gfx start

* pci2: fix psp ring resp

* pci2: try ring

* pci2: mes and some fixes

* pci2: some progress

* pci2: progress

* pci2: mm

* pci2: discovery

* pci2: correct apertures

* pci2: b

* pci2: i

* pci2: l

* pci2: o

* pci2: cmu

* pci2: mes_kiq works

* pci2: mes

* pci2: kcq does not work(

* pci2: unhalt gfx

* ops_am

* minor

* check if amdgpu is there, or we will crash

* bring back graph, it just works

* less prints

* do not init mes (not used)

* remove unused files

* ops_am: start move into core

* ops_am: works

* clcks, but still slower

* faster + no mes_kiq

* vm frags + remove mes

* cleanup fw

* gmc tiny cleanup

* move to ops_amd

* comment out what we dont really need

* driverless

* close in speed

* am clean most of ips

* gmc to ips

* cleaner

* new vm walker

* comment old one

* remove unsued autogens

* last write ups

* remove psp hardcoded values

* more

* add logs

* ih

* p2p and sdma

* vfio hal and interrupts

* smth

* amd dev iface

* minor after rebase

* bind for sdma

* Revert "bind for sdma"

This reverts commit a90766514d.

* tmp

* debug new mm

* ugh, allreduce hangs fixed

* p1

* works

* no pci.py

* cleaner a bit

* smth

* tiny cleanups

* cleaner a bit

* pciiface

* linter

* linter 2

* linter 3

* linter

* pylint

* reverted unrelated changes

* unrelated

* cmp tool

* ugh wrong fw

* clockgating

* unrelated

* alloc smaller chunks

* this

* opt sigs

* collect stat

* ops

* upd

* proclogs

* proclogs2

* vfio

* ruff

* linter pylint

* oops

* mypy p1

* mem fix

* mypy p2

* mypy p3

* mypy p4

* correct

* minor

* more tests

* linter in tests

* pci_regs header

* minor write up

* setup

* do not require libs

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-12-31 23:06:17 +03:00
nimlgen
0a139b1436 amd iface abstraction (#8413)
* start on amd iface

* t

* unused import

* fixes

* internal api
2024-12-27 15:53:53 +03:00
chenyu
3f46425f1e typos found by gemini [pr] (#8400)
not very effective... maybe due to tokenizer
2024-12-24 22:32:25 -05:00
nimlgen
a647f3dd2c move mockgpu to tests [pr] (#8396)
* move mockgpu to tests

* linter

* i'm so sorry

* sorry, python

* path
2024-12-24 23:48:02 +03:00
chenyu
e63c7818dc few type cleanups [pr] (#8347) 2024-12-20 01:56:01 -05:00
George Hotz
9c77e9f9b7 replace Tuple with tuple [pr] (#8344)
* replace Tuple with tuple [pr]

* replace List with list [pr]

* replace Dict with dict [pr]

* replace Set with set [pr]
2024-12-19 21:27:56 -08:00
nimlgen
3a7d64b96c hcq remove update from args state (#8104)
* hcq remove update from args state

fix amd

ugh

qcom?

qcom ops

ops

qcom fix

qcom texture info

fx

qcom fix

qcom

qcom, sry

minor

works

* remove old code

* unrelated+sint

* qcom

* typing

* rm comments
2024-12-08 15:22:05 +03:00
nimlgen
d6e66095fd hcq buffer is a class (#8106)
* hcq buffer is a class

* qcom

* no from_mv in qcom

* remove qcombuffer

* useless cast

* mypy

* qcom fix

* _md -> meta
2024-12-08 13:29:43 +03:00
nimlgen
78c01a5c2b amd general _gpu_alloc (#8056)
* amd general _gpu_alloc

* hmm

* ops
2024-12-05 15:50:23 +03:00
nimlgen
7fda464b08 hcq c-like args state (#8020)
* hcq c-like args state

* ugh

* Dfix

* rename

* i
2024-12-03 23:53:35 +03:00
wozeparrot
077e7e8ed2 fix: private segment sgpr on gfx103x (#7987)
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-12-02 20:54:50 +08:00
nimlgen
10f431b96d hcq replace update with sint (#7899)
* try sym hcq

* start with amd

* move to nv

* nv works

* cache and qcom

* fixes

* signals

* fix nv

* qcom fixes

* linter

* linter

* cache + typings

* fixes

* tiny fixes

* linter

* linter

* lntr

* ugh

* comments
2024-11-29 20:08:13 +03:00
nimlgen
d3660ccc51 prereqs for hcq updates removal (#7959)
* hcq signals touch ups

* hcq compiled has device id

* helpers

* prreq hcq api

* oops
2024-11-29 18:20:07 +03:00
nimlgen
309dcb1044 hcq signal add sleep (#7955)
* hcqsignal sleep

* fixes

* typing

* time ms is int
2024-11-29 14:04:45 +03:00
nimlgen
81d415be03 amd pkt3 refactor (#7923)
* amd pkt3 refactor

* replace this

* linter

* fix

* cmt

* fast

* simpler

* linter

* smth

* missing
2024-11-28 11:06:37 +03:00
nimlgen
84f96e48a1 hcq signal tiny refactor (#7913)
* hcq signal tiny refactor

* no mv

* fix

* fix2

* fix3
2024-11-26 21:48:38 +03:00
George Hotz
439911b2e6 disable disable_abstract_method [pr] (#7815) 2024-11-21 12:28:57 +08:00
George Hotz
c5d458ce02 BufferSpec and ProgramSpec [pr] (#7814)
* BufferSpec and ProgramSpec [pr]

* delete preallocate, it's unused

* Revert "delete preallocate, it's unused"

This reverts commit dcfcfaccde.
2024-11-21 12:18:05 +08:00
George Hotz
490a6130af more hcq typing [pr] (#7813)
* more hcq typing [pr]

* minor

* less generic
2024-11-21 11:23:07 +08:00
George Hotz
9df5a62c5e unify to HWQueue [pr] (#7812)
* unify to HWCommandQueue [pr]

* all is HWQueue
2024-11-21 10:33:08 +08:00
George Hotz
0a74acd90e add proper typing to HCQ [pr] (#7803)
* add proper typing to HCQ [pr]

* more types

* and qcom

* HCQProgram has device type

* typed allocator
2024-11-20 17:20:39 +08:00
George Hotz
6688539bc9 rename device to dev so Buffer can be Allocator [pr] (#7799)
* rename device to dev to Buffer can be Allocator [pr]

* missed those

* update the Program classes also

* more renames

* oops
2024-11-20 15:47:26 +08:00