* hljs.highlightElement target code not pre
* createPre
* no style change
* real no style change
* remove unnecessary scroll bar
* horizontal scrollbar appears only when scrolled all the way to the bottom
* misc
* connect to gpu
* rlc init?
* gfx comp start init
* early init is hardoded, some progress with fw
* gart
* progress, next mqd
* ring setup, still does not execute anything
* ugh write correct reg
* pci2: vm
* pci2: start psp
* vm seems to work
* pci2: gfx start
* pci2: fix psp ring resp
* pci2: try ring
* pci2: mes and some fixes
* pci2: some progress
* pci2: progress
* pci2: mm
* pci2: discovery
* pci2: correct apertures
* pci2: b
* pci2: i
* pci2: l
* pci2: o
* pci2: cmu
* pci2: mes_kiq works
* pci2: mes
* pci2: kcq does not work(
* pci2: unhalt gfx
* ops_am
* minor
* check if amdgpu is there, or we will crash
* bring back graph, it just works
* less prints
* do not init mes (not used)
* remove unused files
* ops_am: start move into core
* ops_am: works
* clcks, but still slower
* faster + no mes_kiq
* vm frags + remove mes
* cleanup fw
* gmc tiny cleanup
* move to ops_amd
* comment out what we dont really need
* driverless
* close in speed
* am clean most of ips
* gmc to ips
* cleaner
* new vm walker
* comment old one
* remove unsued autogens
* last write ups
* remove psp hardcoded values
* more
* add logs
* ih
* p2p and sdma
* vfio hal and interrupts
* smth
* amd dev iface
* minor after rebase
* bind for sdma
* Revert "bind for sdma"
This reverts commit a90766514d.
* tmp
* debug new mm
* ugh, allreduce hangs fixed
* p1
* works
* no pci.py
* cleaner a bit
* smth
* tiny cleanups
* cleaner a bit
* pciiface
* linter
* linter 2
* linter 3
* linter
* pylint
* reverted unrelated changes
* unrelated
* cmp tool
* ugh wrong fw
* clockgating
* unrelated
* alloc smaller chunks
* this
* opt sigs
* collect stat
* ops
* upd
* proclogs
* proclogs2
* vfio
* ruff
* linter pylint
* oops
* mypy p1
* mem fix
* mypy p2
* mypy p3
* mypy p4
* correct
* minor
* more tests
* linter in tests
* pci_regs header
* minor write up
* setup
* do not require libs
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
it's a python style mod. possibily can be cleaner with a floor div
relaxed the vmin for MOD slightly for cstyle negatives mod, it's more correct and might fix other bugs
* remove uop mutability [pr]
* test fixups
* most tests pass
* more tests pass
* lil test fixups
* them too
* fix test
* unneeded
* err, that
* fix test_hcq
* fix test failures
* fix that test
* tensor universe
* does this pass test
* Revert "does this pass test"
This reverts commit ed516b3169.
* Revert "tensor universe"
This reverts commit c21301852a.
* proper spidering for uops
* cleanups
* all tensors
* all tensors
* slow but correct
* fast
* no WeakSet
* faster
* no need for list
* revert that
* merge all one op views [pr]
* does this work?
* this won't work (yet)
* apply movement ops on top of the BUFFER
* buffer needs to become base next
---------
Co-authored-by: qazal <qazal.software@gmail.com>
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>
* UOp ShapeTracker conceptual refactor [pr]
* add the UOp shape spec
* assign spec
* test a permuted assign
* lint + more work
* collapse assign after it swizzles the store [pr]
* more work, refine valid
* permute the other way
* shapetracker cleanup
* this assert should work now
instead of having a class var for whole stack, store the old context in each Context.
also updated a test that ContextVar created in Context is not being cleared after the Context block
* validate that FC exists before loading pretrained weights
* add test case for ResNet pretrained model without FC layer
* remove extra newline
* rename test case
* reraise exception if not handled by check