hcq replace update with sint (#7899)

* try sym hcq

* start with amd

* move to nv

* nv works

* cache and qcom

* fixes

* signals

* fix nv

* qcom fixes

* linter

* linter

* cache + typings

* fixes

* tiny fixes

* linter

* linter

* lntr

* ugh

* comments
This commit is contained in:
nimlgen
2024-11-29 20:08:13 +03:00
committed by GitHub
parent aa51f3c14e
commit 10f431b96d
7 changed files with 261 additions and 358 deletions

View File

@@ -24,27 +24,11 @@ Each runtime should implement the required functions that are defined in the `HW
"signal",
"wait",
"timestamp",
"update_signal",
"update_wait",
"bind",
"submit",
"memory_barrier",
"exec",
"update_exec",
"copy",
"update_copy",
]
show_source: false
#### Implementing custom commands
To implement custom commands in the queue, use the @hcq_command decorator for your command implementations.
::: tinygrad.runtime.support.hcq.hcq_command
options:
members: [
"copy",
"update_copy",
]
show_source: false
@@ -141,5 +125,5 @@ your_device.timeline_signal.wait(your_device.timeline_value - 1)
## HCQGraph
[HCQGraph](https://github.com/tinygrad/tinygrad/tree/master/tinygrad/runtime/graph/hcq.py) is a core feature that implements `GraphRunner` for HCQ-compatible devices. `HCQGraph` builds static `HWQueue` for all operations per device. To optimize enqueue time, only the necessary parts of the queues are updated for each run using the update APIs of the queues, avoiding a complete rebuild.
[HCQGraph](https://github.com/tinygrad/tinygrad/tree/master/tinygrad/runtime/graph/hcq.py) is a core feature that implements `GraphRunner` for HCQ-compatible devices. `HCQGraph` builds static `HWQueue` for all operations per device. To optimize enqueue time, only the necessary parts of the queues are updated for each run using the symbolic variables, avoiding a complete rebuild.
Optionally, queues can implement a `bind` API, which allows further optimization by eliminating the need to copy the queues into the device ring.