HCQArgsState lifetime docs (#6323)

This commit is contained in:
nimlgen
2024-08-30 00:31:49 +03:00
committed by GitHub
parent 56b7fadc2f
commit 9b616cb33e
2 changed files with 4 additions and 1 deletions

View File

@@ -134,6 +134,8 @@ Backends must adhere to the `HCQBuffer` protocol when returning allocation resul
members: true
show_source: false
**Lifetime**: The `HCQArgsState` is passed to `HWComputeQueue.exec` and is guaranteed not to be freed until `HWComputeQueue.submit` for the same queue is called.
### Synchronization
HCQ-compatible devices use a global timeline signal for synchronizing all operations. This mechanism ensures proper ordering and completion of tasks across the device. By convention, `self.timeline_value` points to the next value to signal. So, to wait for all previous operations on the device to complete, wait for `self.timeline_value - 1` value. The following Python code demonstrates the typical usage of signals to synchronize execution to other operations on the device:

View File

@@ -468,10 +468,11 @@ class HCQProgram:
Execution time of the kernel if 'wait' is True, otherwise None.
"""
kernargs = self.fill_kernargs(bufs, vals)
q = self.device.hw_compute_queue_t().wait(self.device.timeline_signal, self.device.timeline_value - 1).memory_barrier()
with hcq_profile(self.device, queue=q, desc=self.name, enabled=wait or PROFILE) as (sig_st, sig_en):
q.exec(self, self.fill_kernargs(bufs, vals), global_size, local_size)
q.exec(self, kernargs, global_size, local_size)
q.signal(self.device.timeline_signal, self.device.timeline_value).submit(self.device)
self.device.timeline_value += 1