HCQ: Increment timeline signal before submitting (#9550)

`AMDComputeQueue.__del__` frees `hw_page` which is safe because
`AMDAllocator._free` does `self.dev.synchronize()` which is supposed
to wait for execution of IB to finish, however that doesn't happen if
AMDComputeQueue is dropped right after submit before timeline signal is
incremented, which it is in most places leading to a race if .bind() is
also used (required for multi-xcc because bug in mec fw treats all
PACKET3_PRED_EXECs outside IBs as if they had EXEC_COUNT of zero).
This commit is contained in:
uuuvn
2025-03-23 16:30:38 +05:00
committed by GitHub
parent d5667419af
commit c631c72f22
5 changed files with 21 additions and 31 deletions

View File

@@ -115,9 +115,8 @@ HCQ-compatible devices use a global timeline signal for synchronizing all operat
```python
HWQueue().wait(your_device.timeline_signal, your_device.timeline_value - 1) \
.exec(...)
.signal(your_device.timeline_signal, your_device.timeline_value) \
.signal(your_device.timeline_signal, your_device.next_timeline()) \
.submit(your_device)
your_device.timeline_value += 1
# Optionally wait for execution
your_device.timeline_signal.wait(your_device.timeline_value - 1)