icicle

mirror of https://github.com/pseXperiments/icicle.git synced 2026-01-09 21:17:56 -05:00

Author	SHA1	Message	Date
Jeremy Felder	2b07513310	[FEAT]: Golang Bindings for pinned host memory (#519 ) ## Describe the changes This PR adds the capability to pin host memory in golang bindings allowing data transfers to be quicker. Memory can be pinned once for multiple devices by passing the flag `cuda_runtime.CudaHostRegisterPortable` or `cuda_runtime.CudaHostAllocPortable` depending on how pinned memory is called	2024-06-24 14:03:44 +03:00
Jeremy Felder	14997566ff	[FIX]: Fix releasing device set on host thread during multigpu call (#501 ) ## Describe the changes This PR fixes an issue when `RunOnDevice` is called for multi-gpu while other goroutines calling device operations are run outside of `RunOnDevice`. The issue comes from setting a device other than the default device (device 0) on a host thread within `RunOnDevice` and not unsetting that host threads device when `RunOnDevice` finishes. When `RunOnDevice` locks a host thread to ensure that all other calls in the go routine are on the same device, it never unsets that thread’s device. Once the thread is unlocked, other go routines can get scheduled to it but it still has the device set to whatever it was before while it was locked so its possible that the following sequence happens: 1. NTT domain is initialized on thread 2 via a goroutine on device 0 2. MSM multiGPU test runs and is locked on thread 3 setting its device to 1 3. Other tests run concurrently on threads other than 3 (since it is locked) 4. MSM multiGPU test finishes and release thread 3 back to the pool but its device is still 1 5. NTT test runs and is assigned to thread 3 --> this will fail because the thread’s device wasn’t released back We really only want to set a thread's device while the thread is locked. But once we unlock a thread, it’s device should return to whatever it was set at originally. In theory, it should always be 0 if `SetDevice` is never used outside of `RunOnDevice` - which it shouldn’t be in most situations	2024-05-08 14:07:29 +03:00
Jeremy Felder	89082fb561	FEAT: MultiGPU for golang bindings (#417 ) ## Describe the changes This PR adds multi gpu support in the golang bindings. Tha main changes are to DeviceSlice which now includes a `deviceId` attribute specifying which device the underlying data resides on and checks for correct deviceId and current device when using DeviceSlices in any operation. In Go, most concurrency can be done via Goroutines (described as lightweight threads - in reality, more of a threadpool manager), however, there is no guarantee that a goroutine stays on a specific host thread. Therefore, a function `RunOnDevice` was added to the cuda_runtime package which locks a goroutine into a specific host thread, sets a current GPU device, runs a provided function, and unlocks the goroutine from the host thread after the provided function finishes. While the goroutine is locked to the hsot thread, the Go runtime will not assign other goroutines to that host thread	2024-03-13 16:19:45 +02:00
Jeremy Felder	e8cd2d7a98	GoLang bindings for v1.x (#386 )	2024-02-22 20:52:48 +02:00

4 Commits