Buffer Creation and Usage¶
In vkdispatch, nearly all data is stored inside “buffers” (each wrapping an individual VkBuffer object and all other objects needed to manage it). These are the equivalent of torch.Tensor or wp.array() in warp-lang.
However, unlike torch.Tensor or wp.array(), vkdispatch buffers are by default multi-device. This means that when a vkdispatch buffer is allocated on a multi-device or multi-queue context, multiple VkBuffers are allocated (one for each queue on each device). This architecture has the benefit of greatly simplfying multi-GPU programs, since all buffers can be assumed to exist on all devices and all queues.
To allocate a buffer, you can either directly use the constructor of the Buffer class or the vd.asbuffer function to directly upload a numpy array to the gpu as a buffer.
Simple GPU Buffer Example¶
import vkdispatch as vd
import numpy as np
# Create a simple numpy array
cpu_data = np.arange(16, dtype=np.int32)
print(f"Original CPU data: {cpu_data}")
# Create a GPU buffer
gpu_buffer = vd.asbuffer(cpu_data)
# Read data back from GPU to CPU to verify
downloaded_data = gpu_buffer.read(0)
print(f"Data downloaded from GPU: {downloaded_data.flatten()}")
# Expected Output:
# Original CPU data: [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]
# Data downloaded from GPU: [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]
What’s happening here?
We import vkdispatch and numpy (a common dependency for numerical data).
We use the
vd.asbufferfunction to upload the numpy array to a vkdispatch buffer.read(0)retrieves data back from the GPU to the CPU. The number provided as an argument to the function is the queue index to read from. For a simple context with one device and one queue, there is only 1 queue, so we read from index 0. If the index is ommited the function returns a python list of the contents of all buffers on all queues and devices.
Buffer API Reference¶
- class vkdispatch.Buffer(shape: Tuple[int, ...], var_type: dtype, external_buffer: ExternalBufferInfo = None)¶
Represents a contiguous block of memory on the GPU (or shared across multiple devices).
Buffers are the primary mechanism for transferring data between the host (CPU) and the device (GPU). They are typed using
vkdispatch.dtypeand support multi-dimensional shapes, similar to NumPy arrays.- Parameters:
shape (Tuple[int, ...]) – The dimensions of the buffer. Must be a tuple of 1, 2, or 3 integers.
var_type (vkdispatch.base.dtype.dtype) – The data type of the elements stored in the buffer.
- Raises:
ValueError – If the shape has more than 3 dimensions or if the requested size exceeds 2^30 elements.
- _destroy() None¶
Destroy the buffer and all child handles.
- read(index: int | None = None)¶
Downloads data from the GPU buffer to the host.
- Parameters:
index (Union[int, None]) – The device index to read from. If
None, reads from all devices and returns a stacked array with an extra dimension for the device index.- Returns:
A host array representation containing the buffer data.
- Raises:
ValueError – If the specified index is invalid.
- write(data: bytes | bytearray | memoryview | Any, index: int = None) None¶
Uploads data from the host to the GPU buffer.
If
indexis None, the data is broadcast to the memory of all active devices in the context. Otherwise, it writes only to the device specified by the index.- Parameters:
data (Union[bytes, bytearray, memoryview, Any]) – The source data. Can be a bytes-like object or an array-like object.
index (int) – The device index to write to. Defaults to -1 (all devices).
- Raises:
ValueError – If the data size exceeds the buffer size or if the index is invalid.