DeviceMemory

DeviceMemory

RAII class for a CUDA memory buffer.

#include <DeviceMemory.h>

Public Functions

Name
DeviceMemory() =default
Default constructor, creates an empty object, without allocating any memory on the device.
DeviceMemory(size_t numberOfBytes)
Constructor allocating the required number of bytes.
DeviceMemory(size_t numberOfBytes, cudaStream_t cudaStream)
Constructor allocating the required number of bytes by making an async allocation. The allocation will be put in the given CUDA stream.
DeviceMemory(void * deviceMemory)
Constructor taking ownership of an already allocated CUDA memory pointer. This class will free the pointer once it goes out of scope.
boolallocateMemory(size_t numberOfBytes)
Allocates device memory. The memory allocated will automatically be freed by the destructor.
boolallocateMemoryAsync(size_t numberOfBytes, cudaStream_t cudaStream)
Allocates device memory. The memory allocated will automatically be freed by the destructor.
boolreallocateMemory(size_t numberOfBytes)
Reallocates device memory. Already existing memory allocation will be freed before the new allocation is made. In case this DeviceMemory has no earlier memory allocation, this method will just allocate new CUDA memory and return a success status.
boolreallocateMemoryAsync(size_t numberOfBytes, cudaStream_t cudaStream)
Asynchronously reallocate device memory. Already existing memory allocation will be freed before the new allocation is made. In case this DeviceMemory has no earlier memory allocation, this method will just allocate new CUDA memory and return a success status.
boolallocateAndResetMemory(size_t numberOfBytes)
Allocates device memory and resets all bytes to zeroes. The memory allocated will automatically be freed by the destructor.
boolallocateAndResetMemoryAsync(size_t numberOfBytes, cudaStream_t cudaStream)
Allocates device memory and resets all bytes to zeroes. The memory allocated will automatically be freed by the destructor.
boolfreeMemory()
Free the device memory held by this class. Calling this when no memory is allocated is a no-op.
boolfreeMemoryAsync(cudaStream_t cudaStream)
Deallocate memory asynchronously, in a given CUDA stream. Calling this when no memory is allocated is a no-op.
voidsetFreeingCudaStream(cudaStream_t cudaStream)
Set which CUDA stream to use for freeing this DeviceMemory. In case the DeviceMemory already holds a CUDA stream to use for freeing the memory, this will be overwritten.
~DeviceMemory()
Destructor, frees the internal CUDA memory.
template
T *
getDevicePointer() const
size_tgetSize() const
DeviceMemory(DeviceMemory && other)
DeviceMemory &operator=(DeviceMemory && other)
voidswap(DeviceMemory & other)
DeviceMemory(DeviceMemory const & ) =delete
DeviceMemory is not copyable.
DeviceMemoryoperator=(DeviceMemory const & ) =delete

Public Functions Documentation

function DeviceMemory

DeviceMemory() =default

Default constructor, creates an empty object, without allocating any memory on the device.

function DeviceMemory

explicit DeviceMemory(
    size_t numberOfBytes
)

Constructor allocating the required number of bytes.

Parameters:

  • numberOfBytes Number of bytes to allocate

Exceptions:

  • std::runtime_error In case the allocation failed

function DeviceMemory

explicit DeviceMemory(
    size_t numberOfBytes,
    cudaStream_t cudaStream
)

Constructor allocating the required number of bytes by making an async allocation. The allocation will be put in the given CUDA stream.

Parameters:

  • size Number of bytes to allocate
  • cudaStream The CUDA stream to put the async allocation in. A reference to this stream will also be saved internally to be used to asynchronously free them memory when the instance goes out of scope.

Exceptions:

  • std::runtime_error In case the async allocation failed to be put in queue.

See: freeMemoryAsync method is not explicitly called, the memory will be freed synchronously when this DeviceMemory instance goes out of scope, meaning that the entire GPU is synchronized, which will impact performance negatively.

Note:

  • The method will return as soon as the allocation is put in queue in the CUDA stream, i.e. before the actual allocation is made. Using this DeviceMemory is only valid as long as it is used in the same CUDA stream, or in case another stream is used, only if that stream is synchronized first with respect to cudaStream.
  • In case the

function DeviceMemory

explicit DeviceMemory(
    void * deviceMemory
)

Constructor taking ownership of an already allocated CUDA memory pointer. This class will free the pointer once it goes out of scope.

Parameters:

  • deviceMemory CUDA memory pointer to take ownership over.

function allocateMemory

bool allocateMemory(
    size_t numberOfBytes
)

Allocates device memory. The memory allocated will automatically be freed by the destructor.

Parameters:

  • numberOfBytes Number of bytes to allocate

Return: True on success, false if there is already memory allocated by this instance, or if the CUDA malloc failed.

function allocateMemoryAsync

bool allocateMemoryAsync(
    size_t numberOfBytes,
    cudaStream_t cudaStream
)

Allocates device memory. The memory allocated will automatically be freed by the destructor.

Parameters:

  • numberOfBytes Number of bytes to allocate
  • cudaStream The CUDA stream to use for the allocation. A reference to this stream will also be saved internally to be used to asynchronously free them memory when the instance goes out of scope.

Return: True on success, false if there is already memory allocated by this instance, or if the CUDA malloc failed.

function reallocateMemory

bool reallocateMemory(
    size_t numberOfBytes
)

Reallocates device memory. Already existing memory allocation will be freed before the new allocation is made. In case this DeviceMemory has no earlier memory allocation, this method will just allocate new CUDA memory and return a success status.

Parameters:

  • numberOfBytes Number of bytes to allocate in the new allocation

Return: True on success, false if CUDA free or CUDA malloc failed.

function reallocateMemoryAsync

bool reallocateMemoryAsync(
    size_t numberOfBytes,
    cudaStream_t cudaStream
)

Asynchronously reallocate device memory. Already existing memory allocation will be freed before the new allocation is made. In case this DeviceMemory has no earlier memory allocation, this method will just allocate new CUDA memory and return a success status.

Parameters:

  • numberOfBytes Number of bytes to allocate in the new allocation
  • cudaStream The CUDA stream to use for the allocation and freeing of memory. A reference to this stream will also be saved internally to be used to asynchronously free them memory when this instance goes out of scope.

Return: True on success, false if CUDA free or CUDA malloc failed.

function allocateAndResetMemory

bool allocateAndResetMemory(
    size_t numberOfBytes
)

Allocates device memory and resets all bytes to zeroes. The memory allocated will automatically be freed by the destructor.

Parameters:

  • numberOfBytes Number of bytes to allocate

Return: True on success, false if there is already memory allocated by this instance, or if any of the CUDA operations failed.

function allocateAndResetMemoryAsync

bool allocateAndResetMemoryAsync(
    size_t numberOfBytes,
    cudaStream_t cudaStream
)

Allocates device memory and resets all bytes to zeroes. The memory allocated will automatically be freed by the destructor.

Parameters:

  • numberOfBytes Number of bytes to allocate
  • cudaStream The CUDA stream to use for the allocation and resetting. A reference to this stream will also be saved internally to be used to asynchronously free them memory when the instance goes out of scope.

Return: True on success, false if there is already memory allocated by this instance, or if any of the CUDA operations failed.

function freeMemory

bool freeMemory()

Free the device memory held by this class. Calling this when no memory is allocated is a no-op.

See: freeMemoryAsync instead.

Return: True in case the memory was successfully freed (or not allocated to begin with), false otherwise.

Note: This method will free the memory in an synchronous fashion, synchronizing the entire CUDA context and ignoring the internally saved CUDA stream reference in case one exist. For async freeing of the memory, use

function freeMemoryAsync

bool freeMemoryAsync(
    cudaStream_t cudaStream
)

Deallocate memory asynchronously, in a given CUDA stream. Calling this when no memory is allocated is a no-op.

Parameters:

  • cudaStream The CUDA stream to free the memory asynchronously in

Return: True in case the memory deallocation request was successfully put in queue in the CUDA stream.

Note:

  • The method will return as soon as the deallocation is put in queue in the CUDA stream, i.e. before the actual deallocation is made.
  • It is the programmer’s responsibility to ensure this DeviceMemory is not used in another CUDA stream before this method is called. In case it is used in another CUDA stream, sufficient synchronization must be made before calling this method (and a very good reason given for not freeing the memory in that CUDA stream instead)

function setFreeingCudaStream

void setFreeingCudaStream(
    cudaStream_t cudaStream
)

Set which CUDA stream to use for freeing this DeviceMemory. In case the DeviceMemory already holds a CUDA stream to use for freeing the memory, this will be overwritten.

Parameters:

  • cudaStream The new CUDA stream to use for freeing the memory when the destructor is called.

Note: It is the programmer’s responsibility to ensure this DeviceMemory is not used in another CUDA stream before this instance is destructed. In case it is used in another CUDA stream, sufficient synchronization must be made before setting this as the new CUDA stream to use when freeing the memory.

function ~DeviceMemory

~DeviceMemory()

Destructor, frees the internal CUDA memory.

function getDevicePointer

template <typename T  =uint8_t>
inline T * getDevicePointer() const

Template Parameters:

  • T The pointer type to return

Return: the CUDA memory pointer handled by this class. Nullptr in case no memory is allocated.

function getSize

size_t getSize() const

Return: The size of the CUDA memory allocation held by this class.

function DeviceMemory

DeviceMemory(
    DeviceMemory && other
)

function operator=

DeviceMemory & operator=(
    DeviceMemory && other
)

function swap

void swap(
    DeviceMemory & other
)

function DeviceMemory

DeviceMemory(
    DeviceMemory const & 
) =delete

DeviceMemory is not copyable.

function operator=

DeviceMemory operator=(
    DeviceMemory const & 
) =delete

Updated on 2024-01-25 at 12:02:05 +0100