StarPU Handbook
|
Macros | |
#define | STARPU_MALLOC_PINNED |
#define | STARPU_MALLOC_COUNT |
#define | STARPU_MALLOC_NORECLAIM |
#define | STARPU_MEMORY_WAIT |
#define | STARPU_MEMORY_OVERFLOW |
#define | STARPU_MALLOC_SIMULATION_FOLDED |
#define | STARPU_MALLOC_SIMULATION_UNIQUE |
#define | starpu_data_malloc_pinned_if_possible |
#define | starpu_data_free_pinned_if_possible |
Typedefs | |
typedef int(* | starpu_malloc_hook) (unsigned dst_node, void **A, size_t dim, int flags) |
typedef int(* | starpu_free_hook) (unsigned dst_node, void *A, size_t dim, int flags) |
Functions | |
void | starpu_malloc_set_align (size_t align) |
int | starpu_malloc (void **A, size_t dim) |
int | starpu_free (void *A) |
int | starpu_malloc_flags (void **A, size_t dim, int flags) |
int | starpu_free_flags (void *A, size_t dim, int flags) |
int | starpu_free_noflag (void *A, size_t dim) |
void | starpu_malloc_set_hooks (starpu_malloc_hook malloc_hook, starpu_free_hook free_hook) |
int | starpu_memory_pin (void *addr, size_t size) |
int | starpu_memory_unpin (void *addr, size_t size) |
starpu_ssize_t | starpu_memory_get_total (unsigned node) |
starpu_ssize_t | starpu_memory_get_available (unsigned node) |
size_t | starpu_memory_get_used (unsigned node) |
starpu_ssize_t | starpu_memory_get_total_all_nodes (void) |
starpu_ssize_t | starpu_memory_get_available_all_nodes (void) |
size_t | starpu_memory_get_used_all_nodes (void) |
int | starpu_memory_allocate (unsigned node, size_t size, int flags) |
void | starpu_memory_deallocate (unsigned node, size_t size) |
void | starpu_memory_wait_available (unsigned node, size_t size) |
void | starpu_sleep (float nb_sec) |
void | starpu_usleep (float nb_micro_sec) |
void | starpu_energy_use (float joules) |
double | starpu_energy_used (void) |
#define STARPU_MALLOC_PINNED |
Value passed to the function starpu_malloc_flags() to indicate the memory allocation should be pinned.
#define STARPU_MALLOC_COUNT |
Value passed to the function starpu_malloc_flags() to indicate the memory allocation should be in the limit defined by the environment variables STARPU_LIMIT_CUDA_devid_MEM, STARPU_LIMIT_CUDA_MEM, STARPU_LIMIT_OPENCL_devid_MEM, STARPU_LIMIT_OPENCL_MEM, STARPU_LIMIT_HIP_MEM, STARPU_LIMIT_HIP_devid_MEM and STARPU_LIMIT_CPU_MEM (see Section How to Limit Memory Used By StarPU And Cache Buffer Allocations). If no memory is available, it tries to reclaim memory from StarPU. Memory allocated this way needs to be freed by calling the function starpu_free_flags() with the same flag.
#define STARPU_MALLOC_NORECLAIM |
Value passed to the function starpu_malloc_flags() along STARPU_MALLOC_COUNT to indicate that while the memory allocation should be kept in the limits defined for STARPU_MALLOC_COUNT, no reclaiming should be performed by starpu_malloc_flags() itself, thus potentially overflowing the memory node a bit. StarPU will reclaim memory after next task termination, according to the STARPU_MINIMUM_AVAILABLE_MEM, STARPU_TARGET_AVAILABLE_MEM, STARPU_MINIMUM_CLEAN_BUFFERS, and STARPU_TARGET_CLEAN_BUFFERS environment variables. If STARPU_MEMORY_WAIT is set, no overflowing will happen, starpu_malloc_flags() will wait for other eviction mechanisms to release enough memory.
#define STARPU_MEMORY_WAIT |
Value passed to starpu_memory_allocate() to specify that the function should wait for the requested amount of memory to become available, and atomically allocate it.
#define STARPU_MEMORY_OVERFLOW |
Value passed to starpu_memory_allocate() to specify that the function should allocate the amount of memory, even if that means overflowing the total size of the memory node.
#define STARPU_MALLOC_SIMULATION_FOLDED |
Value passed to the function starpu_malloc_flags() to indicate that when StarPU is using simgrid, the allocation can be "folded", i.e. a memory area is allocated, but its content is actually a replicate of the same memory area, to avoid having to actually allocate that much memory . This thus allows to have a memory area that does not actually consumes memory, to which one can read from and write to normally, but get bogus values.
#define STARPU_MALLOC_SIMULATION_UNIQUE |
Value passed to the function starpu_malloc_flags() to indicate that when StarPU is using simgrid, the allocation for that size could be unique. Different from only STARPU_MALLOC_SIMULATION_FOLDED, the same address will be given for all mallocs of that particular size.
#define starpu_data_malloc_pinned_if_possible |
#define starpu_data_free_pinned_if_possible |
void starpu_malloc_set_align | ( | size_t | align | ) |
Set an alignment constraints for starpu_malloc() allocations. align
must be a power of two. This is for instance called automatically by the OpenCL driver to specify its own alignment constraints. See Data Management Allocation for more details.
int starpu_malloc | ( | void ** | A, |
size_t | dim | ||
) |
Allocate data of the given size dim
in main memory, and return the pointer to the allocated data through A
. It will also try to pin it in CUDA or OpenCL, so that data transfers from this buffer can be asynchronous, and thus permit data transfer and computation overlapping. The allocated buffer must be freed thanks to the starpu_free_noflag() function. See Data Management Allocation for more details.
int starpu_free | ( | void * | A | ) |
int starpu_malloc_flags | ( | void ** | A, |
size_t | dim, | ||
int | flags | ||
) |
Perform a memory allocation based on the constraints defined by the given flag. See How to Limit Memory Used By StarPU And Cache Buffer Allocations for more details.
int starpu_free_flags | ( | void * | A, |
size_t | dim, | ||
int | flags | ||
) |
Free memory by specifying its size. The given flags should be consistent with the ones given to starpu_malloc_flags() when allocating the memory. See How to Limit Memory Used By StarPU And Cache Buffer Allocations for more details.
int starpu_free_noflag | ( | void * | A, |
size_t | dim | ||
) |
Free memory by specifying its size. Should be used for memory allocated with starpu_malloc(). See Data Management Allocation for more details.
void starpu_malloc_set_hooks | ( | starpu_malloc_hook | malloc_hook, |
starpu_free_hook | free_hook | ||
) |
Set allocation functions to be used by StarPU. By default, StarPU will use malloc()
(or cudaHostAlloc()
if CUDA GPUs are used) for all its data handle allocations. The application can specify another allocation primitive by calling this. The malloc_hook should pass the allocated pointer through the A
parameter, and return 0 on success. On allocation failure, it should return -ENOMEM. The flags
parameter contains STARPU_MALLOC_PINNED if the memory should be pinned by the hook for GPU transfer efficiency. The hook can use starpu_memory_pin() to achieve this. The dst_node
parameter is the starpu memory node, one can convert it to an hwloc logical id with starpu_memory_nodes_numa_id_to_hwloclogid() or to an OS NUMA number with starpu_memory_nodes_numa_devid_to_id(). See Data Management Allocation for more details.
int starpu_memory_pin | ( | void * | addr, |
size_t | size | ||
) |
Pin the given memory area, so that CPU-GPU transfers can be done asynchronously with DMAs. The memory must be unpinned with starpu_memory_unpin() before being freed. Return 0 on success, -1 on error. See Data Management Allocation for more details.
int starpu_memory_unpin | ( | void * | addr, |
size_t | size | ||
) |
Unpin the given memory area previously pinned with starpu_memory_pin(). Return 0 on success, -1 on error. See Data Management Allocation for more details.
starpu_ssize_t starpu_memory_get_total | ( | unsigned | node | ) |
If a memory limit is defined on the given node (see Section How to Limit Memory Used By StarPU And Cache Buffer Allocations), return the amount of total memory on the node. Otherwise return -1. See How to Limit Memory Used By StarPU And Cache Buffer Allocations for more details.
starpu_ssize_t starpu_memory_get_available | ( | unsigned | node | ) |
If a memory limit is defined on the given node (see Section How to Limit Memory Used By StarPU And Cache Buffer Allocations), return the amount of available memory on the node. Otherwise return -1. See How to Limit Memory Used By StarPU And Cache Buffer Allocations for more details.
size_t starpu_memory_get_used | ( | unsigned | node | ) |
Return the amount of used memory on the node. See Data Management Allocation for more details.
starpu_ssize_t starpu_memory_get_total_all_nodes | ( | void | ) |
Return the amount of total memory on all memory nodes for whose a memory limit is defined (see Section Data Management Allocation).
starpu_ssize_t starpu_memory_get_available_all_nodes | ( | void | ) |
Return the amount of available memory on all memory nodes for whose a memory limit is defined (see Section Data Management Allocation).
size_t starpu_memory_get_used_all_nodes | ( | void | ) |
Return the amount of used memory on all memory nodes. See Data Management Allocation for more details.
int starpu_memory_allocate | ( | unsigned | node, |
size_t | size, | ||
int | flags | ||
) |
If a memory limit is defined on the given node (see Section How to Limit Memory Used By StarPU And Cache Buffer Allocations), try to allocate some of it. This does not actually allocate memory, but only accounts for it. This can be useful when the application allocates data another way, but want StarPU to be aware of the allocation size e.g. for memory reclaiming. By default, return -ENOMEM
if there is not enough room on the given node. flags
can be either STARPU_MEMORY_WAIT or STARPU_MEMORY_OVERFLOW to change this. See How to Limit Memory Used By StarPU And Cache Buffer Allocations for more details.
void starpu_memory_deallocate | ( | unsigned | node, |
size_t | size | ||
) |
If a memory limit is defined on the given node (see Section How to Limit Memory Used By StarPU And Cache Buffer Allocations), free some of it. This does not actually free memory, but only accounts for it, like starpu_memory_allocate(). The amount does not have to be exactly the same as what was passed to starpu_memory_allocate(), only the eventual amount needs to be the same, i.e. one call to starpu_memory_allocate() can be followed by several calls to starpu_memory_deallocate() to declare the deallocation piece by piece. See How to Limit Memory Used By StarPU And Cache Buffer Allocations for more details.
void starpu_memory_wait_available | ( | unsigned | node, |
size_t | size | ||
) |
If a memory limit is defined on the given node (see Section How to Limit Memory Used By StarPU And Cache Buffer Allocations), this will wait for size
bytes to become available on node
. Of course, since another thread may be allocating memory concurrently, this does not necessarily mean that this amount will be actually available, just that it was reached. To atomically wait for some amount of memory and reserve it, starpu_memory_allocate() should be used with the STARPU_MEMORY_WAIT flag. See How to Limit Memory Used By StarPU And Cache Buffer Allocations for more details.
void starpu_sleep | ( | float | nb_sec | ) |
Sleep for the given nb_sec
seconds. Similar to calling Unix' sleep
function, except that it takes a float to allow sub-second sleeping, and when StarPU is compiled in SimGrid mode it does not really sleep but just makes SimGrid record that the thread has taken some time to sleep. See Helpers for more details.
void starpu_usleep | ( | float | nb_micro_sec | ) |
Sleep for the given nb_micro_sec
micro-seconds. In simgrid mode, this only sleeps within virtual time. See Helpers for more details.
void starpu_energy_use | ( | float | joules | ) |
Account for joules
J being used. This is support in simgrid mode, to record how much energy was used, and will show up in further call to starpu_energy_used(). See Energy-based Scheduling fore more details.
double starpu_energy_used | ( | void | ) |
Return the amount of energy having been used in J. This account the amounts passed to starpu_energy_use(), but also the static energy use set by the STARPU_IDLE_POWER environment variable. See Energy-based Scheduling fore more details.