StarPU Handbook
|
Data Structures | |
struct | starpu_perfmodel |
struct | starpu_perfmodel_regression_model |
struct | starpu_perfmodel_per_arch |
struct | starpu_perfmodel_history_list |
struct | starpu_perfmodel_history_entry |
Functions | |
void | starpu_perfmodel_free_sampling_directories (void) |
int | starpu_perfmodel_load_symbol (const char *symbol, struct starpu_perfmodel *model) |
int | starpu_perfmodel_unload_model (struct starpu_perfmodel *model) |
void | starpu_perfmodel_debugfilepath (struct starpu_perfmodel *model, enum starpu_perfmodel_archtype arch, char *path, size_t maxlen, unsigned nimpl) |
void | starpu_perfmodel_get_arch_name (enum starpu_perfmodel_archtype arch, char *archname, size_t maxlen, unsigned nimpl) |
enum starpu_perfmodel_archtype | starpu_worker_get_perf_archtype (int workerid) |
int | starpu_perfmodel_list (FILE *output) |
void | starpu_perfmodel_directory (FILE *output) |
void | starpu_perfmodel_print (struct starpu_perfmodel *model, enum starpu_perfmodel_archtype arch, unsigned nimpl, char *parameter, uint32_t *footprint, FILE *output) |
int | starpu_perfmodel_print_all (struct starpu_perfmodel *model, char *arch, char *parameter, uint32_t *footprint, FILE *output) |
void | starpu_bus_print_bandwidth (FILE *f) |
void | starpu_bus_print_affinity (FILE *f) |
void | starpu_perfmodel_update_history (struct starpu_perfmodel *model, struct starpu_task *task, enum starpu_perfmodel_archtype arch, unsigned cpuid, unsigned nimpl, double measured) |
double | starpu_transfer_bandwidth (unsigned src_node, unsigned dst_node) |
double | starpu_transfer_latency (unsigned src_node, unsigned dst_node) |
double | starpu_transfer_predict (unsigned src_node, unsigned dst_node, size_t size) |
struct starpu_perfmodel |
Contains all information about a performance model. At least the type and symbol fields have to be filled when defining a performance model for a codelet. For compatibility, make sure to initialize the whole structure to zero, either by using explicit memset, or by letting the compiler implicitly do it in e.g. static storage case. If not provided, other fields have to be zero.
Data Fields | |
enum starpu_perfmodel_type | type |
double(* | cost_model )(struct starpu_data_descr *) |
double(* | cost_function )(struct starpu_task *, unsigned nimpl) |
size_t(* | size_base )(struct starpu_task *, unsigned nimpl) |
struct starpu_perfmodel_per_arch | per_arch [STARPU_NARCH_VARIATIONS][STARPU_MAXIMPLEMENTATIONS] |
const char * | symbol |
unsigned | is_loaded |
unsigned | benchmarking |
starpu_pthread_rwlock_t | model_rwlock |
starpu_perfmodel::type |
is the type of performance model
starpu_perfmodel::cost_model |
starpu_perfmodel::cost_function |
Used by STARPU_COMMON: takes a task and implementation number, and must return a task duration estimation in micro-seconds.
starpu_perfmodel::size_base |
Used by STARPU_HISTORY_BASED, STARPU_REGRESSION_BASED and STARPU_NL_REGRESSION_BASED. If not NULL, takes a task and implementation number, and returns the size to be used as index for history and regression.
starpu_perfmodel::per_arch |
Used by STARPU_PER_ARCH: array of structures starpu_per_arch_perfmodel
starpu_perfmodel::symbol |
is the symbol name for the performance model, which will be used as file name to store the model. It must be set otherwise the model will be ignored.
starpu_perfmodel::is_loaded |
Whether the performance model is already loaded from the disk.
starpu_perfmodel::benchmarking |
Whether the performance model is still being calibrated.
starpu_perfmodel::model_rwlock |
Lock to protect concurrency between loading from disk (W), updating the values (W), and making a performance estimation (R).
struct starpu_perfmodel_regression_model |
...
struct starpu_perfmodel_per_arch |
contains information about the performance model of a given arch.
Data Fields | |
double(* | cost_model )(struct starpu_data_descr *t) |
double(* | cost_function )(struct starpu_task *task, enum starpu_perfmodel_archtype arch, unsigned nimpl) |
size_t(* | size_base )(struct starpu_task *, enum starpu_perfmodel_archtype arch, unsigned nimpl) |
struct starpu_perfmodel_history_table * | history |
struct starpu_perfmodel_history_list * | list |
struct starpu_perfmodel_regression_model | regression |
starpu_perfmodel_per_arch::cost_model |
starpu_perfmodel_per_arch::cost_function |
Used by STARPU_PER_ARCH, must point to functions which take a task, the target arch and implementation number (as mere conveniency, since the array is already indexed by these), and must return a task duration estimation in micro-seconds.
starpu_perfmodel_per_arch::size_base |
Same as in structure starpu_perfmodel, but per-arch, in case it depends on the architecture-specific implementation.
starpu_perfmodel_per_arch::history |
The history of performance measurements.
starpu_perfmodel_per_arch::list |
Used by STARPU_HISTORY_BASED and STARPU_NL_REGRESSION_BASED, records all execution history measures.
starpu_perfmodel_per_arch::regression |
Used by STARPU_REGRESSION_BASED and STARPU_NL_REGRESSION_BASED, contains the estimated factors of the regression.
struct starpu_perfmodel_history_list |
todo
Data Fields | ||
---|---|---|
struct starpu_perfmodel_history_list * |
next | todo |
struct starpu_perfmodel_history_entry * |
entry | todo |
struct starpu_perfmodel_history_entry |
Enumerates the various types of architectures.
it is possible that we have multiple versions of the same kind of workers, for instance multiple GPUs or even different CPUs within the same machine so we do not use the archtype enum type directly for performance models.
TODO
void starpu_perfmodel_free_sampling_directories | ( | void | ) |
this function frees internal memory used for sampling directory management. It should only be called by an application which is not calling starpu_shutdown as this function already calls it. See for example tools/starpu_perfmodel_display.c
.
int starpu_perfmodel_load_symbol | ( | const char * | symbol, |
struct starpu_perfmodel * | model | ||
) |
loads a given performance model. The model structure has to be completely zero, and will be filled with the information saved in $STARPU_HOME/.starpu
. The function is intended to be used by external tools that should read the performance model files.
int starpu_perfmodel_unload_model | ( | struct starpu_perfmodel * | model | ) |
unloads the given model which has been previously loaded through the function starpu_perfmodel_load_symbol()
void starpu_perfmodel_debugfilepath | ( | struct starpu_perfmodel * | model, |
enum starpu_perfmodel_archtype | arch, | ||
char * | path, | ||
size_t | maxlen, | ||
unsigned | nimpl | ||
) |
returns the path to the debugging information for the performance model.
void starpu_perfmodel_get_arch_name | ( | enum starpu_perfmodel_archtype | arch, |
char * | archname, | ||
size_t | maxlen, | ||
unsigned | nimpl | ||
) |
returns the architecture name for arch
enum starpu_perfmodel_archtype starpu_worker_get_perf_archtype | ( | int | workerid | ) |
returns the architecture type of a given worker.
int starpu_perfmodel_list | ( | FILE * | output | ) |
prints a list of all performance models on output
int starpu_perfmodel_directory | ( | FILE * | output | ) |
prints the directory name storing performance models on output
void starpu_perfmodel_print | ( | struct starpu_perfmodel * | model, |
enum starpu_perfmodel_archtype | arch, | ||
unsigned | nimpl, | ||
char * | parameter, | ||
uint32_t * | footprint, | ||
FILE * | output | ||
) |
todo
int starpu_perfmodel_print_all | ( | struct starpu_perfmodel * | model, |
char * | arch, | ||
char * | parameter, | ||
uint32_t * | footprint, | ||
FILE * | output | ||
) |
todo
void starpu_bus_print_bandwidth | ( | FILE * | f | ) |
prints a matrix of bus bandwidths on f
.
void starpu_bus_print_affinity | ( | FILE * | f | ) |
prints the affinity devices on f
.
void starpu_perfmodel_update_history | ( | struct starpu_perfmodel * | model, |
struct starpu_task * | task, | ||
enum starpu_perfmodel_archtype | arch, | ||
unsigned | cpuid, | ||
unsigned | nimpl, | ||
double | measured | ||
) |
This feeds the performance model model with an explicit measurement measured (in µs), in addition to measurements done by StarPU itself. This can be useful when the application already has an existing set of measurements done in good conditions, that StarPU could benefit from instead of doing on-line measurements. And example of use can be seen in Performance Model Example.
double starpu_transfer_bandwidth | ( | unsigned | src_node, |
unsigned | dst_node | ||
) |
Return the bandwidth of data transfer between two memory nodes
double starpu_transfer_latency | ( | unsigned | src_node, |
unsigned | dst_node | ||
) |
Return the latency of data transfer between two memory nodes
double starpu_transfer_predict | ( | unsigned | src_node, |
unsigned | dst_node, | ||
size_t | size | ||
) |
Return the estimated time to transfer a given size between two memory nodes.