TECA
The Toolkit for Extreme Climate Analysis
teca_thread_util.h
Go to the documentation of this file.
1 #ifndef teca_thread_utils_h
2 #define teca_thread_utils_h
3 
4 /// @file
5 
6 #include "teca_common.h"
7 #include "teca_mpi.h"
8 #include <deque>
9 
10 /// Codes for dealing with threading
12 {
13 /** load balances threads across an MPI communication space such that on the
14  * individual nodes physical cores each receive the same number of threads.
15  * This is an MPI collective call. Building the affinity map relies on
16  * features available only in _GNU_SOURCE. On systems where these features are
17  * unavailable, when automated detection of the number of threads is requested,
18  * the call will fail and the n_threads will be set to 1,
19  *
20  * @param[in] comm an MPI communcation space to load balance threads across.
21  * the communicator is used to coordinate affinity mapping such that
22  * each rank can allocate a number of threads bound to unique cores.
23  *
24  * @param[in] base_core_id identifies the core in use by this MPI rank's main
25  * thread. if -1 is passed this will be automatically
26  * determined.
27  *
28  * @param[in] n_requested the number of requested threads per rank. Passing a value of
29  * -1 results in use of all the cores on the node such that each
30  * physical core is assigned exactly 1 thread. Note that for
31  * performance reasons hyperthreads are not used here. The
32  * suggested number of threads is retruned in n_threads, and the
33  * returned affinity map specifies which core the thread should
34  * be bound to to acheive this. Passing n_requested >= 1
35  * specifies a run time override. This indicates that caller
36  * wants to use a specific number of threads, rather than one per
37  * physical core. In this case the affinity map is also
38  * constructed.
39  *
40  * @param[in] bind if true extra work is done to determine an affinity map such
41  * that each thread can be bound to a unique core on the node.
42  *
43  * @param[in] verbose prints a report decribing the affinity map.
44  *
45  * @param[in,out] n_threads if n_requested is -1, this will be set to the number of threads
46  * one can use such that there is one thread per phycial core
47  * taking into account all ranks running on the node. if
48  * n_requested is >= 1 n_threads will be set to n_requested. This
49  * allows a run time override for cases when the caller knows how
50  * she wants to schedule things. if an error occurs and n_requested
51  * is -1 this will be set to 1.
52  *
53  * @param[out] affinity an affinity map, describing for each of n_threads,
54  * a core id that the thread can be bound to. if n_requested is -1
55  * then the map will conatin an entry for each of n_threads where
56  * each of the threads is assigned a unique phyical core. when
57  * n_requested is >= 1 the map contains an enrty for each of the
58  * n_requested threads such that when more threads are requested
59  * than cores each core is assigned approximately the same number of
60  * threads.
61  *
62  * @returns 0 on success
63  */
64 int thread_parameters(MPI_Comm comm, int base_core_id, int n_requested,
65  bool bind, bool verbose, int &n_threads, std::deque<int> &affinity);
66 };
67 
68 #endif
teca_thread_util
Codes for dealing with threading.
Definition: teca_thread_util.h:11
teca_common.h
teca_thread_util::thread_parameters
int thread_parameters(MPI_Comm comm, int base_core_id, int n_requested, bool bind, bool verbose, int &n_threads, std::deque< int > &affinity)