13.8. Internal frameworks
The Modular Component Architecture (MCA) is the backbone of Open MPI – most services and functionality are implemented through MCA components.
13.8.1. MPI layer frameworks
Here is a list of all the component frameworks in the MPI layer of Open MPI:
bml
: BTL management layercoll
: MPI collective algorithmsfbtl
: point to point file byte transfer layer: abstraction for individual read: collective read and write operations for MPI I/Ofcoll
: collective file system functions for MPI I/Ofs
: file system functions for MPI I/Ohook
: Generic hooks into Open MPIio
: MPI I/Omtl
: Matching transport layer, used for MPI point-to-point messages on some types of networksop
: Back end computations for intrinsic MPI_Op operatorsosc
: MPI one-sided communicationspml
: MPI point-to-point management layersharedfp
: shared file pointer operations for MPI I/Otopo
: MPI topology routinesvprotocol
: Protocols for the “v” PML
13.8.2. OpenSHMEM component frameworks
atomic
: OpenSHMEM atomic operationsmemheap
: OpenSHMEM memory allocators that support the PGAS memory modelscoll
: OpenSHMEM collective operationsspml
: OpenSHMEM “pml-like” layer: supports one-sided, point-to-point operationssshmem
: OpenSHMEM shared memory backing facility
13.8.3. Miscellaneous frameworks
allocator
: Memory allocatorbacktrace
: Debugging call stack backtrace supportbtl
: Point-to-point Byte Transfer Layerdl
: Dynamic loading library interfacehwloc
: Hardware locality (hwloc) versioning supportif
: OS IP interface supportinstalldirs
: Installation directory relocation servicesmemchecker
: Run-time memory checkingmemcpy
: Memory copy supportmemory
: Memory management hooksmpool
: Memory poolingpatcher
: Symbol patcher hookspmix
: Process management interface (exascale)rcache
: Memory registration cachereachable
: Network reachability determinationshmem
: Shared memory support (NOT related to OpenSHMEM)smsc
: Shared memory single-copy supportthreads
: OS and userspace thread supporttimer
: High-resolution timers
13.8.4. Framework notes
Each framework typically has one or more components that are used at
run-time. For example, the btl
framework is used by the MPI layer
to send bytes across different types underlying networks. The tcp
btl
, for example, sends messages across TCP-based networks; the
ucx
pml
sends messages across InfiniBand-based networks.
13.8.5. MCA parameter notes
Each component typically has some tunable parameters that can be changed at run-time. Use the ompi_info(1) command to check a component to see what its tunable parameters are. For example:
shell$ ompi_info --param btl tcp
shows some of the parameters (and default values) for the tcp
btl
component (use --all
or --level 9
to show all the parameters).
Note that ompi_info
(without --all
or a specified level) only
shows a small number a component’s MCA parameters by default. Each
MCA parameter has a “level” value from 1 to 9, corresponding to the
MPI-3 MPI_T tool interface levels. See the LEVELS section in
the ompi_info(1) man page for an explanation
of the levels and how they correspond to Open MPI’s code.
Here’s rules of thumb to keep in mind when using Open MPI’s levels:
Levels 1-3:
These levels should contain only a few MCA parameters.
Generally, only put MCA parameters in these levels that matter to users who just need to run Open MPI applications (and don’t know/care anything about MPI). Examples (these are not comprehensive):
Selection of which network interfaces to use.
Selection of which MCA components to use.
Selective disabling of warning messages (e.g., show warning message XYZ unless a specific MCA parameter is set, which disables showing that warning message).
Enabling additional stderr logging verbosity. This allows a user to run with this logging enabled, and then use that output to get technical assistance.
Levels 4-6:
These levels should contain any other MCA parameters that are useful to expose to end users.
There is an expectation that “power users” will utilize these MCA parameters — e.g., those who are trying to tune the system and extract more performance.
Here’s some examples of MCA parameters suitable for these levels (these are not comprehensive):
When you could have hard-coded a constant size of a resource (e.g., a resource pool size or buffer length), make it an MCA parameter instead.
When there are multiple different algorithms available for a particular operation, code them all up and provide an MCA parameter to let the user select between them.
Levels 7-9:
Put any other MCA parameters here.
It’s ok for these MCA parameters to be esoteric and only relevant to deep magic / the internals of Open MPI.
There is little expectation of users using these MCA parameters.
See this section for details on how to set MCA parameters at run time.