3. RDMA Module¶
The RDMA grouping of functionality applies to general RDMA devices and is not specific to the IBA.
Programs that use a single port should use rdma.get_end_port()
to find the port, passing in a command line argument to specify the port. The
library provides standardized parsing of an end port description. Users are
encouraged to use end port GIDs.
A list of RDMA devices is available through the rdma.get_devices()
call.
Related modules:
3.4. Exceptions¶
The root exception for things thrown by this module is rdma.RDMAError
.
Usage guidelines:
rdma.RDMAError
is thrown for general error conditions, like ‘Device Not Found’. The single argument is a string containing the failure message. This are either system failure messages with no possible recovery, or sane context localized failures - ieget_end_port()
throwsrdma.RDMAError
if it cannot return a port.rdma.MADError
is thrown for error conditions that arise from MAD RPC processing, including error status MADs and malformed replies.rdma.MADTimeoutError
is thrown when a MAD RPC call times out.rdma.MADClassError
is thrown when a MAD RPC call errors out with a class specific error.rdma.SysError
for kernel syscalls that fail.rdma.path.SAPathNotFoundError
when a path cannot be resolved due to the SA reporting it was not found.rdma.ibverbs.WRError
when a verbs work request fails to post.rdma.ibverbs.WCError
when a verbs work completion indicates an error.rdma.ibverbs.AsyncError
when a error case verbs async event is received.
Python’s exception processing is somewhat limited in how it deals with
complicated layering, for instance if a RPC is performed to resolve a path the
exception may appear as a simple timeout error. This is generally pretty
useless. The rdma.MADError
class includes a mechanism to stack error
messages, the lowest layer puts a layer appropriate message and higher layers
stack their layer appropriate messages. To process this extra information any
application using the library should use a try block as follows:
try:
do_stuff();
except rdma.MADError as err:
err.dump_detailed(sys.stderr,"E:",level=level);
Where level is a verbosity level set by the user. The resulting dumps will look something like this at level 1:
E: Failed getting MAD path record for end port GID('fe80::2:c903:0:1492').
E: +RPC MAD_METHOD_GET(1) SAFormat(3.2) SAPathRecord(53) got class specific error 3
Level 2 will include dumps of the request and reply packet that caused the error. Naturally raising the exception again will produce a traceback - which is generally less interesting for network-related RPC errors.
3.5. rdma
module¶
The top level import for the rdma module provides the exception types used in the package as well as the basic accessors for accessing the device list and instantiating access classes.
-
exception
rdma.
MADClassError
(req, code, **kwargs)¶ Thrown when a MAD RPC returns with a class specific error code.
-
code
= None¶ Decoded error code
-
-
exception
rdma.
MADError
(**kwargs)¶ Thrown when a MAD RPC fails in some way. The throw site includes as much information about the error context as possible. Depending on the throw context not all members may be available.
If the RPC is an incoming request then this exception contains enough information for the catch to generate an error reply MAD.
Parameters: - req (derived from
BinStruct
) – The MAD’s *Format that was originally sent. - req_buf (
bytearray
) – The entire raw request MAD data, if fmt is not present. - reply_status (
int
) – The status value to use when generating an error reply for req_buf. - path (
IBPath
) – The destination path for the request. - rep (derived from
BinStruct
) – The MAD’s *Format reply. - rep_buf (
bytearray
) – The entire raw reply MAD data. - status (
int
) – The entire 16 bit status value. - exc_info – Result of
sys.exc_info()
if MAD processing failed due to an unexpected exception.
-
dump_detailed
(F=None, prefix='', level=1)¶ Display detailed information about the exception. This prints a multi-line description to F. Many lines are prefixed with the text prefix. If level is 0 then the default summary line is displayed. If level is 1 then all summary information is shown. If level is 2 then request and reply packets are dumped.
If the
MADError
includes a captured exception then dump_detailed will re-throw it after printing our information.
-
message
(s)¶ Used to annotate additional messages onto the exception. For instance the library function issuing the RPC can call this with a short version of what the RPC actually was trying to do.
- req (derived from
-
exception
rdma.
MADTimeoutError
(req, path)¶ Thrown when a MAD RPC times out.
-
exception
rdma.
RDMAError
¶ General exception class for RDMA related errors.
-
exception
rdma.
SysError
(errno, func, msg=None)¶ Thrown when a system call fails. Inclues errno
errno is the positive errno code, func is the system call that failed and msg is more information, if applicable.
-
rdma.
get_device
(name=None)¶ Return a
rdma.devices.Device
for the default device if name isNone
, or for the device described by name.- The device string format is one of:
Format Example device mlx4_0 Node GUID 0002:c903:0000:1491
Return type: rdma.devices.device
Raises: rdma.RDMAError – If no matching device is found or name is invalid.
-
rdma.
get_devices
(refresh=False)¶ Return a container of
rdma.devices.RDMADevice
objects for all devices in the system.The return result is an object that looks like an ordered list of
rdma.devices.RDMADevice
objects. However, indexing the list is done by device name not by index. If the length of the returned object is 0 then no devices were detected. Programs are encouraged to userdma.get_end_port()
.Return type: DemandList
but this is an implementation detail.
-
rdma.
get_end_port
(name=None)¶ Return a
rdma.devices.EndPort
for the default end port if name isNone
, or for the end port described by name.- The end port string format is one of:
Format Example device mlx4_0 (defaults to the first port) device/port mlx4_0/1 Port GID fe80::2:c903:0:1491 Port GUID 0002:c903:0000:1491
Return type: rdma.devices.EndPort
Raises: rdma.RDMAError – If no matching device is found or name is invalid.
-
rdma.
get_gmp_mad
(port, path=None, verbs=None, **kwargs)¶ Return a subclass instace of
rdma.madtransactor.MADTransactor
for the associatedrdma.devices.EndPort
. If a verbs instance is already open then it should be passed in as verbs
-
rdma.
get_umad
(port, path=None, **kwargs)¶ Create a
rdma.umad.UMAD
instance for the associatedrdma.devices.EndPort
. UMAD instances can issue SMPs and GMPs. If only GMP is required then useget_gmp_mad()
.
-
rdma.
get_verbs
(port, **kwargs)¶ Create a
rdma.uverbs.UVerbs
instance for the associatedrdma.devices.RDMADevice
/rdma.devices.EndPort
.