5. MAD RPC Processing¶
The python-rdma package includes a simplified system for processing MAD based RPCs defined in the InfiniBand Architecture. Most of the tedious processing is taken care of automatically and the user sees only the actual RPC payload they are interested in.
For example, this displays the SMPPortInfo
for the local
port:
end_port = rdma.get_end_port();
path = rdma.path.IBDRPath(end_port);
with rdma.get_umad(end_port) as umad:
pinf = umad.SubnGet(IBA.SMPPortInfo,path);
pinf.printer(sys.stdout);
5.1. rdma.madtransactor
MAD RPC Mixin¶
MADTransactor
provides the base set of methods for doing MAD
RPC. Derived classes uses this as a mixin to provide the basic API.
The user visible API is the IBA defined RPC names, eg
SubnGet()
which performs that named RPC.
Two modes of operation are possible, synchronous and asynchronous. In synchronous mode the RPC API will return the decoded reply payload. In asynchronous mode the RPC API will return the request message details. The mode in use is determined by the derived class.
The API is quite simplified:
# Return the SMPPortInfo for port 1
pinf = mad.SubnGet(IBA.SMPPortInfo,path,1);
print pinf.masterSMLID;
Under the covers the MADTransactor
produces a
rdma.IBA.SMPFormat
or rdma.IBA.SMPFormatDirected
that
contains as a payload a zero rdma.IBA.SMPPortInfo
. The
attributeID
is set to IBA.SMPPortInfo.MAD_ATTRIBUTE_ID
and
the argument is tested to ensure that SubnGet
is a legal RPC.
When a valid reply is received the generic MAD header is processed and errors
are converted into exceptions. The payload is unpacked and a new
rdma.IBA.SMPPortInfo
is returned.
All RPC functions have a similar signature:
-
RPC
(payload, path, attributeModifier=0)¶ Parameters: - payload (
rdma.binstruct.BinStruct
derived class adhering to the MAD protocol) – The RPC type to execute. If it is a class then the request payload is zero, otherwise the content of the instance is sent as the request. - path (
rdma.path.IBPath
) – A reversible path specifying the target node. - attributeModifier (
int
) – The value of the generic MADrdma.IBA.MADHeader.attributeModifier
field
Returns: If payload is class then an instance of that class, otherwise a new instance of payload.__class__.
Raises: - rdma.MADError – If an error MAD is returned.
- rdma.MADTimeoutError – If the MAD timed out.
- AttributeError – If payload or path are invalid.
- payload (
Support is also provided for processing incoming MADs as a server. The basic template is:
try:
fmt,req = umad.parse_request(buf,path);
raise rdma.MADError(req=fmt,req_buf=buf,path=path,
reply_status=IBA.MAD_STATUS_UNSUP_METHOD_ATTR_COMBO,
msg="Unsupported attribute ID %s"%(
fmt.describe()));
except rdma.MADError as err:
err.dump_detailed(sys.stderr,"E:",level=1);
umad.send_error_exc(err);
fmt will be an instance of the appropriate class format structure, and req
will be an instance of the appropriate payload structure. Continued parsing of
the request should happen within the try block and errors raised as
rdma.MADError
with reply_status set appropriately. Once a reply is
prepared use rdma.madtransactor.MADTransactor.send_reply()
.
-
class
rdma.madtransactor.
MADTransactor
¶ This class is a mixin for everything that implements a MAD RPC transaction interface. Derived classes must provide the
_execute()
method which sends the MAD and gets the reply.By design instances of this interface cannot be multi-threaded. For multi-threaded applications each thread must have a separate
MADTransactor
instance. Simple MAD request/reply transactors return payload, other attributes for the last processed reply are available via instance attributes.Paths used with this object can have a MKey (for SMPs) and SMKey (for SA GMPs) attribute.
-
PerformanceGet
(payload, path, attributeModifier=0)¶
-
PerformanceSet
(payload, path, attributeModifier=0)¶
-
SubnAdmGet
(payload, path=None, attributeModifier=0)¶
-
SubnAdmGetTable
(payload, path=None, attributeModifier=0)¶
-
SubnAdmSet
(payload, path=None, attributeModifier=0)¶
-
SubnGet
(payload, path, attributeModifier=0)¶
-
SubnSet
(payload, path, attributeModifier=0)¶
-
VendGet
(payload, path, attributeModifier=0)¶
-
VendSet
(payload, path, attributeModifier=0)¶
-
do_async
(op)¶ This runs a simple async work coroutine against a synchronous instance. In this case the coroutine yields its own next result.
-
end_port
= None¶ The end_port this is associated with
-
static
get_request_match_key
(buf)¶ Return a
tuple
for matching a request MAD buf. Thetuple
is ((oui << 8) | mgmtClass,(baseVersion << 8) | classVersion,attributeID). Where oui is 0 if this is not a vendor OUI MAD.
-
is_async
¶ True if this is an async MADTransactor interface.
-
parse_request
(rbuf, path)¶ Parse a request packet into a format and data.
Raises: rdma.MADError – If the packet could not be parsed.
-
reply_fmt
= None¶ The MADFormat for the last reply packet processed
-
reply_path
= None¶ The path for the last reply packet processed
-
result
= None¶
-
send_error_exc
(exc)¶ Call
send_error_reply()
with the arguments derived from therdma.MADError
exception passed in.
-
send_error_reply
(buf, path, status, class_code=0)¶ Generate an error reply for a MAD. buf is the full original packet. This entire packet is returned with an appropriate error code set. status and class_code should be set to the appropriate result code.
-
send_reply
(ofmt, payload, path, attributeModifier=0, status=0, class_code=0)¶ Generate a reply packet. ofmt should be the request format.
-
send_rmpp_reply
(ofmt, attrClass, payload, path, attributeModifier=0, status=0, class_code=0)¶ Like send_reply, but generates an RMPP reply packet. ofmt should be the request format. attrClass is the class of the attribute, eg. IBA.SANodeRecord. payload is a list or tuple of type attrClass, potentially with zero elements.
-
trace_func
= None¶ A function to call for tracing.
-
-
rdma.madtransactor.
dumper_tracer
(mt, kind, fmt=None, path=None, ret=None)¶ Logs full decoded packet dumps of what is happening to
sys.stdout
. Assign tordma.madtransactor.MADTransactor.trace_func
.
-
rdma.madtransactor.
simple_tracer
(mt, kind, fmt=None, path=None, ret=None)¶ Simply logs summaries of what is happening to
sys.stdout
. Assign tordma.madtransactor.MADTransactor.trace_func
.
5.2. rdma.umad
Userspace MAD Interface¶
The userspace MAD interface is normally instantiated by rdma.get_umad()
which will select the appropriate implementation for the platform.
-
class
rdma.umad.
LazyIBPath
(end_port, **kwargs)¶ Bases:
rdma.path.LazyIBPath
Similar to
rdma.path.IBPath
but the unpack of the UMAD AH is deferred until necessary since most of the time we do not care.end_port is the
rdma.devices.EndPort
this path is associated with. kwargs is applied to set attributes of the instance during initialization.
-
class
rdma.umad.
UMAD
(parent)¶ Bases:
rdma.tools.SysFSDevice
,rdma.madtransactor.MADTransactor
Handle to a UMAD kernel interface. This class supports the context manager protocol.
parent is the owning
rdma.devices.EndPort
.-
IB_IOCTL_MAGIC
= 27¶
-
IB_USER_MAD_ENABLE_PKEY
= 6915¶
-
IB_USER_MAD_REGISTER_AGENT
= 3223067393¶
-
IB_USER_MAD_UNREGISTER_AGENT
= 1074010882¶
-
ib_mad_addr_local_t
= <Struct object>¶
-
ib_mad_addr_t
= <Struct object>¶
-
ib_user_mad_t
= <Struct object>¶
-
recvfrom
(wakeat)¶ Receive a MAD packet. If the value of
rdma.tools.clock_monotonic()
exceeds wakeat thenNone
is returned.Returns: tuple(buf,path)
-
register_client
(mgmt_class, class_version, oui=0)¶ Manually register a MAD agent. This is done automatically for sending MADs, this API is mainly intended for special cases..
-
register_server
(mgmt_class, class_version, oui=0, method_mask=0)¶ Register to receive MADs that match the given pattern. method_mask is a bitmask of the method ID to match, oui is only used for
rdma.IBA.VendOUIFormat
MADs.
-
register_server_fmt
(fmt)¶ Same as
register_server()
except the arguments are deduced from fmt which should be derived fromrdma.binstruct.BinFormat
.
-
sendto
(buf, path, agent_id=None)¶ Send a MAD packet. buf is the raw MAD to send, starting with the first byte of
rdma.IBA.MADHeader
. path is the destination.
-
5.3. rdma.vmad
Verbs MAD Interface¶
The verbs MAD interface can be used to send GMP MADs (eg to QPN 1), which is useful for SA communication. This class creates a UD QP using verbs and uses that to send all GMPs. This means the the source QPN of the GMP will not be 1, which is a configuration supported by IBA.
-
class
rdma.vmad.
VMAD
(parent, path, depth=16)¶ Bases:
rdma.madtransactor.MADTransactor
Provide a UMAD style interface that runs on ibverbs. This can be used with GMP (eg QPN=1) traffic.
path is used to set the PKey and QKey for all MADs sent through this interface.
-
close
()¶ Free the resources held by the object.
-
end_port
= None¶ rdma.devices.EndPort
this is associated with.
-
recvfrom
(wakeat)¶ Receive a MAD packet. If the value of
rdma.tools.clock_monotonic()
exceeds wakeat thenNone
is returned.Returns: tuple(buf,path)
-
sendto
(buf, path)¶ Send a MAD packet. buf is the raw MAD to send, starting with the first byte of
rdma.IBA.MADHeader
. path is the destination.
-
5.4. rdma.sched
Parallel MAD Scheduler¶
MADSchedule
is a parallel MAD scheduling system built
using Python coroutines as the scheduling element. It provides for very
simplified programming of parallel MAD operations.
A simple use of the class to fetch rdma.IBA.SMPNodeInfo
for a list of
paths:
def get_nodeinfo(sched,node):
node.ninf = yield sched.SubGet(IBA.SMPNodeInfo,node.path);
nodes = [..];
sched = rdma.sched.MADSchedual(umad);
sched.run(mqueue=(get_nodeinfo(sched,I) for I in nodes));
The scheduler will pull coroutines from the mqueue argument and runs them to
return MADs to send, bounding the total outstanding MAD count and returning
replies as the result of yield
.
Simplified, MADSchedule
manages a set of generators
and coroutines and schedules when each is running. Generators yield
coroutines and coroutines yield
MADs to execute. Generators are started
by calling mqueue()
. Typically this
would be done using a generator expression as an argument, but this is not
required.
Coroutines are the functions that actually process the MADs. They are started
either by being yielded from a generator or via the
queue()
call. The typical format of a coroutine
is:
def get_nodeinfo(sched,node):
node.ninf = yield sched.SubGet(IBA.SMPNodeInfo,node.path);
MADSchedule
implements the asynchronous interface for
MADTransactor
, so the RPC functions return the
MAD to send. The coroutine yields these MADs back to the scheduler which
issues them on the network and waits for a reply. When a reply (or exception)
is returned for the MAD the yield
statement will return that exactly as
though the synchronous interface to MADTransactor
was being used.
While a coroutine is yielded other coroutines can execute until
rdma.sched.MADSchedule.max_outstanding
MADs are issued, at which point
the scheduler waits for MADs on the network to complete. As coroutines exit
queued generators are called to produce more coroutines until there is no more
work to do.
A coroutine may also yield
another coroutine. In this instance the
scheduler treats it as a function call and runs the returned coroutine to
completion before returning from yield
. If the coroutine produces an
exception then it will pass through the yield
statement as well. The
called coroutine can return a result to the parent by setting the
result
attribute before returning.
This example shows how to perform directed route discovery of a network using parallel MAD scheduling:
def get_port_info(sched,path,port,follow):
pinf = yield sched.SubnGet(IBA.SMPPortInfo,path,port);
if follow and pinf.portState != IBA.PORT_STATE_DOWN:
npath = rdma.path.IBDRPath(end_port,drPath=path.drPath + chr(port));
yield get_node_info(sched,npath);
def get_node_info(sched,path):
ninf = yield sched.SubnGet(IBA.SMPNodeInfo,path);
if ninf.nodeGUID in guids:
return;
guids[ninf.nodeGUID] = ninf;
if ninf.nodeType == IBA.NODE_SWITCH:
sched.mqueue(get_port_info(sched,path,I,True)
for I in range(1,ninf.numPorts+1));
pinf = yield sched.SubnGet(IBA.SMPPortInfo,path,0);
else:
yield get_port_info(sched,path,ninf.localPortNum,
len(path.drPath) == 1);
guids = {};
with rdma.get_umad(endport) as umad:
sched = rdma.sched.MADSchedule(umad);
local_path = rdma.path.IBDRPath(end_port);
sched.run(get_node_info(sched,local_path));
5.4.1. What can be Yielded¶
- A generator can yield:
- A coroutine. The coroutine is scheduled to run as though
rmda.madschedule.MADSchedule.queue()
had been called. The generator does not wait for the coroutine to finish. - The result of
queue()
, or the result of yield’ing a coroutine. Yield will return once the thing queued is finished. - The result of
mqueue()
- yield will return once the generator is exhausted and all the coroutines it spawned are finished.
- A coroutine. The coroutine is scheduled to run as though
- A coroutine can yield:
- The result of a RPC call function (a tuple describing the MAD to send). The yield result will be the MAD reply.
- A coroutine. The yield result will be value of
result
when the coroutine raisesStopIteration
or True if it is None, once the coroutine exits. Exceptions raised by the coroutine will propagate through the yield as though the yield was a function call. - A generator. This is identical to yielding a coroutine - the generator runs sequentially through its work and blocks at each yield.
- The result of
queue()
- yield will return once the thing queued is finished. - The result of
mqueue()
- yield will return once the generator is exhausted and all the coroutines it spawned are finished. - None - yield immediately returns. This is useful for calling something that might be a coroutine or a normal function that returns None.
-
class
rdma.sched.
Context
(op, gengen, parent=None)¶ Bases:
object
-
class
rdma.sched.
MADSchedule
(umad)¶ Bases:
rdma.madtransactor.MADTransactor
This class provides a MADTransactor interface suitable for use by python coroutines. The implementation gets MAD parallelism by running multiple coroutines at once. coroutines are implemented as generators.
umad is a
rdma.umad.UMAD
instance which will be used to issue the MADs.-
class
Work
(buf, fmt, path, newer, completer)¶ Bases:
tuple
-
buf
¶ Alias for field number 0
-
completer
¶ Alias for field number 4
-
fmt
¶ Alias for field number 1
-
newer
¶ Alias for field number 3
-
path
¶ Alias for field number 2
-
-
MADSchedule.
is_async
¶
-
MADSchedule.
max_outstanding
= 4¶ Maximum number of outstanding MADs at any time.
-
MADSchedule.
mqueue
(works)¶ works is a generator returning coroutines. All coroutines can run in parallel.
Returns: An opaque context reference.
-
MADSchedule.
queue
(work)¶ work is a single coroutine, or work is a tuple of coroutines.
Returns: An opaque context reference.
-
MADSchedule.
result
= None¶ Set to return a result from a coroutine
-
class
5.5. rdma.satransactor
Automatic SubnGet to SubnAdmGet Conversion¶
IBA provides two ways to get information about objects manages by a SMA - the first is a SubnGet SMP RPC to the end port, the second is a SubnAdmGet GMP RPC to the SA. These should return the same information and are generally interchangeable.
This class provides an easy way for tools to access the information either using SubnGet or using SubnAdmGet without really affecting the source code. The SubnGet is transparently recoded into a SubnAdmGet with the proper query components set from the path and attribute ID and proper unwrapping of the SA reply.
The class can wrapper both synchronous and asynchronous
MADTransactor
instances. When wrappering a
synchronous instance the class can also automatically resolve a DR path to a
LID for use with the SA.
I highly recommend that all tools with the cabability to perform SubnGet provide an option to use this class to rely on the SA. IBA defines operation modes that would deny all SubnGet operations without a valid MKey.
Example:
end_port = rdma.get_end_port();
path = rdma.path.IBDRPath(end_port);
with rdma.satransactor.SATransactor(rdma.get_gmp_mad(end_port)) as umad:
pinf = umad.SubnGet(IBA.SMPPortInfo,path);
pinf.printer(sys.stdout);
-
class
rdma.satransactor.
SATransactor
(parent, sa_path=None)¶ Bases:
rdma.madtransactor.MADTransactor
This class wrappers another MADTransactor and transparently changes SMP queries into corrisponding SA queries. It is useful to write applications that need to support both methods.
There are some limitations due to how the SA interface is defined.
SMPPortInfo
requires the port number, which is often 0. This requires extra work unless the node type is known since the SA does not support the same port 0 semantics. Generally using ninf.localPortNum as the attributeModifier works around this.When using the async interface it is not possible to use a
IBDRPath
since that requires multiple MADs to resolve the DR path to a LID through the SA.The class will collect and cache information in the path to try and work around some of these issues.
It is also a context manager that wrappers the parent‘s
close()
.parent is the
MADTransactor
we are wrappering.-
SubnGet
(payload, path, attributeModifier=0)¶
-
close
()¶
-
get_path_lid
(path)¶ Resolve path to a LID. This is only does something if path is directed route.
-
is_async
¶
-
prepare_path_lid
(path)¶ Coroutine to resolve path to a LID. This only does something if path is directed route. This must be performed when using directed route paths with asynchronous MAD transactors.
-
result
¶
-