Read Completion Boundary
Bytes
64128
cfg_rcb_status[0] orcfg_rcb_status[1]
01
DW
1632
QW
816
Credits
48
When calculating the number of completion credits a non-posted request requires, you must determine how many RCB-bounded blocks the completion response might be required, which is the same as the number of completion header credits required.
Important Note For High Performance Applications
While a programmed RCB value can be used by the user application to compute the
maximum number of completions returned for a request, most high performance memory controllers have the optional feature to combine RCB-sized completions in response to large read requests (read lengths multiples of RCB value), into completions that are at or near the programmed Max_Payload_Size value for the link. You are encouraged to take advantage of this feature, if supported, by a memory controller on the host CPU. Data exchange based on completions that are integer multiples (>1) of RCB value results in greater PCI Express interface utilization and payload efficiency, as well as, more efficient use of completion space in the Endpoint receiver.
Methods of Managing Completion Space
A user application can choose one of five methods to manage receive-buffer completion space, as listed in TableC-4. For convenience, this discussion refers to these methods as LIMIT_FC, PACKET_FC, RCB_FC, and DATA_FC. Each method has advantages and disadvantages that you need to consider when developing the user application.Table C-4:
Managing Receive Completion Space Methods
Description
Limit the total number of outstanding NP RequestsTrack the number of
outstanding CplH and CplD credits; allocate and deallocate on a per-packet basis
Method
LIMIT_FCPACKET_FC
Advantage
Simplest method to implement in user logicRelatively simple user logic; finer allocation granularity means less wasted capacity than LIMIT_FC
Disadvantage
Much Completion capacity goes unusedAs with LIMIT_FC, credits for an NP are still tied up until the request is completely satisfied
UltraScale Devices Gen3 Block for PCIe v4.4 PG156 April 4, 2018
Table C-4:Managing Receive Completion Space Methods (Cont’d)
Description
Advantage
Disadvantage
More complex user logic than LIMIT_FC or PACKET_FC
More complex user logic than LIMIT_FC, PACKET_FC, and RCB_FC
Track the number of Ties up credits for less outstanding CplH and CplD time than PACKET_FCcredits; allocate and deallocate on a per-RCB basis
Track the number of Lowest amount of outstanding CplH and CplD wasted capacitycredits; allocate and deallocate on a per-RCB basis
Method
RCB_FC
DATA_FC
LIMIT_FC Method
The LIMIT_FC method is the simplest to implement. The user application assesses the
maximum number of outstanding Non-Posted Requests allowed at one time, MAX_NP. To calculate this value, perform these steps:
1.Determine the number of CplH credits required by a Max_Request_Size packet:
Max_Header_Count = ceiling(Max_Request_Size / RCB)
2.Determine the greatest number of maximum-sized completions supported by the CplD
credit pool:
Max_Packet_Count_CplD = floor(CplD / Max_Request_Size)
3.Determine the greatest number of maximum-sized completions supported by the CplH
credit pool:
Max_Packet_Count_CplH = floor(CplH / Max_Header_Count)
4.Use the smaller of the two quantities from steps 2 and 3 to obtain the maximum number
of outstanding Non-Posted requests:
MAX_NP = min(Max_Packet_Count_CplH, Max_Packet_Count_CplD)
With knowledge of MAX_NP, the user application can load a register NP_PENDING with zero at reset and make sure it always stays with the range 0 to MAX_NP. When a non-posted request is transmitted, NP_PENDING decreases by one. When all completions for an outstanding non-posted request are received, NP_PENDING increases by one.For example:••••
Max_Request_Size = 128BRCB = 64BCplH =64CplD = 15,872B
UltraScale Devices Gen3 Block for PCIe v4.4
PG156 April 4, 2018
••••
Max_Header_Count =2Max_Packet_Count_CplD = 124Max_Packet_Count_CplH = 32MAX_NP = 32
Although this method is the simplest to implement, it can waste the greatest receiver space because an entire Max_Request_Size block of completion credit is allocated for each non-posted request, regardless of actual request size. The amount of waste becomes
greater when the user application issues a larger proportion of short memory reads (on the order of a single DWORD), I/O reads and I/O writes.
PACKET_FC Method
The PACKET_FC method allocates blocks of credit in finer granularities than LIMIT_FC, using the receive completion space more efficiently with a small increase in user logic.
Start with two registers, CPLH_PENDING and CPLD_PENDING, (loaded with zero at reset), and then perform these steps:
1.When the user application needs to send an NP request, determine the potential
number of CplH and CplD credits it might require:
NP_CplH = ceiling[((Start_Address mod RCB) + Request_Size) / RCB]NP_CplD = ceiling[((Start_Address mod 16 bytes) + Request_Size) /16 bytes](except I/O Write, which returns zero data) [(req_size + 15)/16]
The modulo and ceiling functions ensure that any fractional RCB or credit blocks are rounded up. For example, if a memory read requests 8 bytes of data from address 7Ch, the returned data can potentially be returned over two completion packets (7Ch-7Fh, followed by 80h-83h). This would require two RCB blocks and two data credits.2.Check these:
CPLH_PENDING + NP_CplH < Total_CplH CPLD_PENDING + NP_CplD < Total_CplD 3.If both inequalities are true, transmit the non-posted request, and increase
CPLH_PENDING by NP_CplH and CPLD_PENDING by NP_CplD. For each non-postedrequest transmitted, keep NP_CplH and NP_CplD for later use.4.When all completion data is returned for an non-posted request, decrease
CPLH_PENDING and CPLD_PENDING accordingly.
UltraScale Devices Gen3 Block for PCIe v4.4 PG156 April 4, 2018
因篇幅问题不能全部显示,请点此查看更多更全内容