Full 100G TCP Offload for AMD Alveo Accelerator Card | TOE100G-IP Core
TOE100G-IP Core on Alveo Card for Full TCP offload

Full 100G TCP Offload for AMD Alveo Accelerator Card | TOE100G-IP Core

No alt text provided for this image

Nowadays, the large amount of data need to be stored and accessed through the Server inside the Data center. The network communication channel for each Server in the same location requires high-bandwidth from the large data size.

Also, the network channel for sharing the large amount of data among multiple Servers that are installed at different location is very important. It will be the critical point and bottleneck of the overall system if the connection for network crossing lcation does not have enough bandwidth or can’t utilize network bandwidth effectively.

100G Ethernet connection is the ideal solution for solving large amount of data problem in the Data Center. Supporting 100Gbps for efficiently transferring data with the reasonable infrastructure cost to satisfy the Data center’s requirements. 

Anyway, without Hardware Accelerator, sometimes the network performance is dropped because CPU and the OS need to handle other tasks.  The performance graph below shown about 68 Gbps can be achieved or 68% of the maximum bandwidth of 100G Ethernet with significant drop sometime.

No alt text provided for this image
100G network performance limitation without hardware accelerator

Next, let us show the details of CPU task for handling TCP/IP packet by using the standard NIC. The software on CPU consists of many parts for processing each network layers.

Starting from the low layer, Device Driver, Network Subsystem, TCP/IP stack, Socket interface, and the application are implemented.

From the CPU bottle-neck, the complete CPU offload engine, implemented by Accelerator card, is purposed. Most CPU tasks for handling TCP/IP packet are handled by the TOE100G-IP and Alveo Accelerator card instead. 

No alt text provided for this image
Offload Engine By Accelerator Card Ref: https://www.cs.cornell.edu/~qizhec/paper/tcp_2021.pdf

Implementation of Full 100G TCP offload by TOE100G-IP and Alveo Card

There are two key hardwares inside the Alveo Accelerator card, TOE100G-IP and DMA engine to build Accelerator Systems.

In Sender process, DMA Engine transfer the data from the system memory to TOE100G-IP. After that, TOE100G-IP builds the Ethernet packet that includes the Application data and transfers to the target system via 100G Ethernet.

In Receiver process, TOE100G-IP extracts the Application data from the received Ethernet packet on 100G Ethernet.

Next, DMA engine transfers the Application data from TOE100G-IP to the system memory. The application can process the data on the system memory.

No alt text provided for this image
Offload Engine by Accelerator Card

Let’s see the data flow for Send process in more details.

  • Firstly, TOE Application generates the data, called TCP Payload, and then write to the Main Memory.
  • Next, TOE Application sends the request to DMA engine for transferring the data from the Main memory to TOE100G-IP via TOE function.
  • Finally, TOE Application sends the request to TOE100G-IP for creating Ethernet frame that includes TCP payload data and sending to the target system.

The performance result when the test application writes the incremental data is up to 9,180 MB/s. Without losing CPU time to generate incremental data, assuming that data is available on main memory (as dummy data), peak performance on 100G Ethernet at 12,300 MB/s can be achieved. 

No alt text provided for this image
TOE100G-IP on Alveo Card (Send)

In receive process, the data flow is inversed.

  • TOE100G-IP receives and extracts the TCP payload from Ethernet frame and transfers to DMA Engine.
  • Next, DMA Engine uploads the data to the Main memory and asserts the signal to the TOE Application that the new data is arrived.

TOE Application reads the data from the Main memory and verifies it. Similar to Send process, the performance is about 9,700 MB/s when the Application verifies the receive data. Without losing CPU time for data verification, the Application shows the peak performance at 12,300 MB/s

No alt text provided for this image
TOE100G-IP on Alveo Card (Receive)

Now we show the demo setup and performance result by using two Accelerator systems.

The Accelerator system consists of the Alveo card which is U50 or U250 and DG's Turnkey Accelerator system. 

No alt text provided for this image
Test Environment Set Up

Run the Application, TOE100DMATest, on two Turnkey systems.

The left-side console shows the IP that is initialized by Server mode. The right-side console shows the IP initialized by Client mode.

To show the half duplex transfer, the left-side console selects Send data test menu by using 256 GBs. Jumbo-frame size is applied. The right-side console selects Receive data test menu. Without enable test data generating and verification, 12,300 MB/s can be achieved.

Half duplex TCP sending

No alt text provided for this image

Half duplex TCP receiving

No alt text provided for this image

 When running full duplex transfer, the performance result is about 10,000 MB/s.

Full duplex TCP sending & receiving

No alt text provided for this image

Example Use Cases

The TOE100G-IP with Alveo card demo can be applied to the Real-time data processing application. The system can transfer the large size data in very short time which is the core feature for this application. When the bandwidth is not enough, the number of 100G Ethernet connections can be increased by adding more Alveo cards. 

No alt text provided for this image
Real-Time Data Processing

One TOE100G-IP is designed to handle one TCP session data. When multiple TCP sessions are required for transferring many data types, multiple TOE100G-IPs and DMA engines can be integrated to the Accelerator system.

No alt text provided for this image
Real-Time Data Processing

When the Accelerator system needs to support both TCP/IP and UDP/IP protocols, the UDP100G-IP can also be integrated and work together with TOE100G-IP.

No alt text provided for this image
UDP/TCP Data Processing

For more information about TOE100G-IP and DG's Turnkey Accelerator System, please visit our website.

To view or add a comment, sign in

More articles by DesignGateway

Insights from the community

Others also viewed

Explore topics