Wednesday, August 26, 2009

Software design and implementation for TCP/UDP/IP Checksum offloading Interface

        Most of modern NICs (Network Interface Cards) are with Giga bit speed capability and come with TCP/UDP/IP Checksum offloading support. Embedded Operating systems can no longer postpone the integration of checksum offloading support with their drivers and protocol stacks.

        When referring with few controllers such as Xilinx TEMAC and PowerPC TSEC, it is clear that the extent of support provided by each widely varies and it is not a standardized one. In turn, it complicates the design and implementation of a standardized software interface that can support all kind of controllers.
Varying features of checksum offloading
Supported Layers (TCP/UDP/IP)
Some support only TCP/UDP. Some support IP too.
Support for packets with Options header
Some controllers do not calculate the checksum for packets with options header.
Checksum calculation for UDP/TCP Pseudo Header
Most of controllers which do not support IP layer and Options header expects the protocol stack to calculate the UDP/TCP Pseudo header checksum and seed them with.
Driver Interface specification
Driver interface such as input parameters and output format of the Checksum Offload Engines vary, though mostly interfaced with buffer descriptors.
Support for fragmented support
Some support fragmented packets and some do not.
Error packets handling
Whether the hardware rejects the erroneous packets or leaving it to the software also varies.
VLAN packets support
Some support and some do not support.


        From the above table, it is clear that the protocol stack can not fully depend on the hardware engine for checksum calculation. There are packets which are not supported by the Checksum Offload Engine and they have to go through the software checksum calculation.

        In this article, I try to design a standardized software interface for the checksum offloading functionality. The implementation is divided into three modules called Configuration, Outbound flow and Inbound flow.
  • Configuration
        It is about advertising the abilities of the controller's Checksum Offload Engine(Let's say COE. I hesitate to name it as TOE(TCP offload engine) since it offloads UDP checksum calculation too) across the protocol stack. In other words, it is about initializing the network interface structures with the capability of the Checksum Offload Engine so that each protocol layer can refer whether the COE supports that particular layer or not and process the packets accordingly. The main capabilities to be advertised are: Which layers are supported (UDP/TCP/IP)? What is the extent of support(Partial/Full)(Partial means checksum offload controller does not support pseudo header checksum calculation)? Does it support fragments or not?
Interface configuration
COE supports IP?
COE supports TCP?
COE supports UDP?
COE support is full or partial?
COE supports fragmented packets?


        How this information is maintained in the network protocol stack is implementation specific. However, this article suggests to store the information as flags in the network interface structure and the driver can do the initialization job. Ok! How this information can be used at the protocol stack? While sending and receiving each packet, each layer refers to the above flags to know about the ability of the COE and do the processing accordingly. So, three types of processing must be possible by the TCP/UDP layer: 1) Software 2) Partial 3) Full. And, IP layer must do two types of processing 1) Software 2) Full, where each type is explained as below.

        Software: If the COE does not support the particular layer or the packet type (for example, fragmented packets), the checksum will be calculated as usual by the software routine.

        Partial: This is special case mainly for TCP and UDP layers. Some COEs support TCP and UDP checksum calculation, but they demand the protocol stack to calculate and feed them with the pseudo header checksum alone(What a pity!). In this case, TCP and UDP layers need to calculate only just the pseudo header checksum and send it to the driver.

        Full: The COE calculates the whole checksum for a particular layer. The software routine does not need to do anything.

        Done ! Now, COE abilities are maintained in the protocol stack. Let's see how to process each packet.
  • Outbound flow and parameters
        Generalizing the input parameters for the controllers yields the following table of input parameters that need to be passed to the driver with each packet.
Outbound parameters
Layer3 type = IPv4 or IPv6?
flag
Layer4 type = TCP or UDP?
flag
Layer3 calculation by COE?
flag
Layer4 calculation by COE?
flag
Should calculate Pseudo header?
flag
Byte offset for layer4 start
offset
Checksum offset for layer4
offset
Checksum offset for layer3
offset
Pseudo Header Checksum
data


        But, some of the parameters can be easily calculated/fixed by the driver rather than sending all the way with the packet. If that optimization is done, the list becomes as follows:
Optimized outbound parameters
Layer3 type = IPv4 or IPv6?
flag
Layer4 type = TCP or UDP?
flag
Layer3 calculation by COE?
flag
Layer4 calculation by COE?
flag
Pseudo Header Checksum
data


        Now, the UDP and IP checksum calculation logic in the protocol stack become as follows. All the the above parameters are passed to the driver and the driver sets the COE using these parameters. Everything is over. Packet will come out of the controller with calculated checksum.

UDP
Set UDP Checksum field to 0.
IF (COE Supports UDP? is Yes)
{
    IF (this is fragmented packet AND COE supports fragmented packet? is False)
    {
        Calculate by software
    }
    /* Just send the packet. Let COE calculate the checksum */
    Set Layer4 type = UDP;
    Set Layer4 calculation by COE? = Yes;
    IF (COE support is partial)
    {
        Pseudo header checksum = Calculate just pseudo checksum
    }
}
ELSE
    Calculate by software


IP
IF (COE Supports IP? is No)
{
    Calculate by software
}
ELSE
{
    Set Layer3 type = IP;
    Set Layer3 calculation by COE? = Yes;
}

TCP logic will be the same just as the UDP. And, how to pass all these information to the driver is implementation specific. However they can be passed as flags and data bytes as specified in the above table with the packet structure. 
  • Inbound flow and parameters
        Some controllers gives the calculated checksum and some notifies whether the checksum verification is passed or failed. And some controllers drop the erroneous packets. So, as a general rule, this article suggests to verify the checksum at the driver level and just drop the packets with checksum error. And, no parameters are passed from the driver to the upper layer. So, it becomes clear that only two types of packets are sent to upper layer. 1) Checksum verified correct packets 2) Packets unsupported by the COE(for example, fragmented packets). So, in the UDP and TCP layers check the checksum of all fragmented packets by software. So the logic becomes as follows:

IP
IF (COE Supports IP? is No)
{
    Verify by software
}

/* All received are correct packets, when COE is enabled */


UDP
IF (COE Supports IP? is No)
{
    Verify by software
}
ELSE IF (packet is fragmented AND COE supports fragmented packet? is False)
{
    Verify by software
}

The algorithm for TCP will be the same just as the UDP. That is all. Big job Done!!

(Please leave your comments on this article. That will help me to improve this. See you!)

No comments: