Thursday, December 22, 2011

Little Endian support in MicroBlaze: Application Notes

Little Endian support is added to MicroBlaze in EDK 12.3 by adding AXI bus support. This notes focuses on the differences in MicroBlaze core and features added by each EDK version, and software changes needed for Little Endian support.

EDK Version History

ISE Design Suite 12.3
ISE Design Suite 13.1
ISE Design Suite 13.2 & 13.3
  • MicroBlaze version V8.20.a
Software Changes
 Since the MicroBlaze CPU as well as the AXI bus both supports Little Endian, there is no specific changes required except for the following conditions:

1) Reading or writing data using different type of pointers
 

For example, reading an half-word using word pointer. Writing a byte to a word pointer. For example, in AXI UART 16550 driver,  for Big endian, the data byte has to be read or written in address (C_BASEADDR + 0x1000 + 3). For Little Endian, (C_BASEADDR + 0x1000) address has to be read/write.

2) Byte array is merged into half-word or word

For example, in AXI Ethernet Lite driver, when writing MAC address to transmit dual port memory, for Big Endian, it has to be written as follows:

Word 0 = (MAC[0] << 24) | (MAC[1] << 16) | (MAC[2] << 8) | MAC[3];
Word 1 = (MAC[4] << 24) | (MAC[5] << 16)

For Little Endian, it has to be written as follows:

Word 0 = (MAC[3] << 24) | (MAC[2] << 16) | (MAC[1] << 8) | MAC[0];
Word 1 = (MAC[5] << 8) | MAC[4]

In similar way, when reading Ethernet packet type and packet length from AXI Ethernet Lite receive dual port memory, the code should be as follows:

unsigned char *dpm_buf;

Ethernet packet type = (dpm_buf[12] << 8) | dpm_buf[13]
Packet data length = (dpm_buf[16] << 8) | dpm_buf[17]

Otherwise, if you read these fields as word or halfword, you need to convert them explicitly by ntohl/ntohs functions or lwr/lhur instructions.

3) Half-word or word is converted into byte array


4) Self-modifying code
 

If you use self-modifying code, please update it as specified in "Self-modifying Code" section of MicroBlaze Processor Reference Guide for EDK 13.3.
http://www.xilinx.com/support/documentation/sw_manuals/xilinx13_3/mb_ref_guide.pdf

5) Memory Barrier Instruction
In places where memory synchronization is required, use the (mbar) memory barrier instruction.

Condtional Assembly in GCC
 

Sometimes you may need to conditionally assemble your assembly source using the predefined compiler macros in GCC. However, the compiler predefined macros are not accessible in GCC assembler. You may need to change the file extension of your assembly source into capital S, ".S" (not small '.s') and use C style #ifdef..#else..#endif. For example, to conditionally assemble your assembly source according to endian, use the constructs as follows:

#ifdef __LITTLE_ENDIAN__
#else
#endif

or

#ifdef __BIG_ENDIAN__
#else
#endif


But, the compiler flag is -mlittle-endian for little endian and -mbig-endian for big endian.

Enjoy the FPGA!

Tuesday, December 20, 2011

Monday, December 19, 2011

How to create new Computex CSIDE project

When using HUDI, PALMiCE2 or PALMiCE3 for development or debug, CSIDE debug software is used. For invoking the debug environment, it is necessary to have CSIDE project or workspace. When CSIDE project (*.cpf file) has to be created newly for a new CPU, please follow these steps:
  • Start menu→ProgramsCSIDECSIDE for PALMiCE3 SH
Now, if you are asked for loading any existing project, DO NOT load. Simply close it. You will get empty debug window as follows:













Now,
  • SettingsTarget System Settings
You will get CSIDE for PALMiCE3 SH project information window as follows. Select Category and CPU and press OK.














You can see the Target System Settings window opened. Now, save your project file with
  • File→Save as→Project

Saturday, September 10, 2011

Howto switch on Wireless in DELL inspiron 1525

I have DELL inspiron 1525 Laptop with Windows Vista OS. Few days before, I tried to enable WiFi on this. But, I could not turn on/off it, even I enable the Wireless switch and Wireless adapter in BIOS and slide the switch as explained in the following "DELL INSPIRON SETUP GUIDE".

http://premiersupport.dell.com/support/edocs/systems/ins1525/en/SG/Y465HA01MR.pdf

 Then, I found that it has to be enabled in Windows Device manager too. Let me give the steps:

① Open Windows Explorer and right-click "Computer". From the pop-up menu select "Properties".

② In the dialog box, click on "Device manager". In the "Device Manager" dialog box,  expand the "Network adapter" and right-click on "Dell wireless 1490 Dual band WLAN Mini-card" and select "Properties" from the pop-up menu.

③ From the dialog box "Properties of Dell wireless 1490 Dual band WLAN Mini-card", select "Driver" tab and click "Enable" button. This will enable the device.

Now, in BIOS, I enabled the Wireless adapter and disabled the switch. So, the WiFi is always enabled.

Wednesday, July 27, 2011

Data read from Kinetis K60 Flash is inconsistent

Problem:
When data is written to the Flash in the Kinetis K60 controller and the data is read through the program, the read data differs from the actual data written to the Flash. But, the data is intact when reading through the debugger. The contents of the flash memory seems to be corrupted/inconsistent.

Solution:
Though there is no separate cache in Kinetis K60, the Flash alone has prefetch functionality enabled and caching enabled. This causes this problem. To make sure, disable all cache, prefetch and single entry buffer as follows:


    uint32 temp_reg;

    temp_reg = FMC_PFAPR;    /* store present value of FMC_PFAPR */
    FMC_PFAPR = 0x00ff0000; /* Disable prefetch temporarily */

    FMC_PFB0CR &= ~0x00080000;
    FMC_PFB1CR &=  ~0x00080000;

    FMC_PFB0CR &= ~0x0000001f;
    FMC_PFB1CR &= ~0x0000001f;

    FMC_PFAPR = temp_reg;    /* re-store original value of FMC_PFAPR */

Have great embedded computing!

"Data TLB access error" exception when XMD(GDB) resumes execution from breakpoint


After breakpoint is set up in Xilinx XMD(GDB) debugger and the program is run to the breakpoint, when trying to continue/resume the execution or step through the code, "Data TLB access error" exception occurs.

This problem indicates stack corruption or uninitialized initial stack frame. When program execution reaches a breakpoint, execution pauses and gdb begins to reconstruct the backtrace. That means, using the contents of stack, the debugger trace through the previous stack frames and lists the calling functions corresponding to each stack-frame.

Actually, each stack-frame contains the (back chain pointer, which is) pointer to the previously allocated stack-frame and the pointer to the calling function.

GDB(XMD) continues to trace until the backchain pointer for a stack-frame is 0x00000000 which is the initial stack frame corresponding to the start of the program.

Therefore, either if the back-chain pointer of the initial stack frame is not initialized to NULL(0x00000000) by the program or if stack is corrupted, invalid addresses might be referred during the back-trace and "Data TLB access error" exception occurs when resuming the execution.

For more details about GDB backtrace, please refer to the following:
http://devpit.org/wiki/GDB

Have a nice debug!

Monday, July 25, 2011

Xilinx TEMAC checksum offload verification: Optimized solution

XPS LL TEMAC core calculates raw checksum over the entire Ethernet payload. According to the product specification, to verify the checksum, the checksum of the fields which should not have been included must be subtracted from the raw checksum and the adjusted raw checksum has to be compared with the checksum field of the TCP or UDP header.

But the above mentioned method is a time-consuming solution and I found a more optimized solution by which, the raw checksum is compared with the checksum of the fields which should not have been included. If both are equal, the checksum verification is passed. Otherwise, it is failed.

How is it possible?

Raw checksum = Checksum of TCP/UDP payload + Checksum field of TCP/UDP packet + Checksum of fields that should not have been included

For valid packet, the whole checksum(including the checksum field) of TCP/UDP packet is zero. So,

Raw checksum = 0 + Checksum of fields that should not have been included

Raw checksum = Checksum of fields that should not have been included

Pseudo code for the solution


bool temac_verify_csum(unsigned char *pkt_buf, unsigned short rx_csraw)
{
    unsigned short *temp;
    unsigned int csum;


    if(ntohs(*(unsigned short *)(pkt_buf + 12)) == 0x0800)/* IPv4 */
    {
        struct ip_header *ip_hdr = (struct ip_header *)pkt_buf + 14;
        temp = (unsigned short *)ip_hdr;
     
        /* Other than TCP/UDP, pass to the upper layer */
        if(ip_hdr->protocol != UDP && ip_hdr->protocol != TCP)
            return PASSED;


        /* If fragment, pass it to upper layer */
        if(ip_hdr->fragment)
            return PASSED;


        /* UDP with checksum 0. No need to verify */
        if(ip_hdr->protocol == UDP && *(temp + 13) == 0)
            return PASSED;


        /* Add the fields that should not be included */
        csum = *temp++;      /* Version, IHL, Differentiated Services */
        csum += ip_header_length; /* 20 */
        temp++;
        csum += *temp++; /* Identification */
        csum += *temp++; /* Flags and Fragment offset */
        csum += *temp++ & htons(0xFF00); /* TTL */
        csum += *temp;   /* IP header checksum */
    }
    else if(ntohs(*(unsigned short *)(pkt_buf + 12)) == 0x86dd)/* IPv6 */
    {
        struct ip6_header *ip6_hdr = (struct ip6_header *)pkt_buf + 14;
        temp = (unsigned short *)ip6_hdr;


        if(ip6_hdr->next_header != TCP && ip6_next_header != UDP)
            return PASSED;
        if(ip6_hdr->next_header == UDP && *(temp + 23) == 0)
            return PASSED;/* UDP with checksum 0. No need to verify */
        csum = *temp++;   /* Version, Traffic class */
        csum += *temp++;  /* and Flow Label */
        temp++;
        csum += *temp++;  /* Next header and Hop Limit */
        /* Next header must be included. So, minus it */
        csum -= htons(ip6_hdr->next_header);
    }
    else
        return PASSED;


    csum = (csum & 0xFFFF) + (csum >> 16);
    csum += (csum >> 16);
    if((unsigned short)csum == rx_csraw)
        return PASSED;
    return FAILED;
}


Where rx_csraw is raw checksum in the descriptor and pkt_buf is pointer to the buffer of the descriptor. Surely, it is an optimized solution, right?

For details on checksum offload of transmission side, look at the following post:
http://embeddedknowledge.blogspot.com/2009/08/xilinx-temac-checksum-offload.html

If you have any queries, write them as comments.