1925 views|0 replies

1140

Posts

0

Resources
The OP

TMS320C66x programming 16-bit to 32-bit operation [Copy link]

6位变为32位操作,使用intrinsic函数,用const等。
1、源代码:


Word32 L_mpy_ll(Word32 L_var1, Word32 L_var2)
{
        double aReg;
        Word32 lvar;
        /* (unsigned)low1 * (unsigned)low1 */
        aReg = (double)(0xffff & L_var1) * (double)(0xffff & L_var2) * 2.0;
        /* >> 16 */
        aReg = (aReg / 65536);
        aReg = floor(aReg);
        /* (unsigned)low1 * (signed)high2 */
        aReg += (double)(0xffff & L_var1) * ((double)L_shr(L_var2, 16)) * 2.0;
        /* (unsigned)low2 * (signed)high1 */
        aReg += (double)(0xffff & L_var2) * ((double)L_shr(L_var1, 16)) * 2.0;
        /* >> 16 */
        aReg = (aReg / 65536);
        aReg = floor(aReg);
        /* (signed)high1 * (signed)high2 */
        aReg += (double)(L_shr(L_var1, 16)) * (double)(L_shr(L_var2, 16)) * 2.0;
        /* saturate result.. */
        lvar = L_saturate(aReg);
        return(lvar);
}
2、改编后的代码:
static inline Word32 L_mpy_ll(Word32 L_var1, Word32 L_var2)
{
        Word32 aReg_hh;
        Word40 aReg, aReg_ll, aReg_lh, aReg_hl;

        aReg_ll = (Word40)_mpyu(L_var1, L_var2) >> 16;
        aReg_lh = (Word40)_mpyluhs(L_var1, L_var2);
        aReg_hl = (Word40)_mpyhslu(L_var1, L_var2);
        aReg_hh = _smpyh(L_var1, L_var2);
        aReg = _lsadd(aReg_ll, _lsadd(aReg_lh, aReg_hl));
        aReg = _lsadd(aReg >> 15, aReg_hh);

        return(_sat(aReg));
}
3、优化方法说明:
        C6000编译器提供的intrinsic 可快速优化C代码,intrinsic用前下划线表示同调用函数一样可以调用它,即直接内联为C6000的函数。
        例如,在上例的源代码中没有使用intrinsics,每一行C代码需多个指令周期,在改编后的代码中,每一行代码仅需一个指令周期。
        例如,“aReg_ll = (Word40)_mpyu(L_var1, L_var2) >> 16”中“_mpyu”就是一个intrinsics函数,它表示两个无符号数的高16位相乘,结果返回。C6000支持的所有intrinsics指令及其功能参见《TMS320C6000系列DSP的原理与应用》一书的第265、266页,该书还提供了另外的例子。这些内联函数定义在CCS所在的C6000 / CGTOOLS / Include目录下的C6X.h文件中。
下面这个例子是C6000的“Programmer's Guide”上提取的使用intrinsics优化C代码的例子。


源代码:


int dotprod(const short *a, const short *b, unsigned int N)
{
        int i, sum = 0;

        for (i = 0; i < N; i++)
                sum += a * b;
        return sum;
}
改编后代码:
int dotprod(const int *a, const int *b, unsigned int N)
{
        int i, sum1 = 0, sum2 = 0;

        for (i = 0; i < (N >> 1); i++)
        {
                sum1 += _mpy(a, b);
                sum2 += _mpyh(a,b); } return sum1 + sum2; } Tips: After all the debugging of C language is passed, you can try to rewrite as many statements as possible using intrinsics functions, especially in the loop body. This rewriting can greatly reduce the execution time.

This post is from Microcontroller MCU

Just looking around
Find a datasheet?

EEWorld Datasheet Technical Support

Related articles more>>

    EEWorld
    subscription
    account

    EEWorld
    service
    account

    Automotive
    development
    circle

    Robot
    development
    community

    About Us Customer Service Contact Information Datasheet Sitemap LatestNews

    Room 1530, Zhongguancun MOOC Times Building, Block B, 18 Zhongguancun Street, Haidian District, Beijing 100190, China Tel:(010)82350740 Postcode:100190

    Copyright © 2005-2025 EEWORLD.com.cn, Inc. All rights reserved 京B2-20211791 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号
    快速回复 返回顶部 Return list