GCC在处理部分LoongISA指令时的Bug

今天在编译为龙芯优化的libjpeg-turbo的时候遇到了奇怪的错误:

[ 11%] Built target md5cmp [ 11%] Built target wrjpgcom [ 11%] Built target rdjpgcom {standard input}: Assembler messages: {standard input}:8082: Error: operand 2 out of range gssdlc1 $f0,7+128($sp)' {standard input}:8083: Error: operand 2 out of range gssdrc1 $f0,128($sp)’ {standard input}:14889: Error: operand 2 out of range gssdlc1 $f0,7+128($sp)' {standard input}:14890: Error: operand 2 out of range gssdrc1 $f0,128($sp)’ make[2]: *** [simd/CMakeFiles/simd.dir/build.make:131: simd/CMakeFiles/simd.dir

Operand 2 这个立即数offset超过范围了,应该是个signed 8bit。

追到导致错误的代码

extern __inline void FUNCTION_ATTRIBS
_mm_store_si64(__m64 *dest, __m64 src)
{
  asm("gssdlc1 %1, 7+%0\n\t"
      "gssdrc1 %1, %0\n\t"
      : "=m" (*dest)
      : "f" (src)
      : "memory"
     );
}

很明显,GCC没有正确但处理"=m"这个地址offset修饰符,按照其他load/store指令的offset位域传入了这个register+offset。 其实这是GCC的Bug,不过暂时就将错就错了。 解决方案也很简单,直接把 ×dest的地址作为input丢进寄存器传进去,就不需要offset,可以直接操作地址了。

extern __inline void FUNCTION_ATTRIBS
_mm_store_si64(__m64 *dest, __m64 src)
{
  asm("gssdlc1 %1, 7(%0)\n\t"
      "gssdrc1 %1, 0(%0)\n\t"
      : "r" (dest), "f" (src)
      : "memory"
     );
}

感谢黑老师的帮助! 然而编译出来的libjpeg-turbo还是跑不过testsuit,慢慢解决吧。

Licensed under CC BY 4.0
comments powered by Disqus
Built with Hugo
Theme Stack designed by Jimmy