public marks

PUBLIC MARKS with tag cos

February 2008

Simple SSE optimized sin, cos, log and exp

by ogrisel
I chose to write them in pure SSE1 MMX so that they run on the pentium III of your grand mother, and also on my brave athlon-xp, since thoses beast are not SSE2 aware. Intel AMath showed me that the performance gain for using SSE2 for that purpose was not large enough (10%) to consider providing an SSE2 version (but it can be done very quickly). The functions use only the _mm_ intrinsics , there is no inline assembly in the code. Advantage: easier to debug, works out of the box on 64 bit setups, let the compiler choose what should be stored in a register, and what is stored in memory. Inconvenient: some versions of gcc 3.x are badly broken with certain intrinsic functions ( _mm_movehl_ps , _mm_cmpeq_ps etc). Mingw's gcc for example -- beware that the brokeness is dependent on the optimization level. A workaround is provided (inline asm replacement for the braindead intrinsics), it is not nice but robust, and broken compilers are detected by the validation program below.

September 2007


by pyxosledisciple
Le 13° Régiment De Dragons Parachutiste Venez découvrir le régiment des hommes de l'ombre sur ce forum.

Active users

last mark : 27/02/2008 12:53

last mark : 05/09/2007 11:35