AMD today announced SSE5 Extensions to x86 instructions. It will be specially useful for algorithms that require fast floating-point matrix and vector processing. A floating-point matrix multiply using the new SSE5 extensions is reportedly 30 percent faster than a similar algorithm implemented with the existing SSE instructions. It is two years away though.

For more visit: http://developer.amd.com/SSE5