AMD Confidential
User Manual November 21
st
, 2008
224 Appendix A
Converts packed double-precision
floating-point values in an XMM
register or 128-bit memory location to
packed doubleword integers values in
the destination MMX™ register.
Converts two packed doubleword integer
values in a MMX™ register or 64-bit
memory location to two packed double-
precision floating-point values in the
destination XMM register.
Converts packed doubleword integer
values in a MMX™ register or 64-bit
memory location to single-precision
floating-point values in the
destination XMM register.
A.6.6 3DNow!™ Instruction Set
This chapter describes the 3DNow! Instruction Set that the simulator supports and
simulates. 3DNow! Technology is a group of new instructions that opens the traditional
processing bottlenecks for floating-point-intensive and multimedia applications.
Fast Enter/Exit of the MMX or
floating-point state.
PAVGUSB mmreg1,mmreg2/m64
Average of unsigned packed 8-bit
values.
Converts packed floating-point
operand or packed 32-bit integer.
Floating-point accumulate.
Packed, floating-point addition.
PFCMPEQ mmreg1,mmreg2/m64
Packed floating-point comparison,
equal to.
PFCMPPGE mmreg1,mmreg2/m64
Packed floating-point comparison,
greater than or equal to.
PFCMPGT mmreg1,mmreg2/m64
Packed floating-point comparison,
greater than.
Packed floating-point maximum.
Packed floating-point minimum.
Packed floating-point
multiplication.
Packed floating-point approximation.
PFRCPIT1 mmreg1,mmreg2/m64
Packed floating-point reciprocal,
first iteration step.
PFRCPIT2 mmreg1,mmreg2/m64
Packed floating-point reciprocal,
second iteration step.
PFRSQIT1 mmreg1,mmreg2/m64
Packed floating-point reciprocal,
square root, first iteration step.
PFRSQRT mmreg1,mmreg2/m64
Packed floating-point reciprocal,
square root approximation.
Packed, floating-point subtraction.
Packed, floating-point reverse
subtraction.
Packed 32-bit integer to floating-
point conversion.
PMULHRW mmreg1,mmreg2/m64
Multiply signed packed 16-bit values
with rounding and store the high 16
bits.
Prefetch processor cache line into
L1 data cache (Dcache).
Table 15-10: 3DNow!™ Instruction Reference