cjed
05-20-2008, 07:18 AM
Hi, there are some links that explain how to optimize an audio application simply for both PowerPC and Intel macs.
Since MacOS 10.4 (Tiger) the vector engine APIs are hardware agnostic (the same code to write for PPC Altivec or Intel SSE). It is called the accelerate framework (it previously only supported PPC Altivec) :
http://developer.apple.com/releasenotes/Performance/RN-vecLib/index.html
"Accelerate.framework is the fundamental support library for SIMD programming (both AltiVec and SSE/SSE2/SSE3/...) on MacOS X. The vecLibTypes.h header (automatically included when you #include <Accelerate/Accelerate.h>) defines unified 128-bit SIMD data types that work for both AtliVec and Intel's vector architecture. Using these types (e.g. vFloat, vSInt32), it is possible to write a single piece of vector code that compiles and runs on both the PowerPC and Intel vector engines on both 32- and 64-bit architectures using GCC."
Among other dedicated APIs (to image processing or huge maths calculations), the Accelerate framwork includes the vDSP unit, that provides fast computations (Fast Fourier Transformation, etc.) aimed at audio applications :
"Digital Signal Processing: vDSP
The vDSP library is focused primarily in the realm of Fourier Transforms, vector-to-scalar, and vector-to-vector operations. The vDSP library has a wide range of applications, including signal processing (audio, digital image, and speech), physics, statistics, and cryptography. The vDSP library can perform both one and two dimensional Fourier transforms. vDSP functions operate on both real and complex data types.
vDSP uses vectorized code to implement functions that operate on single precision data. This code uses AltiVec extensions when a PowerPC G4 or G5 is present, or the SSE extensions when an Intel microprocessor is present. On the PowerPC G3 processor, vDSP uses scalar code...
http://developer.apple.com/documentation/Darwin/Reference/ManPages/man7/SSE3.7.html
The vDSP functions have been implemented in two ways: as vectorized code, using the vector unit on the PowerPC and Intel microprocessors, and as scalar code, which runs on all machines. Vector code often has special alignment restrictions. If your data is not properly aligned it is common for vDSP to use the scalar path as a fallback. For best results, align your data to a multiple of 16 bytes. (Malloc naturally aligns memory blocks that it allocates to 16 bytes on MacOS X.)
It is noteworthy that vDSP's FFTs are one of the fastest implementations of the Discrete Fourier Transforms available anywhere."
http://developer.apple.com/documentation/Performance/Conceptual/PerformanceOverview/BasicTips/chapter_3_section_3.html#//apple_ref/doc/uid/TP40001410-CH204-DontLinkElementID_4
So bus and memory speed considerations appart, EastWest PLAY should perform well on PowerPC. I wonder what specific Intel optimizations have been made ?
Since MacOS 10.4 (Tiger) the vector engine APIs are hardware agnostic (the same code to write for PPC Altivec or Intel SSE). It is called the accelerate framework (it previously only supported PPC Altivec) :
http://developer.apple.com/releasenotes/Performance/RN-vecLib/index.html
"Accelerate.framework is the fundamental support library for SIMD programming (both AltiVec and SSE/SSE2/SSE3/...) on MacOS X. The vecLibTypes.h header (automatically included when you #include <Accelerate/Accelerate.h>) defines unified 128-bit SIMD data types that work for both AtliVec and Intel's vector architecture. Using these types (e.g. vFloat, vSInt32), it is possible to write a single piece of vector code that compiles and runs on both the PowerPC and Intel vector engines on both 32- and 64-bit architectures using GCC."
Among other dedicated APIs (to image processing or huge maths calculations), the Accelerate framwork includes the vDSP unit, that provides fast computations (Fast Fourier Transformation, etc.) aimed at audio applications :
"Digital Signal Processing: vDSP
The vDSP library is focused primarily in the realm of Fourier Transforms, vector-to-scalar, and vector-to-vector operations. The vDSP library has a wide range of applications, including signal processing (audio, digital image, and speech), physics, statistics, and cryptography. The vDSP library can perform both one and two dimensional Fourier transforms. vDSP functions operate on both real and complex data types.
vDSP uses vectorized code to implement functions that operate on single precision data. This code uses AltiVec extensions when a PowerPC G4 or G5 is present, or the SSE extensions when an Intel microprocessor is present. On the PowerPC G3 processor, vDSP uses scalar code...
http://developer.apple.com/documentation/Darwin/Reference/ManPages/man7/SSE3.7.html
The vDSP functions have been implemented in two ways: as vectorized code, using the vector unit on the PowerPC and Intel microprocessors, and as scalar code, which runs on all machines. Vector code often has special alignment restrictions. If your data is not properly aligned it is common for vDSP to use the scalar path as a fallback. For best results, align your data to a multiple of 16 bytes. (Malloc naturally aligns memory blocks that it allocates to 16 bytes on MacOS X.)
It is noteworthy that vDSP's FFTs are one of the fastest implementations of the Discrete Fourier Transforms available anywhere."
http://developer.apple.com/documentation/Performance/Conceptual/PerformanceOverview/BasicTips/chapter_3_section_3.html#//apple_ref/doc/uid/TP40001410-CH204-DontLinkElementID_4
So bus and memory speed considerations appart, EastWest PLAY should perform well on PowerPC. I wonder what specific Intel optimizations have been made ?