2001
A simple and efficient parallel FFT algorithm using the BSP model
Publication
Publication
Parallel Computing , Volume 27 p. 1847- 1878
We present a new parallel radix-4 FFT algorithm based on the BSP model. Our parallel algorithm uses the group-cyclic distribution family, which makes it simple to understand and easy to implement. We show how to reduce the communication cost of the algorithm by a factor of 3, in the case that the input/output vector is in the cyclic distribution. We also show how to reduce computation time on computers with a cache-based architecture. We present performance results on a Cray T3E with up to 64 processors, obtaining reasonable efficiency levels for local problem sizes as small as 256 and very good efficiency levels for local sizes larger than 2048.
| Additional Metadata | |
|---|---|
| Parallel Computing | |
|
Inda, M. A., & Bisseling, R. H. (2001). A simple and efficient parallel FFT algorithm using the BSP model. Parallel Computing, 27, 1847–1878. |
|