Sparse frequency domain input
Reuse of pre-allocated memory
Support of negative indexing for frequency domain data
Parallelization and acceleration are optional
Unified interface for calculations on CPUs and GPUs
Support of Complex-To-Real and Real-To-Complex transforms, where the full hermitian symmetry property is utilized
C++, C and Fortran interfaces
To allow for pre-allocation and reuse of memory, the design is based on two classes:
Grid: Allocates memory for transforms up to a given size in each dimension.
Transform: Is associated with a Grid and can have any size up to the Grid dimensions. A Transform holds a counted reference to the underlying Grid. Therefore, Transforms created with the same Grid share memory, which is only freed, once the Grid and all associated Transforms are destroyed.
A transform can be computed in-place and out-of-place. Addtionally, an internally allocated work buffer can optionally be used for input / output of space domain data.
The creation of Grids and Transforms, as well as the forward and backward execution may entail MPI calls and must be synchronized between all ranks.