Re: Very Fast Fourrier Transform
Posted:
Mon Aug 03, 2015 9:13 pm
by aolofsson
So your routine on a simple scalar Epiphany core at 600MHz runs 39% faster than FFTW running on a 4-way A9 core at 667MHz?
I'd say that's pretty darn impressive!
Andreas
Re: Very Fast Fourrier Transform
Posted:
Tue Aug 04, 2015 7:43 am
by tnt
Yeah, I'm pretty happy about it.
The big advantage of the epiphany in this case are:
- Large register file : except for fft data load / store, there is no memory access for temporary results. Despite having loop pipelining and processing 4 data per loop iteration (2 radix-2 ops in //), I only ever use registers, and even only the "caller saver" registers so I don't even need to save/restore them.
- BITR opcode : infinitely useful for this :p
- Easy to predict low level behavior: Because I can understand exactly how the CPU will execute stuff, I can tailor the operations manually much better. Optimizing for ARM (or even worse Intel) has so many rules to follow that I can't keep them all in my head ...
Next step will probably be to extend this for higher point FFTs using multiple cores. (The current one is local mem only, so you can do at most 2048 points, but more realistically 1024 when using double-buffering)
Re: Very Fast Fourrier Transform
Posted:
Tue Aug 04, 2015 12:26 pm
by aolofsson
That's great to hear! Look forward to your inputs in the following topic.
viewtopic.php?f=23&t=3127