I note with interest that the epiphany core has 64 registers. A few years ago, I programmed a neural network based pattern classifier in Power MacForth, a forth dialect that made use of the large number of registers in the PowerPC architecture to provide for passing named parameters (up to 8) to functions, which could be accessed as named local variables within the function, and of course made it child's play to call functions recursively. This not only made forth programming much easier since far fewer stack rearranging was necessary, it was also very fast. And the code was readable!
I hope that someone implements a forth machine on the parallela architecture. My pattern classifier would benefit greatly from running on multiple cores.