Epiphany not #1 in $/performance

Postby suarezvictor » Thu Oct 02, 2014 7:05 pm

In terms of performance/$, Epiphany seems more expensive than other solutions.

$12 Rockchip RK3188 4-core 7.2 GFLOP/core (NEON): ~$0.4/GFLOP @ 1.8GHz [1]
$250 TI 66AK2H12 8-DSP 19.2 GFLOP/core: ~$1.6/GFLOP @ 1.4GHz[2]
$74 Adapteva 16-core 2 GFLOP/core: ~$2.3/GFLOP @ 0.7GHz [3]
(All single precision peak performance)


[1] https://olimex.wordpress.com/2013/07/08 ... at-1-8ghz/
[2] http://www.ti.com/product/66AK2H12/samplebuy
[3] http://shop.adapteva.com/collections/fe ... ii-samples
Re: Epiphany 90X more expensive than Rockchip

Postby mhonman » Thu Oct 02, 2014 7:50 pm

There's something to bear in mind - these are all "guaranteed never to exceed" performance figures (actually for the Epiphany E16G03 it's 1.4GFlops/core and $3/GFlop).

Application overheads - often related to the memory hierarchy and/or communication structure - can only reduce performance from these levels.

Now the realities surrounding these parts:

- very high volume product for mobile devices - development costs are spread over millions of units sold
- BUT there is no high-speed inter-chip communication capability. If you want more performance than you can get from one RK3188, you're out of luck

TI 66AH2H12
- medium volume (but getting better now that HP are selling them in servers)
- very expensive if you want a board with one, somewhere north of $1000 IIRC
- lots of on-chip memory and/or cache - 18MB vs 512KB for RK3188 & E16G03
- lots of parallelism in each DSP core, but _very_ hard to exploit due to VLIW architecture (here's a paper by some experts: http://www.cs.utexas.edu/users/flame/pubs/FLAWN61.pdf). Can't just recompile and enjoy the performance!
- It has a proper communication fabric (SRIO) - so can go multi-chip but SRIO has quite power-hungry and has relatively high latency
- BTW this part also has 4 ARM cores each with a NEON unit, so the overall performance is even higher than you quoted

- small volume - hence high overhead costs/chip
- nice cheap board available - Parallella is an absolute bargain.
- good on-chip and between-chip network, low latency is very nice for parallel programming
- that's a bit academic because AFAIK no-one has made a nice 2D mesh-based Epiphany system!
- applicability is limited by small per-core RAM & need to be partnered with an FPGA.

Having compared the TI C6678 against Epiphany for matrix multiplication - without using the tricks in the paper referenced above - the price/performance comparison is not in Adapteva's favour. How I wish it were different...

However if you have an application that has "perfect" data locality (I think someone mentioned neural networks as an example), simple logic, and the need for > 20GFlops with low power consumption - there's not much around that can compete with the Epiphany.
Re: Epiphany not #1 in $/performance

Postby suarezvictor » Thu Oct 02, 2014 8:02 pm

My metric is $/performance and it seems Adapteva is easily beaten. Other factors may apply but this comparison is JUST PERFORMANCE/$, for a somewhat easy to pgoram multicore CPU (not GPU nor FPGA)

Also, TI TMS320C6678 8-DSP 20 GF/core achieves $1/GF (I'd be delighted to know more about your benchmark)

I think Adapteva should publish volume prices, maybe they achieve sub-$1/GFLOP mark.
Re: Epiphany 90X more expensive than Rockchip

Postby yanidubin » Thu Oct 02, 2014 8:39 pm

I'm interested in the answer to this question also - it looks like there are a few other contenders in terms of raw performance.

But first a few points about the comparison you are drawing - some is common sense, some is a bit speculative, to be taken with a grain of salt.

The first article points to 100,000 quanitity price, the second to 1,000s of units, versus samples of a chip which is not in mass production. I have been unable to find any sort of 1-off quantity for the RK3188 to make any sort of comparison - where can I even purchase samples from? For the second chip, when I look at suppliers like Avnet and DigiKey, the price for a single chip is $300-400, so actually quite good in terms of translation from medium to small volumes, and from a less than competitively priced distributor. So quantities play some part in the pricing (by as much as several times) which is worth bearing in mind, but in no way accounts for a factor of 30X or 90X.

The second important factor is that (from my understanding at least) this is not (yet) an in-volume production chip, which will have a very significant impact. These are not going to be cheap to make in small quantities, so none of the efficiencies of scale will be in play to drive the cost down, as are likely the case for the other chips you are comparing with. Expect this to change if/when the Epiphany is in mass production. Comparing anything to a chipset mass produced for the huge tablet/phone market is going to look like a poor proposition. If you need something dirt cheap, re-use something ubiquitous from such a market (provided it meets all your other needs).

I would guess Adapteva will be having to spread significant NRE costs in getting the chips produced over a fairly small production run. The Parallella kickstarters (of which there were maybe 10,000?) was very competitively priced, so I expect there was little in the way of recouping NRE costs - you can already see the effect of that (buying a non-kickstarter board simply costs more). I would not be surprised if they chip cost is still artificially high due to the cost of making the chip in small quantities, and not passing on the entire NRE cost to the boards they were built for.

You may be aware, for instance, that unless they reached a $3m stretch goal on kickstarter, it was too expensive to produce the 64-core chip at all (they were funded less than $1M). This may give you an idea of the sort of cost/benefit calls they needed to make. Note that I'm not suggesting you can determine the tooling cost from this (much of the kickstarter cost would have gone towards the rest of the board / chips - and on the other side, I expect Adapteva likely has other investment which would allow them to go into production of the new core without covering it all from Kickstarter).

I'm not even going to explain the apple vs oranges comparison in terms of architecture - the Epiphany offers capabilities around meshing with other devices to form a large cluster of cores linked via a high speed bus - which I expect you won't find on a quad-core Cortex A9. But then it provides very little of the peripherals you would expect on a Cortex A9 - completely different beast. I'm sure someone else can give you a much more satisfying answer there.

Another point here is that they've certainly suggested W/GFLOP is where the Epiphany shines. It might be worth doing a comparison of these figures also to see how they stack up. If you are building a supercomputer, it is not simply the cost of the chip you need to factor in, but the cost of running it constantly for many years.

But obviously if building a small project with only one or few cores, the $/GFLOP question you are asking is more relevant.

I agree that going by the (rather unfair) comparison you have made, they do not nearly stack up in terms of those $/GFLOPs numbers. As I said, I too am interested in the answer, and I think the question needs to be "what is the target volume production price for the Epiphany (16-core, 64-core, etc)?" so that we can draw a meaningful comparison.

I'm also interested to know whether, since the 16-core chip is only the first chip on the roadmap, how do Adapteva expect the price per core to drop if/when they move to production quantities of the 16/64/1024 core chips and beyond?
Re: Epiphany not #1 in $/performance

Postby yanidubin » Thu Oct 02, 2014 8:58 pm

Ah, sorry - looks like you already got a good reply, I didn't see that come in.

And you should probably redact your claim of a factor of 90X from the post as well as the topic.

Given the difference is so slim, and the explanations around volumes/production sizes, I'm actually surprised how reasonable the Epiphany is priced all things considered. While there is still a factor of 6X, as you say, the volume price could well put it on a very competitive footing here.
Re: Epiphany not #1 in $/performance

Postby notzed » Fri Oct 03, 2014 7:28 am

suarezvictor wrote:Why?

Basic economics?
Re: Epiphany not #1 in $/performance

Postby 8l » Fri Oct 03, 2014 4:02 pm

time is fast, current cpu position changing rapidly..
with much money, ppl can do more different things.

who knows qualcomm next chips (krait 888?) come with 256 cores but only consume 1w..
Qualcomm’s MARE (Multicore Asynchronous Runtime Environment)
http://techcrunch.com/2014/09/15/qualco ... -euvision/

innovation over money could be possible?
Re: Epiphany not #1 in $/performance

Postby aolofsson » Fri Oct 03, 2014 8:36 pm

My thoughts/experiences on the subject here:
http://www.adapteva.com/andreas-blog/se ... nomics-101

Have a great weekend!
Re: Epiphany not #1 in $/performance

Postby suarezvictor » Tue Oct 07, 2014 5:46 am

Andreas, I'm happy to know I triggered this interesting discussion, that resulted in your article. I'm backer of the project so my intentions are just to make all other backers more informed about $/performance.

I think next thing we'll appreciate, is to know
a) what needs to happen to reach let's say < $1/GFlop, for example, how many chips you need to sell to get into the right economics
b) when do you estimate this event will happen (i.e. an equation or something), that is, when could be adapteva suitable for a cost-sensitive project.

PS: If you liked my topic please also review again my tweet about SHP
http://www.eetimes.com/author.asp?secti ... id=1320965
Re: Epiphany not #1 in $/performance

Postby Gravis » Wed Oct 15, 2014 9:25 pm

suarezvictor wrote:In terms of performance/$, Epiphany seems more expensive than other solutions.

hertz isnt a good scale for a good reason, if addition takes 3 cycles on one chip and 4 cycles on another, that's a large difference. MIPS is a better scale but it doesnt take into account any novel instructions added to the instruction set.

also, nothing is in a bubble, so consider the following:

  • MIPS/$ (initial cost of processor)
  • MIPS/meter (computing density)
  • MIPS/Watt (power consumption)
  • MIPS/BTU (heat output)
  • rate of failure
  • special instructions that could boost the speed for your program.
  • communication speed between processors. (epiphany chips have a built-in protocol)

i honestly dont know how it all works out but i know epiphany low power and has a fast bitswap instruction that speeds up things like FFT algos.
