Epiphany vs Intel Core i7

Forum for anything not suitable for the other forums.

Epiphany vs Intel Core i7

Postby eoghanoh » Tue Jan 07, 2014 7:56 am

Hi,

I'm doing a thesis around this board and in fact this paradigm. But unfortunately I haven't got my board yet so I'm very under pressure (I pledged for a 64 core board). I’ve already extended once and the new deadline is April. I was wondering if somebody had a few minutes and wanted to try out their new board could they check something (I think it would be interesting for the community also).

I have an Intel Core i7 2GHz (turbo boosting to 2.6GHz) and I want to see where the Epiphany Chip could potentially outperform the i7. Obviously the i7 has much faster clock speed, much bigger caches, Turbo Boost, hyperthreading etc, but there are so many cores on the Epiphany that if it scales linearly (as shodruk's mandelbrot demo shows) then at what point and for what type of programs is it better to have an Epiphany chip rather than an i7? Or is it ever? Perhaps the cost would be restrictive.

As one of the demos is a matrix multiplication (the one that they have shown a lot), I’ve run a matrix multiplication OpenMP program (http://www.arc.vt.edu/resources/softwar ... mp_mmult.c) on my system. I used matrices of 2048 x 2048 to give a longer runtime to differentiate reuslts.

Here are the matrix multiplication times:
1 Core - 98 seconds
2 Cores - 55 seconds
4 Cores - 41 seconds
8 Cores - 32 seconds

I’ve a lot more tests I want to do when I get the board, but it would be great if somebody with a 16 core board could try this out for me (adjust the parameters to use 2048 x 2048 matrices). It would be even better if somebody had a 64 core board as we could see how it scales as it goes significantly higher. Ideally you would try it at intervals of powers of 2.

Depending on the results of this test I’ll work towards something for when I get my board. Perhaps theres no advantage on a load like this but a big advantage if you’re running enough threads that are doing DIFFERENT things. But I’ll leave that for later.

If anybody could run this I’d really appreciate it.

Thanks,
Eoghan.
eoghanoh
 
Posts: 23
Joined: Mon Dec 17, 2012 3:22 am

Re: Epiphany vs Intel Core i7

Postby ed2k » Wed Jan 08, 2014 3:49 am

I did offer to share my board with someone, but seems no interest.
about the comparison, a multicore i7 is faster on many levels. To achieve linear effect, you have to take the same consideration. data has to be localized, otherwise cpu just waiting for data to be ready, the speed to access cache is often 100times faster than local dram. same problem applied to ephiphany as well. So to do performance analysis you don't need the actual chip, just compare the architecture with intel's

I would say the most advantage of epiphany is its performance per watt number.
ed2k
 
Posts: 113
Joined: Mon Dec 17, 2012 3:27 am

Re: Epiphany vs Intel Core i7

Postby eoghanoh » Wed Jan 08, 2014 9:03 am

Hi there,

yes, certainly in these tests I would be keeping the data local and trying to fit everything in the 32K of onboard RAM. I would need to do that if I've any hope of linear speedup etc. I'd just start off with an embarassingly simple problem with no communication other than the final result (CRC for example) and try to see how many cores it would take to outperform a Core i7. In fact, I'll also be looking at the larger picture - typically the way people use an i7 is through an operating system, so you don't have exact control over the allocation of thread to processor (you CAN set affinity, and when it runs it will run on that processor, but it still may be swapped out for another thread). With the Epiphany, there is no OS so you have access to the raw core which means no swapping out threads etc. So with an i7 your caching would be hurt if swapping threads very often but you wouldn't have the same issue on an Epiphany chip as there is no swapping. Hence why I say perhaps if there's enough threads doing different things there might be an advantage.

After that (CRC) I would move onto something like Mandelbrot which relies on passing data back. After that I would move onto something like Jacobi (Red Black SOR maybe) which relies heavily on inter-core communication. This starts to investigate the inter-core messaging. A reasonable size grid and code would easily fit on the local 32K RAM, but is the message passing the killer here? As it scales, does efficiency drop off dramatically?

I don't necessarilly expect the Epiphany to outperform a Core i7 - but what if it did? what if there was a certain type of program for which it did? And if that certain type of program happened to be something that mobiles would do, or if an existing mobile program could be restated and redesigned in such a way as to take advantage of that model, then (for that task at least) you would have the equivalent of an i7 inside your mobile but with much less power.

These are the kind of things I want to look at but it's very difficult without a board!

I didn't see your original very kind offer of sharing your board but if it is still open then I would love to take you up on it until mine arrives.

Alternatively, if anybody has ordered 2 boards and they have arrived and doesn't quite need their second board yet (or if somebody won't get a chance to use their single board for another few months due to existing project committments etc) perhaps somebody could loan it to me? I know by the way that this is essentially saying "Hi, you know that board that you waited a year and a half to get? Well, could I have it?"!!! But I'm just asking because I'm really stuck.

Thanks,
Eoghan.
eoghanoh
 
Posts: 23
Joined: Mon Dec 17, 2012 3:22 am

Re: Epiphany vs Intel Core i7

Postby optimaler » Wed Jan 08, 2014 5:34 pm

Being a student myself, I can understand the pain of deadline extensions. They are not fun.

I don't know how soon you will have you board, but if you need extra resources before your April deadline I supposedly have a cluster on the way (8x16 epiphany cores) which I'd be fine giving you ssh access to once I get it set up. I'm probably not going to be able to work on my parallella ideas until March because of my own deadlines anyway. Obviously you won't be able to scale out to 64 cores or more without using MPI and dealing with the communication overhead that comes with it, and I can see why that might not be compatible with your experiment, but if you're okay with that then we can have a dialogue about it.
optimaler
 
Posts: 24
Joined: Mon Dec 17, 2012 3:29 am

Re: Epiphany vs Intel Core i7

Postby shodruk » Thu Jan 09, 2014 12:14 pm

eoghanoh wrote:With the Epiphany, there is no OS so you have access to the raw core which means no swapping out threads etc. So with an i7 your caching would be hurt if swapping threads very often but you wouldn't have the same issue on an Epiphany chip as there is no swapping. Hence why I say perhaps if there's enough threads doing different things there might be an advantage.


Interesting insight!
It also indicates the Epiphany's advantage over GPGPU.
I think Epiphany is excellent not only in data parallelism but also in task parallelism.
Shodruky
shodruk
 
Posts: 464
Joined: Mon Apr 08, 2013 7:03 pm

Re: Epiphany vs Intel Core i7

Postby ed2k » Fri Jan 10, 2014 12:33 am

shodruk wrote:Interesting insight!
It also indicates the Epiphany's advantage over GPGPU.
I think Epiphany is excellent not only in data parallelism but also in task parallelism.


In ephiphany, you are dealing with bare-metal.
ed2k
 
Posts: 113
Joined: Mon Dec 17, 2012 3:27 am

Re: Epiphany vs Intel Core i7

Postby eoghanoh » Sat Jan 11, 2014 10:16 pm

optimaler wrote:Being a student myself, I can understand the pain of deadline extensions. They are not fun.

I don't know how soon you will have you board, but if you need extra resources before your April deadline I supposedly have a cluster on the way (8x16 epiphany cores) which I'd be fine giving you ssh access to once I get it set up. I'm probably not going to be able to work on my parallella ideas until March because of my own deadlines anyway. Obviously you won't be able to scale out to 64 cores or more without using MPI and dealing with the communication overhead that comes with it, and I can see why that might not be compatible with your experiment, but if you're okay with that then we can have a dialogue about it.


That is a very nice offer from you optimaler, thankyou. I may take you up on that if I still haven't got a 64 core board. But you're right, the message passing would be an overhead that would have to be adjusted for, somehow. If I can I'll stay away from it as it just brings in more complications!!

Thanks,
Eoghan.
eoghanoh
 
Posts: 23
Joined: Mon Dec 17, 2012 3:22 am

Re: Epiphany vs Intel Core i7

Postby greytery » Thu Jan 16, 2014 11:52 pm

Being a pensioner myself, I've decided that there's little point in extending my (undefined, but final :-( ) Deadline, but I hope there will be no pain. I'm patiently waiting for my two Epihany III boards which I hope to get before that Deadline expires.

@eoghanoh - a couple of thoughts on your posts ..

You refer to shodruk's excellent and exciting demo: "if it scales linearly (as shodruk's mandelbrot demo shows)".
Shodruck writes that his multi-core timing results show that the application is not linear, but near linear.
Using Amdahl's equation to make a guess at the application seriality, it shows a small seriality factor as core numbers increase.
Playing with the numbers, extrapolating up to 64 and then on up to 1024, 4096 (er, I mean 4095) cores shows that this tiny serial component would be significant at the higher values. Theoretically though, it's still an impressive speedup, and that's before looking at the energy costs.
(I'm breaking a Golden Rule of performance engineering here: "Interpolate not extrapolate", but this is just for fun, OK)

You also show numbers for your application runs on the i7. Charting your results shows a huge tail-off as the effects of seriality and coherence (see N.Gunther) kick in from the Hyperthreading chip and operating system. If you tried this on someone's larger 12 (Hyperthreading) CPU Intel machine it's likely that you would not get much more speedup.
(I've worked on the performance of large Xeon and Hyperdome systems, which go up to a couple of hundred cores, and shown that real speedup curves can actually be negative at higher core numbers! Mileage varies considerably with the application).

As and when you recode/rerun for Epiphany, it's likely that the coherence effects will be considerably less - as has been said, bare metal, less Windoze or Linuzzz to get in the way - but there will still be some seriality which becomes significant at higher CPU numbers.

So I don't think there is "any hope of linear speedup". Near, but no cigar.

Looking forward to any results you may get if/when you get access to the chips. This is exactly the sort of games I intend to play myself. :-)

tery
tery
User avatar
greytery
 
Posts: 205
Joined: Sat Dec 07, 2013 12:19 pm
Location: ^Wycombe, UK

Re: Epiphany vs Intel Core i7

Postby mhonman » Fri Jan 17, 2014 12:12 pm

I have an Intel Core i7 2GHz (turbo boosting to 2.6GHz) <snip>
Here are the matrix multiplication times:
1 Core - 98 seconds
2 Cores - 55 seconds
4 Cores - 41 seconds
8 Cores - 32 seconds


Something to bear in mind is that Intel's "Turbo Boost"* works best when only one core is active - when all cores are active too much heat is generated for them to all run at the top frequency (the effectiveness of Turbo boost is determined by Intel's marketing people, so you'll have to check Intel's documentation for your CPU to discover its turbo Boost rules). For a better idea of how the problem scales on the i7, try turning off turbo boost.

When it comes to the lack of speedup from 4 to 8 cores, have you actually got a 4-core processor there? The i7 is a superscalar architecture, i.e. it has extra functional units beyond those visible in the Instruction Set Architecture - this allows the processor to simultaneously schedule several instructions that need to use the same kind of functional unit. Hyper-threading is another way of sharing those functional units and since they are limited in number there will not be much speedup if a single thread is already able to keep most of them busy.

Conversely when writing "tight" code for the Epiphany cores one needs to be aware that they are not superscalar. In particular register dependencies will cause pipeline stalls (see "Pipeline description" in the architecture reference). The gcc compiler is able to identify instruction-level parallelism, but in order to get best results from the FPU the inner loop must have enough independent work (e.g. computation of different values in the result matrix) for the compiler to work with. The Adapteva matrix multiplication example is, well, a good example of how to do this. The reasons behind this are very well explained in Paralant's matrix multiplication whitepaper.

* sadly the term "Turbo Boost" reminds me of the little red button some 80286 PCs were equipped with - its primary function seemed to be to make them crash more often than usual.
mhonman
 
Posts: 112
Joined: Thu Apr 25, 2013 2:22 pm

Re: Epiphany vs Intel Core i7

Postby shodruk » Fri Jan 17, 2014 6:17 pm

greytery wrote:You refer to shodruk's excellent and exciting demo: "if it scales linearly (as shodruk's mandelbrot demo shows)".
Shodruck writes that his multi-core timing results show that the application is not linear, but near linear.


The scaling linearity of my mandelbrot example is 0.97.
I think this is a very good result compared to general multicore environment, but why isn't it completely linear?
The cause of -0.03 is maybe framebuffer blitting.
The eCore local memory is completely independent from others, so it must scale linearly, but the framebuffer is shared by all the eCores and the host cores, so it never scales.
The more the calculation speeds up, the more the bottleneck of the framebuffer(shared memory) stands out.
Shodruky
shodruk
 
Posts: 464
Joined: Mon Apr 08, 2013 7:03 pm

Next

Return to General Discussion

Who is online

Users browsing this forum: No registered users and 10 guests

cron