Page 1 of 3

OpenCL is the best.

PostPosted: Mon Dec 17, 2012 6:59 am
by Ariemeth
This is the reason I was excited about the Parallella board. Too bad I'll have to suffice with 16 cores instead of 64.

Re: OpenCL is the best.

PostPosted: Mon Dec 17, 2012 8:15 am
by Dade
I'm looking forward to OpenCL support too :D

BTW, is there going to be an "official" OpenCL SDK for Parallella or the community is going to need a port of one of the open source version available ?

Re: OpenCL is the best.

PostPosted: Mon Dec 17, 2012 9:28 am
by the_summer
Can you recommend a good ressource for learning OpenCL. I am familiar with general C/C++ programming, but parallel needs a different way of thinking I guess.

Best regards,

Re: OpenCL is the best.

PostPosted: Mon Dec 17, 2012 9:37 am
by BlueByLiquid
@thesummer,

OpenCL in Action is very good. Also Heterogenous Computing with OpenCL as well as "The OpenCL Programming Guide" are very good. I have them all and they are all very good in their own ways. OpenCL in action is the easiest read if you are new to OpenCL.

If you are looking for free resources there are a number out their. Amd and nvidia have their own guides but they are geared towards their own hardware. Also intel has a small amount of coverage on the subject. I would highly recommend any of the books as they are pretty complete coverage. That said this board will be different than programming for a GPU or a CPU but the concepts and language will still be demonstrated well in the books.

Re: OpenCL is the best.

PostPosted: Mon Dec 17, 2012 10:41 am
by the_summer
Thank you very much. I will see, if I can get one of the books before Christmas. ;-)

Re: OpenCL is the best.

PostPosted: Mon Dec 17, 2012 1:17 pm
by Marco
Same here. I've got no idea how to write parallel code, but hopefully the pdf book supplyed by Parallella will help. ;)

Re: OpenCL is the best.

PostPosted: Mon Dec 17, 2012 2:29 pm
by Ariemeth
The biggest hurdle for me when I started working on parallel code was getting over how ugly the code looked when set up to be efficiently run in parallel. I find both "OpenCL Programming Guide" and "Heterogeneous Computing with OpenCL" which just released a second edition to both be very good.

Re: OpenCL is the best.

PostPosted: Mon Dec 17, 2012 3:34 pm
by jar
For those of you interested in the Parallella SDK, see the 'current' branch (https://github.com/browndeer/coprthr/tree/current) of the COPRTHR SDK (https://github.com/browndeer/coprthr). I'm not sure if Adapteva is going to be supporting a fork of this code but this is what they're basing the Parallella SDK on.

The COPRTHR SDK includes an OpenCL 1.1 implementation with a compiler based on GNU/GCC that supports x86_64, ARM, and Epiphany microarchitectures. Also included is a lightweight interface to OpenCL, called Standard Compute Layer (STDCL), that should ease the development of new codes and new programmers. For those of you that have written OpenCL codes, you know how verbose the API can be. The STDCL interface is closer to the CUDA API in simplicity yet portable across architectures and other vendors OpenCL implementations.

The SDK also includes other tools related to OpenCL software development such as clcc (an offline OpenCL kernel compilation tool) and cltrace (a tool used to trace OpenCL API calls).

There's a lot more to the SDK, but this should help you guys get started. See some of the example codes that are included (https://github.com/browndeer/coprthr/tree/current/examples).

Re: OpenCL is the best.

PostPosted: Mon Dec 17, 2012 3:45 pm
by Ariemeth
The Cltrace was handy at times.

Re: OpenCL is the best.

PostPosted: Mon Dec 17, 2012 4:05 pm
by jar
@Ariemeth

The only thing I'm aware of in the STDCL interface that uses macros that call other macros would be the clforka() call in stdcl_host.h. Consider the following OpenCL example snippet:
Code: Select all
   clSetKernelArg(krn,0,sizeof(cl_uint),&n);
   clSetKernelArg(krn,1,sizeof(cl_mem),&aa_buf);
   clSetKernelArg(krn,2,sizeof(cl_mem),&b_buf);
   clSetKernelArg(krn,3,sizeof(cl_mem),&c_buf);
   size_t gtdsz[] = { n };
   size_t ltdsz[] = { 64 };
   cl_event ev[10];
   clEnqueueNDRangeKernel(cmdq,krn,1,0,gtdsz,ltdsz,0,0,&ev[0]);

This may be replaced by the following STDCL code (clndrange_init1d and clforka are macros):
Code: Select all
   clndrange_t ndr = clndrange_init1d( 0, n, 64);
   clforka(cp,devnum,krn,&ndr,CL_EVENT_NOWAIT,n,aa,b,c);


These snippets are taken from the hello_opencl and hello_stdcl examples, respectively.

You don't have to use the STDCL macros if you don't want to. All the underlying OpenCL data structures are exposed if you choose to use them. It's left up to the programmer to decide which they would rather use.