Calling e_dma_copy from within a kernel

Moderator: dar

Calling e_dma_copy from within a kernel

Postby nickoppen » Sun Jan 03, 2016 6:53 am

Hi,

I've been trying to compile the nbody example from the US Army Research Lab's mpi examples.

When using the provided Makefile, I get three problems:

Code: Select all
/usr/local/browndeer/include/coprthr_xdevice.h:42: undefined reference to `e_mutex_lock(unsigned int, unsigned int, int*)'
/usr/local/browndeer/include/coprthr_xdevice.h:45: undefined reference to `e_mutex_unlock(unsigned int, unsigned int, int*)'
/tmp/xclfoLLmH/AE1nwe.cpp:69: undefined reference to `e_dma_copy(void*, void*, unsigned long)'

The first two are peculiar because they are being called from within a coprthr include file. The third one is really interesting because I'd love to be able to call e_dma_copy from my kernel and take advantage of the speed increase over a copy loop.

I think that the relevant environment variables are:

Code: Select all
EPIPHANY_HDF=/opt/adapteva/esdk/bsps/current/platform.hdf
EPIPHANY_HOME=/opt/adapteva/esdk
LD_LIBRARY_PATH=/usr/local/browndeer/lib:/opt/adapteva/esdk/tools/host/lib:/opt/openmpi/lib:
PATH=/home/parallella/bin:/usr/local/browndeer/bin:/opt/adapteva/esdk/tools/host/bin:/opt/adapteva/esdk/tools/e-gnu/bin:/opt/openmpi/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games

Is there anything else I need?

Thanks,

nick
Sharing is what makes the internet Great!
User avatar
nickoppen
 
Posts: 266
Joined: Mon Dec 17, 2012 3:21 am
Location: Sydney NSW, Australia

Re: Calling e_dma_copy from within a kernel

Postby jar » Mon Jan 04, 2016 5:00 pm

I'm not sure why you have errors. I believe the threaded MPI COPRTHR beta download should be used rather than the COPRTHR source on github.

The kernel already has e_dma_copy in it for off-chip DMAs (bringing data in and writing out results). The on-chip MPI_Sendrecv_replace routine in the n-body code uses the DMA engine on the back end, although that's an implementation detail the application developer doesn't need to know. Regardless of the copy method, the n-body algorithm is not communication bound for large problems so that even if a more efficient copy method were written, it would not increase performance significantly (less than 5%).

I see you have some OpenMPI stuff installed. OpenMPI is different than COPRTHR threaded MPI. OpenMPI is strictly for the ARM host cores, not the Epiphany cores.
User avatar
jar
 
Posts: 295
Joined: Mon Dec 17, 2012 3:27 am

Re: Calling e_dma_copy from within a kernel

Postby nickoppen » Wed Jan 06, 2016 12:17 pm

Thanks jar. That's actually very frustrating. I spent hours trying to figure out how to get the COPRTHR MPI interface and the best i could find was the README.md from the para-para example. I also downloaded a tiny compressed file from browdeer that included the coprthr_mpi.h file but that seemed too small to be anything of substance. Clearly not the good stuff.

Would you be so kind as to post a link to the coprthr beta version that includes the mpi extensions. I'll get rid of openmpi and that might get the e_dma_copy and mutex_lock problems sorted as well.

nick
Sharing is what makes the internet Great!
User avatar
nickoppen
 
Posts: 266
Joined: Mon Dec 17, 2012 3:21 am
Location: Sydney NSW, Australia

Re: Calling e_dma_copy from within a kernel

Postby jar » Wed Jan 06, 2016 3:18 pm

nickoppen wrote:Would you be so kind as to post a link to the coprthr beta version that includes the mpi extensions. I'll get rid of openmpi and that might get the e_dma_copy and mutex_lock problems sorted as well.


Here's an article discussing the work for reference:
https://www.parallella.org/2015/04/09/t ... hitecture/

Current beta (preview) as of this post:
http://www.browndeertechnology.com/code ... review.tgz
User avatar
jar
 
Posts: 295
Joined: Mon Dec 17, 2012 3:27 am

Re: Calling e_dma_copy from within a kernel

Postby nickoppen » Thu Jan 14, 2016 10:45 am

I'm missing something here.

Having gotten rid of openMPI and installed the coprthr_mpi file (http://www.browndeertechnology.com/code ... review.tgz) I still get:

Code: Select all
parallella@parallella:~/Work/mpi-epiphany/nbody$ make
clcc --coprthr-cc -mtarget=e32 -D__link_mpi__ --dump-bin -I/opt/adapteva/esdk/tools/e-gnu/epiphany-elf/include -I ./ -DECORE=16 -DNPARTICLE=2048 -DSTEPS=32 -DSIZE=1 -DMPI_BUF_SIZE=1024 -DCOPRTHR_MPI_COMPAT mpi_tfunc.c
e32.o: In function `coprthr_mutex_lock':
/usr/local/browndeer/include/coprthr_xdevice.h:42: undefined reference to `e_mutex_lock(unsigned int, unsigned int, int*)'
/usr/local/browndeer/include/coprthr_xdevice.h:42: undefined reference to `e_mutex_lock(unsigned int, unsigned int, int*)'
e32.o: In function `coprthr_mutex_unlock':
/usr/local/browndeer/include/coprthr_xdevice.h:45: undefined reference to `e_mutex_unlock(unsigned int, unsigned int, int*)'
/usr/local/browndeer/include/coprthr_xdevice.h:45: undefined reference to `e_mutex_unlock(unsigned int, unsigned int, int*)'
e32.o: In function `nbody_thread':
/tmp/xclACl76Z/onoE1P.cpp:69: undefined reference to `e_dma_copy(void*, void*, unsigned long)'
/tmp/xclACl76Z/onoE1P.cpp:69: undefined reference to `e_dma_copy(void*, void*, unsigned long)'
collect2: error: ld returned 1 exit status
epiphany-elf-objcopy: 'e32.0.elf': No such file
cat: e32.0.srec: No such file or directory
nm: 'e32.0.elf': No such file
[2793] clmesg ERROR: computil_e32.h(397): addr_core_local_data not found
[2793] clmesg ERROR: compiler_cc.c(212): compile returned -11

[2792] clmesg ERROR: clcc.c(668): clcc: clcc1 returned non-zero exit status 11
make: *** [mpi_tfunc.cbin.3.e32] Error 255


I found e_dma.h and e_mutex.h and included a -I switch as per the mpi_fft2d example in parallella-examples but that had no effect. Indeed the mpi_fft2d example gave me the same result.

I must be missing something and I have no idea what or where to look.

Any suggestions would be greatly appreciated.

nick
Sharing is what makes the internet Great!
User avatar
nickoppen
 
Posts: 266
Joined: Mon Dec 17, 2012 3:21 am
Location: Sydney NSW, Australia

Re: Calling e_dma_copy from within a kernel

Postby jar » Thu Jan 14, 2016 5:50 pm

I'm sorry you're having a tough time with this. It appears that the compilation process for clcc isn't finding the correct binary for e_mutex_*

There is a utility called 'cldebug' included with the COPRTHR SDK. The cldebug utility can be used to dump out extra information about the compile process and even save its temporary build directory. What you are seeing is just the error, not what caused it.

Try running the following:
Code: Select all
cldebug -t ./temp -- clcc --coprthr-cc -mtarget=e32 -D__link_mpi__ --dump-bin -I/opt/adapteva/esdk/tools/e-gnu/epiphany-elf/include -I ./ -DECORE=16 -DNPARTICLE=2048 -DSTEPS=32 -DSIZE=1 -DMPI_BUF_SIZE=1024 -DCOPRTHR_MPI_COMPAT mpi_tfunc.c


The "-t ./temp" will save the temporary files, but may not be necessary. The extra information should help you identify what's going wrong.
User avatar
jar
 
Posts: 295
Joined: Mon Dec 17, 2012 3:27 am

Re: Calling e_dma_copy from within a kernel

Postby nickoppen » Fri Jan 15, 2016 2:32 am

Hi JAR,

Thanks for that. I followed the log file through and everything was going fine until there was a reference to /opt/adapteva/esdk/tools/e-gnu/epiphany-elf/sys-include. I don't have such a directory. I've got /opt/adapteva/esdk/tools/e-gnu/epiphany-elf/include which has the e_mutex.h and e_dma.h in it.

When I created a soft link on ...-elf/include to give myself a ...elf/sys-include directory everything was happy.

I'm happy with this work around but is there a setting somewhere that causes the script to use ...elf/sys-include rather than ...elf/include?

nick
Sharing is what makes the internet Great!
User avatar
nickoppen
 
Posts: 266
Joined: Mon Dec 17, 2012 3:21 am
Location: Sydney NSW, Australia

Re: Calling e_dma_copy from within a kernel

Postby jar » Fri Jan 15, 2016 2:49 am

It's probably set in the configure process while building the COPRTHR SDK.
User avatar
jar
 
Posts: 295
Joined: Mon Dec 17, 2012 3:27 am

Re: Calling e_dma_copy from within a kernel

Postby nickoppen » Fri Jan 15, 2016 9:10 am

Thanks. I'll live with it for now. Hopefully version 2 will have it fixed.

nick
Sharing is what makes the internet Great!
User avatar
nickoppen
 
Posts: 266
Joined: Mon Dec 17, 2012 3:21 am
Location: Sydney NSW, Australia


Return to OpenCL

Who is online

Users browsing this forum: No registered users and 2 guests

cron