OpenCL SDK and ESDK misalignment

Moderator: dar

OpenCL SDK and ESDK misalignment

Postby Dade » Sat Apr 06, 2013 3:34 pm

I'm trying to to install Parallella OpenCL SDK (binary version downloaded from http://www.browndeertechnology.com/code ... 130215.tgz) and there may be a misalignment between the OpenCL SDK binaries and the Epiphany SDK (/opt/adapteva/esdk.4.13.03.30) I received with the board.

Once I'm at the point of "make quicktest", I get the following error:

Code: Select all
clinfo.c:407:4: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 3 has type ‘cl_ulong’ [-Wformat]
/usr/bin/ld: warning: libe-loader.so.1, needed by /usr/local/browndeer/lib/libcoprthr-e.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libe-host.so.1, needed by /usr/local/browndeer/lib/libcoprthr-e.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libe_platform.so, needed by /usr/local/browndeer/lib/libcoprthr-e.so, not found (try using -rpath or -rpath-link)
/usr/local/browndeer/lib/libcoprthr-e.so: undefined reference to `e_open'
/usr/local/browndeer/lib/libcoprthr-e.so: undefined reference to `e_set_host_verbosity'
/usr/local/browndeer/lib/libcoprthr-e.so: undefined reference to `e_mwrite_buf'
/usr/local/browndeer/lib/libcoprthr-e.so: undefined reference to `e_get_platform_info'
/usr/local/browndeer/lib/libcoprthr-e.so: undefined reference to `e_load'
/usr/local/browndeer/lib/libcoprthr-e.so: undefined reference to `e_alloc'
/usr/local/browndeer/lib/libcoprthr-e.so: undefined reference to `e_set_loader_verbosity'
/usr/local/browndeer/lib/libcoprthr-e.so: undefined reference to `e_mread_buf'
collect2: ld returned 1 exit status
make: *** [clinfo.x] Error 1


libcoprthr-e.so seems to refer to libraries that do not exist on my ESDK installation. The same symbols seems to be defined in another library (/opt/adapteva/esdk/tools/host/lib/libe-hal.so). So I added a "-L/opt/adapteva/esdk/tools/host/lib -le-hal" to the makefile and the result is:

Code: Select all
/usr/bin/ld: warning: libe-loader.so.1, needed by /usr/local/browndeer/lib/libcoprthr-e.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libe-host.so.1, needed by /usr/local/browndeer/lib/libcoprthr-e.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libe_platform.so, needed by /usr/local/browndeer/lib/libcoprthr-e.so, not found (try using -rpath or -rpath-link)
/usr/local/browndeer/lib/libcoprthr-e.so: undefined reference to `e_mwrite_buf'
/usr/local/browndeer/lib/libcoprthr-e.so: undefined reference to `e_mread_buf'
collect2: ld returned 1 exit status
make: *** [clinfo.x] Error 1


e_mread_buf/e_mwrite_buf seems defined inside libe-hal (but may be with another signature).

This looks like the OpenCL SDK assumes to work with a different version of the Epiphany SDK :?:

I have also tried to compile the OpenCL SDK from sources but I end with an internal compiler error :!:

Code: Select all
make -C tools/cltrace 
make[1]: Entering directory `/home/linaro/opt/browndeer-src/coprthr/tools/cltrace'
cc -O1 -fPIC   -I/home/linaro/opt/browndeer-src/coprthr/include -c libcltrace1.c
libcltrace1.c: In function ‘_libcltrace1_fini’:
libcltrace1.c:245:6: warning: format ‘%d’ expects argument of type ‘int’, but argument 4 has type ‘long long unsigned int’ [-Wformat]
cc: internal compiler error: Killed (program cc1)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-4.6/README.Bugs> for instructions.
make[1]: *** [libcltrace1.o] Error 4
make[1]: Leaving directory `/home/linaro/opt/browndeer-src/coprthr/tools/cltrace'
make: *** [tools/cltrace] Error 2
User avatar
Dade
 
Posts: 26
Joined: Sun Dec 16, 2012 8:59 pm

Re: OpenCL SDK and ESDK misalignment

Postby dar » Sat Apr 06, 2013 4:39 pm

rc2 is was built against eSDk 4.13.01.04 so it appears we have a misalignment.

Regarding build from source, have not run into that internal compiler error. Note that if you are getting stuck in tools/cltrace that tool is deprecated, being replaced with cldebug interface, so you could try,

./configure --enable-epiphany --disable-cltrace

in order to bypass it.

Also, note that building the COPRTHR package in parts can be tricky since there is some ordering of the steps that is required. Until you get comfortable with it, always best to do a make in the root directory.

We will try to get this misalignment resolved. In principle we should not have these issues going forward once paths and API becomes fixed in place. It only takes small detail to knock this off track.

Finally, be aware there is a kernel launch lock-up issue we also need to resolve. We have been working on it and believe we have solution that will go out in rc3. Practical consequence is that you can run OpenCL codes fine - we do this a lot using our prototype, but you run into intermittent errors. And this intermittent error will cause the quicktest to fail since it hammers the OpenCL implementation with hundreds of kernel tests and one is bound to hit the lock-up issue. Just be aware of this, and know we are of it and are working to get the fix pushed out in a release. Its been a difficult issue to track down.
dar
 
Posts: 90
Joined: Mon Dec 17, 2012 3:26 am

Re: OpenCL SDK and ESDK misalignment

Postby Dade » Sat Apr 06, 2013 7:37 pm

dar wrote:rc2 is was built against eSDk 4.13.01.04 so it appears we have a misalignment.


Thanks Dar, I have downloaded eSDk 4.13.01.04 from the Parallella FTP site and I can now compile clinfo without any problem.

According ftp://ftp.parallella.org/esdk/esdk.revisions.txt:

Code: Select all
esdk.3.12.11.20_linux_x86_64.tgz  - latest EMEK3 eSDK release
esdk.4.13.01.04_linux_armv7l.tgz  - Parallella Prototype ZedBoard/E16 eSDK release, with e-host driver
esdk.4.13.03.30_linux_armv7l.tgz  - Parallella Prototype ZedBoard/E16/E64 eSDK release, with e-hal driver


So I guess something has drastically changed in the newer version: it looks like there is a single e-hal library, etc.

I had another problem to run clinfo:

Code: Select all
[4434] clmesg CRITICAL: libocl.c(849): Linux mmap does not support MAP_NOSYNC, demand a work-around
_libocl_clproc_state 0xb6f5b000
[4434] clinfo: report OpenCL platform and device information: 127.0.1.1
[4434] clmesg WARNING: libocl.c(149): cannot read ocl.conf, using ICD fallback (/etc/OpenCL/vendors
[4434] nplatforms0
[4434] clinfo: No platforms found


I assume the first CRITICAL message is a warning like the second one. The ICD support was installed correctly, /etc/OpenCL/vendors/coprthr-e.icd included a path to a DLL as expected "/usr/local/browndeer/lib/libcoprthr-e.so". However an ldd of the .so told me there was some missing dependency:

Code: Select all
linaro@linaro-ubuntu-desktop:~/projects/ocl-examples/examples$ ldd /usr/local/browndeer/lib/libcoprthr-e.so
   libelf.so.0 => /usr/local/lib/libelf.so.0 (0xb6ead000)
   libe-loader.so.1 => not found
   libe-host.so.1 => not found
   libpthread.so.0 => /lib/arm-linux-gnueabi/libpthread.so.0 (0xb6e89000)
   libdl.so.2 => /lib/arm-linux-gnueabi/libdl.so.2 (0xb6e7e000)
   libe_platform.so => /usr/local/browndeer/lib/libe_platform.so (0xb6e74000)
   libc.so.6 => /lib/arm-linux-gnueabi/libc.so.6 (0xb6d92000)
   /lib/ld-linux.so.3 (0xb6efe000)


At the end was just wrong path in the LD_LIBRARY_PATH declaration. Apparently the old eSDK had also a different path for the libraries. Changing the /opt/adapteva/esdk symbolic link is not enough to switch eSDK. Just in case someone else find the same problem.

Now clinfo works fine :D

Code: Select all
linaro@linaro-ubuntu-desktop:~/projects/ocl-examples/examples$ ./clinfo
[4758] clmesg CRITICAL: libocl.c(849): Linux mmap does not support MAP_NOSYNC, demand a work-around
_libocl_clproc_state 0xb6f59000
[4758] clinfo: report OpenCL platform and device information: 127.0.1.1
[4758] clmesg WARNING: libocl.c(149): cannot read ocl.conf, using ICD fallback (/etc/OpenCL/vendors
coprthr-1.5.0-RC1 (Marathon)
clinfo: e_open(): mmap failure.
clinfo: e_alloc(): mmap failure.
[4758] clmesg ERROR: device.c(257): e_alloc returned 1
[4758] nplatforms1
[4758] clinfo: Number of platforms found = 1
[4758] clinfo: platform 0:
[4758] clinfo: CL_PLATFORM_PROFILE = <profile>
[4758] clinfo: CL_PLATFORM_VERSION = coprthr-1.5.0-RC1 (Marathon)
[4758] clinfo: CL_PLATFORM_NAME = coprthr-e
[4758] clinfo: CL_PLATFORM_VENDOR = Brown Deer Technology, LLC.
[4758] clinfo: CL_PLATFORM_EXTENSIONS = cl_khr_icd
[4758] clinfo: Number of devices found for this platform = 1
clinfo: device 0:
clinfo: CL_DEVICE_TYPE =  ACCELERATOR
clinfo: CL_DEVICE_VENDOR_ID = 9002
clinfo: CL_DEVICE_MAX_COMPUTE_UNITS = 16
clinfo: CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS = 3
clinfo: CL_DEVICE_MAX_WORK_ITEM_SIZES =  1024 (symmetric)
clinfo: CL_DEVICE_MAX_WORK_GROUP_SIZE = 16
clinfo: CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR = 4
clinfo: CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT = 2
clinfo: CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT = 1
clinfo: CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG = 8
clinfo: CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT = 1
clinfo: CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE = 8
clinfo: CL_DEVICE_MAX_CLOCK_FREQUENCY = 1000
clinfo: CL_DEVICE_ADDRESS_BITS = 0x20
clinfo: CL_DEVICE_GLOBAL_MEM_SZ = 33554432
clinfo: CL_DEVICE_IMAGE_SUPPORT = false
clinfo: CL_DEVICE_MAX_PARAMETER_SIZE = 256
clinfo: CL_DEVICE_MEM_BASE_ADDRESS_ALIGN = 64
clinfo: CL_DEVICE_NAME = E16G Needham
clinfo: CL_DEVICE_VENDOR = Adapteva, Inc.
clinfo: CL_DEVICE_VERSION =
clinfo: CL_DRIVER_VERSION =
clinfo: CL_DEVICE_LOCAL_MEM_SIZE = 32768
[4758] clinfo: clCreateContext 0x1fd70 (0)
[4758] clinfo: clCreateCommandQueue [0] 0x1fdc8 (0)
[4758] clmesg info: cmdsched.c(88): cmdqx0: run
[4758] clmesg info: cmdsched.c(193): cmdqx0: shutdown
done.


However the e_open()/e_alloc() errors look a bit suspicious :?:

dar wrote:Finally, be aware there is a kernel launch lock-up issue we also need to resolve.


Ok, thanks, I will keep in mind this problem.
User avatar
Dade
 
Posts: 26
Joined: Sun Dec 16, 2012 8:59 pm

Re: OpenCL SDK and ESDK misalignment

Postby ysapir » Sun Apr 07, 2013 7:58 am

@Dade,

"So I guess something has drastically changed in the newer version: it looks like there is a single e-hal library, etc."

Many thing have radically changed in the new release. It is totally incompatible with the January release. Most probably, any host app (including the OpenCL environment) needs to be revised and rebuilt to work with the new interface.

Please find more info at the "using_esdk_for_a_host_side_application.pdf" document and in the SDK Reference manual (both in the docs/ directory)
User avatar
ysapir
 
Posts: 393
Joined: Tue Dec 11, 2012 7:05 pm

Re: OpenCL SDK and ESDK misalignment

Postby Dade » Sun Apr 07, 2013 8:21 am

ysapir wrote:Many thing have radically changed in the new release. It is totally incompatible with the January release. Most probably, any host app (including the OpenCL environment) needs to be revised and rebuilt to work with the new interface.


I assume that just installing and using the old version of the eSDK is enough (or something has changed also in the hardware and only the new eSDK works :?: ).

It looks like I have still some problem to run OpenCL applications, I'm trying to run "hello_opencl" example and I get a crash (it is inside the eSDK so it may be related to the version problems we are talking about):

Code: Select all
linaro@linaro-ubuntu-desktop:~/projects/ocl-examples/examples/hello_opencl$ gdb ./hello_opencl
GNU gdb (Ubuntu/Linaro 7.3-0ubuntu2) 7.3-2011.08
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabi".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /home/linaro/projects/ocl-examples/examples/hello_opencl/hello_opencl...done.
(gdb) r
Starting program: /home/linaro/projects/ocl-examples/examples/hello_opencl/hello_opencl
[Thread debugging using libthread_db enabled]
[4189] clmesg CRITICAL: libocl.c(849): Linux mmap does not support MAP_NOSYNC, demand a work-around
_libocl_clproc_state 0xb6ffa000
[4189] clmesg WARNING: libocl.c(149): cannot read ocl.conf, using ICD fallback (/etc/OpenCL/vendors
coprthr-1.5.0-RC1 (Marathon)
hello_opencl: e_open(): mmap failure.
hello_opencl: e_alloc(): mmap failure.
[4189] clmesg ERROR: device.c(257): e_alloc returned 1
[New Thread 0xb6d99470 (LWP 4192)]
[4189] clmesg info: cmdsched.c(88): cmdqx0: run
[4189] clmesg WARNING: memobj.c(117): workaround: CL_MEM_USE_HOST_PTR => CL_MEM_COPY_HOST_PTR, fix this

Program received signal SIGSEGV, Segmentation fault.
0xb6f2159c in memcpy () from /lib/arm-linux-gnueabi/libc.so.6
(gdb) bt
#0  0xb6f2159c in memcpy () from /lib/arm-linux-gnueabi/libc.so.6
#1  0xb6da59f8 in e_mwrite_buf () from /opt/adapteva/esdk/tools/host/armv7l/lib/libe-host.so.1
#2  0xb6dd9764 in __do_create_buffer () from /usr/local/browndeer/lib/libcoprthr-e.so
#3  0xb6dd3f64 in clCreateBuffer () from /usr/local/browndeer/lib/libcoprthr-e.so
#4  0xb6fd4288 in clCreateBuffer () from /usr/local/browndeer/lib/libocl.so
#5  0x00008f80 in main () at hello_opencl.c:63
(gdb)
User avatar
Dade
 
Posts: 26
Joined: Sun Dec 16, 2012 8:59 pm

Re: OpenCL SDK and ESDK misalignment

Postby ysapir » Sun Apr 07, 2013 8:25 am

Should be enough, but make sure you update the /opt/adapteva/esdk symlink, and the EPIPHANY_HOME variable accordingly.
User avatar
ysapir
 
Posts: 393
Joined: Tue Dec 11, 2012 7:05 pm

Re: OpenCL SDK and ESDK misalignment

Postby dar » Mon Apr 08, 2013 4:48 am

I noticed this line in the output,

Code: Select all
hello_opencl: e_open(): mmap failure.


This would suggest the code was not run as root. Presently you need to run as root. This is not related to OpenCL, but has to do with permissions on the mechanism for accessing shared DRAM in the eSDK.

Try to re-run as root. This is described in the parallella quick start guide linked from the same page where the pre-built package was found.

Also, note that the OpenCL implementation is "noisy" in that it generates some scary debug messages that no longer are considered a problem. There are various ways to silence the pre-built package, but easiest is to wait for rc3 when some of that stuff will be cleaned up.
dar
 
Posts: 90
Joined: Mon Dec 17, 2012 3:26 am

Re: OpenCL SDK and ESDK misalignment

Postby Dade » Mon Apr 08, 2013 8:52 am

dar wrote:I noticed this line in the output,

Code: Select all
hello_opencl: e_open(): mmap failure.


This would suggest the code was not run as root. Presently you need to run as root. This is not related to OpenCL, but has to do with permissions on the mechanism for accessing shared DRAM in the eSDK.

Try to re-run as root. This is described in the parallella quick start guide linked from the same page where the pre-built package was found.


Thanks, ahah, it works :D

mandel-small.jpg
mandel-small.jpg (234.24 KiB) Viewed 5145 times


Going to try to run something more complex than a Mandelbrot renderer.

P.S. I opened a couple of conformance (to OpenCL specs) related issues on GitHUB.
User avatar
Dade
 
Posts: 26
Joined: Sun Dec 16, 2012 8:59 pm

Re: OpenCL SDK and ESDK misalignment

Postby Dade » Mon Apr 08, 2013 8:57 am

dar wrote:Finally, be aware there is a kernel launch lock-up issue we also need to resolve.


Is the "XXX check corenum" message a symptom of this problem ? Looking at OpenCL SDK code, it seems a warning (while waiting for Epiphany cores to be ready) but some time the applications hangs writing an endless number of "XXX check corenum".
User avatar
Dade
 
Posts: 26
Joined: Sun Dec 16, 2012 8:59 pm

Re: OpenCL SDK and ESDK misalignment

Postby dar » Mon Apr 08, 2013 11:17 am

some time the applications hangs writing an endless number of "XXX check corenum"


short answer is yes. fact that you get at least one message is result of trying to address the issue at the time of rc2. this is the intermittent kernel launch lock-up issue mentioned above. and as I mentioned, we believe its understood and a fix is being completed for rc3 to make this go away. debug messages will be de-escalated once that is done. it proved to be a tough problem due to very infrequent occurrence on the prototype we used for development.
dar
 
Posts: 90
Joined: Mon Dec 17, 2012 3:26 am

Next

Return to OpenCL

Who is online

Users browsing this forum: No registered users and 1 guest

cron