Parallella for bio-inspired neural networks

Parallella for bio-inspired neural networks

Postby illioren » Wed Apr 01, 2015 7:27 pm


Currently PhD Student in neuroscience I received a parallella from one on my co-worker (youhou, it's Christmas) and would like to use it to accelerate one of my "large scale" neural simulations. Before allocating any time to the task, I would like to know a few things about the feasibility of the project.
I have read the board specification (architecture, core frequency and memory) but still cannot get a hang of what will the main restrictions be.

The current simulation is built in C++ using boost (just for its queue implementation - which can be re-written if needed) and - soon enough - openMP for the multi-threading (and quite possibly MPI to connect many boards/computer together).
The entire simulation currently counts about a million of neurons. A typical neuron counts about 15 float variables an a bunch of ODEs. Each neurons are inherited from an abstract class and contain a C-array pointers to Synapse objects (up to a thousand). Here are my questions :

- Does using C++ class against C function/structure pointers would generate a memory overhead regarding to the 32k of memory each epiphany core possess ? I actually have no clear idea what using C++ against C implies when it comes to hardware memory management and overhead. As far as I understood, C++ is more of a syntactic abstraction for some pointer acrobatics, still possible but much more painful to do, in pure C.
- Is using that small part of the boost library a major hindrance regarding the cores memory (yes, the memory once again) ?
- While using openMP to dispatch thread queue to the processors, is the memory of the core only containing the current function/variable to be executed (analog the the low level cash of Intel processors for instances) or is all the memory necessary to the complete thread execution imported at once ?
- How heavy is it, time wise, to copy from the global memory to the core memory ?

The code involves many other aspects to manage space, create the neurons an wire them together (about 300Mo of ram on my computer) but this only takes a few seconds on my desktop computer and is executed once before the actual neural simulation begins - where the acceleration is required.

Ultimately, I would like to use the board to let my simulation run in a robotic environment as a substitute for traditional vision systems (I am working on a Retina model).

Thanks you in advance for your time and advises :)
Posts: 3
Joined: Wed Apr 01, 2015 6:54 pm

Re: Parallella for bio-inspired neural networks

Postby sebraa » Thu Apr 02, 2015 12:38 pm

If you stick to classes and inheritance, your code should not inflate by much. Same goes for templates, if the resulting code does not require dozens of almost-identical functions. You are right that this part of C++ is just a nicer syntax for things you can do similarly in C.

I don't know anything about Boost or OpenMP, so I can't comment on it. However, using the C++ standard library seems to be a problem with regard to memory consumption already.

Writing from core-local memory to shared memory is done with a speed of about 150 MB/s, but might be faster with the new images (I haven't tested). Read accesses generally are slower, but there are some numbers here on the forum.
Posts: 495
Joined: Mon Jul 21, 2014 7:54 pm

Re: Parallella for bio-inspired neural networks

Postby illioren » Fri Apr 03, 2015 3:28 pm

Thank you for you quick answer. I am kind a relieved that C++ is still find memory wise. Converting the whole code would have been such a pain !
I'll try to find a tool to monitor the size of my class/function before the implementation and post an update on this post with some numbers is anyone is interested. That look like something missing on this forum (or maybe I missed the right post)

I also read about the STD library to be heavy, which is not a good news on its own knowing how useful it is. Hopefully, boost will reveal itself to be lighter (finger crossed), otherwise, I'll just attempt to re-code the section needed for my code.

If anyone has experience with OpenMP in particular on parallella, could you let me know ? I'm particularly interested in knowing "what" is exactly copied to the epiphany internal memory when a thread is attributed.

Thank you all :)
Posts: 3
Joined: Wed Apr 01, 2015 6:54 pm

Return to Scientific Computing

Who is online

Users browsing this forum: No registered users and 2 guests