Page 1 of 1

Sorting 2D data

PostPosted: Fri Dec 20, 2013 11:59 am
by nickoppen
My 5-year-old woke me up the other night and it took a while to get back to sleep so I used the time to think about sorting 2D data point on the Parallella (it worked! I got back to sleep at least before the sun came up).

Many algorithms that manipulate two dimensional data need to first sort the data from an un-ordered point cloud into something that is more manageable. The application that I have in mind is DeLaunay Triangulation where each point is connected to it's nearest neighbours http://en.wikipedia.org/wiki/Delaunay_triangulation.

The parallella seems to be well suited to this type of processing.

I've written up my initial thoughts on a Parallella-friendly algorithm here: http://nicksparallellaideas.blogspot.com/2013/12/sorting-of-spatial-data-for-meshed.html. Please comment if you have any ideas, especially if I've missed something.

Thanks,

nick

Re: Sorting 2D data

PostPosted: Fri Dec 20, 2013 4:49 pm
by fdeutschmann
Hilbert curves (or other space filling curves, though Hilberts are particularly well-suited) can be very useful for this sort of thing....

-frank

Re: Sorting 2D data

PostPosted: Fri Dec 20, 2013 5:19 pm
by shodruk
nice! :D

Re: Sorting 2D data

PostPosted: Mon Apr 28, 2014 6:07 am
by nickoppen
I had another look at my blog entry for the sorting algorithm and I realised that I was still thinking sequentially when I wrote it.

Distribution in a round-robin fashion allocates one point at a time per core. There is no reason why that core needs to wait until it gets its full number of points before it sorts, especially if quick sort then needs to run through them sequentially again. The quick sort step should start as soon as the first point arrives.

Similarly, a core that passes a point that is bigger than its' neighbour's lowest value is also just adding another point. Thus the distribution by the main core and the shuffling between cores both need to call the same function (e.g. add_point(...)) and the receiving core needs to sort the incoming point into its' list.

Since quick-sort itself is a linear process, using this parallelised method, there is a chance that a core is not finished with the sort before the next point arrives, especially when the number of points gets large. Therefore, each core should maintain an incoming point list. Perhaps the call would be better named queue_point_for_sort(...).

I also believe that the efficient DeLaunay Triangulation algorithm. Lee and Schachter, (http://www.personal.psu.edu/cxc11/AERSP560/DELAUNEY/13_Two_algorithms_Delauney.pdf) is also a divide and conquer type algorithm. Maybe that can start straight away as well???

Re: Sorting 2D data

PostPosted: Mon Apr 28, 2014 8:03 pm
by NealKelly
Hi Nick,

I'm curious, since you're already using quicksort once on the x values, and then again on the y values, why not complete the 2d sort as follows:

1) quicksort x values.
2) split x values into 4 arrays of length ~FullLength/4
3) quicksort those 4 arrays by y value
4) split the 4 arrays into 4 more arrays of length ~ArrayLength/4

I'm also curious how you created the graphs that contrast the orginal pattern to the points after distribution. Did you randomly generate the points on the right as (rand(), rand()), and then apply some f(x,y) to get the graph on the left? I also wondered if the cells on the right were created by 2-d sorting the points on the left, and then plotted as (indexInXSort,indexInYSort), but they don't appear to be integer values.

Re: Sorting 2D data

PostPosted: Mon Apr 28, 2014 11:12 pm
by nickoppen
Hi Neal,

Thanks for your sorting suggestion. I did gloss over this in my blog entry and it is not as easy as "apply a quick sort... problem solved".

Now that I come to think about it in more detail, I'm thinking that it might be the subject of another blog entry (which is short hand for "I've only got sketchy ideas at the moment"). I should have a look at some literature regarding 2D sorts and see if there are any ideas out there that can be well parallelised.

Regarding the diagrams, you give me way too much credit. I used inkscape and a mouse. I kept on clicking random points until the diagrams looked like what I was trying to demonstrate. Then I coloured some of the points red and saved the drawing as a png file. I have not received my parallella yet so all of my blog entries are purely "thought experiments" at the moment.

nick