Prime Factorization

Generic algorithms that could be implemented in almost any language, e.g. matrix operations and FFT.

Prime Factorization

Postby doddy1985 » Wed Jun 11, 2014 8:49 pm

I don't have any src code yet, but just thought I'd put the idea out there..

I think parallell processing could handle this pretty well (and it would be interesting to see how one could implement scaling when additional resources are added):

Given some Composite number C

HOST
do
set x = sqrt(C); floor x (or ceiling maybe??) //all numbers below repeat above sqrt(C)
send x to all e-cores.
wait for prime factors to come back in memory

ECORES
get x
(this part i'm not sure how would happen yet)
Split up the number line from 2->x (integers only) for n number of cores
each core check divisibility of all 6K+/-1 in the selected range for C //all primes of the form 6K+/-1
if a hit found; write value to memory for HOST to pick up.


Example would be like C = 101*23 = 2323 (small composite number)
sqrt(C) = 48.19....
x = 48 (floor result)
wait for ECORE results (interrupt/loop to look at memory)..

ECORES
core0 get x, C
core0 tell core1-coreM value of C
core0 tell core1-coreM here is your range to look at: //(assume M=15 for 16 core board)
core0 range 3->5
core1 range 6->9
core2 range 10->12
...
core15 range 45->48

Each ECORE (1->M):
accept range from ecore0
create array in range for the form 6*k+/-1
C mod (i) // pointer i iterates through array
if value at array(i) ==0, you have a hit.. pass value to HOST.

Could do some other stuff like.. if you know it is a semi-prime, once you get a hit just kill all other processes (you've found the ONLY lower prime factor value)


I dunno, poke fun if you wish :oops:


edit1 6/11/2014: it's def floor not ceiling
doddy1985
 
Posts: 15
Joined: Tue Oct 08, 2013 4:38 pm

Re: Prime Factorization

Postby over9000 » Wed Jun 11, 2014 10:08 pm

doddy1985 wrote:Split up the number line from 2->x (integers only) for n number of cores
each core check divisibility of all 6K+/-1 in the selected range for C //all primes of the form 6K+/-1
if a hit found; write value to memory for HOST to pick up.


Are you talking about using a wheel? I think you are, in which case instead of a wheel of size 6 (or is it 7? sorry it's been a while), you could go up to 16, one for each core.

Also, you'll probably use different algorithms depending on whether you're sieving (finding all primes up to some limit) and factorisation. I'm not sure which one you're doing?

Actually, I'm already implementing Shor's algorithm on my Parallella cluster. Right now, they're in a superposition of being shipped and not being shipped. Some parts of the quantum waveform have actually collapsed, in that I know I'm not getting the cluster connection cables, but there's still enough uncertainty about the rest of the hardware that I'm already on target to take the prize for discovering the largest Mersenne prime, provided I can actually read the value off if/when I get my boards (decoherence---and anticipation---is a bitch).
over9000
 
Posts: 98
Joined: Tue Aug 06, 2013 1:49 am

Re: Prime Factorization

Postby doddy1985 » Wed Jun 11, 2014 10:36 pm

Oh wow (about the mersenne prime). You aren't from Missouri are you?

Yes this is for factorization (not sieve), so sieve indeed would be a different algorithm. Although.. it would be interesting just running a sieve up to sqrt(C) and save to memory (or file on hard disk) and then just check all those primes for factorizing. I guess that'd be a different approach.

I do believe this is a wheel. Although I figured there can be a bit of optimization (stuff like, check to see if the last integer in a number is 5/0) I figured i'd just start the wheel at 7 also (never look at an even composite) and divisible by 3 is an easy check (sum of integers is divisible by 3). So some of those easy cases just bump them off as one comes across them.

The wheel size also I was thinking for scalability in the future (but 16 to start):
Check to see how many resources are available, ie: if it's a cluster then programmatically determine how many cores available to work with and make the wheel size = num_cores

something like that I guess. Just figured I'd start spitting out some garbled mess on the forum, see what comes of it. It's def something I will program myself to get familiar with the SDK and epiphany chips, but if there is interest with anyone else, I'd love to see what people come up with.
doddy1985
 
Posts: 15
Joined: Tue Oct 08, 2013 4:38 pm

Re: Prime Factorization

Postby over9000 » Thu Jun 12, 2014 12:21 am

doddy1985 wrote:Oh wow (about the mersenne prime). You aren't from Missouri are you?

Eh, nope. I guess they get some twisters there? Otherwise I don't really see the connection :) (Oh, and I can't show you yet, or the quantum waveform would collapse, ruining the experiment)

I do believe this is a wheel.

Cool. I wasn't sure I remembered/recognised right.

Although I figured there can be a bit of optimization (stuff like, check to see if the last integer in a number is 5/0) I figured i'd just start the wheel at 7 also (never look at an even composite) and divisible by 3 is an easy check (sum of integers is divisible by 3).


Yeah, but some of them are only optimisations in base 10 (the divisibility by 3, 5 shortcuts, anyway).

So some of those easy cases just bump them off as one comes across them.


Unfortunately, only /2 is really easy, while there are specific (base-2) hacks for /3 at least, and maybe more if you want to work at it. If you want factors and not just primality testing you'll also have to be able to deal with shifting off known factors, which requires methods of telling the coprocessors to start looking at the reduced (after all known factors are eliminated) values.

The wheel size also I was thinking for scalability in the future (but 16 to start):
Check to see how many resources are available, ie: if it's a cluster then programmatically determine how many cores available to work with and make the wheel size = num_cores


Sounds like it could work, but:

* dividing by smaller numbers should be quicker (so cores doing that will be left idle for longer)
* bignums (assuming you want to scale in that way) are big (a significant portion of the program will involve shifting stuff around in memory, so you may only be able to store some numbers across a number of cores or stream them between them)
* you may be better off using cores to work on the modulus operator and let ARM side handle the high-level algorithm (more benefit for less effort)
* e-cores don't actually have divide operators (so long division is needed, and that's linear with the size of operands)
* you might want a kind of interrupt to stop unnecessary work, and the easiest way to do this is to set a flag in interrupt handler and do work (like block-by-block division) in smallish packets and test flag after every block
* at that stage, you might as well look for better algorithms (optimising an inefficient algorithm is a bad idea; better to replace it)

something like that I guess. Just figured I'd start spitting out some garbled mess on the forum, see what comes of it. It's def something I will program myself to get familiar with the SDK and epiphany chips, but if there is interest with anyone else, I'd love to see what people come up with.


You and me both (for some substring of the above paragraph).
over9000
 
Posts: 98
Joined: Tue Aug 06, 2013 1:49 am

Re: Prime Factorization

Postby xilman » Thu Jun 12, 2014 9:10 am

doddy1985 wrote:I don't have any src code yet, but just thought I'd put the idea out there..

I think parallell processing could handle this pretty well (and it would be interesting to see how one could implement scaling when additional resources are added):

[Deletia]
I dunno, poke fun if you wish :oops:
I'm interested in factoring integers and have been for a very long time. Hate to disillusion you but your algorithm will only work for small integers as it scales as sqrt(N). Fine if you want to factor, say, 40-bit integers but useless for even 100-bitters let alone kilobit integers.

When my 4-node cluster arrives I intend to try porting some real factorization code to it, probably starting with the P-1 algorithm for simplicity (and as a learning exercise)then ECM for the heavy lifting.

Anyone else who is interested in integer factorization: please get in contact and/or contribute to this thread.
xilman
 
Posts: 80
Joined: Sat May 10, 2014 8:10 pm
Location: UK

Re: Prime Factorization

Postby doddy1985 » Thu Jun 12, 2014 5:36 pm

Maybe the sieve would make more sense in the long run.. I was really only thinking for smaller numbers (even less than 40 bit ints). I guess I just wanted to learn how to use the ephiphany chip first before really concentrating efforts on a more advanced algorithm. This algorithm seemed simple enough at least in the beginning just to start thinking parallel. Very intersting comments though from both of you. I'll certainly keep your feedback in mind when the time comes to come up with a more practical solution for doing some heavy lifting.

best,
~doddy
doddy1985
 
Posts: 15
Joined: Tue Oct 08, 2013 4:38 pm

Re: Prime Factorization

Postby xilman » Thu Jun 12, 2014 7:27 pm

over9000 wrote:[
* you may be better off using cores to work on the modulus operator and let ARM side handle the high-level algorithm (more benefit for less effort)
* e-cores don't actually have divide operators (so long division is needed, and that's linear with the size of operands)
Just spotted this bit. I strongly recommend that you research "Montgomery multiplication" and "Barrett reduction". You may then find why your second statement quoted above is rather misleading.

The first statement is right on!
xilman
 
Posts: 80
Joined: Sat May 10, 2014 8:10 pm
Location: UK

Re: Prime Factorization

Postby over9000 » Thu Jun 12, 2014 8:00 pm

xilman wrote:
over9000 wrote:[
* you may be better off using cores to work on the modulus operator and let ARM side handle the high-level algorithm (more benefit for less effort)
* e-cores don't actually have divide operators (so long division is needed, and that's linear with the size of operands)
Just spotted this bit. I strongly recommend that you research "Montgomery multiplication" and "Barrett reduction". You may then find why your second statement quoted above is rather misleading.

The first statement is right on!


Are you objecting that it's not linear time, or that division isn't necessary? Do either of those algorithms do better?

I looked them up and looks like they're both effectively replacing costly division with a similarly-costly inversion step (doesn't Newton's method, for example, have the same big-O complexity as regular division?). Or am I reading that wrong? I know that for some applications, doing an inverse is fine because you can amortise the cost of inversion by reusing the same divisor, but surely if you're doing a naive primality test (or gcd, even), all your divisors will be different?
over9000
 
Posts: 98
Joined: Tue Aug 06, 2013 1:49 am

Re: Prime Factorization

Postby xilman » Sun Jun 15, 2014 9:14 am

over9000 wrote:Are you objecting that it's not linear time, or that division isn't necessary? Do either of those algorithms do better?

I looked them up and looks like they're both effectively replacing costly division with a similarly-costly inversion step (doesn't Newton's method, for example, have the same big-O complexity as regular division?). Or am I reading that wrong? I know that for some applications, doing an inverse is fine because you can amortise the cost of inversion by reusing the same divisor, but surely if you're doing a naive primality test (or gcd, even), all your divisors will be different?
Sorry about this. I have big difficulties getting my head around that people still want to factor single precision integers by trial division. Yes, I know that you proposed project is for educational purposes but I've just been in this game for too long and my thought processes are ossified.

Once you've got the trial division code working you may wish to investigate multiple precision arithmetic and somewhat more efficient factoring algorithms. The heavy lifting, such as converting into and out of Montgomery format, should be run on the Linux side (IMAO) and the relatively simple parallelism on the Epiphany. There are two ways to parallelise --- the arithmetic itself and the factoring algorithms. The former generally gives lower latency at the cost of lower throughput. The latter is applicable in at least two ways. Either factor multiple integers at one or run multiple attempts on the same integer. ECM is ideally suited to the latter case though Pollard's rho and, to a small extent, the P+1 algorithm can parallelised.

One of my planned projects is definitely to get modern factorization algorithms working on the Parallellas when they arrive.
xilman
 
Posts: 80
Joined: Sat May 10, 2014 8:10 pm
Location: UK

Re: Prime Factorization

Postby webmasterpdx » Tue Jun 24, 2014 9:54 am

I've done this before on a cray machine with 64 processors. I was interested at the time in the number of primes between powers of 2.
The determination of whether a number is prime is pretty quick. I found it best to assigm that to a simple C thread and get it assigned to different processors for different integers.
The problem I ran into is where you run out of bits for the integers.

Soooo.....what would be very useful for this kind of stuff is providing libraries for expanding bitwidth integers for when you get greater than 2 to the power of 32.... i.e. Then you'll need to do 64-bit arithmetic using 2 cores at a time, and then eventually using 3 cores at a time as the integers get bigger and bigger. Alternatively could have each core handling the higher resoluion so you have 64 cores doing 64-bit and then 96-bit and eventually 128-bit as the integers get bigger.

I think having the same thing for floating point could be very useful too, so that we could calculate pi to a large number of places......beat the record.... :lol:
webmasterpdx
 
Posts: 5
Joined: Tue Jun 24, 2014 8:59 am

Next

Return to Algorithms

Who is online

Users browsing this forum: No registered users and 2 guests