by piotr5 » Fri Apr 26, 2013 8:52 am
I already talked of this topic. basically for division of integers take a look at gcc/libgcc/config/epiphany/divsi3.c in the eSDK from the public repository. if you find a faster algorithm than the assembler-version of this bit-by-bit division (after the overhead it takes 4 cycles for each bit in the result), an algorithm that uses up less memory, you are welcome to post it. you know, memory is limited to 32k, among this is at least 8k reserved for stack and heap and whatever (because the rest can be protected against writing), so every byte wasted for a division algorithm is precious...
as for counting cycles, as a rule of thumb every assembler instruction takes up 1 cycle. multiplication and float take up 2 more cycles but isn't blocking the program flow (i.e. also 1 cycle if the result isn't needed immediately). branching also takes up 3 cycles and is always blocking if the branch is being taken. no cache, no predictions, all these you must implement yourself -- access beyond those 32k is easily possible but it is alike to accessing stuff outside of a cache, like in level2 cache (whenever it's memory of another core) or raw memory access (much much slower)...