brainstorming, FPGA/cache

Q1: would there be any way the FPGA could assist the parallella with DDR access, e.g. implementing a write-back cache between it and the 1gb/32mb shared area
or would this be pointless (is the performance hit mostly latency from going off chip anyway, and maybe it couldn't do anything the DDR bank/burst mechanism doesn't already do.
- I wondered if having a cache there would allow multiple areas to be accessed without thrashing the DDR's banks as much, or buffer more into burst writes, or save contention with ARM).
- it might also prevent temporaries having to reach DDR.
whats the bandwidth& latency - DDR <-> FPGA, FPGA <-> Epiphany, Epiphany <-> north-south e-Links
maybe there are ARM SOC's where an L2 cache is available to DMA
(CELL had SPU DMA aware of the L2 cache)
or would this be pointless (is the performance hit mostly latency from going off chip anyway, and maybe it couldn't do anything the DDR bank/burst mechanism doesn't already do.
- I wondered if having a cache there would allow multiple areas to be accessed without thrashing the DDR's banks as much, or buffer more into burst writes, or save contention with ARM).
- it might also prevent temporaries having to reach DDR.
whats the bandwidth& latency - DDR <-> FPGA, FPGA <-> Epiphany, Epiphany <-> north-south e-Links
maybe there are ARM SOC's where an L2 cache is available to DMA
(CELL had SPU DMA aware of the L2 cache)