Enable double-linefill and increase prefetch offset, which gives
considerable read performance boost. The following numbers were
obtained using lmbench 3.0 bw_mem tool, for easier comparison, the
numbers are pasted in two columns. The test machine has Cyclone V
SoC running at 800MHz MPU clock and 512MiB 333MHz 16bit DDR3 DRAM.