Managing data movement bottlenecks in accelerator-rich systems