FPGA-Based Overlay Accelerators with Massive Parallel Processing Units to Accelerate Deep Neural Networks