The roofline model is used in this chapter to find the best platform for training a neural network to detect handwritten digits in a multicore CPU and general purpose GPU (GPGPU) hardware environment. The pattern parallel training approach is utilised for the MNIST dataset. On multicore CPU and GPGPU, the training of MNIST's parallel network using multiple data layouts is demonstrated. Several bottlenecks have been explained using the roofline model. This roofline model can be easily developed since it is so basic. Layouts and limitations, such as memory and computation limits, are used to choose the optimum platform. The computational intensity of all rooflines is moved to the right, which improves performance. As a consequence of optimization and the variety of available data size, core number, and operational strength, the most suited hardware platform is selected.
Noor Mowafeq Al layla,
Department of Computer Engineering, College of Engineering, University of Mosul, Iraq.
Shefa A. Dawwd,
Department of Computer Engineering, College of Engineering, University of Mosul, Iraq.
Please see the link here: https://stm.bookpi.org/NRAMCS-V4/article/view/7030
No comments:
Post a Comment