PROGRAM PIPELINING OF GPU USING C-SLOW RETIMING
Keywords:
synchronous data flow, software pipelining, C-slow retiming, graphics processing unitAbstract
The software pipelining methods used in various computer architectures are discussed. In particular, thanks to software pipelining, it is possible to significantly increase the performance of graphics processing units. The proposed software pipelining method is based on the creation and optimisation of a synchronous data flow graph through its folding using C-slow retiming, with its subsequent description in a programming language.
References
A. Zou, J. Li, C. D. Gill and X. Zhang, "RTGPU: Real-Time GPU Scheduling of Hard Deadline Parallel Tasks With Fine-Grain Utilization," in IEEE Transactions on Parallel and Distributed Systems, vol. 34, no. 5, pp. 1450-1465, May 2023, doi: 10.1109/TPDS.2023.3235439.
Easwaran Raman et al. “Parallel-stage decoupled software pipelining”. In: Proceedings of the 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization. CGO ’08. Boston, MA, USA: Association for Computing Machinery, 2008, pp. 114–123. isbn: 9781595939784. doi: 10.1145/1356058.1356074. url: https://doi.org/10.1145/ 1356058.1356074.
Yuanming Zhang et al. “Clustered Decoupled Software Pipelining on Commodity CMP”. In: Department of Information Science, Graduate School of Engineering, Utsunomiya University, Japan (2008).
E. A. Lee and D. G. Messerschmitt, "Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing," in IEEE Transactions on Computers, vol. C-36, no. 1, pp. 24-35, Jan. 1987, doi: 10.1109/TC.1987.5009446.
K. K. Parhi, C. , -Y. Wang and A. P. Brown, "Synthesis of control circuits in folded pipelined DSP architectures," in IEEE Journal of Solid-State Circuits, vol. 27, no. 1, pp. 29-43, Jan. 1992, doi: 10.1109/4.109555.
A. Sharma, C. Ebeling and S. Hauck, "PipeRoute: a pipelining-aware router for reconfigurable architectures," in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 25, no. 3, pp. 518-532, March 2006, doi: 10.1109/TCAD.2005.853691.
K. K. Parhi, "Algorithm transformation techniques for concurrent processors," in Proceedings of the IEEE, vol. 77, no. 12, pp. 1879-1895, Dec. 1989, doi: 10.1109/5.48830.