Publication:
A power-efficient and scalable load-store queue design

Loading...
Thumbnail Image
Full text at PDC
Publication Date
2005
Advisors (or tutors)
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Springer-Verlag Berlin
Citations
Google Scholar
Research Projects
Organizational Units
Journal Issue
Abstract
The load-store queue (LQ-SQ) of modem superscalar processors is responsible for keeping the order of memory operations. As the performance gap between processing speed and memory access becomes worse, the capacity requirements for the LQ-SQ increase, and its design becomes a challenge due to its CAM structure. In this paper we propose an efficient load-store queue state filtering mechanism that provides a significant energy reduction (on average 35% in the LSQ and 3.5% in the whole processor), and only incurs a negligible performance loss of less than 0.6%.
Description
© Springer-Verlag Berlin Heidelberg 2005. We want to thank Simha Sethumadhavan for his helpful and thorough comments. International Workshop on Power and Timing Modeling, Optimization and Simulation (15th. sep 21-23, 2005.Lovaina, Belgica).
Keywords
Citation
1. R. E. Kessler. The Alpha 21264 Microprocessor. Technical Report, Compaq Computer Corporation, 1999. 2. B. Calder and G. Reinman. A Comparative Survey of Load Speculation Architectures. Journal of Instruction-Level Parallelism, May-2000. 3. C. Nairy and D. Soltis. Itanium-2 Processor Microarchitecture. IEEE-Micro, 23(2):44-55, March/April, 2003. 4. J. M. Tendler, J. S. Dodson, J. S. Fields Jr., H. Le and B. Sinharoy. Power-4 System Microarchitecture. IBM Journal of Research and Development, 46(1):5-26, 2002. 5. S. Sethumadhavan, R. Desikan, D. Burger, Charles R. Moore, Stephen W. Keckler. Scalable Hardware Memory Disambiguation for High ILP Processors. Proceedings of MICRO-36, December-2003. 6. T. Austin, E. Larson, and D. Ernst. SimpleScalar: An Infrastructure for Computer System Modeling. Computer, vol. 35, no. 2, Feb 2002. 7. D. Brooks, V. Tiwari, and M. Martonosi. Wattch: A Framework for Architectural-Level Power Analysis and Optimizations. 28-ISCA, Göteborg, Sweden. July, 2001. 8. T. Sherwood, E. Perelman, G. Hamerly, B. Calder. Automatically charecterizing large scale program behavior . Proceedings of ASPLOS-2002, October-2002. 9. S. Sethumadhavan, R. Desikan, D. Burger, Charles R. Moore, Stephen W. Keckler. Scalable Hardware Memory Disambiguation for High ILP Processors. IEEE-Micro, Vol. 24, Issue 6:118-127, November/December, 2004. 10. I. Park, C. Liang Ooi, T. N. Vijaykumar. Reducing design complexity of the load-store queue. Proceedings of MICRO-36, December-2003. 11. H. W. Cain and M. H. Lipasti. Memory Ordering: A Value-Based Approach. Proceedings of ISCA-31, June-2004. 12. A. Roth. A high-bandwidth load-store unit for single- and multi- threaded processors. Technical Report, University of Pennsylvania, 2004. 13. L. Baugh and C. Zilles. Decomposing the Load-Store Queue by Function for Power Reduction and Scalability. Proceedings of PAC Conference, October-2004.