¡Nos trasladamos! E-Prints cerrará el 7 de junio.

En las próximas semanas vamos a migrar nuestro repositorio a una nueva plataforma con muchas funcionalidades nuevas. En esta migración las fechas clave del proceso son las siguientes:

Es muy importante que cualquier depósito se realice en E-Prints Complutense antes del 7 de junio. En caso de urgencia para realizar un depósito, se puede comunicar a docta@ucm.es.

Gem5-X: a many-core heterogeneous simulation platform for architectural exploration and optimization

Impacto

Downloads

Downloads per month over past year

Qureshi, Yasir Mahmood and Simon, William Andrew and Zapater, Marina and Olcoz Herrero, Katzalin and Atienza, David (2021) Gem5-X: a many-core heterogeneous simulation platform for architectural exploration and optimization. ACM transactions on architecture and code optimization, 18 (4). ISSN 1544-3566

[thumbnail of olcoz26 libre+CC.pdf]
Preview
PDF
Creative Commons Attribution.

3MB

Official URL: http://dx.doi.org/10.1145/3461662




Abstract

The increasing adoption of smart systems in our daily life has led to the development of new applications with varying performance and energy constraints, and suitable computing architectures need to be developed for these new applications. In this article, we present gem5-X, a system-level simulation framework, based on gem-5, for architectural exploration of heterogeneous many-core systems. To demonstrate the capabilities of gem5-X, real-time video analytics is used as a case-study. It is composed of two kernels, namely, video encoding and image classification using convolutional neural networks (CNNs). First, we explore through gem5-X the benefits of latest 3D high bandwidth memory (HBM2) in different architectural configurations. Then, using a two-step exploration methodology, we develop a new optimized clustered-heterogeneous architecture with HBM2 in gem5-X for video analytics application. In this proposed clustered-heterogeneous architecture, ARMv8 in-order cluster with in-cache computing engine executes the video encoding kernel, giving 20% performance and 54% energy benefits compared to baseline ARM in-order and Out-of-Order systems, respectively. Furthermore, thanks to gem5-X, we conclude that ARM Out-of-Order clusters with HBM2 are the best choice to run visual recognition using CNNs, as they outperform DDR4-based system by up to 30% both in terms of performance and energy savings.


Item Type:Article
Additional Information:

©2020 Association for Computing Machinery
This work has been partially supported by the ERC Consolidator Grant COMPUSAPIEN (GA No. 725657), the EC H2020 WiPLASH (GA No. 863337), the EC H2020 RECIPE (GA No. 801137), the Spanish CM (S2018/TCS-4423), the EU FEDER, and the Spanish MINECO (GA No. RTI2018-093684-B-I00).

Uncontrolled Keywords:Power; Many-core; Architectural exploration; Gem5; In-cache; HBM; Heterogeneous architectures; Cluster
Subjects:Sciences > Computer science > Artificial intelligence
ID Code:68402
Deposited On:04 Nov 2021 19:49
Last Modified:05 Nov 2021 08:20

Origin of downloads

Repository Staff Only: item control page