Genome sequence alignment-design space exploration for optimal performance and energy architectures



Downloads per month over past year

Qureshi, Yasir Mahmood and Herruzo, José M. and Zapater, Marina and Olcoz Herrero, Katzalin and González Navarro, Sonia and Plata, Óscar and Atienza, David (2021) Genome sequence alignment-design space exploration for optimal performance and energy architectures. IEEE transactions on computers, 70 (12). pp. 2218-2233. ISSN 0018-9340

[thumbnail of olcoz27 preprint.pdf]

Official URL:


Next generation workloads, such as genome sequencing, have an astounding impact in the healthcare sector. Sequence alignment, the first step in genome sequencing, has experienced recent breakthroughs, which resulted in next generation sequencing (NGS). As NGS applications are memory bounded with random memory access patterns, we propose the use of high bandwidth memories like 3D stacked HBM2, instead of traditional DRAMs like DDR4, along with energy efficient compute cores to improve both performance and energy efficiency. Three state-of-the-art NGS applications, Bowtie2, BWA-MEM, and HISAT2 are used as case studies to explore and optimize NGS computing architectures. Then, using the gem5-X architectural simulator, we obtain an overall 68 percent performance improvement and 71 percent energy savings using HBM2 instead of DDR4. Furthermore, we propose an architecture based on ARMv8 cores and demonstrate that 16 ARMv8 64-bit OoO cores with HBM2 outperforms 32-cores of Intel Xeon Phi Knights Landing (KNL) processor with 3D stacked memory. Moreover, we show that by using frequency scaling we can achieve up to 59 percent and 61 percent energy savings for ARM in-order and OoO cores, respectively. Lastly, we show that many ARMv8 in-order cores at 1.5GHz match the performance of fewer OoO cores at 2GHz, while attaining 4.5x energy savings.

Item Type:Article
Additional Information:

©2021 IEEE
This work was supported in part by the ERC Consolidator Grant COMPUSAPIEN (GA No. 725657), the EC H2020 WiPLASH (GA No. 863337), the EC H2020 RECIPE (GA No. 801137), the EU FEDER and the Spanish MINECO (GA No. RTI2018-093684-B-I00), the Spanish CM (S2018/TCS-4423), Spanish MINECO TIN2016-80920-R, JA2012 P12-TIC-1470 and UMA18-FEDERJA-197 projects.

Uncontrolled Keywords:Read alignment; Sequential analysis; Bioinformatics; Genomics; Bandwidth; Three-dimensional displays; Data centers; Field programmable gate arrays; Genome sequencing; sequence alignment; NGS; HPC; HBM2; KNL; architecture exploration; many-core
Subjects:Sciences > Computer science > Artificial intelligence
ID Code:69204
Deposited On:20 Dec 2021 12:35
Last Modified:20 Dec 2021 12:46

Origin of downloads

Repository Staff Only: item control page