Intel® In-Memory Analytics Accelerator: Performance Characterization and Guidelines

Jaeyoung Kang, Qirong Xia, Ipoom Jeong, Yongjoo Park, Nam Sung Kim

March, 2025

Abstract

The rapid advancements in CPU performance have slowed due to the end of Dennard scaling and the exponential growth of data, making it increasingly difficult to sustain previous progress rates. As a result, there is a growing trend of offloading frequently used operations to hardware accelerators to enhance application performance while reducing CPU cycle consumption. In line with this trend, Intel introduced the In-Memory Analytics Accelerator (IAA) as an on-chip accelerator in its 4th-generation Xeon® Scalable CPUs (Sapphire Rapids). IAA is designed to offload common big data and in-memory analytics operations from CPUs, accelerating query processing tasks such as CRC64, expand, extract, scan, and select, in addition to (de)compression functions. As IAA is integrated with CPUs as an on-chip accelerator, it can directly access the CPU’s cache and memory in a cache-coherent manner, offering low latency and power consumption with reduced programming complexity compared to off-chip accelerators. In this paper, we first introduce the hardware and software architectures of IAA, highlighting its latest features. We then evaluate IAA’s performance using various microbenchmarks and widely used analytics applications, including Pandas, Citus, and Clickhouse. Finally, based on our findings, we provide guidelines for effectively utilizing and optimizing IAA for analytics applications.

Type

Conference paper

Publication

IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS, Best Paper Honorable Mention, Distinguished Artifact Award)

Intel® In-Memory Analytics Accelerator: Performance Characterization and Guidelines

Abstract

Ipoom Jeong

Assistant Professor