首页|X-DINC: Toward Cross-Layer Approximation for the Distributed and In-Network ACceleration of Multi-Kernel Applications

X-DINC: Toward Cross-Layer Approximation for the Distributed and In-Network ACceleration of Multi-Kernel Applications

扫码查看
With the rapid evolution of programmable network devices and the urge for energy-efficient and sustainable computing, network infrastructures are mutating toward a computing pipeline, providing In-Network Computing (INC) capability. Despite the initial success in offloading single/small kernels to the network devices, deploying multi-kernel applications remains challenging due to limited memory, computing resources, and lack of support for Floating Point (FP) and complex operations. To tackle these challenges, we present a cross-layer approximation and distribution methodology (X-DINC) that exploits the error resilience of applications. X-DINC utilizes a chain of techniques to facilitate kernel deployment and distribution across heterogeneous devices in INC environments. First, we identify approximation and optimization opportunities in data acquisition and computation phases of multi-kernel applications. Second, we simplify complex arithmetic operations to cope with the computation limitations of the programmable network switches. Third, we perform application-level sensitivity analysis to measure the trade-off between performance gain and Quality of Results (QoR) loss when approximating individual kernels via various techniques. Finally, a greedy heuristic swiftly generates Pareto/near-Pareto mixed-precision configurations that maximize the performance gain while maintaining the user-defined QoR. X-DINC is prototyped on a Virtex-7 Field Programmable Gate Array (FPGA) and evaluated using the Blind Source Separation (BSS) application on industrial audio dataset. Results show that X-DINC performs separation up to 35% faster with up to 88% lower Area-Delay Product (ADP) compared to an Accurate-Centralized approach, when distributed across 2 to 7 network nodes, while maintaining audio quality within an acceptable range of 15-20 dB.

In-network computing (INC)Distributed computingCross-layer approximationBlind source separationIndependent component analysisField programmable gate array (FPGA)Programmable networksP4Energy-efficiencysustainability

Zahra Ebrahimi、Maryam Eslami、Xun Xiao、Akash Kumar

展开 >

Center for Advancing Electronics Dresden (cfaed), Technische Universitaet Dresden, Dresden, Germany||Chair for Embedded Systems, Ruhr-Untversitaet Bochum, Bochum, Germany

Chair for Embedded Systems, Ruhr-Untversitaet Bochum, Bochum, Germany

Munich Research Center, Huawei Technologies, Munich, Germany

2025

Future generation computer systems: FGCS

Future generation computer systems: FGCS

ISSN:0167-739X
年,卷(期):2025.172(Nov.)
  • 90