首页|面向脉动阵列神经网络加速器的软错误近似容错设计

面向脉动阵列神经网络加速器的软错误近似容错设计

扫码查看
本文根据神经网络本身的错误弹性和层内过滤器相似性提出了一种近似容错设计,把过滤器划分成不同校验组进行不精确校验,保证严重错误被检出并恢复.通过优化过滤器-计算单元映射使校验流程与脉动阵列数据流契合,相较于传统双模冗余,本文提出的容错设计可以降低73.39%的性能开销.
Systolic array-based CNN accelerator soft error approximate fault tolerance design
To satisfy the massive computational requirement of Convolutional Neural Networks,various Domain-Specific Architecture based accelerators have been deployed in large-scale systems.While improving the performance significantly,the high integration of the accelerator makes it much more susceptible to soft-error,which will be propagated and amplified layer by layer during the execution of CNN,finally disturbing the decision of CNN and leading to catastrophic consequences.CNNs have been increasingly deployed in security-critical areas,requiring more attention to reliable execution.Although the classical fault-tolerant approaches are error-effective,the performance/energy overheads introduced are non-negligible,which is the opposite of CNN accelerator design philosophy.In this article,we leverage CNN's intrinsic tolerance for minor errors and the similarity of filters within a layer to explore the Approximate Fault Tolerance opportunities for CNN accelerator fault tolerance overhead reduction.By gathering the filters into several check groups by clustering to perform an inexact check while ensuring that serious errors are mitigated,our approximate fault tolerance design can reduce fault tolerance overhead significantly.Furthermore,we remap the filters to match the checking process and the dataflow of systolic array,which can satisfy the real-time checking demands of CNN.Experimental results exhibit that our approach can reduce 73.39%performance degradation of baseline DMR.

computer architectureconvolutional neural networksystolic arraysoft errorapproximate fault tolerance

魏晓辉、王晨洋、吴旗、郑新阳、于洪梅、岳恒山

展开 >

吉林大学 计算机科学与技术学院,长春 130012

计算机系统结构 卷积神经网络 脉动阵列 软错误 近似容错

国家自然科学基金项目国家自然科学基金项目

62272190U19A2061

2024

吉林大学学报(工学版)
吉林大学

吉林大学学报(工学版)

CSTPCD北大核心
影响因子:0.792
ISSN:1671-5497
年,卷(期):2024.54(6)
  • 3