一种基于安全多方计算的快速Transformer安全推理方案

A Fast and Secure Transformer Inference Scheme with Secure Multi-Party Computation

刘伟欣 ¹管晔玮 ¹霍嘉荣 ¹丁元朝 ¹郭华 ²李博³

扫码查看

作者信息

1. 北京航空航天大学网络空间安全学院北京 100191
2. 北京航空航天大学网络空间安全学院北京 100191;复杂关键软件环境全国重点实验室(北京航空航天大学) 北京 100878
3. 复杂关键软件环境全国重点实验室(北京航空航天大学) 北京 100878
折叠

摘要

Transformer模型在自然语言处理、计算机视觉等众多领域得到了广泛应用,并且有着突出的表现.在Transformer的推理应用中用户的数据会被泄露给模型提供方.随着数据隐私问题愈发得到公众的关注,上述数据泄露问题引发了学者们对Transformer安全推理的研究,使用安全多方计算(secure multi-party computation,MPC)实现Transformer模型的安全推理是当前的一个研究热点.由于Transformer模型中存在大量非线性函数,因此使用MPC技术实现Transformer安全推理会造成巨大的计算和通信开销.针对Transformer安全推理过程中开销较大的Softmax注意力机制,提出了 2种MPC友好的注意力机制Softmax freeDiv Attention 和 2Quad freeDiv Attention.通过将 Transformer 模型中的 Softmax 注意力机制替换为新的MPC友好的注意力机制,同时结合激活函数GeLU的替换以及知识蒸馏技术,提出了一个MPC友好的Transformer转换框架,通过将Transformer模型转化为MPC友好的Transformer模型,提高Transformer安全推理的效率.在局域网环境下使用安全处理器(secure processing unit,SPU)提供的隐私计算协议,基于所提出的MPC友好的Transformer转换框架,在SST-2上使用Bert-Base进行安全推理.测试结果表明,在保持推理准确率与无近似模型一致的情况下,安全推理计算效率提高2.26倍.

Abstract

Transformer has been widely used in many fields such as natural language processing and computer vision,and has outstanding performance.The users'data will be leaked to the Transformer model provider during inference.With the increasing public attention on data privacy,the above data leakage problem has triggered researchers'study on secure Transformer inference.Implementing secure Transformer inference with secure multi-party computation(MPC)is today's hot topic.Due to the widely existence of non-linear functions in Transformer,it is hard to use MPC to implement secure Transformer inference,which leads to huge computation and communication cost.We focus on Softmax attention,bottleneck in secure Transformer inference,and propose two kinds of MPC-friendly attention mechanism,Softmax freeDiv Attention and 2Quad freeDiv Attention.By replacing the Softmax attention in Transformer with the MPC-friendly attention mechanism proposed,combining with the replacement of activation function GeLU and knowledge distillation,we propose an MPC-friendly Transformer convert framework,which can convert Transformer model to an MPC-friendly one,so as to improve the performance of secure Transformer inference later.Based on the proposed MPC-friendly Transformer convert framework,we perform secure Bert-Base inference on SST-2 in the LAN setting,using privacy computing protocols provided by secure processing unit(SPU).The result shows that the secure inference achieves 2.26 times speedup while maintaining the accuracy with non-approximation model.

关键词

安全推理/Transformer/安全多方计算/安全处理器/知识蒸馏

Key words

secure inference/Transformer/secure multi-party computation(MPC)/secure processing unit(SPU)/knowledge distillation

引用本文复制引用

基金项目

国家重点研发计划(2021YFB2700200)

国家自然科学基金(U21B2021)

国家自然科学基金(61972018)

国家自然科学基金(61932014)

出版年

2024

计算机研究与发展

中国科学院计算技术研究所中国计算机学会

计算机研究与发展

CSTPCD北大核心

影响因子：2.649

ISSN：1000-1239

参考文献量32

段落导航