遥测遥控2024,Vol.45Issue(3) :43-51.DOI:10.12347/j.ycyk.20231130001

PTNet:一种面向加密流量分类的半监督并行Transformer网络

PTNet:A Semi-supervised Parallel Transformer Network for Encrypted Traffic Classification

冯舒文 李育恒 白旭洋
遥测遥控2024,Vol.45Issue(3) :43-51.DOI:10.12347/j.ycyk.20231130001

PTNet:一种面向加密流量分类的半监督并行Transformer网络

PTNet:A Semi-supervised Parallel Transformer Network for Encrypted Traffic Classification

冯舒文 1李育恒 1白旭洋1
扫码查看

作者信息

  • 1. 北京遥测技术研究所 北京 100076
  • 折叠

摘要

随着网络加密协议的广泛使用,传统的网络流量分类技术面临很大的挑战.目前的方法具有以下局限性:一是模型高度依赖深度特征,这要求有标注训练数据集的规模足够大,否则模型难以在新的数据上进行泛化;二是模型仅专注于流量的一个模态特征,不同类别流量的同一模态的特征区分度可能不够明显.针对这些问题,本文提出了一种基于深度学习的加密流量分类模型Parallel Transformer Net(并行转换网络,PTNet).该模型基于预训练-微调的半监督思想,充分利用网络中大量无标签流量数据进行预训练,然后在少量有标签数据的基础上进行微调.此外,该模型并行提取了载荷和包长序列两个模态的流量特征,进行多模态的特征融合,并在三种不同的流量分类任务与相应的数据集(Android、USTC-TFC和CSTNET-TLS1.3,均为公开的数据集)上都表现出很好的效果,分类准确率分别达到95%、98%和97%.

Abstract

With the widespread use of network encryption protocols,traditional network traffic classification technology has been challenged.The current method has the following limitations:first,the model is highly dependent on the depth feature,which requires the labeled training data set to be large enough in scale,otherwise the model will have difficulty generalizing to new data;second,the model only focuses on one modal feature of traffic,and the feature differentiation of the same mode of traffic from differ-ent categories may not be obvious.To solve these problems,a deep learning-based encryption traffic classification model called Par-allel Transformer Net(PTNet)is proposed in this paper.Based on the semi-supervised idea of pre-training and fine-tuning,the mod-el makes full use of a large amount of unlabeled traffic data on the network for pre-training,and then fine-tunes on the basis of a small amount of labeled data.Additionally,the model extracts the flow characteristics of load and packet length sequences in parallel to carry out multi-mode feature fusion.Three different traffic classification tasks and their corresponding datasets(Android,USTC-TFC,and CSTNET-TLS1.3)show good results,with classification accuracies reaching 95%,98%,and 97%,respectively.

关键词

网络流量分类/加密流量/深度学习/PTNet

Key words

Network traffic classification/Encrypted traffic/Deep learning/PTNet

引用本文复制引用

出版年

2024
遥测遥控
中国航天工业总公司第七0四研究所

遥测遥控

CSTPCD
影响因子:0.28
ISSN:
参考文献量20
段落导航相关论文