网络与信息安全学报2024,Vol.10Issue(4) :98-108.DOI:10.11959/j.issn.2096-109x.2024056

基于滑动窗口和随机性特征的加密流量识别方案

Encrypted traffic identification scheme based on sliding window and randomness features

刘家池 况博裕 苏铓 许亚倩 付安民
网络与信息安全学报2024,Vol.10Issue(4) :98-108.DOI:10.11959/j.issn.2096-109x.2024056

基于滑动窗口和随机性特征的加密流量识别方案

Encrypted traffic identification scheme based on sliding window and randomness features

刘家池 1况博裕 1苏铓 2许亚倩 3付安民4
扫码查看

作者信息

  • 1. 南京理工大学网络空间安全学院,江苏 南京 210094
  • 2. 南京理工大学计算机科学与工程学院,江苏 南京 210094
  • 3. 中国电子信息产业发展研究院,北京 100000
  • 4. 南京理工大学网络空间安全学院,江苏 南京 210094;南京理工大学计算机科学与工程学院,江苏 南京 210094
  • 折叠

摘要

随着信息技术的发展,用户和组织对网络安全的关注度不断提高,数据加密传输逐渐成为主流,推动互联网中加密流量的比例不断攀升.然而,数据加密在保障隐私和安全的同时也成为非法内容逃避网络监管的手段.为实现加密流量的检测与分析,需要高效地识别出加密流量.但是,压缩流量的存在会严重干扰对加密流量的识别.针对上述问题,设计了基于滑动窗口和随机性特征的加密流量识别方案,以高效且准确地识别加密流量.具体来说,所提方案根据滑动窗口机制对会话中数据传输报文的有效载荷进行采样,获取能够反映原始流量信息模式的数据块序列,针对每个数据块使用随机性测度算法进行样本特征提取,为原始载荷构建随机性特征.此外,通过设计基于CART(classification and reqression tree)算法的决策树模型,在提高加密和压缩流量识别的准确率的同时,极大降低了针对加密流量识别的漏报率.基于对多个权威网站数据的随机抽样,构建均衡的数据集,并通过实验证明了所提方案的可行性和高效性.

Abstract

With the development of information technology,network security has increasingly become a focal point for users and organizations,and encrypted data transmission has gradually become mainstream.This trend has driven the proportion of encrypted traffic on the Internet to rise continuously.However,data encryption,while en-suring privacy and security,has also become a means for illegal content to evade network supervision.To achieve the detection and analysis of encrypted traffic,it has become necessary to efficiently identify encrypted traffic.However,the presence of compressed traffic has significantly interfered with the identification of encrypted traffic.To address this issue,an encrypted traffic identification scheme based on sliding windows and randomness features was designed to efficiently and accurately identify encrypted traffic.Specifically,the scheme involved sampling the payloads of data packets in sessions using a sliding window mechanism to obtain data block sequences that reflect the information patterns of the original traffic.For each data block,randomness measurement algorithms were uti-lized to extract sample features and construct randomness features for the original payload.Additionally,a decision tree model based on the CART algorithm was designed,which significantly improved the accuracy of identifying encrypted and compressed traffic and greatly reduced the false negative rate for encrypted traffic identification.A balanced dataset was constructed by randomly sampling data from several authoritative websites,and experiments demonstrated the feasibility and efficiency of the proposed scheme.

关键词

加密流量/压缩流量/随机性特征/滑动采样

Key words

encrypted traffic/compressed traffic/random feature/sliding sampling

引用本文复制引用

基金项目

国家自然科学基金项目(62072239)

国家自然科学基金项目(62372236)

江苏省青蓝工程;江苏省卓越博士后计划()

出版年

2024
网络与信息安全学报
人民邮电出版社

网络与信息安全学报

CSTPCD
ISSN:2096-109X
段落导航相关论文