NetExtractor:Unknown protocol reverse approach based on network traces
Network protocol reverse engineering is an important challenge in many security domains.The current mainstream approach is to compare and slice characters and tokens between network traces,but the existing work is limited by the high variance and complex state of binary protocol field values in the deriva-tion,and also suffers from the problems of format over-slicing and low accuracy of multi-state field annota-tion.To address these challenges,the authors propose the NetExtractor tool,which integrates optimization methods for format extraction and state annotation.In the format extraction phase,the spatiotemporal charac-teristics of network trajectories are extracted for coarse clustering,followed by multiple sequence alignment,by merging and optimizing using statistical characteristics to further improve the accuracy of format extrac-tion.In the state annotation phase,edit distance is introduced to measure the differences between fields,and random forest and statistical properties are combined to constrain the candidate state fields to improve the ac-curacy of multi-state field annotation.To validate the effectiveness of the proposed method,the NetExtractor tool is employed for automating the inversion of the botnet zeroaccess protocol format and state machine,Evaluation experiments are conducted on eight commonly used protocols to access the efficiency of the pro-posed method.The experiment results demonstrate that compared to the leading research work in the field,NetExtractor can enhance the accuracy of protocol format and protocol state identification,which is of great significance for network security analysis.