A Binary Modularization Approach Based on Graph Community Detection Method
With the continuous development of information technology,the scale of software is also constantly increasing.Complex large-scale software is built by combining components that perform independent functions.However,once the source code is compiled into binary files,this modular information is lost,and the goal of binary modularization tasks is to reconstruct this information.Binary modularization has many downstream applications such as detecting binary code reuse,binary similarity detection,and binary software composition analysis.We introduce a new graph community detection algorithm and designs a binary modularization method based on this algorithm.The method's effectiveness is verified through modularization of 7 839 binary files from the Linux system.Experiments show that the method's Normalized Turbo MQ indicator is 0.557,which is a 58.6%improvement over existing state-of-the-art methods,and the running time is much less than existing methods.Additionally,we also put forward a library-level binary modularization method.Existing binary modularization methods can only decompose binaries into several modules,whereas the proposed library-level binary modularization method allows for the decomposition of binaries into several libraries.We also demonstrate the application of this method in malware classification.