Online Log Parsing Method Based on Bert and Adaptive Clustering
Log parsing is a technique for extracting valid information from raw log files,which can be used in areas such as sys-tem troubleshooting,performance analysis and security auditing.The main challenge of log parsing is the unstructured,diversity and dynamics of log data.Different systems and applications may use different log formats,and log formats may change over time.Therefore,this paper proposes BertLP,an online log parsing method that can automatically adapt to different log sources and log format variations.It uses a pre-trained language model,Bert,combined with an adaptive clustering algorithm for static and dynamic recognition of words in logs to group logs to generate log templates.Instead of manually defining log templates or regu-lar expressions and performing frequency counts on words,BertLP automatically identifies log fields and types by learning seman-tic and structural features of log message.Comparative experiments on public log datasets show BertLP improves log parsing ac-curacy by 6.1%compared with the best available method and performs better on log parsing tasks.