首页|面向异质事件日志的轨迹聚类采样框架

面向异质事件日志的轨迹聚类采样框架

扫码查看
信息系统在执行过程中收集了大量的业务流程事件日志,流程发现旨在从事件日志中发现流程模型,从而为改进提供事实依据.已有流程发现方法在处理大规模事件日志时仍存在性能瓶颈,事件日志采样技术为提高流程发现的效率提供了一种有效方案.已有事件日志采样方法通常假定日志是同质的,即日志来源于或对应单一的业务流程.然而,考虑到业务流程的复杂性和动态变化,同一事件日志中的轨迹通常呈现出异质的特点,即日志来源于或对应多个行为差异的业务流程.在处理异质事件日志时,通过已有采样技术得到的样本日志存在精度低等问题,而事件日志轨迹聚类却能很好地处理这一问题.由此提出一种面向异质事件日志的轨迹聚类采样框架,首先将事件日志通过轨迹聚类方法分解为一组同质的子日志;其次,通过已有采样方法对子日志进行日志采样;然后,将子日志对应的样本日志进行合并作为最终的样本日志;最后,从流程模型挖掘的角度对样本日志的质量进行评估.通过6个公开数据集的实验分析表明,所提方法为异质事件日志的高质量采样提供了一种有效的解决方案.
Trace clustering sampling framework for heterogeneous event logs
Considerable amounts of business process event logs are collected by information systems,process discov-ery aims to discover process models from event logs to provide evidence for business process improvement.Existing process discovery approaches have performance bottlenecks when handling large-scale event logs,event log sampling technology provides an effective solution for improving the efficiency of process discovery.Existing event log sam-pling techniques usually assume that the log is homogeneous,that is,the log comes from or corresponds to a single business process.However,considering the complexity and dynamic changes of business,the traces in the same e-vent log usually show the characteristics of heterogeneity,that is,the log comes from or corresponds to multiple business processes with different behaviors.In the face of heterogeneous event logs,the sample logs obtained by ex-isting sampling techniques have problems such as low accuracy.To address to this challenge,a trace clustering sam-pling framework for heterogeneous event logs was proposed.The event log was decomposed into a set of sub-logs by trace clustering method,the sub-logs were sampled respectively by existing sampling methods,and then the sampled logs were merged to obtain the final sample log,finally the quality of the sample log was evaluated from the perspec-tive of process model mining.Experimental evaluation with 6 public datasets demonstrated that the proposed method provided an effective solution for high-quality sampling of heterogeneous event logs.

heterogeneitytrace clusteringlog samplingprocess discoveryquality measurement

张帅鹏、刘聪、苏轩、郭娜、高庆鑫、李彩虹、曾庆田

展开 >

山东理工大学计算机科学与技术学院,山东 淄博 255000

山东科技大学计算机科学与工程学院,山东 青岛 266590

异质性 轨迹聚类 日志采样 流程发现 质量评估

国家自然科学基金资助项目国家自然科学基金资助项目山东省泰山学者工程专项基金资助项目山东省泰山学者工程专项基金资助项目山东省自然科学基金优秀青年基金资助项目山东省高等学校青创科技计划创新团队资助项目

6247226452374221tsqn201909109ts20190936ZR2021YQ452021KJ031

2024

计算机集成制造系统
中国兵器工业集团第210研究所

计算机集成制造系统

CSTPCD北大核心
影响因子:1.092
ISSN:1006-5911
年,卷(期):2024.30(9)