[计信讲坛] 第115讲 Evolving Ranking-Based Failure Proximities
for Better Clustering in Fault Isolation
———— [计算机与信息学院] (2022年10月7日)发布
报告名称 |
Evolving Ranking-Based Failure Proximities for Better Clustering in Fault Isolation |
时 间 |
2022年10月10日 10:00-10:45 |
地 点 |
信息科学楼一楼学术报告厅 |
主 讲 人 |
宋壹 博士 |
主办单位 |
计算机与信息学院 |
备注 |
报告人:宋壹,武汉大学计算机学院软件工程系2020级博士研究生。主要研究方向为软件测试调试、软件缺陷定位、基于搜索的软件工程等。以第一作者身份在软件工程领域国际顶级会议ASE(CCF-A)、国际知名期刊JSS (CCF-B,中科院二区) 发表论文各一篇。获批国家发明专利一项、实用新型专利一项、软件著作权三项,参与湖北省自然科学基金项目一项。曾受邀到美国德克萨斯大学达拉斯分校软件测试与质量保障高级研究中心开展联合研究,获硕士研究生国家奖学金。担任软件缺陷预测与分析国际研讨会(SDPA 2022) 共同主席,任SCI期刊Journal of Supercomputing审稿人。
【报告题目】 Evolving Ranking-Based Failure Proximities for Better Clustering in Fault Isolation
【摘要】 Failures that are not related to a specific fault can reduce the effectiveness of fault localization in multi-fault scenarios. To tackle this challenge, researchers and practitioners typically cluster failures (e.g., failed test cases) into several disjoint groups, with those caused by the same fault grouped together. In such a fault isolation process that requires input in a mathematical form, ranking-based failure proximity (R-proximity) is widely used to model failed test cases. In R-proximity, each failed test case is represented as a suspiciousness ranking list of program statements through a fingerprinting function (i.e., a risk evaluation formula, REF). Although many off-the-shelf REFs have been integrated into R-proximity, they were designed for single-fault localization originally. To the best of our knowledge, no REF has been developed to serve as a fingerprinting function of R-proximity in multi-fault scenarios. For better clustering failures in fault isolation, in this paper, we present a genetic programming-based framework along with a sophisticated fitness function, for evolving REFs with the goal of more properly representing failures in multi-fault scenarios. By using a small set of programs for training, we get a collection of REFs that can obtain good results applicable in a larger and more general scale of scenarios. The best one of them outperforms the state-of-the-art by 50.72% and 47.41% in faults number estimation and clustering effectiveness, respectively. Our framework is highly configurable for further use, and the evolved formulas can be directly applied in future failure representation tasks without any retraining.
|