Mining Software Data for Better Software Quality
Automated bug fixing from Bug Reports
Define Fix Pattern
R2Fix: Automatically Generating Bug Fixes from Bug Reports

报告人手动研究了开发人员如何修复三种主要的和重要的bug类型,例如86.6%的缓冲区溢出可以通过以下方式解决:
- Allocate a longer buffer
- Assign fewer bytes to a buffer
- Modify the bound check condition
1 | E.g., |
2 | - strcpy(<buffer>,<expression>); |
3 | + strcpy(<buffer>,<exprsssion>,sizeof(<buffer>); |
Using classification to decide the bug type
graph LR;
A[Bug reports E.g.Bugzilla]--> |Bag-of-words model| B[Real bug?]
B[Real bug?]--> |Yes| C[Root Causes?]
B[Real bug?]--> |No| D[ ]
C[Root Causes?]-->E[Buffer overflow]
C[Root Causes?]-->F[Null pointer]
C[Root Causes?]-->G[Memory leak]
C[Root Causes?]-->H[Not one of the three bug types]
90.3% of the classified ‘buffer overflow’ bug reports are true buffer overflows.
Extracting Web API Specifications from Documentation
Towards Extracting Web API Specifications from Documentation

Challenges faced by consumers:
- Finding “the right” APIs
- Correctly using APIs
- Detecting and reacting to API changes
报告人提出了一个自动提取specification的方法

提取结果与API Harmony、APIs.guru上的相比
- Our approach achieves reasonably precise results
- Documentation may contain errors
- There exist many inconsistencies between documentation and crawled specifications, e.g., due to API updates.
Studing and improving testing practices
A Study of Oracle Approximations in Testing Deep Learning Libraries

- There exists a significant portion of oracle approximations in DL libraries
- Developers constantly modify oracle approximations for evolution needs, esp. modifying tolerances to avoid flaky tests and manage variabilities