Mining Software Data for Better Software Quality
Automated bug fixing from Bug Reports
Define Fix Pattern
R2Fix: Automatically Generating Bug Fixes from Bug Reports
报告人手动研究了开发人员如何修复三种主要的和重要的bug类型,例如86.6%的缓冲区溢出可以通过以下方式解决:
- Allocate a longer buffer
- Assign fewer bytes to a buffer
- Modify the bound check condition
1 | E.g., |
2 | - strcpy(<buffer>,<expression>); |
3 | + strcpy(<buffer>,<exprsssion>,sizeof(<buffer>); |
Using classification to decide the bug type
graph LR; A[Bug reports E.g.Bugzilla]--> |Bag-of-words model| B[Real bug?] B[Real bug?]--> |Yes| C[Root Causes?] B[Real bug?]--> |No| D[ ] C[Root Causes?]-->E[Buffer overflow] C[Root Causes?]-->F[Null pointer] C[Root Causes?]-->G[Memory leak] C[Root Causes?]-->H[Not one of the three bug types]
90.3% of the classified ‘buffer overflow’ bug reports are true buffer overflows.
Extracting Web API Specifications from Documentation
Towards Extracting Web API Specifications from Documentation
Challenges faced by consumers:
- Finding “the right” APIs
- Correctly using APIs
- Detecting and reacting to API changes
报告人提出了一个自动提取specification的方法
提取结果与API Harmony、APIs.guru上的相比
- Our approach achieves reasonably precise results
- Documentation may contain errors
- There exist many inconsistencies between documentation and crawled specifications, e.g., due to API updates.
Studing and improving testing practices
A Study of Oracle Approximations in Testing Deep Learning Libraries
- There exists a significant portion of oracle approximations in DL libraries
- Developers constantly modify oracle approximations for evolution needs, esp. modifying tolerances to avoid flaky tests and manage variabilities