0%

Mining Software Data for Better Software Quality

Mining Software Data for Better Software Quality

Automated bug fixing from Bug Reports

Define Fix Pattern

R2Fix: Automatically Generating Bug Fixes from Bug Reports

image1

报告人手动研究了开发人员如何修复三种主要的和重要的bug类型,例如86.6%的缓冲区溢出可以通过以下方式解决:

  • Allocate a longer buffer
  • Assign fewer bytes to a buffer
  • Modify the bound check condition
1
E.g.,
2
- strcpy(<buffer>,<expression>);
3
+ strcpy(<buffer>,<exprsssion>,sizeof(<buffer>);

Using classification to decide the bug type

graph LR;
    A[Bug reports E.g.Bugzilla]--> |Bag-of-words model| B[Real bug?]
    B[Real bug?]--> |Yes| C[Root Causes?]
    B[Real bug?]--> |No| D[ ]
    C[Root Causes?]-->E[Buffer overflow]
    C[Root Causes?]-->F[Null pointer]
    C[Root Causes?]-->G[Memory leak]
    C[Root Causes?]-->H[Not one of the three bug types]

90.3% of the classified ‘buffer overflow’ bug reports are true buffer overflows.

Extracting Web API Specifications from Documentation

Towards Extracting Web API Specifications from Documentation

image2

Challenges faced by consumers:

  • Finding “the right” APIs
  • Correctly using APIs
  • Detecting and reacting to API changes

报告人提出了一个自动提取specification的方法

image3

提取结果与API Harmony、APIs.guru上的相比

  • Our approach achieves reasonably precise results
  • Documentation may contain errors
  • There exist many inconsistencies between documentation and crawled specifications, e.g., due to API updates.

Studing and improving testing practices

A Study of Oracle Approximations in Testing Deep Learning Libraries

image4

  • There exists a significant portion of oracle approximations in DL libraries
  • Developers constantly modify oracle approximations for evolution needs, esp. modifying tolerances to avoid flaky tests and manage variabilities