Shape analysis (program analysis)
In program analysis, shape analysis is a static code analysis technique that discovers and verifies properties of linked, dynamically allocated data structures in (usually imperative) computer programs. It is typically used at compile time to find software bugs or to verify high-level correctness properties of programs. In Java programs, it can be used to ensure that a sort method correctly sorts a list. For C programs, it might look for places where a block of memory is not properly freed. ApplicationsShape analysis has been applied to a variety of problems:
ExampleShape analysis is a form of pointer analysis, although it is more precise than typical pointer analysis. Pointer analysis attempts to determine the set of objects to which a pointer can point (called the points-to set of the pointer). Unfortunately, these analysis are necessarily approximate (since a perfectly precise static analysis could solve the halting problem). Shape analysis can determine smaller (more precise) points-to sets. Consider the following simple C++ program. Item *items[10];
for (int i = 0; i < 10; ++i) {
items[i] = new Item(...); // line [1]
}
process_items(items); // line [2]
for (int i = 0; i < 10; ++i) {
delete items[i]; // line [3]
}
This program builds an array of objects, processes them in some arbitrary way, and then deletes them. Assuming that the Unfortunately, most pointer analyses have difficulty analyzing this program precisely. In order to determine points-to sets, a pointer analysis must be able to name a program's objects. In general, programs can allocate an unbounded number of objects; but in order to terminate, a pointer analysis can only use a finite set of names. A typical approximation is to give all the objects allocated on a given line of the program the same name. In the example above, all the objects constructed at line [1] would have the same name. Therefore, when the Summarization and materializationShape analysis overcomes the problems of pointer analysis by using a more flexible naming system for objects. Rather than giving an object the same name throughout a program, objects can change names depending on the program's actions. Sometimes, several distinct objects with different names may be summarized, or merged, so that they have the same name. Then, when a summarized object is about to be used by the program, it can be materialized—that is, the summarized object is split into two objects with distinct names, one representing a single object and the other representing the remaining summarized objects. The basic heuristic of shape analysis is that objects that are being used by the program are represented using unique materialized objects, while objects not in use are summarized. The array of objects in the example above is summarized in separate ways at lines [1], [2], and [3]. At line [1], the array has been only partly constructed. The array elements 0..i-1 contain constructed objects. The array element i is about to be constructed, and the following elements are uninitialized. Shape analysis can approximate this situation using a summary for the first set of elements, a materialized memory location for element i, and a summary for the remaining uninitialized locations, as follows:
After the loop terminates, at line [2], there is no need to keep anything materialized. The shape analysis determines at this point that all the array elements have been initialized:
At line [3], however, the array element
Notice that in this case, the analysis recognizes that the pointer at index See alsoReferences
Bibliography
Information related to Shape analysis (program analysis) |