Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.

Pointer Analysis Survey.

Rupesh Nasre.

Aug 24, 2007.

Outline.

● The problem.● Background.● Representative papers.● Discussion: trends, similarities, differences.● Directions for research.

The problem.

Statically find out the groups of program variables, such that, all variables in a group may point to the same memory block during the program execution.

Background (1 of 7).

● Static analysis.➢ done on static representation of a program.➢ does not require program execution.➢ is conservative by definition.

● Dynamic analysis.➢ done on traces of program executions.➢ does not cover all possible behaviors.➢ precise for a run of the program.


● Clients.➢ program transformations that depend on pointer

analysis.➢ for instance, queries related to pointers and compiler

optimizations.➢ typically, query resolution time for clients is inversely

proportional to pointer analysis time.

Background (3 of 7).● Precision.

➢ a measure of correctness for getting the required information from pointer analysis.

➢ for pointer analysis, the required information is: whether two pointers are aliases or non-aliases.

➢ dynamic analysis is precise with respect to that execution.


● Efficiency.➢ amount of time taken by an algorithm.

● Scalability.➢ asymptotic time complexity of an algorithm.

An algorithm can be efficient, but not scalable.


● Flow-sensitivity.➢ algorithm considers control flow in the program.

● Context-sensitivity.➢ algorithm considers calling context of a function.

● Field-sensitivity.➢ algorithm separates individual fields of an aggregate,

from each other and from the aggregate itself.


● Unification-based.➢ algorithm merges equivalence classes of variables in

an assignment.➢ less storage requirement.➢ fast.➢ low precision.


● Inclusion based (or subset based or constraint based).

➢ algorithm processes assignments directionally and each symbol is represented by a single node.

➢ more storage requirement.➢ slower.➢ high precision.

Representative papers (1 of 4).

● Choi et al, Efficient flow-sensitive interprocedural computation of pointer-induced aliases and side effects, POPL 1993.

● Andersen, PhD Thesis, 1994.● Burke et al, Flow-insensitive interprocedural

alias analysis in the presence of pointers, LCPC 1995.

● Reps et al, Precise interprocedural dataflow analysis via graph reachability, POPL 1995.


● Steensgaard, Points-to analysis in almost linear time, POPL 1996.

● Ghiya et al, Is it a tree, DAG, or a cyclic graph? A shape analysis for heap-directed pointers in C, PLDI 1996.

● Hind et al, Which pointer analysis should I use?, ISSTA 2000.


● Cheng et al, Modular interprocedural pointer analysis using access paths: design, implementation, and evaluation, PLDI 2000.

● Liang et al, Evaluating the precision of static reference analysis using profiling, ISSTA 2002.

● Whaley et al, Cloning-based context-sensitive pointer alias analysis using binary decision diagrams, PLDI 2004.


● Raman et al, Recursive data structure profiling, MSP 2005.

● Lattner et al, Making context sensitive points-to analysis with heap-cloning practical for the real world, PLDI 2007.

Discussion: similarities, differences.

● Flow-sensitive: Choi93, Ghiya96, Reps95, Whaley04.

● Context-sensitive: Andersen94, Cheng00, Ghiya96, Lattner07, Whaley04.

● Field-sensitive: Cheng00, Lattner07, Whaley04.● Unification-based: Steensgaard96, Lattner07.● Inclusion-based: Andersen94, Cheng00,

Whaley04.

Discussion: trends (1 of 2).

● Recursion is handled using strongly-connected components.

● A recursive data structure is represented using a single representative node.

● Stack pointers are often treated in a different manner than heap pointers.

● For better precision, inclusion-based analyses are preferred. For better efficiency, unification-based analyses are preferred.

Discussion: trends (2 of 2).

● Flow-sensitivity does not improve precision to a significant extent, for, typically pointers are not reassigned and when they are, they point to the other part of the same data structure represented as a whole using a single node.

● Graph algorithms typically involve three phases: intraprocedural, bottom-up, and top-down.

● Single level of context-sensitivity proves sufficiently precise and efficient.

Discussion.● Most of the papers differ in the techniques used

to solve pointer analysis problem.● Representation of alias information differs a lot

across techniques.➢ matrices: Ghiya96.➢ graphs: Das00, Lattner07, Raman05, Reps95,

Steensgaard96.➢ access-paths: Cheng00.➢ ordered binary decision diagrams: Whaley04.

Directions for research (1 of 4).

● Complex data structures.➢ most algorithms do not handle them well.➢ occur when large hash tables, dictionaries, symbol

tables form the main data structure of a program.➢ need to characterize complexity of a data structure.➢ adaptive algorithm depending on the complexity.


● Out-of-order execution for multithreaded programs.

➢ some research done for multithreaded programs.➢ none of the papers talk about the result of out-of-order

execution of instructions on aliases in multithreaded programs.

➢ instructions may be reordered by compiler or hardware.


● Combination of techniques.➢ no one of the techniques present is best in all aspects.➢ hybrid approaches are necessary.➢ one way is to combine static pointer analysis with

dynamic profile information.➢ another way is to use adaptive algorithm which

internally uses different sub-algorithms invented.


● Representation of alias information.➢ history tells us that difference in the alias information

representation often led to new algorithms.➢ research on finding novel ways to represent aliases

can be an interesting area to be explored.

Pointer Analysis Survey.

Rupesh Nasre.

Aug 24, 2007.

Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.

Documents

Transcript of Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.