Understanding Exceptional Flow

15‑03‑2017 Tom den Braber 5 min.

Maybe you recognise the following situation. You are implementing a new feature, and you know that you can use a certain method, as it already covers some of the functionality you need. You briefly look at it, and you don’t see any exception handling constructs. The method documentation does not contain information about what exceptions can be thrown, for example via a @throws declaration. You conclude that there is no reason to think any exceptions are thrown or propagated by the method.
But can you be sure? To come to the conclusion that no exceptions can be thrown or propagated, you would need to trace every method call that could be made by the method you are looking at, and repeat this process for all methods you encounter. You would also need to look up all definitions of internal php functions or methods that are used, to see if they throw any exceptions. This requires an enormous amount of work, and it is tedious to do. It would be very helpful to have this process automated!

That’s exactly why I am currently working on a tool which models the Exceptional Flow as part of my MSc Thesis Project.
In this introductory post, the building blocks of this tool will be discussed briefly. In the posts to follow in this series, each of these building blocks will be covered in more depth.
The series will conclude with an overview of the results of using the tool on a number of open source projects.

Building blocks

The exception flow model consists of a few building blocks. This is visualised in the picture below.
Building blocks for modelling the exception flow

The system takes a complete PHP program as an input. The code is parsed, which results in an Abstract Syntax Tree (AST). This AST serves as the basis for the complete analysis. First, the types will be inferred and mapped back to the AST. Thereafter, the call graph of the program will be created. Using the AST with types and the call graph together, the exception flow can be deduced.

Type inference

Because PHP is dynamically typed, the AST does not contain information about the types of expressions. Because these types are needed in order to construct the call graph and to detect which exceptions are thrown, the types of the expressions in the AST need to be inferred. However, before we can do type inference, we need to have a Control Flow Graph (CFG), as the paths that can be taken through the code during program execution decide what types a variable can have.
Note that a separate CFG is created for each function or method and that these CFGs are not connected by resolving the method and function calls.
When the CFGs are created, the types can be inferred. These types are mapped back to the AST. At this point, we have an AST which includes type information of expressions.

Call graph construction

Because we want to know how exceptions can travel between functions and methods, we want to know for each method which method calls it can make. Because we have done type inference, we can now decide (for most) expressions what type they have. If we encounter a statement like $a->m(), and we now the type of $a, we can limit the number of possible methods this expression resolves to. Polymorphism plays an important role here.

Inferring the exceptional flow

Now that we have the call graph and the AST with types, we can start inferring the exception flow. The analysis uses the notion of ‘scopes’ and ‘guarded scopes’ [1]. A scope in this context is a method or function, whereas a guarded scope is a try/catch/finally block. A guarded scope can be nested in another (guarded) scope.
An exception that is encountered within a (guarded) scope can originate from four different sources. To start with, the exception can be explicitly thrown using the throw statement. Secondly, the exception can be generated by a statement. This happens when the code causes an exception to occur, without explicitly throwing it. For example, if you call a function which specifies that it returns an int, but actually returns a string, calling this function would result in a TypeError. The third origin of an exception could be a call to a method or function that encounters an exception. The encountered exception is then propagated into the scope of the caller. Finally, an exception can be encountered in a scope, because it was not caught in a nested guarded scope.
Using these sources, we can model the exceptional flow in a system. The exact algorithm will be covered in a later post.

Wrapping up

In this post, all ingredients for building a tool that can model the exceptional flow were briefly discussed.
Stay tuned for the next post in this series, in which the subject of type inference will be covered in more depth.

References

[1] Robillard, M. P., & Murphy, G. C. (2003). Static analysis to support the evolution of exception structure in object-oriented systems. ACM Transactions on Software Engineering and Methodology (TOSEM), 12(2), 191-221.

php static analysis call graph type inference exception flow EN

Deel deze blog