Demand-Driven Type Inference with Subgoal Pruning: Trading Precision ...

=0 cellpadding=0 cellspacing=0 width=100%>
Yahoo! is not affiliated with the authors of this page or responsible for its content.
Demand-Driven Type Inference with Subgoal Pruning: Trading Precision for Scalability
Demand-Driven Type Inference
with Subgoal Pruning:
Trading Precision for Scalability
S. Alexander Spoon and Olin Shivers
Georgia Institute of Technology
Abstract. After two decades of eort, type inference for dynamically
typed languages scales to programs of a few tens of thousands of lines
of code, but no further. For larger programs, this paper proposes using
a kind of demand-driven analysis where the number of active goals is
carefully restricted. To achieve this restriction, the algorithm occasionally
prunes goals by giving them solutions that are trivially true and thus
require no further subgoals to be solved; the previous subgoals of a newly
pruned goal may often be discarded from consideration, reducing the
total number of active goals. A specic algorithm DDP is described
which uses this approach. An experiment on DDP shows that it infers
precise types for roughly 30% to 45% of the variables in a program with
hundreds of thousands of lines; the percentage varies with the choice of
pruning threshold, a parameter of the algorithm. The time required varies
from an average of one-tenth of one second per variable to an unknown
maximum, again depending on the pruning threshold. These data suggest
that 50 and 2000 are both good choices of pruning threshold, depending
on whether speed or precision is more important.
1
Introduction
While dynamic programming languages have many advantages, e.g., supporting
productive environments such as Smalltalk [1, 2] and the Lisp Machine [3], they
share the fundamental weakness that less information about programs is imme-
diately apparent before the program runs. This lack of information aects both
programmers and their tools. Programmers require more time tracing through
a program in order to answer questions such as what kind of object is the
schema parameter of this method? Tools such as refactoring browsers [4] and
compilers [5, 6] are similarly hampered by the lack of type information. Refac-
toring browsers cannot make as many safe refactorings, and compilers cannot
optimize as much.
To counteract this lack of information, there have been a number of attempts
to infer types as often as possible for programs written in dynamically typed
languages [611]. The algorithms from this body of work are quite successful for
programs with up to tens of thousands of lines of code, but do not scale to larger
programs with hundreds of thousands of lines. For analyzing programs with hundreds of thousands of lines, this paper pro-
poses modifying existing algorithms in two ways: (1) make them demand-driven,
and (2) make them prune subgoals. The rest of this paper describes this ap-
proach in more detail, describing a specic algorithm DDP which instantiates
the approach, and giving experimental results showing DDPs eectiveness.
2
Previous Work
2.1
Type Inference in Dynamic Languages
The type-inference problem is similar for various dynamic languages, including
Smalltalk, Self, Scheme, Lisp, and Cecil. Algorithms that are eective in one
language are eective in the others. As pointed out by Shivers [12], all of these
languages share the diculty that the analysis of control and data ow is interde-
pendent; all of these language have and use dynamic dispatch on types and have
data-dependent control ow induced by objects or by higher-order functions.
Eorts on this problem date back at least two decades. Suzukis work in
1981 is the oldest published work that infers types in Smalltalk without any type
declarations [7]. More recently, in 1991, Shivers identied context selection, called
contour selection in his description, as a central technique and a central design
dierence among type-inference algorithms [6]. The core of Agesens Cartesian
Products Algorithm (CPA) is the insight of choosing contexts in a type-specic
way [9]. CPA selects contexts as tuples of parameter types; the algorithm gets
its name because the tuples are chosen as cartesian products of the possible
argument types for each method. More recently yet, Flanagan and Felleisen have
increased the speed and scalability of type-inference algorithms by isolating part
of the algorithm to work on individual modules; thus, their algorithm spends
less time analyzing the entire program [10]. Most recently of all, Garau has
developed a type inference algorithm for a subset of Smalltalk [11]. However,
the algorithm does not accurately model two important features of Smalltalk:
multiple variables may refer to the same object, and objects may reference each
other in cycles. Thus Garaus algorithm supports a much smaller subset of the
full language than is usual, and it cannot be directly compared to other work in
this area.
Unfortunately, the existing algorithms do not scale to dynamic programs
with hundreds of thousands of lines. Certainly, no successful results have been
reported for such large programs. Grove, et al., implemented a wide selection of
these algorithms as part of the Vortex project, and they report them to require
an unreasonable amount of time and memory for larger Cecil programs. [5] Their
experimental results for analyzing Cecil are summarized in Table 1. Clearly, 0-
CFA is the only algorithm of these that has any hope for larger programs; the
various context-sensitive algorithms tend to fail even on programs with twenty
thousand lines of code. However, even 0-CFA, a context-insensitive algorithm
that is faster but less precise, seems to require an additional order of magnitude
of improvement to be practical for most applications. With the single order of
magnitude in speed available today over the test machine of this study (a Sun
2 b-CPA
SCS
0-CFA 1,0-CFA 1,1-CFA 2,2-CFA 3,3-CFA
richards
4 sec
3 sec
3 sec
4 sec
5 sec
5 sec
4 sec
(0.4 klocs)
1.6 MB 1.6 MB
1.6 MB
1.6 MB
1.6 MB
1.6 MB
1.6 MB
deltablue
8 sec
7 sec
5 sec
6 sec
6 sec
8 sec
10 sec
(0.65 klocs) 1.6 MB 1.6 MB
1.6 MB
1.6 MB
1.6 MB
1.6 MB
1.6 MB
instr sched
146 sec 83 sec
67 sec
99 sec
109 sec
334 sec 1,795 sec
(2.0 klocs) 14.8 MB 9.6 MB
5.7 MB
9.6 MB
9.6 MB
9.6 MB 21.0 MB
typechecker
947 sec 13,254 sec (20.0 klocs) 45.1 MB 97.4 MB new-tc 1,193 sec 9,942 sec (23.5 klocs) 62.1 MB 115.4 MB compiler 11,941 sec
(50.0 klocs) 202.1 MB
Table 1. Type-inference performance data from Grove, et al. [5]. Each box gives the
running time and the amount of heap consumed for one algorithm applied to one
program. A entry means attempted executions that did not complete in 24 hours
on the test machine.
Ultra 1 running at 170 MHz), and with the cubic algorithmic complexity of
0-CFA[13], one can expect a program with one hundred thousand lines to be
analyzed in about 2.7 hours and to require 1600 MB of memory. A program
with two hundred thousand lines would require 21 hours and 13 GB of memory.
While one might quibble over the precise denition of scalable, 0-CFA is at
best near the limit. Grove, et al., agree with this assessment:
The analysis times and memory requirements for performing the var-
ious interprocedurally ow-sensitive algorithms on the larger Cecil pro-
grams strongly suggest that the algorithms do not scale to realistically
sized programs written in a language like Cecil [5].
2.2
Concrete Type Inference in Static Languages
Type inference (by which we mean concrete type inference or class analysis)
is a much easier problem in languages like Java and C which have static type
information available, and there are already scalable analyses for these languages.
It is not a trivial problem, because better types may be inferred than the type
system provides, but it is nevertheless a much easier problem.
One way the problem is easier is that context sensitivity is not necessary. For
example, Tip and Palsberg have developed an eective analysis that does not
need context [14]. In fact, Tip and Palsberg argue that existing context-sensitive
algorithms in Java should be abandoned as impractical:
In practice, the scalability of the algorithms at either end of the
spectrum is fairly clear. The CHA and RTA algorithms at the low end
3 of the range scale well and are widely used. The k-CFA algorithms (for
k > 0) at the high end seem not to scale well at all.[14]
It appears that the static types in static languages are helping by bounding
the distance that an imprecise type may ow. For some intuition about this
eect, consider the following example. Suppose we dene an identity method,
yourself, in Smalltalka simple polymorphic method that returns whatever is
passed to it. Consider this use of the yourself message:
x := x yourself
This statement has no eect on the value of x, much less its type. With a context-
free analysis, however, every invocation of yourself must be assigned the same
result type, and thus the presence of this statement causes the type of x to be
polluted by the types of every other sender of yourself in the entire program.
Contrast this result to a context-sensitive analysis. With CPA, the yourself
method is analyzed multiple times, once for each type of argument supplied to it.
With 1-CFA [6], the method is analyzed once for each place it