Typed Memory Management via Static Capabilities
ediate language, called the Capability Language (CL), that supports
region-based memory management and enjoys a provably safe type system. Unlike previous region-
based type systems, region lifetimes need not be lexically scoped, and yet the language may be
checked for safety without complex analyses. Therefore, our type system may be deployed in
settings such as extensible operating systems where both the performance and safety of untrusted
code is important. The central novelty of the language is the use of static capabilities to specify the
permissibility of various operations, such as memory access and deallocation. In order to ensure
capabilities are relinquished properly, the type system tracks aliasing information using a form of
bounded quantication. Moreover, unlike previous work on region-based type systems, the proof
of soundness of our type system is relatively simple, employing only standard syntactic techniques.
In order to show how our language may be used in practice, we show how to translate a variant of
Tofte and Memory-Management-via-Static-Capabilities/' >Talpins high-level type-and-eects system for region-based memory management into
our language. When combined with known region inference algorithms, this translation provides
a way to compile source-level languages to CL.
Categories and Subject Descriptors: D.3.1 [Programming Languages]: Formal Denitions and
TheorySemantics, Syntax; D.3.4 [Programming Languages]: ProcessorsCompilers; F.3.2
[Logics and Meanings of Programs]: Semantics of Programming LanguagesOperational
Semantics; F.3.3 [Logics and Meanings of Programs]: Studies of Program ConstructsType
Structure
General Terms: Languages, Theory, Verication
Additional Key Words and Phrases: Certied code, region-based memory management, type-
directed compilation, typed intermediate languages
This material is based on work supported in part by the AFOSR grant F49620-97-1-0013 and
the National Science Foundation under Grant No. EIA 97-03470. Any opinions, ndings, and
conclusions or recommendations expressed in this publication are those of the authors and do not
reect the views of these agencies. This research was performed while the rst and second authors
were at Cornell University.
Authors addresses: David Walker, Karl Crary, Carnegie Mellon University, Computer Science
Department, 5000 Forbes Avenue, Pittsburgh, PA 15213. Greg Morrisett, Cornell University,
Computer Science Department, Upson Hall, Ithaca, NY 14853.
Permission to make digital/hard copy of all or part of this material without fee for personal
or classroom use provided that the copies are not made or distributed for prot or commercial
advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and
notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish,
to post on servers, or to redistribute to lists requires prior specic permission and/or a fee.
c 2000 ACM 0164-0925/00/0500-0431 $5.00
ACM Transactions on Programming Languages and Systems, Vol. TBD, No. TDB, Month Year, Pages 171.
2
·
D. Walker, K. Crary, and G. Morrisett
1.
MOTIVATION AND BACKGROUND
A Memory-Management-via-Static-Capabilities/' target='blank' class='doin' >current trend in systems software is to allow untrusted extensions to be installed
in protected services, relying upon language technology to protect the integrity of
the service instead of hardware-based protection mechanisms [Lindholm and Yellin
1996; Wahbe et al. 1993; Bershad et al. 1995; Necula 1997; Morrisett et al. 1998;
Kozen 1998; Hawblitzel et al. 1998]. For example, the SPIN project [Bershad et al.
1995] relies upon the Modula-3 type system to protect an operating system kernel
from erroneous extensions. Similarly, Web browsers rely upon the Java Virtual
Machine byte-code verier [Lindholm and Yellin 1996] to protect users from mali-
cious Memory-Management-via-Static-Capabilities/' target='blank' class='doin' >applets. In both situations, the goal is to eliminate expensive communications
or boundary crossings by allowing extensions to directly access the resources they
require.
Recently, Necula and Lee [Necula and Lee 1996; Necula 1997] have proposed
Proof-Carrying Code (PCC) and Morrisett et al. [1999; 1998] have suggested Typed
Assembly Language (TAL) as language technologies that provide the security ad-
vantages of high-level languages, but without the overheads of interpretation or
just-in-time compilation. In both systems, low-level machine code can be heavily
optimized, by hand or by compiler, and yet be automatically veried through proof-
or type-checking.
However, in all of these systems (SPIN, JVM, TAL, and Touchstone [Necula
and Lee 1998], a compiler that generates PCC), there is one aspect over which
programmers and optimizing compilers have little or no control: memory man-
agement. In particular, their soundness depends on memory being reclaimed by
a trusted garbage collector. Hence, Memory-Management-via-Static-Capabilities/' target='blank' class='doin' >applets or kernel extensions may not perform
their own optimized memory management.
Furthermore, as garbage collectors
tend to be large, complicated pieces of unveried software, the degree of trust in
language-based protection mechanisms is diminished.
The goal of this work is to provide a high degree of control over memory man-
agement for programmers and compilers, but as in the PCC and TAL frameworks,
make verication of the safety of programs a straightforward task.
1.1
Regions
Tofte and Talpin [1994; 1997] suggest a type-and-eects system for verifying the
soundness of region-based memory management. In later work, Tofte and others
show how to infer region types and lifetimes and how to implement their the-
ory [Tofte and Birkedal 1998; Birkedal et al. 1993; Birkedal et al. 1996]. There are
several advantages to region-based memory management; from our point of view,
the two most important are:
(1) Region-based memory management can be implemented using relatively simple
constant-time routines.
(2) All memory operations are explicit in the program text, but safety is guaran-
teed.
The rst advantage has several implications. If regions are used in a secure system
then the simplicity of the implementation leads to a smaller trusted computing
base. Moreover, it may be possible to formally verify that the region operations are
implemented correctly. In contrast, a formal analysis of a garbage collector would
ACM Transactions on Programming Languages and Systems, Vol. TBD, No. TDB, Month Year.
Typed Memory Management via Static Capabilities
·
3
be a much more onerous task. Second, because region operations are constant-time
and do not trace the structure of the heap, programs do not suer from the pauses
that are associated with conventional garbage collectors. Consequently, region-
based memory management systems may be a practical alternative to real-time
garbage collectors [Baker 1978; Wilson 1992].
The second advantage gives programmers greater control over memory use. By
using a region-proler [Birkedal et al. 1993], programmers can quickly identify the
memory regions that are causing performance problems in their applications. Next,
because allocation and deallocation operations are explicit in the program text, pro-
grammers can use the proling data to accurately relate the run-time behavior of
programs to their static representation. In other words, given information about
the ways regions are used at run time, it is often straightforward to examine pro-
gram code, identify memory-intensive routines, and reason about the lifetimes of the
data structures allocated there. Once the trouble spots have been identied, pro-
grammers can concentrate their optimization eorts on a small portion of the code.
Most importantly, throughout the programming process, a type checker guarantees
that all memory operations are safe. More specically, it prevents dereferencing
a pointer to an object that has been deallocated so programmers do not have to
worry about programs crashing due to memory faults. It also ensures that every
memory region that is allocated in a program is later deallocated (assuming the
program does not enter an innite loop).
In order to ensure that regions are used safely, the Tofte-Talpin language includes
a lexically scoped expression letregion r in e end that delimits the lifetime
of a region r. A region is allocated when control enters the scope of the letregion
construct and is deallocated when control leaves the scope. Programs may allocate
values into live regions using the notation v at r. These values may be used until
the region is deallocated. For example,
.
.
.
Region lifetime
letregion r in
% Allocate region r
let x = v at r in
% Allocate value v in r
f (r,x)
% Function f may access r
end
% Deallocate r (and v)
.
.
.
Tofte and Talpin ensure that deallocated values are not accessed unsafely using a
type-and-eects system. Informally, whenever an expression uses a value in region
r, the type system expresses this fact using the eect access(r). However, eects
occuring within the scope of the letregion construct are masked. More specically,
if the expression e has eects access(r) (for some set of eects ) then the
overall eect of the expression letregion r in e end is simply . Hence, if there
is no overall eect for an entire program then every region access must have occured
within the scope of the corresponding letregion construct. In other words, values
in region r are used only during the lifetime of r and not before or after. If this
condition holds, we can conclude the program is safe.
The Tofte-Talpin language makes ecient use of memory provided that the life-
times of values coincide with the lexical structure of the program. However, if
ACM Transactions on Programming Languages and Systems, Vol. TBD, No. TDB, Month Year.