Architecture for a Next-Generation GCC
llspacing=0 width=100%>Yahoo! is not affiliated with the authors of this page or responsible for its content.
Architecture for a Next-Generation GCC
Architecture for a Next-Generation GCC
Chris Lattner
Vikram Adve
University of Illinois at Urbana, Champaign
{lattner, vadve}@cs.uiuc.edu
http://llvm.cs.uiuc.edu
Abstract
This paper presents a design and implementation
of a whole-program interprocedural optimizer built
in the GCC framework.
Through the introduc-
tion of a new language-independent intermediate
representation, we extend the current GCC archi-
tecture to include a powerful mid-level optimizer
and add link-time interprocedural analysis and op-
timization capabilities. This intermediate represen-
tation is an SSA-based, low-level, strongly-typed,
representation which is designed to support both
ecient global optimizations and high-level anal-
yses.
Because most of the program is available
at link-time, aggressive whole-program optimiza-
tions and analyses are possible, improving the time
and space requirements of compiled programs. The
nal proposed organization of GCC retains the im-
portant features which make it successful today, re-
quires almost no modication to either the front-
or back-ends of GCC, and is completely compatible
with user makeles.
1
Introduction
The GNU Compiler Collection (GCC) [15] is in
many ways the centerpiece of the Free Software
movement. It supports several source languages and
a plethora of back-ends for various targets, provid-
ing a unied target for free software.
GCC has
been successful because of its extreme portability,
stability, and because it is able to compile and op-
timize several popular source languages (C, C
++
,
Java, etc) to each target. Unfortunately, despite the
success of the GCC compiler suite as a whole, the
optimization infrastructure is still not competitive
with commercial compilers.
Over the years, the GCC optimizer has evolved from
compiling a statement at a time, to compiling and
optimizing entire functions at a time, to the (still
very new) support for unit-at-a-time compilation
(compiling and optimizing all of the functions in a
translation unit together). As the scope for analysis
and optimizations increases, the compiler is better
able to reduce the time and space requirements for
the generated code.
This paper proposes the next logical step for the
GCC optimizer: extend it to be able to analyze and
optimize whole programs at link-time
1
, enabling new
optimizations and making existing analyses and op-
timizations more powerful. For example:
inlining across translation units
whole-program alias analysis
interprocedural register allocation
interprocedural constant propagation
data layout optimizations
exception handling space optimizations
sorting initializer priorities at link-time
The key challenges to whole-program optimization
are to enable powerful transformations while keep-
ing compile times reasonable, and to keep the user-
visible development process unchanged (e.g. user
makeles).
The architecture that we propose is based on a new
language-independent low-level code representation
that preserves important type information from the
source code. The use of a low-level, SSA-based rep-
resentation allows the compiler to perform a variety
of optimizations at compile time, o-loading work
from the link-time optimizer. However, the link-
time optimizer can only perform meaningful opti-
mizations on the program if it has enough high-level
information about the program to prove that aggres-
sive optimizations are safe. Because of this, the low-
level code representation is typed (using a language-
independent constructive type system) and directly
1
This capability would be optional and could be enabled
only when the program is compiled at the -O4 level of op-
timization, for example.
Optimizing Linker
.o files
Front-End Compiler
GCC FE Mid-Level Opt
GCC FE Mid-Level Opt
.a files
Source File
Source File
Source File
GCC FE Mid-Level Opt
"LLVM"
Assembly
Link
Whole Program
Analysis and
Optimization
RTL Expansion
Native Codegen
.s file
Native
Assembler
native
.a /.so
files
Native
Linker
".exe"
file
Unmodified binutils Toolchain
Source File
GCC FE Mid-Level Opt
Figure 1: High-Level Compiler Architecture for Whole-Program Optimization
exposes information about structure and array ac-
cesses to the optimizer.
The link-time optimizer is designed to combine the
translation units of a program together and do the
nal whole-program optimization. After the pro-
gram is optimized, machine code is generated at
link-time for the entire program at once, allowing a
variety of interprocedural low-level code optimiza-
tions to be performed.
The Low-Level Virtual Machine (LLVM) [10] is an
implementation of the architecture and intermediate
representation [11] described in this paper, which al-
lows us to be more concrete when describing aspects
of the design. This system has served as the host
for several research projects [7, 13, 12] which require
whole-program information as well as a host for a
variety of traditional compiler optimizations.
We hope that the lessons learned by the LLVM
project will be useful to the GCC community, and
are willing to contribute as much code to the GNU
project as there is interest in. We are planning to
have our rst public release of LLVM, with a liberal
license, in the Summer of 2003. However, LLVM will
only be discussed when it helps clarify the ideas in
the proposed architecture, this paper is intended to
be a GCC paper, not an LLVM paper.
This paper is organized as follows: Section 2 de-
scribes the proposed high-level architecture in de-
tail, including modications that would need to be
made to the GCC infrastructure.
Section 3 de-
scribes important aspects of the proposed interme-
diate representation for the system. Section 4 de-
scribes LLVM, our existing implementation of the
proposed design. Section 5 describes other work re-
lated to the proposed design, and Section 6 wraps
up the paper.
2
High-Level Compiler Architecture
The proposed high-level architecture is illustrated in
Figure 1. The essential aspect of this design is that
it separates the current cc1 program into two com-
ponents: a front-end compiler and an optimizing
linker. The front-end retains all of the responsibili-
ties of current GCC front-ends (preprocessing, lex-
ical analysis, parsing, semantic analysis, etc..) and
should work unmodied in the new system. After
each function is parsed and checked for semantic
errors it is expanded from the tree representa-
tion to the new language-independent intermediate
representation (described in Section 3). Once the
entire translation unit has been translated (and if
no errors have occurred), a standard set of mid-level
optimizations are performed on the translated mod-
ule. After these optimizations are nished, a .o
le is emitted which contains IR assembly code for
the representation.
When the optimizing linker is invoked, it reads in all
of the translated IR les and any libraries compiled
to the intermediate representation. It links these
les together into a single-le representation of the
program, on which it can run whole-program analy-
ses and optimizations. Finally, once these analyses
and transformations are complete, the GCC back-
end is invoked to expand the intermediate repre-
sentation into RTL and use the congured target
description to produce a native .s le.
After the optimizing linker produces a native .s le,
the compilation process proceeds through the stan-
dard system assembler and linker (to resolve any
symbols in libraries that were not available in the
IR form), nally producing a native executable.
2.1
Compatibility and Implementation
One of the key features of this design is that it is
compatible with the standard compile and link
models of compilation, and is thus fully compati-
ble with existing makeles. In order to provide this
compatibility, the link phase of the gcc compiler
driver is extended to invoke the optimizing linker
and system assembler (if necessary) during the stan-
dard link step of the compile process. In this way,
any input les that are in the IR format are au-
tomatically linked together and optimized without
interfering with the compilation and linking of stan-
dard translation units and libraries. If no les in the
IR format are present, the entire invocation of the
optimizing linker is skipped.
Another important aspect of the design is how
the compiler works when whole-program optimiza-
tion is not enabled. If not enabled, each transla-
tion unit is either compiled a function at a time
or a unit at a time (depending on the setting of
the -funit-at-a-time switch), through the mid-
level optimizer, RTL expansion, and code genera-
tion phases of the compiler. This produces a native
.s le, which can be processed with the standard
system assembler and linker, as before.
For this approach to be feasible, a large amount of
code must be shared between the optimizing linker
and the compiler front-ends. This can either be
accomplished through the use of libraries that are
shared between the two (which would contain the
existing GCC back-end, and any shared optimiza-
tions on the IR), or by making both logical pieces be
part of the same binary. In either case, the actual
or