A Principled Approach to Operating System Construction in Haskell

-1 color=black>Yahoo! is not affiliated with the authors of this page or responsible for its content.
A Principled Approach to Operating System Construction in Haskell
A Principled Approach to
Operating System Construction in Haskell
Thomas Hallgren
Mark P Jones
OGI School of Science & Engineering
Oregon Health & Science University
http://www.cse.ogi.edu/~hallgren/
http://www.cse.ogi.edu/~mpj/
Rebekah Leslie
Andrew Tolmach
Department of Computer Science
Portland State University
http://www.cs.pdx.edu/~rebekah/
http://www.cs.pdx.edu/~apt/
Abstract
We describe a monadic interface to low-level hardware features that
is a suitable basis for building operating systems in Haskell. The
interface includes primitives for controlling memory management
hardware, user-mode process execution, and low-level device I/O.
The interface enforces memory safety in nearly all circumstances.
Its behavior is specied in part by formal assertions written in a
programming logic called P-Logic. The interface has been imple-
mented on bare IA32 hardware using the Glasgow Haskell Com-
piler (GHC) runtime system. We show how a variety of simple O/S
kernels can be constructed on top of the interface, including a sim-
ple separation kernel and a demonstration system in which the ker-
nel, window system, and all device drivers are written in Haskell.
Categories and Subject Descriptors
D.3.2 [Programming Lan-
guages]: Language ClassicationsApplicative (functional) lan-
guages; D.4.0 [Operating Systems]: Organization and Design;
D.4.5 [Operating Systems]: ReliabilityVerication
General Terms
Languages,Design,Verication
Keywords
Operating systems, Haskell, hardware interface, mon-
ads, programming logic, verication
1.
Introduction
Systems software often contains bugs that cause system failures,
security violations, or degraded performance. One reason for the
high bug rate is that most of this software is written in C or C++,
which lack strong static typing and memory safety. For example,
many security failures are due to buffer over-runs that could have
been avoided simply by using a programming language that en-
forced type safety and bounds checking.
Writing systems software in a relatively low-level implementa-
tion language makes it hard to assure that the software obeys key
specications. For example, we might wish to verify formally that
an operating system maintains strict data separation between its
processes. If the O/S is written in C, this will be a very challeng-
ing task, because reasoning about the program must be performed
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. To copy otherwise, to republish, to post on servers or to redistribute
to lists, requires prior specic permission and/or a fee.
ICFP 05
September 26-28, 2005, Tallinn, Estonia.
Copyright c 2005 ACM 1-59593-064-7/05/0009. . . $5.00.
at a very low level. Again, using a programming language with a
cleaner and safer semantics would ease the reasoning task.
Given these goals, Haskell is an attractive language for systems
programming. The core of Haskell is type-safe and memory-safe,
which prevents many classes of bugs, and also pure, which eases
reasoning about program behavior. In addition, Haskells highly
expressive type system can be used to capture important program
properties without the need for additional proofs.
Systems software needs to interact directly with machine hard-
ware, which can be accomplished in Haskell by using the built-in
IO monad and the Foreign Function Interface (FFI) extensions. Un-
fortunately, this extended system includes raw pointers and pointer
arithmetic, which allow writes to arbitrary memory locations and
can corrupt the Haskell heap. It also includes unsafePerformIO,
which can be used to manufacture unsafe type casts. Of course, just
as in C, these problems can be addressed by careful coding, but
we would like a safer infrastructure instead.
In this paper, we describe the design, implementation, and ap-
plication of a restricted monadic framework that achieves this goal.
The monad provides access to hardware facilities needed to build
an operating system on the Intel IA32 architecture [13], including
virtual memory management, protected execution of arbitrary user
binaries, and (low-level) I/O operations. The interface is memory-
safe in almost all circumstances; the only possible safety violations
are ones that occur via abuse of a device DMA controller. More-
over, unlike the full IO monad, it is small enough that we can enu-
merate useful properties about it as a basis for reasoning about the
behavior of its clients. For example, we can assert that executing
a program in user space has no impact on kernel data structures.
Such properties can be viewed as part of the specication of the in-
terface. We give them as formulas in P-Logic [18], a programming
logic for Haskell that has been developed as part of the Progra-
matica project [22], which is an on-going investigation into using
Haskell for high-assurance development.
We are using this hardware monad as the basis for some ex-
perimental operating system kernels that exploit Haskells strengths: House is a small operating system coded almost entirely in
Haskell. It builds on the previous hOp project conducted by
S磂bastien Carlier and Jeremy Bobbio [4]. The system includes
device drivers, a simple window system, a network protocol
stack, and a command shell window in which a.out les can
be loaded (via TFTP) and executed in user-mode. To our knowl-
edge, this is the rst functional-language operating system that
supports execution of arbitrary user binaries (not just programs
written in the functional language itself). Osker, currently in development, is a microkernel coded in
Haskell, based on L4 [25], for which we hope to prove some key security properties, including a strong notion of separation be-
tween multiple domains and/or virtual machines that are hosted
on a single computer. We expect these proofs to be based on a
combination of type-based and ad-hoc reasoning. The hardware
monad properties will be key axioms in the ad-hoc proofs. The
architecture of this kernel makes heavy use of Haskells type
classes, especially to describe monad transformers.
We have implemented the hardware monad directly on top of
IA32 hardware. Our implementation uses a modied version of
the Glasgow Haskell Compiler (GHC) runtime system [10], based
on that of hOp. We retain most services of the runtime system,
including the garbage collector and, optionally, the Concurrent
Haskell multi-threading primitives. In place of the usual le-based
I/O, we provide direct access to the IA32s I/O ports and memory-
mapped regions, and we add support for direct access to paging
hardware and protection mode switching. In effect, this combines
the runtime system and the operating system in a single entity.
In addition to this real implementation, we have developed a
model implementation of (most of) the hardware monad entirely in
pure Haskell 98. This version is built on an abstract model of mod-
ern CPUs that uses a combination of state and continuation mon-
ads to implement the mechanisms of virtual memory and protected
mode execution [16]. The pure Haskell implementation provides
us with a more formal setting in which to validate the consistency
of the properties that we have specied for the hardware monad.
For example, others in our group are working towards mechanized
proofs for many of these properties using Isabelle, and based on
the semantics of Haskell instead of informal assumptions about any
specic CPU platform. However, we omit further discussion of this
work from the current paper.
Related work
The idea of applying functional languages to sys-
tems programming has a long history. Early examples of operating
systems implemented in functional languages include Nebula [17]
and the Kent Applicative Operating System [24, 6].
A more recent project, Hello [9], implements an operating sys-
tem in Standard ML and addresses various language design and
efciency issues, like how to access hardware devices and how to
handle interrupts in a garbage-collected language. It builds on the
results of the Fox project,where Standard ML was used for systems
programming. In particular, it includes FoxNet, an efcient imple-
mentation of the TCP/IP protocol stack [2]. Compared to these
projects, one new feature of our hardware monad is its support
for controlling memory management hardware, which allows us to
run code written in other languages safely.
Functional languages have also been applied when real-time
constraints are an issue. Embedded Gofer [27] has an incremen-
tal garbage collector in addition to other language features needed
to program embedded controllers [28]. We have not seriously ad-
dressed real-time issues in our project yet, but if we do, then we
might nd it necessary to switch to a version of GHC that uses an
incremental garbage collector [5].
Our goal of making systems programming safer is shared by
projects using other high-level languages. SPIN [23] takes advan-
tage of the type safety of Modula-3 [3] to ensure safety in an ex-
tensible O/S kernel. Cyclone [14, 7] is a language deliberately de-
signed as a safe replacement for C.
Modern PC hardware is complex, and building an operat