Logic Emulation with Virtual Wires
page or check for previous versions at the Internet Archive.
Yahoo! is not affiliated with the authors of this page or responsible for its content.
Logic Emulation with Virtual Wires
Logic Emulation with Virtual Wires
Jonathan Babb, Russell Tessier, Matthew Dahl,
Silvina Hanono, David Hoki, and Anant Agarwal
MIT Laboratory for Computer Science
Cambridge, MA 02139
Abstract
Logic emulation enables designers to functionally verify
complex integrated circuits prior to chip fabrication. How-
ever, traditional FPGA-based logic emulators have poor
inter-chip communication bandwidth, commonly limiting
gate utilization to less than 20 percent. Global routing con-
tention mandates the use of expensive crossbar and PC-board
technology in a system of otherwise low-cost, commodity
parts. Even with crossbar technology, current emulators only
use a fraction of potential communication bandwidth be-
cause they dedicate each FPGA pin (physical wire) to a sin-
gle emulated signal (logical wire). Virtual Wires overcome
pin limitations by intelligently multiplexing each physical
wire among multiple logical wires and pipelining these con-
nections at the maximum clocking frequency of the FPGA.
The resulting increase in bandwidth allows effective use of
low dimension, direct interconnect. The size of the FPGA
array can be decreased as well, resulting in low cost logic
emulation.
This paper covers major contributions of the MIT Vir-
tual Wires project. In the context of a complete emula-
tion system, we analyze phase-based static scheduling and
routing algorithms, present Virtual Wires synthesis method-
ologies, and overview an operational prototype with 20K-
gate boards. Results, including in-circuit emulation of a
SPARC microprocessor, indicate that Virtual Wires elimi-
nate the need for expensive crossbar technology while in-
creasing FPGA utilization beyond 45 percent. Theoretical
analysis predicts that Virtual Wires emulation scales with
FPGA size and average routing distance, while traditional
emulation does not.
1
Introduction
Field Programmable Gate Array (FPGA) based logic emula-
tors are capable of emulating complex logic designs at clock
speeds four to six orders of magnitude faster than software
simulators. This performance is achieved by partitioning a
logic design, described by a netlist, across an interconnected
Logic
Simulation
Accelerated
Simulation
Year
Month
Day
Hour
Week
Hour
Day
Week
Month
Logic
Emulation
Final
Silicon
Execution
Time
Compilation
Time
Minute
Minute
Figure 1: Verication Alternatives
array of FPGAs. The netlist partition on each FPGA, con-
gured directly into logic circuitry, is then executed at near
hardware speeds.
Figure 1 compares logic emulation to other prototyping
methods, including simulation and accelerated simulation,
as well as to nal silicon. The y-axis measures relative
time for compiling or constructing a hypothetical design,
while the x-axis measures relative time for executing one
set of test vectors on this design. As an example, consider
nal silicon which takes months to construct and runs a set
of vectors in less than one minute. The same design and
vector set could be compiled for a logic simulator on the
order of minutes, but would take years to execute. Logic
emulation lls a wide gap between simulation and actual
silicon. With both a moderately fast compile time and a fast
execution time, emulation offers a compromise between the
programmability of software and the fast execution speed of
hardware.
Logic emulators are further characterized by interconnec-
tion topology, target FPGA, and supporting software. The in-
terconnection topology describes the arrangement of FPGA
devices and routing resources. Example interconnects in-
clude full crossbars and two-dimensional meshes. Impor-
Page 1
Not Limited
unused FPGA pins
unused FPGA gates
Gate Limited
Pin Limited
no unused pins
some unused gates
Balanced
no unused pins
no unused gates
some unused pins
no unused gates
Figure 2: Partition Limitation Scenarios
tant target FPGA properties include gate count, pin count,
and mapping efciency. Supporting software is extensive,
combining netlist translators, logic optimizers, technology
mappers, global and FPGA-specic partitioners, placers, and
routers.
Traditional emulators are gate inefcient due to inher-
ent pin limitations in the FPGA devices. To reduce pin
limitations, these emulators supplement FPGAs with cus-
tom crossbars chips and expensive PC-board and backplane
technology, further increasing the per-gate cost of emulation.
This paper suggests an alternative solution to pin limitations
based on multiplexing of FPGA resources.
1.1
Virtual Wires
In existing emulator architectures, both the logic congu-
ration and the network connectivity remain xed for the
duration of the emulation. Every emulated partition of the
input design, one per FPGA, consists of a set of gates and
a set of signals communicating to other partitions. Each
emulated gate is mapped to one or more FPGA equivalent
gates and each inter-partition emulated signal is allocated to
a pair of pins between two FPGAs. Thus for a partition to be
feasible, the partition gate and pin requirements must be no
greater that the available FPGA resources. These constraints
yield four possible scenarios (Figure 2).
When typical circuits are mapped onto available FPGA
devices, partitions are predominately pin limited. That is,
all available FPGA gates cannot be utilized due to lack of
pin resources to support them. We demonstrate this result-
ing bandwidth gap with a set of partitionings of the Sparcle
and CMMU benchmarks (see Section 5.1) for various gate
counts. Figure 3 shows the resulting curves, plotted on a
log-log scale. Partition gate count is scaled by a factor of
two to get FPGA equivalent gates with an assumed mapping
efciency of 50%. On the same curve we plot the pin and
gate capacity of target FPGAs: the Xilinx 3000 and 4000
series [40], the Altera Flex 8000 series [3], and the Atmel
6000 series [5]. For equal average gate counts in the bench-
mark partitions and FPGA devices, the required average pin
counts for partitions are much greater than the available pin
capacity of the FPGAs.
Alewife Cache Controller partitions
Sparcle partitions
Xilinx 3000 & 4000 FPGAs
Xilinx 4000H FPGAs
Atmel FPGAs
Altera FPGAs
Altera MCM
|
100
|
|
|
|
| | | |
|
1000
|
|
|
|
| | | |
|
10000
|
|
|
|
| | | |
|
100000
|
|
|
|
|
|
100
|
|
|
|
|
|
|
|
|
1000
FPGA / Partition Gate Count
FPGA / Partiton Pin Count
Bandwidth
Gap
Figure 3: Pin Count as a Function of FPGA Partition Size
Pin limits set a hard upper bound on the maximum usable
gate count any FPGA gate count can provide. Low utiliza-
tion of gate resources increases both the number of FPGAs
needed for emulation and the time required to emulate a
particular design. This discrepancy will only get worse as
technology scales; current trends indicate that available gate
counts are increasing faster than available pin counts. Fu-
ture breakthroughs in area I/O [27] may partially address this
problem for FPGA packaging, but will leave open the more
difcult issues of inter-board and system-level communica-
tion. Additionally, any new technology will be challenged
to keep up as minimum feature size decreases faster than
required bonding area.
Virtual Wires eliminate the pin limitation problem of pre-
vious emulators by intelligently multiplexing each physical
wire among multiple logical wires and pipelining these con-
nections at the maximum clocking frequency of the FPGA.
1
A Virtual Wire represents a simple connection between a
logical output on one FPGA and a logical input on another
FPGA. Established via a pipelined, statically-routed com-
munication network, these Virtual Wires increase available
off-chip communication bandwidth by multiplexing the use
of FPGA pin resources (physical wires) among multiple em-
ulation signals (logical wires).
Without Virtual Wires, one to one allocation of logical
wires to physical wires does not exploit available pin band-
width because:
emulation clock frequencies are one or two orders of
magnitude lower than the potential FPGA frequency;
all logical wires are not active simultaneously.
1
Although this paper focuses on logic emulation, Virtual Wires can be
applied to any multi-chip system.
Page 2
Logical Inputs
Logical Outputs
Physical Wire
FPGA #1
FPGA #2
Figure 4: Hard Wire Interconnect
However, by clocking physical wires at the maximum fre-
quency of the FPGA technology, several logical connections
can share the same physical resource. Figure 4 shows an
example of six logical wires allocated to six physical wires.
Figure 5 shows the same example with the six logical wires
sharing a single physical wire. The physical wire is multi-
plexed between two pipelined shift loops (Section 3). Each
register in the pipeline carries a single bit of information
from one logical output to the corresponding logical input in
the neighboring FPGA.
Systems based on Virtual Wires exploit several properties
of digital circuits to boost bandwidth from available pins