Dynamic Vt SRAM : A Leakage Tolerant Cache Memory for Low Voltage ...

r> Below is a cache of http://www.ece.umn.edu/groups/VLSIresearch/papers/conferences/ISLPED02.pdf. It's a snapshot of the page taken as our search engine crawled the Web.
The web site itself may have changed. You can check the current page or check for previous versions at the Internet Archive. Yahoo! is not affiliated with the authors of this page or responsible for its content.
Dynamic Vt SRAM : A Leakage Tolerant Cache Memory for Low Voltage Microprocessors
Dynamic Vt SRAM : A Leakage Tolerant Cache Memory for
Low Voltage Microprocessors
Chris H. Kim
hyungil@ecn.purdue.edu
Kaushik Roy
kaushik@ecn.purdue.edu
Department of Electrical and Computer Engineering
Purdue University, West Lafayette, IN 47907, USA
ABSTRACT
This paper presents a Dynamic V
t
SRAM (DTSRAM) ar-
chitecture to reduce the subthreshold leakage in cache mem-
ories. The V
t
of each cache line is controlled separately by
means of body biasing. In order to minimize the energy and
delay overhead, a cache line is switched to high V
t
only when
it is not likely to be accessed anymore. Simulation results
from SimpleScalar framework show that even after consider-
ing the energy overhead, the DTSRAM can save 72% of the
cache leakage with a performance loss less than 1%. Layout
of the DTSRAM shows that the area penalty is minimal.
1.
INTRODUCTION
Increasing on-chip integration and the large fraction of
chip area devoted to memory sturctures has resulted in an
unacceptably large leakage power dissipation for state-of-
the-art microprocessor designs [1, 2]. Recent energy esti-
mates for 0.13
祄 processes indicate that leakage energy
accounts for 30% of L1 cache energy and as much as 80% of
L2 cache energy [3].
This paper presents a Dynamic V
t
SRAM (DTSRAM)
architecture to reduce the large leakage energy dissipation
in memory structures. Body biasing was used to reduce the
subthreshold leakage without sacricing data stability [4]. A
time-based dynamic V
t
scheme is devised for the DTSRAM
which only assigns a high V
t
to the cache lines which are not
accessed for a certain time period (30
祍 100祍). A low
V
t
is assigned to the cache lines which are in frequent use
to maintain high performance. A V
t
control circuit is de-
signed which implements this time-based leakage reduction
strategy. The analog implementation enabled us to reduce
the leakage energy using a very simple hardware. Optimal
design parameters for the DTSRAM are found by exploring
their impact on total leakage energy savings. This paper
also evaluates in detail the energy, performance, and area
tradeos of the capacitor-discharging circuit scheme using
architectural and circuit-level simulations.
2.
DYNAMIC VT SRAM
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for prot or commercial advantage and that copies
bear this notice and the full citation on the rst page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specic
permission and/or a fee.
ISLPED02, August 12-14, 2002, Monterey, California, USA.
Copyright 2002 ACM 1-58113-475-4/02/0008 ...
$
5.00.
Figure 1:
The two dominant leakage paths (Vdd
to ground, bitline to ground) for a 6T SRAM cell.
Leakage through the two paths consist 93% of the
total leakage.
Figure 2: Schematic of a dynamic Vt SRAM set.
2.1
Leakage in SRAM
Fig. 1 depicts the two dominant leakage paths for a con-
ventional 6T SRAM cell, the
i) Vdd to ground and ii) bitline
to ground leakage paths [6]. Together, they make up 93%
of the total leakage. Substantial amount of leakage savings
can be achieved by biasing the NFETs only since most of
the leakage pass through the turned o NFETs in Fig. 1. Of
course, reverse body biasing both the PFETs and NFETs
can give the maximum leakage savings. However, the addi-
tional leakage savings gained by biasing the PMOS substrate
is minute. Transition energy consumed while charging (or
discharging) the substrate and the extra area required to
separate the substrate contacts, isolate the substrates, and
globally route the body bias network can be halved by not
implementing the PMOS substrate biasing. Due to these
considerations, PMOS substrate biasing is not implemented
in our DTSRAM design. Fig. 2 shows the schematic of a
DTSRAM cache line. The NFET substrate can be switched
to 0V for high performance. In times when the cache line is
not in use, the substrate can be switched to a negative volt-
age V
bs
to reduce leakage. The following section describes an ecient strategy to turn on and turn o the cache lines.
2.2
A Time-Based Dynamic Vt Approach
SPICE simulations using TSMC 0.18
祄 show that the
energy required for 1 transition is larger than the leakage
energy saved during one clock cycle, by more than 4 orders
of magnitude. Hence, making a V
t
transition every cycle is
disastrous in terms of energy savings. Speed of the NFETs
on the discharging path also decreases as a negative body
bias is applied. Apparently, the energy overhead and per-
formance loss due to reverse body biasing is considerable.
To tackle the above-mentioned energy and delay over-
heads, a time-based approach is devised which intelligently
turns o the cache lines. The strategy is based on the general
access pattern of a cache line. When data is rst brought
in, it sees a burst of accesses. Then there is a dead period
between the last access and the point when the data is re-
placed [7]. Leakage can be saved by turning o the cache
line during the dead period. While the cache line is experi-
encing a burst of accesses, it is remained on to maintain
the performance. Namely, rather than turning o a cache
line right after its access, we leave it on for a certain time
period (30
祍 100祍) so that the upcoming accesses within
the time period will not impose energy or delay penalties.
Energy and delay overhead is imposed when there is an ac-
cess to a cache line which is in high V
t
state. However this
happens very rarely since most of the cache accesses are lim-
ited to the turned on portion of the cache due to the locality
of reference.
3.
CAPACITOR-DISCHARGING SCHEME
FOR THE DYNAMIC VT SRAM
3.1
Overview
Schematic and waveforms of the V
t
control circuit are
shown in Fig. 3 and Fig. 4, respectively. The circuit consists
of an RC decay circuit, a level converter to adjust the logic
levels, and V
sub
switches which drive the body terminals.
When a cache line is accessed, V
cap
is charged, immediately
switching V
sub
to 0V and making the corresponding cache
line low V
t
. V
cap
starts discharging slowly at a decay time
(30
祍 100祍) determined by the RC values. After this
certain amount of time elapses without having any accesses,
the SRAM cache line is switched to high V
t
. In case there
are accesses to the cache line before it is switched to high
V
t
, V
cap
will be charged and the cache line will continue to
remain low V
t
until there is an idle period so long as the
time for V
cap
to completely discharge.
Performance of the cache is not aected by using the
capacitor-discharging scheme, since most of the accesses will
be on the fast, low V
t
cache lines. Leakage energy is saved
in the cache lines which are in idle mode. Energy and delay
overhead is impose only when these high V
t
cache lines have
to be waken up. However, this happens very rarely, making
the capacitor-discharging scheme protable.
3.2
Circuit Design
A separate low supply voltage is used for the inverters in
the V
t
control circuit to reduce short circuit current. The
short circuit current in the inverters is induced due to the
intermediate voltage level while V
cap
is decaying. The lower
supply voltage for the inverters weakens the gate drive for
Figure 3: Schematic diagram of the Vt control cir-
cuit using capacitor-discharging scheme.
Figure 4: WL, Vcap and NMOS body bias voltage
waveforms for the capacitor-discharging scheme.
the level converters. Larger PFETs are used in the level
converters to compensate for the weak drive signal. The
time constant of the capacitor decay can be changed by an
analog control voltage, V
discharge
. Our design shows a decay
time of approximately 1ms when V
discharge
is 0V.
The V
sub
switches in Fig. 3 are sized so that the time
to switch from high V
t
to low V
t
is squeezed inside 1 clock
cycle. An extra clock cycle has to be added for the V
t
tran-
sition whenever a high V
t
cache line is accessed. This extra
cycle becomes the delay penalty for the DTSRAM.
3.3
DTSRAM Layout
Fig. 5 shows the layout of 4 cache lines, each having 96
SRAM cells. The layout of the DTSRAM was done using
TSMC 0.18
祄 technology. Due to area considerations, con-
secutive cache lines are ipped so that their substrates can
be shared. Area overhead of the Vt control circuit is also
reduced since only one V
t
control circuit is required for 2
cache lines.
A deep N-well layer was used to isolate the
P-substrates between each cache line. TSMC 0.18
祄 de-
sign rules require a comparably large area margin on the
edge of each deep N-well layer and this leads to an increase
in the SRAM cell area [8]. Table 1 shows the layout and
area of a conventional SRAM cell and