Analysis and Modeling of Subthreshold Leakage of RT-Components under PTV and State Variation

Page created by Walter Porter
 
CONTINUE READING
Analysis and Modeling of Subthreshold Leakage of
             RT-Components under PTV and State Variation

                                   Domenik Helms1 , Günter Ehmen1 , and Wolfgang Nebel2
                                              1
                                                  OFFIS Research Institute, 2 University of Oldenburg
                                                          D - 26121 Oldenburg, Germany
                                         helms@offis.de, ehmen@offis.de, nebel@offis.de

ABSTRACT                                                                                • Always applicable techniques, which if available will be
In this work we present a SPICE-based RTL subthreshold-                                   used for each design i.E. well engineering [4] or high-k
leakage model analyzing components built in 70nm technol-                                 gate oxide [8].
ogy [1]. We present a separation approach regarding inter-
                                                                                        • Leakage-performance tradeoff techniques [9], enabling
and intra-die threshold variations, temperature, supply-vol-
                                                                                          the choice between fast and low leaking devices. The
tage, and state dependence. The body-effect and differences
                                                                                          usefulness of this tradeoff depends on the design.
between NMOS and PMOS introduce a leakage state de-
pendence of one order of magnitude[2, 3]. We show that                                  • Power management techniques offering a high perfor-
the leakage of RT-components still shows state dependen-                                  mance and a low leakage mode in which the component
cies between 20% and 80%. A leakage model not regarding                                   is either slow or dysfunctional as power gating or adap-
the state can never be more accurate than this. The pro-                                  tive body biasing - both reviewed in [10]. Both tech-
posed state aware model has an average error of 6.7% for                                  niques are implemented on lowest levels of abstraction
the RT-components analyzed.                                                               but have to be controlled system wide.
Categories and Subject Descriptors:
 B.8.2: Performance Analysis and Design Aids.                                        Both, the tradeoff and the power management techniques
                                                                                     have to be evaluated at high level.
General Terms:                                                                          There are several approaches abstracting from accurate
 Design.                                                                             BSIM models [5] to much faster gate level models. But
Keywords:                                                                            as the impact of tradeoffs and power management has to
 Leakage, Process Variation, State Dependence, Modeling.                             be evaluated at system level, gate level based models are
                                                                                     still too complex for system level tools which usually do not
1.     INTRODUCTION                                                                  even generate gate level details. Thus we developed two al-
   In recent years, a leakage paper motivation would have                            ternative RTL-leakage macromodels: The simulation based
read like: leakage will become the most important source of                          bottom-up model, abstracting from transistor level to RTL
power consumption. Today, leakage is the most important                              is described in [11]. The top-down characterization based
contributor to a system’s power consumption and within                               model analytically describing the leakage of RT components
the last 3 years there was an incredible amount of scientific                        is presented in this work.
approach in this area.                                                                  Existing dynamic power, area and delay models for RT
   The physics of leakage are well understood and can be                             components have typical estimation errors in the order of
estimated with a sufficient accuracy at transistor level if the                      10%. To make leakage estimation accuracy comparable, our
device geometry as the physical condition of the transistor                          leakage models will have to regard all known parameters
is exactly known [4, 5].                                                             influencing leakage [8].
   Process variations randomly and unpredictably affecting                              In the bottom-up model, the dynamic parameters temper-
the geometry are identified being the major hurdle of accu-                          ature, supply voltage, and body voltage, and the variation
rate leakage and performance estimation. Results of ongoing                          parameters of channel length, oxide thickness, and chan-
research have to be implemented to the EDA tools [3, 6, 7].                          nel doping are explicit input parameters. In the analyti-
   A huge amount of anti-leakage techniques exist which can                          cal model, only supply voltage, temperature and component
be separated into 3 classes:                                                         state directly enter the model. Bulk voltage, deviating due
                                                                                     to ABB indirectly enters our model by modifying the effec-
                                                                                     tive threshold voltage. Modeling a huge number of transis-
                                                                                     tors in one model, random process variations (intra-die vari-
Permission to make digital or hard copies of all or part of this work for            ations) only enter due to the non-linearity of the Ileak (Vth )
personal or classroom use is granted without fee provided that copies are            relation which is captured inside the models as presented in
not made or distributed for profit or commercial advantage and that copies           [3, 7, 12]. Finally, the inter-die variations resulting in sys-
bear this notice and the full citation on the first page. To copy otherwise, to      tematic (not statistic) deviations of the threshold voltage
republish, to post on servers or to redistribute to lists, requires prior specific   can be regarded by sequentially recomputing the models re-
permission and/or a fee.
ISLPED’06, October 4–6, 2006, Tegernsee, Germany.                                    sulting in a probability density function of the leakage power
Copyright 2006 ACM 1-59593-462-6/06/0010 ...$5.00.                                   distribution.
In this paper we will thus address the emerging leakage        evaluation section, the strengths of the analytical model are
problem by presenting an accurate leakage macro-model en-         easy model characterization and very fast model evaluation.
abling leakage aware RT-synthesis. The problem of mod-            The advantages of [11] are higher modeling accuracy and
eling PTV variations is discussed in several publications as      more direct model parameters.
shown in Section 2. After presenting the simulation envi-
ronment in Section 3, the separation approach itself is only
briefly described in Section 4. In Section 5, the major con-      3.    DESCRIPTION OF THE SIMULATION
tribution of this work, the state dependent leakage model is            ENVIRONMENT
developed and evaluation results are presented in Section 6.         Since there is no reliable higher level leakage estimation
                                                                  tool available, this work is based on the Berkeley SPICE
2.   RELATED WORK                                                 simulator including the BSIM transistor model. The recent
   Chen et al. [13] presented a subthreshold leakage estima-      version BSIM4.40 is able to model various leakage effects
tion methodology for computing lower and upper bounds by          like subthreshold current, gate tunneling and junction leak-
regarding the stacking effect. [14] then presented a first use-   age including the Drain Induced Barrier Lowering (DIBL)
ful separation approach, dividing leakage current into tran-      effect. Unfortunately, in our experiments the recent tran-
sistor count N , supply VDD , leakage per device Idev and a       sistor model failed to converge for several circuits as soon
constant k as                                                     as they exceeded a size of 100-1000 transistors. The older
                                                                  BSIM3.52 version does not show these convergency prob-
                  Ileak = k · N · VDD · Idev ,             (1)    lems, so we decided to use this version, even though it is
where k gives the average leakage current per device at nom-      not capable of estimating other leakage effects than sub-
inal voltage. In our approach, we first introduce data depen-     threshold current. This limits the application of our model
dency to the parameter k, then we show that k accurately          to technologies where subthreshold leakage is the dominat-
separates from the other parameters for the Berkeley Pre-         ing source of leakage. These are technologies down to 70nm
dictive Technology Model (BPTM) 70nm technology, and              structure size and smaller technologies having applied high-
finally we introduce some corrections, as leakage does not        k gate dielectrics.
perfectly separate into these 4 parts.                            The MOSFET model was characterized using the BPTM
   [2] analyses the leakage distribution for the 1 and 2 input    card for the 70nm NMOS and PMOS [1]. The model-cards
gates reporting a substantial state dependency. There are         were created using the reference values for channel length,
many approaches at this level enabling accurate leakage cur-      oxide thickness, threshold voltage and drain-source resis-
rent prediction if all relevant parameters are exactly known      tance.
per device [5]. But some parameters are not accurately pre-       For model characterization and evaluation, we need to have
dictable on the lower levels. The effect of parameter varia-      SPICE simulations of RT components, having each tran-
tions on leakage was analytically investigated in [15], where     sistor properly configured. Hence, we synthesize each RT
[16] regards the distribution of these variations.                component using a commercial technology. Then we replace
   From a system level view, the dominant leakage param-          each gate of the component by a cloned SPICE version hav-
eters [8] described in Section 1 can be identified and pre-       ing the same behavior as the respective commercial gate.
dicted. In [17], a flow for modeling the thermal depen-           In Section 3.1 we present, how to clone a commercial tech-
dence of the leakage was presented, and in [18, 19], ther-        nology to SPICE, and in Section 3.2 we describe the flow
mally dependent leakage estimation was combined with a            creating RTL SPICE.
chip-wide temperature prediction, thus also regarding the
electro-thermal back-coupling introduced by the subthresh-        3.1   Creation of a Generic SPICE Library
old current’s thermal dependence.                                    A commercial technology offers several hundred gates in
   [12] analyses the impact of threshold voltage variations in    different driving strengths. In order to limit the cloning
order to handle intra-die process variations. In [6] a method-    effort, we prune the relevant gates of the commercial tech-
ology is presented, estimating the probability density func-      nology. As for instance, an AND gate has to be constructed
tion of the leakage current due to process parameter vari-        using a NAND gate and an inverter, we pruned all logic gates
ation. In [3] parameter sensitivity analysis is introduced,       which can be constructed from other cells resulting in the
enabling estimation of the effect of a variation on the aver-     same transistor level description. The difference between a
age leakage. In [7] this sensitivity analysis regarding intra-    native AND gate and a constructed one is that the intercon-
and inter-die process variations is extended.                     nect of the native one is optimized on layout level. But as
   In order to combine the different parameters, an itera-        long as the interconnect seems not to show relevant leakage
tive approach is presented by [20], accurately modeling dy-       effects, this pruning is valid for our work and reduces the
namic power and leakage power by regarding the interaction        cloning to 21 native gates in 4 − 10 driving strengths.
between temperature, supply voltage, and power consump-              As described in the next section, the gates of the SPICE
tion. They introduce a thermal system model handling the          library have to replace the gates of the commercial technol-
electro-thermal coupling as well as a supply-grid model han-      ogy in a gate level description of an RT component. To make
dling the electro-electro coupling introduced by the finite       a replacement valid, our gates must have the same timing
capacitance of the supply system. The most complete high          and power behavior. As accurate power figures for the com-
level leakage model was presented by [21], regarding all PTV      mercial technology are not available, we can just ensure the
variations, thus all parameters except for the state.             same timing behavior. Thus, we create timing-equivalent
   Best to our knowledge, the only other approach enabling        RT-components for our model validation.
PTV and state dependent RTL leakage analysis is our al-              The only degree of freedom left in the BPTM regarding
ternative approach, presented in [11]. As can be seen in the      the timing is the transistor width and the sequence of serial
the oxide. VT = kB T /e is proportional to the absolute tem-
        greedy width                                       constraints:           perature T . Assuming that, due to process variations, the
                          commercial    Design Compiler    • use pruned part
          adoption          library                          of the library       threshold voltage is Gaussian distributed
                                                           • write flat verilog
                                                           • avoid tri-state
                                                                                                          √ −1       (Vth − µV )2
                                                                                                                                                                  
           SPICE                        RT component
         testbench                       in flat verilog
                                                                                          p (Vth ) = σV    2π   exp −                                                  ,    (3)
                                                                                                                         2σV 2
                                                                                  the average leakage due to the threshold
                                                                                                                       P variation then re-
          measure
           delays
                                         verilog2spice                            sults as the expectation value E(x) = i p(xi )xi
                                                                                                            Z   ∞
                                            SPICE
     n converged? y         SPICE                                                                  µI =             dVth p (Vth ) I (Vth )
                                            gate list
                            library                                                                          −∞

      commercial lib’s                  RT component          leakage
                                                                                                                                                                  
                                                                                                                ∞                   (Vth −µV )2            Vth
                                                                                             kVT 2 W
                                                                                                            Z                   −                        + nV
        pin delays                      in SPICE/BSIM         currents                                                                  2σV 2                  T
                                                                                           =    √ ·                 dVth e
                                                                                             LσV 2π             −∞
      for each gate in                    for each RT
       pruned library                     component
                                                                                                                2          ∞                (Vth −µV +σV 2 /(nVT ))2
                                                                                      kVT 2 W  µV    σV
                                                                                                                       Z
                                                                                                  +                                     −
                                                                                  =      √ · e nVT 2n2 VT 2 ·                dVth e                         2σV 2

                                                                                      LσV 2π                               −∞
Figure 1: Adopting width of the 70nm BPTM tech-
nology transistors to generate gates having the same                                                                   2            ∞                                      
                                                                                                                                            σV 2
                                                                                                      µV          σV
                                                                                                                               Z
                                                                                            2               +
timing behavior. Right: RT components in Verilog                                                      nVT       2n2 VT 2
                                                                                      = kVT W/L · e                         ·  dVth p Vth +
can be simulated in SPICE using the SPICE library                                                                             −∞
                                                                                                                                            nVT
and the Verilog netlist.
                                                                                                   = Isub (µV ) · fP (σV , T ) · 1.                                         (4)
                                                                                    Due to the nonlinear relation between temperature, thresh-
transistors. Using a setup script interacting with SPICE and                      old voltage and leakage, the threshold voltage can not be
adopting transistor width of each transistor, we construct a                      completely separated, but remains coupled with the tem-
SPICE sub-circuit for every pruned gate of the commercial                         perature as
library having the same delay behavior and the minimal
                                                                                                                                            2
total size (rf. Fig. 1 left side).                                                                 fP (σV , T ) = e(σV /nVT )                   /2
                                                                                                                                                     .                      (5)

3.2       Synthesis of RT-Components                                              4.2    Separation of the Supply Voltage
   As presented on the right side of Fig. 1, we use Syn-                            In first order, the supply voltage influences the subthresh-
opsys Design Compiler to synthesize the components. A                             old current linearly, but due to the drain induced barrier
script limits the synthesis to the pruned gates, forces non-                      lowering effect (DIBL), the threshold voltage depends on
hierarchical output as it eases up the conversion to SPICE,                       the supply voltage as
and avoids tri-state logic as this eases up SPICE conver-                                                                        VDD
sion. We obtain a Verilog gate-netlist, which is automati-                                      ∆Vth (VDD ) = −                                                             (6)
                                                                                                                           2 cosh (L/lc ) − 2
cally converted to SPICE, but now instantiating our SPICE
library instead of the commercial technology’s gates. We                          with lc being a technology constant. Thus, the supply volt-
end up with professionally designed RT-components having                          age also influences the correction term (5), preventing an
the same timing behavior as the commercial technology and                         easy separation approach. We circumvent this by introduc-
giving realistic leakage estimates.                                               ing a second order Taylor approximation at nominal values
                                                                                  building an effective supply voltage function fV . Using the
                                                                                  fact, that eα(x+δ) ≈ eαx · (1 + αδ) for small |δ|, the supply
4.      PARAMETER SEPARATION                                                      voltage dependence can be approximated as
   In this work we regard the impact of process, temperature                                                                                                          2
and voltage variations (PTV-variations) together with state                                                   V −V∗                                      V −V∗
dependencies on leakage. This section will describe a separa-                           fV (V, T ) = 1 + αV ·       + βV ·
                                                                                                                T                                          T
tion approach splitting the influences of process variations,
temperature, and supply voltage.
                                                                                          T ∗ ∂I                                    T ∗2 ∂ 2 I
                                                                                   αV =                              , βV =                                                 (7)
4.1       Variation of the Threshold Voltage                                              V ∗ ∂V   T ∗ ,Vth ∗ ,V ∗
                                                                                                                                    2V ∗ ∂V 2             T ∗ ,Vth ∗ ,V ∗
   Since we will only regard sub-threshold currents, we re-                       The fitting parameters αV and βV can be easily charac-
duce the process variations to the variation of the threshold                     terized having SPICE simulation results. For the 70nm
voltage. We combine the separation approach [14] with a                           BPTM, they resulted as αV ≈ 188.5KAV −2 and βV ≈
statistical process variation model [3]. For a transistor that                    8460K 2 A2 V −3 (rf. Fig. 2).
is off, the subthreshold leakage results as
                                                                                  4.3    Analysis of the Nonlinear Thermal Depen-
                    Isub = kVT 2 W/L · exp (Vth /nVT )                     (2)           dence
where n = 1 + Cdm /Cox is the subthreshold slope resulting                           As can be seen in Table 1, the subthreshold leakage sig-
from the ratio of the capacitances of the depletion layer and                     nificantly depends on temperature, doubling with each 19K
1.5                                                                                Component       N      Ileak   min.   max.   std.
                                                 NMOS @ 300K
                                                                                                                                               [nA]     [%]    [%]    [%]
                                                 NMOS @ 400K
                                                 model @ 300K
                                                                                                                        AddCla4       364     37.31    30.8   27.4   10.7
                                                 model @ 400K                                                           AddCla8       610     62.49    24.8   18.5    7.0
                                                                                                                        AddRpl4       330     34.50    35.2   29.6   14.2
 relative leakage I(V,T) / I(V*,T)

                                                                                                                        AddRpl8       656     76.63    33.6   22.7    7.8
                                                                                                                        DecRpl4        54      8.18    31.9   56.5   22.1
                                                                                                                        DecRpl8       214     29.06    53.0   75.6   25.6
                                      1                                                                                 IncRpl4        56      8.48    62.6   36.8   28.9
                                                                                                                        IncRpl8       156     20.63    39.4   22.9    9.5
                                                                                                                        MultCsa4      758     73.09    17.5   21.1    6.9
                                                                                                                        MultWall4    1034    110.14    29.9   32.5   14.7
                                                                                                                        Mux2 4         50     13.06    65.1   86.7   51.1
                                                                                                                        Mux2 8         98     26.38    65.8   83.0   51.6
                                                                                                                        Mux3 4        104     10.26    71.0   68.9   40.2
                                                                                                                        Mux3 8        208     20.82    70.8   65.6   39.8
                                     0.5                                                                                SubCla4       370     43.70    35.7   46.3   15.2
                                           0         0.2    0.4     0.6      0.8           1   1.2    1.4   1.6
                                                                                                                        SubCla8       696     77.14    34.1   36.0    9.8
                                                                      supply voltage [V]
                                                                                                                        SubRpl4       274     32.92    36.3   39.2   13.9
                                                                                                                        SubRpl8       802     80.85    25.5   38.5    8.1
Figure 2: Evaluation of the voltage separation func-
                                                                                                                  Table 2: Number of transistors N , average sub-
tion fV (V, T ). Within typical limits of supply voltage
                                                                                                                  threshold leakage, percentage of deviation for mini-
(VDD > 0.4V ) and temperature (300K - 400K), the
                                                                                                                  mum and maximum state and relative standard de-
separation error is below 1%
                                                                                                                  viation for RT components up to 1000 transistors.
                                                 T       INM OS       IPM OS         errorN      errorP
                                               [K]     [nA/µm]      [nA/µm]              [%]         [%]
                                               300         2.23         0.12             0.0      -13.3           where the remainder kdata models the state dependency and
                                               320         5.59         0.35             1.6        -6.1          will be discussed in the following section.
                                               340         12.5         0.89             2.1         0.0
                                               360         25.7         2.05             1.9         5.0
                                               380         48.8         4.34             0.8         9.2          5.    MODELING STATE DEPENDENCY
                                               400         86.6         8.56            -0.8       12.7             In this section, we will analyze the importance of state
                                                                                                                  dependent leakage modeling and develop a kdata model.
Table 1: Accuracy of the thermal leakage model for
NMOS and PMOS devices. Negative errors mean                                                                       5.1    Analysis of the State Dependence
an underestimation by the model.
                                                                                                                     Here, we perform a pure SPICE based minimum, maxi-
                                                                                                                  mum, and average leakage current analysis to evaluate the
(NMOS) and 16K (PMOS) of temperature increase. As                                                                 influence, the state has on the leakage of a huge RT struc-
we already handled the temperature-variation correlation in                                                       ture. In 5.2, we then develop a model describing this state
Section 4.1 and the temperature-voltage interaction in Sec-                                                       dependence.
tion 4.2, the remainder of the thermal dependence of the                                                             Determining minimum and maximum leakage current is
subthreshold leakage strictly follows the analytical expres-                                                      np-hard [22]. Hence, we limited our analysis to components
sion of Equation 2. The last two columns of Table 1 sum-                                                          with up to 17 inputs (2 times 8bit data plus 1 control) ex-
marize the error, when trying to model subthreshold current                                                       cept for the multiplexers where minimum and maximum can
of equation 2 using a characterized value of n = 1.598, thus                                                      easily be determined due to their symmetry. In addition we
approximating                                                                                                     limited the analysis to components with less than 1000 tran-
                                                                                                                  sistors as simulation time in SPICE was reasonable then.
                                                           fT (T ) = I0 · eVth /1.598VT .                   (8)   Even with this limits, the total simulation time (for eval-
                                                                                                                  uation) was 6 weeks on a 8xPentium4 System running at
The subthreshold leakage’s thermal dependence of an NMOS                                                          3GHz, as a single BSIM evaluation needs ≈ 2.5ms on our
transistor can be approximated with less than 2% error be-                                                        system and we had to perform 1.5 · 109 evaluations.1
tween 300K and 400K. The PMOS modeling is a little worse                                                             In Table 2, we summarize our results: On average, the
than this having a maximum error of 13% in this range. But                                                        minimum and maximum leakage are 39.5% and 41.9% away
as the subthreshold leakage for PMOS is approximately one                                                         from the mean leakage. The average standard deviation
order of magnitude smaller, the expected overall error for                                                        was 19.2%, but relative standard deviation is sinking with
large circuits remains below 2%.                                                                                  the number of transistors. Monte Carlo simulations of a
                                                                                                                  3490 transistor 8 bit Wallace tree multiplier show a relative
4.4                                            Combination of the Separated Models                                standard deviation of 4.3%.
   The total leakage power of a component having N tran-
                                                                                                                  1
sistors can be modeled as                                                                                          This high simulation effort was needed for detailed analysis
                                                                                                                  of the state dependence. The final model is characterizable
             Isub = I0 · kdata · N · fP (σV , T ) · fV (VDD , T ) · fT (T ) (9)                                   in a few seconds.
5.2     Development of a State Model                                                  Compon.     w/o     w/D     Compon.      w/o     w/D
                                                                                      AddCla4     10.9    3.83    AddCla8      7.35    6.63
   The state dependency of leakage is the least important
                                                                                      AddRpl4     14.4    7.40    AddRpl8      8.12    4.49
one of all parameters introduced in Equation 9, but with
                                                                                      DecRpl4     22.2    8.27    DecRpl8      25.7    11.4
19.2% on average, it still needs to be considered.
                                                                                      IncRpl4     29.0    18.1    IncRpl8      11.0    9.78
   We develop our model, starting with an RT-level soft-
                                                                                      MultCsa4    7.26    2.98    MultWall4    14.9    6.02
macro modeling the leakage state dependence by simplify-
                                                                                      SubCla4     15.4    4.60    SubCla8      10.1    3.00
ing the transistor level equation for fixed temperature and
                                                                                      SubRpl4     14.1    4.36    SubRpl8      8.43    2.86
supply, without variance and limiting the body effect to an
effective width2 :                                                                                                Average      14.2    6.69

                     N −1
                     X                                                          Table 3: Evaluation of the relative standard devi-
     Isub (data) =          si · (wip Ip ) + (1 − si ) · (win In ) ,     (10)   ation error of the parameter separation approach
                      i=0                                                       with no data awareness (w/o) and the data aware
                                                                                approach (w/D).
where   wip
          and win are the effective widths (in µm) of the ith
PMOS and NMOS device, si is the logic value and Ip and
In are PMOS and NMOS leakage for an inverter with 1µm                                    Component       #gates    std. [%]   max. [%]
width. In order to reduce the complexity we replace all wip                              AddCla4          83         4.11       12.9
and win by an average w̄p and w̄n :                                                      AddRpl8          151        2.90       11.6
                                                                                         MultWall8        767        2.33       11.1
                             N −1
                             X                        N −1
                                                       X
         Isub = N w̄p Ip ·          si + N w̄n In ·          (1 − si )          Table 4: Evaluation of the bottom-up model: Rel-
                              i=0                      i=0                      ative standard deviation and maximum estimation
                                                                                error of three 45nm RT components against statisti-
                                    N −1                                        cal SPICE simulation.
                                    X
              Isub = N w̄n In +            si · (w̄p Ip − w̄n In )
                                    i=0
                                                                                6.1     Analytical Model Evaluation
                                                         ∗      ∗
    = N (w̄n In + pall (w̄p Ip − w̄n In )) = N (α + β pall ) (11)                  We compute the relative standard deviation for three mod-
                                                                                els. To evaluate how accurate the parameter-data separation
where pall is the signal probability of all internal nodes.                     is, we fix the input state and just vary all the other param-
Assuming that the internal nodes are correlated to the in-                      eters. We characterize the separation function parameters
puts and can be linearly approximated from the input signal                     using random sampling points. The resulting model error
probability pinput , kdata can be modeled using                                 for all components is between 2% and 3%. For the next
                                                                                model, called ’w/o’ in Table 3, we vary the data and all pa-
                     kdata = αD + βD · pinput ,                          (12)   rameters. We do not characterize the data dependency but
where the data dependence parameters αD and βD can be                           assume that kdata = 1. As the model error of the parameter
fit by characterization. The number of transistors N re-                        separation alone is very low, the resulting error of the ’w/o’
sulted as a factor to the leakage, thus it can be separated                     model is dominated by the data-variance of the component
as suggested in Equation 9. As there is no proof, that state                    as presented in Table 2. The average error of the ’w/o’ model
dependence also separates from other parameters, we exper-                      is 14.1%. The final model, called ’w/D’ in Table 3 uses all
imentally determined the separability. The term                                 parts of Equation 9. Except for the smallest components
                                                                                (incrementer and decrementer), the standard deviation is
                  I(data, TA , VA )/I(data, TB , VB )                           always below 8%, the average error for all components is at
                                                                                6.7%.
varies less than 0.1% when randomly selecting TA , TB , VA                         In comparison to the SPICE simulation time, the model
and VB and evaluating the term for each possible input state.                   evaluation time of each of our models is negligible as is is just
                                                                                evaluation of analytical functions. In order to characterize
                                                                                the models, we took 3000 sampling points needing one hour
6.     EVALUATION                                                               computation time for all components together.
  The set of characterization and the set of evaluation data                       In this evaluation, we neglect the effect of electro-thermal
both are obtained as follows: We create the RT Components                       and electro-electro back-coupling, described in Section 2.
shown in Table 2 and perform SPICE simulations measur-                          But as back-coupling aware modeling means iteratively eval-
ing the leakage of these components while varying all pa-                       uating a model working at fixed parameters and recomput-
rameters: Synthesizing the circuit, we randomly assign a                        ing these parameters afterwards, our model is well applicable
threshold voltage variation between 0 ≤ σVth ≤ 50mV with                        to back-coupling approaches.
a mean value of 0.2V and −0.22V . We supply this circuit
with a voltage randomly chosen from 0.6V ≤ VDD ≤ 1.0V                           6.2     Comparison to the Bottom-Up Model
and set the ambient temperature to a random value from
                                                                                   For comparison to our alternative model, Table 4 shows
300K ≤ T ≤ 400K. Then we measure the subthreshold
                                                                                the evaluation results of the bottom-up model. As this
leakage of the component for different input states.
                                                                                model can estimate the effect of various parameter varia-
2
 In [11], we showed, that leakage reduction due to the body-                    tions, the error of a single prediction is computed in com-
effect can be accurately modeled with a state-dependent ef-                     parison to a Monte Carlo SPICE simulation averaging over
fective transistor width.                                                       1000 settings for intra-die variation of the parameters.
As both models show comparable accuracy, it strongly          [6] H Chang, S Sapatnekar: Full-Chip Analysis of
depends on the application, which is the better one. As              Leakage Power under Process Variations, Including
the bottom-up model is characterized at transistor level, it         Spatial Correlations. DAC, 2005.
avoids RTL spice simulation and directly models transis-         [7] R Rao, A Shrivastava, D Blaauw, D Sylvester:
tor geometry parameters (channel length and width, oxide             Statistical Estimation of Leakage Current Considering
thickness, channel doping) of which the variability is eas-          Inter- and Intra-Die Process Variation. ISLPED, 2003.
ier to predict. But for characterization, several thousand       [8] D Helms, E Schmidt, W Nebel: Leakage in CMOS
spice simulations are needed and have to be saved inside             circuits - An Introduction. PATMOS, 2004.
the model. The advantages of the top-down approach, pre-         [9] D Lee D Blaauw, D Sylvester: Static Leakage
sented here, are a) the few number of characterization data          Reduction Through Simultaneous VT /Tox and State
needed characterizing the few model parameters and b) the            Assignment. IEEE Tran on CAD of ICs and Systems
fast model evaluation.                                               Vol24 No7, 2004
                                                                [10] S Narendra: Challenges and Design Choices in
7.   CONCLUSION                                                      Nanoscale CMOS. ACM Journal on Emerging Techn.
   On gate level, accurate leakage modeling is available, but        in Computing Systems Vol1 No1, 2005.
as design for low leakage has to start at system level, mod-    [11] D Helms, M Hoyer, W Nebel: Accurate PTV, State,
els at higher abstraction are required. RTL models exist,            and ABB Aware RTL Blackbox Modeling of
but because state dependence has the smallest influence to           Subthreshold, Gate, and PN-Junction Leakage.
leakage, it was not regarded so far on these higher levels.          PATMOS, 2006.
We showed that leakage can be modeled with 6.7% accu-           [12] S Narendra, V De, S Borkar, D Antoniadis, A
racy many orders of magnitude faster than SPICE could do.            Chandrakasan: Full-Chip Sub-threshold Leakage
Without regarding state dependency, the accuracy would               Power Prediction Model for sum-0.18µm CMOS.
only be 14.1%. Even though, there are some further im-               ISLPED, 2002.
provements possible:                                            [13] Z Chen, M Johnson, L Wei, K Roy: Estimation of
   We are currently working at an extension to the top-down          Standby Leakage Power in CMOS Circuits
model enabling estimation for all 3 important sources of             Considering Accurate Modeling of Transistor Stacks.
leakage, subthreshold currents, gate leakage and junction            ISLPED, 1998.
leakage due to BTBT. Also modeling gate and junction leak-      [14] J A Butts, G S Sohi: A static power model for
age the variance of other parameters as oxide thickness and          architects. Proc. Int’l Symp. on Microarchitecture,
channel doping are regarded, too. Further work may also an-          2000.
alyze the effect of temperature and supply voltage variations   [15] S Mukhopadhyay, K Roy: Modeling and Estimation of
inside a component, as we assumed both being constant for            Total Leakage Current in Nano-scaled CMOS Devices
all transistors inside one component.                                Considering the Effect of Parameter Variation.
   Finally, the model proposed here has to be embedded into          ISLPED, 2003.
a high level power estimation tool determining the input pa-
                                                                [16] S Bhardwaj, S Vrudhula: Leakage Minimization of
rameters temperature and supply voltage by iteratively com-
                                                                     Nano-Scale Circuits in the Presence of Systematic and
puting local energy consumption and the resulting thermal
                                                                     Random Variables. DAC, 2005.
increase and voltage drop. Having RT level leakage estima-
                                                                [17] Y Zhang, D Parikh, M Stan, K Sankaranarayanan, K
tion is the key enabler for leakage aware RT synthesis.
                                                                     Skadron: HotLeakage: A Temperature-Aware Model
                                                                     of Subthreshold and Gate Leakage for Architects. Tech
8.   REFERENCES                                                      Report CS-2003-05, Univ. of Virginia Dept. of
 [1] Berkeley Predictive Technology Model:                           Computer Science, 2003.
     www-device.eecs.berkeley.edu/∼ptm/                         [18] W Liao, F Li, L He: Microarchitecture Level Power
 [2] S Mukhopadhyay, A Raychowdhury, K Roy: Accurate                 and Thermal Simulation Considering Temperature
     Estimation of Total Leakage Current in Scaled CMOS              Dependent Leakage Model. ISLPED, 2003.
     Logic Circuits Based on Compact Current Modeling.          [19] K Banerjee, S-C Lin, A Keshavarzi, S Narendra, V
     DAC, 2003.                                                      De: A Self-Consistent Junction Temperature
 [3] A Shrivastava, R Bai, D Blaauw, D Sylvester:                    Estimation Methodology for Nanometer Scale ICs
     Modeling and Analysis of Leakage Power Considering              with Implications for Performance and Thermal
     Within-Die Process Variations. ISLPED, 2002.                    Management. IEEE, 2003.
 [4] K Roy, S Mukhopadhyay, H Mahmoodi-Meimand:                 [20] H Su, F Liu, A Devgan, E Acar, S Nassif: Full chip
     Leakage Current Mechanisms and Leakage Reduction                leakage estimation considering power supply and
     Techniques in Deep-Submicrometer CMOS Circuits.                 temperature variations. ISLPED, 2003.
     Proc. of the IEEE Vol.91 No.2, 2003.                       [21] S Borkar, T Karnik, S Narendra, J Tschanz, A
 [5] C Hu: BSIM Model for Circuit Design Using                       Keshavari, V De: Parameter Variations and Impact on
     Advanced Technologies. 2001 Symposium on VLSI                   Microarchitecture. DAC, 2003.
     Circuit Digest of Technical Papers, 2001.                  [22] F A Aloul, S Hassoun, K A Sakallah, D Blaauw:
                                                                     Robust SAT-Based Search Algorithm for Leakage
                                                                     Power Reduction. PATMOS, 2002.
You can also read