UCLA Electrical Engineering Fall 2015: EEM216A
Design of VLSI Circuits and Systems Prof. Dejan Marković
[email protected]
Teaching Staff, Office Hours Prof. Dejan Marković
Vahagn Hokhikyan
MSOL: Yuta Toriyama
Office hours • Mon & Fri 11am-12:30pm • MSOL: Tue 10-11am • 56-147E Eng-IV Bldg. Office hours • Thu 10am-12pm • CAD tools, labs, project Office hours • Wed 5-6pm • MSOL discussions 1.2 D. Markovic / Slide 2
Elevator Pitch 1.5
f e r
SW=22 SVth=22 SVdd=16
1
SW SVth=0.2 SVdd=1.5
Modeling and design
ref
/ E E
of energy-delay optimal
65%
0.5 SW=1 SVth=1 SVdd=1
0
0
0.5
VLSI circuits and systems 1
1.5
D/Dref
1.3 D. Markovic / Slide 3
Background
Familiarity with •
Digital ICs
•
VLSI design
•
CAD tools
1.4 D. Markovic / Slide 4
EE115C vs. EEM216A circuits
circuits + systems
115C(intro) •
• • •
Simple transistor and circuit models Circuit design styles Logic gate design Custom blocks (adders)
216A(advanced) •
•
• •
Several transistor and circuit models Constrained design (Power, Area, Speed) RTL, chip synthesis Test, packaging
1.5 D. Markovic / Slide 5
EEM216A Goals (1/2) Understanding the basic building blocks of VLSI • Transistors/Wires • Logic Gates and Layout • Datapath Blocks Be able to conceptually model a system • Logic Optimization • State Machine Design (RTL)
1.6 D. Markovic / Slide 6
EEM216A Goals (2/2) Be able to build a system (using a subset of the tools) • Verilog Modeling • Synthesis, Place and Route Understanding the constraints and tradeoffs • Delay analysis (gates and interconnects) • Clocking methodology • System integration issues (Power/Ground routing, Noise)
1.7 D. Markovic / Slide 7
Course Objective and Key Outcomes Energy-performance optimal design:
•
Outcome 1: energy and delay models
•
Outcome 2: circuit energy-delay optimization
•
Outcome 3: high-level description, chip synthesis
1.8 D. Markovic / Slide 8
VLSI Design Challenges
•
Power-limited performance
•
Limited technology improvements
•
Methods for energy efficient design
•
Flexibility (multi-mode, multi-standard)
1.9 D. Markovic / Slide 9
Course Outcomes 1 . C M O S s ca l i n g 2. RC transistor model 3. Static CMOS logic gate design 4. Design with H DL (Verilog) 5. Dynamic and leakage power model 6. Power and delay calculation 7. Logical effort and gate sizing 8. Energy-delay tradeoff analysis 9. Clocking methodologies and timing analysis 10. Design automation using logic syn thesis 11. State machine design (ASM or FSM) 1.10 D. Markovic / Slide 10
Online Resources Three places to bookmark: •
EEWeb
Grades
•
Piazza
Q&A
•
Wiki
Course material
1.11 D. Markovic / Slide 11
Everything on the Wiki, Grades on EEWeb Lecture notes, homeworks, tutorials, project, references
classwiki
grades
1.12 D. Markovic / Slide 12
icslwebs.ee.ucla.edu/dejan/classwiki
ee216a_student bruin2015 1.13 D. Markovic / Slide 13
[email protected]
•
Submission of assignments
•
Personal queries ▪ Before you email, think of WHY can’t you post the question on Piazza
1.14 D. Markovic / Slide 14
Course Material •
Lecture notes
•
Homeworks
•
CAD tutorials
•
Class project
•
Selected papers from IEEExplore (http://ieeexplore.ieee.org) 1.15 D. Markovic / Slide 15
Books (Optional) EE115C textbook • J. Rabaey, A. Chandrakasan, B. Nikolić, Digital Integrated Circuits: A Design Perspective , (2nd Edition), Prentice Hall, 2003.
Another popular VLSI textbook • N. Weste, D. Harris, CMOS VLSI Design: A Circuits and Systems Perspective, (3rd Edition), Addison Wesley, 2004.
1.16 D. Markovic / Slide 16
Journals and Conferences Circuits
•
IEEE Journal of Solid-State Circuits (JSSC) IEEE International Solid-State Circuits Conference (ISSCC) European Solid-State Circuits Conference (ESSCIRC) Symposium on VLSI Circuits (VLSI)
• •
Custom Integrated Circuits Conference (CICC) Other conferences and journals
• • •
CAD • • •
IEEE Transactions on Computer-Aided Design (TCAD) International Conference on Computer Aided Design (ICCAD) Design Automation Conference (DAC)
1.17 D. Markovic / Slide 17
Schedule and Syllabus
Intro, Scaling MOS, Delay Models Logic Design Logical Effort Adders Verilog 1 Latches and FFs Verilog 2 Clocking Methods Timing Analysis
Midterm Dynamic Power Leakage Power E-D Optimization Multi-VDD and Clk Physical Synthesis No lecture – projects Thanksgiving holiday Packaging and Test FPGA vs. ASIC Final
Logic Synthesis
1.18 D. Markovic / Slide 18
Grading Policy & Organization 15%
•
5 homeworks
6%
•
3 CAD labs
30%
•
Project
24%
•
Miderm
25%
•
Final
1.19 D. Markovic / Slide 19
Gantt Chart Hw #1 Hw #2 Hw #3 Hw #4 Hw #5 Lab #1 Lab #2 Lab #3 Project Midterm Final Exam 1.20 D. Markovic / Slide 20
Homework Topics •
Scaling, Models, Logical Effort
•
Logic Design, Verilog ALU
•
FFs, Verilog FSM
•
Clk, Timing Analysis
•
Power, E-D Optimization
1.21 D. Markovic / Slide 21
CAD Labs
•
Verilog testbench
•
PrimeTime, PrimePower
•
UPF (Multi-VDD and Clk)
1.22 D. Markovic / Slide 22
Class Project
•
Team project (2 partners) ▪ Start teaming up
•
Topic TBD ▪ Details in Week 4
1.23 D. Markovic / Slide 23
Generic Technologies 45nm PDKs + library • •
PDK: Cadence 45nm GPDK PDK + lib: Nangate open cell library (NCSU FreePDK, ASU PTM)
32/28nm EDK + libraries •
EDK + libs: Synopsys kit and libs ▪ Std cell, I/O, mem, PLL, ref. designs
1.24 D. Markovic / Slide 24
CAD Tools
Cadence
Synopsys •
•
Circuit simulation Logic synthesis
•
(HSPICE) Logic synthesis
•
Physical synthesis
•
Physical synthesis
•
Mentor •
DRC and LVS
1.25 D. Markovic / Slide 25
Design Description: Gajski-Kuhn Y Chart Synthesis
Structural
Behavioral
Processor
Verification
Systems Algorithms Register Transfer Logic Transfer Functions
Hardware Modules ALUs, Registers Gates, FFs Transistors
Physical Design
Rectangles Cell, Module Plans Floor Plans Clusters Physical Partitions
Physical 1.26 D. Markovic / Slide 26
Design Styles
n g is e d f o e s a E
FPGA Gate Array
Cellbased Fully Custom
Development Time Performance 1.27 D. Markovic / Slide 27
Design Styles Custom
•
•
•
Manual PnR: small or regular designs Wiring space preassigned Long design cycle
Cell-based
•
•
•
PnR simplified: rows separated by horiz. routing channels Wiring space not pre-assigned, cell size can vary Fab more complex than gate array
Array
F PG A
Array of cells Personalization ▪ Soft: mem (Xilinx) ▪ Hard: a-fuse (Actel) • Immediate spin • Prototyping • •
•
•
•
Mask or field prog. gate array Inter cell wiring using layout sw Quick to fabricate
1.28 D. Markovic / Slide 28
Design Styles: Comparison S tyle
FPGA
Gatearray
Std-cell
Fullcustom Variable
Cellsize
Fixed
Fixed
Fixed height
Celltype
Prog.
Fixed
Variable
Variable Variable
Cell placement
Fixed
Fixed
Inrow
Interconnect
Prog.
Variable
Variable
Designtime
Veryfast
Fast
Medium
Variable Slow
Could use a mix of styles 1.29 D. Markovic / Slide 29
Design Methodology TOP-DOWN Architecture CAD HDL 7400
1
IC Design AB C
VCC
14
2
13
3 4
12 11
5
10
6
9
7
GND
BOTTOM-UP
8
F=AB+C
To design complex circuits in a simple way, simple components in a complex way need to be analyzed 1.30 D. Markovic / Slide 30
EEM216A Fall 2015
Introduction, CMOS Scaling
[email protected]
The First Digital Electronic Computer Z use Z 3 (1941)
Binary
5 – 10 Hz 22b words
2K relays Google images
Device: Electromechanical relay 1.32 D. Markovic / Slide 32
Five Years Later ENIAC (1946)
150kW
$500K $6M today
Decimal
5M joints
18K
hand-soldered
tubes
Google images
Device: Vacuum tube 1.33 D. Markovic / Slide 33
The First PC Simon (1950)
4 ops:
+, −x, >, S
$600
Google images
2b Reg/ALU Device: Electromechanical relay 1.34 D. Markovic / Slide 34
What is the Machine’s Future? Mr. Berkeley's answer: "Simon has two futures. In first place Simon can grow. With another chassis and some wiring and engineering, the machine will be able to compute decimally. Perhaps in six months more, we may be able to have it working on real problems. In the second place, Simon may start a fad of building baby mechanical brains, similar to the hobby of building crystal radio sets that swept the country in the 1920's." [1956 Berkeley Enterprises Report]
1.35 D. Markovic / Slide 35
Squee (The Electronic Robot Squirrel) Eye
Google images
Hand
[1956 Berkeley Enterprises Report]
1.36 D. Markovic / Slide 36
Integrated Electronics BJT
IC 1948 (Bell Labs)
μP
1958 (TI) 4004
60kHz 1971 (Intel) 1.37 D. Markovic / Slide 37
1965
1.38 D. Markovic / Slide 38
Moore’s Law •
In 1965, Gordon Moore noted that the number of transistors on a chip doubled every 18 to 24 months
“The complexity for minimum component costs has increased at a rate of roughly a factor of two per year. Certainly term,Over this the ratelonger can beterm, expected continue, over if notthe to short increase. the to rate of increase is a bit more uncertain, although there is no reason to believe it will not remain nearly constant for at least 10 years. That means by 1975, the number of components per integrated circuit for minimum cost will be 65,000.” [G. Moore, Electronics, 1965] 1.39 D. Markovic / Slide 39
2005
1.40 D. Markovic / Slide 40
Transistors / cm2 4B
40 years 2,000,000x
100M
improvement Courtesy: Broadcom
2M 50K 2K 10μm
5μ m
1μ m
130nm
20nm
1972
1982
1992
2002
2012 1.41 D. Markovic / Slide 41
Voltage: VDD, VT Size: W, L, tox
1.42 D. Markovic / Slide 42
Dennard’s Classical MOSFET Scaling (1974) Scaling Factor Device or Circuit Parameter 1/κ : Device dimension tox, L, W κ : Doping concentration Na κ 1/ 1/κ 1/κ 1/κ 1/κ2 1
:: : : : :
Voltage V Current I Capacitance εA/tox Delay time/circuit VC/I Power dissipation/circuit VI Power density VI/A
R. Dennard, JSSC, Oct 1974.
1.43 D. Markovic / Slide 43
Constant E-field Scaling Voltage and size scale by the same factor, S (S > 1) • E = V/L = constant Outcomes: • More transistors/area • Fasterdelay • Lowerenergy/op
1 /S 2 1 /S 1 /S 3
Problem: VT scaling (exponential leakage)
1.44 D. Markovic / Slide 44
Constant E-field Scaling
Ended at the 130nm node
1.45 D. Markovic / Slide 45
Historical Scaling Trends Power density
Leakage power
10000
)z H 1000 ( 100 M y c n e 10 u q e 1 r F 0.1 1970
Pentium 4
Pentium Pro Pentium ® 486 8086 286 8085 8080
386
Const VDD Const E General
8008 4004 1980
1990
2000
Courtesy: S. Borkar (Intel)
2010
Year 1.46 D. Markovic / Slide 46
Technology Scaling is Power Driven 1970
B ip o la r
1985
N M OS power wall
•
2000
C M OS power wall
?? ? power wall
CMOS delivered better cost performance ▪ It was more energy efficient ▪ It improved the integration level
1.47 D. Markovic / Slide 47
Bipolar • •
Power Wall
CMOS
Technologies: bipolar, nMOS, CMOS Constant voltage scaling: increasing power
) m c / W ( x lu F t a e H le u d o M
2
Courtesy: Roger Schmidt (IBM)
1.48 D. Markovic / Slide 48
Scaling Scenarios: Fixed V, Fixed E, General Parameter
Relation
FixedV
FixedE
General
W, L, tox
1/S
1/S
1/S
VDD, VT
1
1/S
1/U
Area/Device
WL
1/S2
1/S2
1/S2
Cox
1/tox
S
S
S
Cgate kn , k p
Cox WL Cox W/L
1 /S S
1 /S S
1 /S S
Isat
Cox WV
1
1 /S
1 /U
Current Density
I sat / Area
S2
S
S2/U
Ron
V / Isat
1
1
1
Intr. Delay
Ron Cgate
1/S
1 /S
1 /S
Power
Isat V
1
1/S2
1/U2
P Density
Power/Area
S2
1
S2/U2 1.49 D. Markovic / Slide 49
General Scaling
Size scaling S > Voltage scaling U Voltage scaling slowing down • •
VT determined by leakage tox also set by leakage
Current increasing by stressing silicon
1.50 D. Markovic / Slide 50
Strained Silicon (90nm) •
Increase current to make up the loss due to U < S
Courtesy: Intel 1.51 D. Markovic / Slide 51
High-K Metal-Gate (45nm)
•
Restored tox scaling at the 45nm node
Courtesy: Intel
1.52 D. Markovic / Slide 52
45nm Interconnects
Global wires, low-R power grid
Local interconnects (max density) Courtesy: Intel 1.53 D. Markovic / Slide 53
45nm Interconnects •
M9: very-low-R power routing
Courtesy: Intel
1.54 D. Markovic / Slide 54
Changing Layout Styles 65nm layout
32nm layout
Courtesy: Intel 1.55 D. Markovic / Slide 55
Challenges in Scaling Fixed E (Past) • •
• •
Scaling reduced cost Scaling increased performance Performance constrained Active power dominates
General (Now) • •
• •
Scaling reduces cost Materials increase performance Power constrained Standby power dominates
130 Courtesy: Intel
90 65
45
32
1.56 D. Markovic / Slide 56
Semiconductor Scaling •
Integration density continues to grow
•
RC delay did not scale ▪ RC delay started to overtake gate delay
•
VT did not scale ▪ To cope with leakage
•
VDD did not scale ▪ To sustain performance growth
1.57 D. Markovic / Slide 57
The Limits
Theoretical (Physics)
Practical (Physical + manufacturing cost)
•
System
•
Circuit
•
Device
•
Material
•
Fundamental
[J. Meindl, Proc. IEEE, 1995]
1.58 D. Markovic / Slide 58
Circuit Limits
•
Logic levels (gain)
•
Energy/transition
•
Delay
•
Global interconnect
1.59 D. Markovic / Slide 59
Logic Levels (Gain) •
Distinguish logic 0’s from 1’s
•
Restore logic levels
2 ≥ ≥
|Gain| >1
1 +
ln 2+
≈ 0.1
(T = 300K)
≈4 [J. Meindl, Proc. IEEE, 1995]
1.60 D. Markovic / Slide 60
Circuit Limits (Cont.) •
•
•
Energy/transition ▪ Neglecting Estatic
Delay ▪ Limited by IDSat
Global interconnect ▪ Interconnect delay should not exceed gate delay
=
1 2
∝ ( − ) =
1 2
∝ 2.3 + < 2.3 1.61 D. Markovic / Slide 61
Practical Limits: Minimum Feature Size
~130nm is the most cost-effective technology (the last generation for which deep UV microlithography will suffice) [J. Meindl, Proc. IEEE, 1995]
1.62 D. Markovic / Slide 62
Practical Limits: Die Size
50 mm 40 mm
12” wafer
16” wafer
25 mm
8” wafer
[J. Meindl, Proc. IEEE, 1995] 1.63 D. Markovic / Slide 63
Practical Limits: Packing Efficiency Packing efficiency = # transistors / min feature area
3D integration
Layout density
[J. Meindl, Proc. IEEE, 1995] 1.64 D. Markovic / Slide 64
Approaching Atomic Limits Hair 20nm FET Si atom 0.25nm
200,000x 80x 1x 1.65 D. Markovic / Slide 65
We Have Reached the 130W Power Limit
N.B. Asadi, PhD Thesis, Stanford 2010. 1.66 D. Markovic / Slide 66
Moore’s Law and the Long Term What level?
1965
2005 1.67 D. Markovic / Slide 67
Moore’s Law and the Long Term What level?
Within your working life?
1965
2005
When? 1.68 D. Markovic / Slide 68
Scaling Toward 10nm Node Bulk/SOI CMOS
Post-Silicon
Multi-gate CMOS
5 nm 5 5
m m
65nm 45nm 32nm 22nm
16nm 12nm
•
Technology: alternative structures and materials, post-silicon devices
•
Design: billion transistors, GHz operation
Source: K. Cao (ASU)
1.69 D. Markovic / Slide 69
CMOS Replacement?
•
Replacing CMOS by another more energy efficient technology is a distant prospect now
•
Low-power high-speed CMOS technology is becoming an indispensable, rather than desirable, technology
•
Power is the main challenge we need to address
1.70 D. Markovic / Slide 70
Wiki / References Before next lecture:
Review EE115C Lectures 2-5
(2) MOS IV Model (3) MOS RC Model (4) Inverter VTC (5) Propagation Delay
1.71 D. Markovic / Slide 71