Design Of Dual Port SRAM Using Verilog HDL
TABLE OF CONTENTS
List of figures f igures List of tables
Ca!ter " #NTRODUCT#ON "$" Design Ob%e&ti'es "$( ACCOMPL#SHMENTS Ca!ter ( L#TERATURE REV#E)
Ca!ter * *$" Design ofSRAM
*$( SRAM O!eration
Ca!ter + #ntro,u&tion to FP-A ,esign Flo. +$" #NTRODUCT#ON TO VLS# / FP-A DES#-N FLO) +$"$" Design Entr0 +$"$( S0ntesis +$"$*$ #1!le1entation +$"$*$" Translate +$"$*$( Ma! +$"$*$* Pla&e an, Route
List of figures f igures List of tables
Ca!ter " #NTRODUCT#ON "$" Design Ob%e&ti'es "$( ACCOMPL#SHMENTS Ca!ter ( L#TERATURE REV#E)
Ca!ter * *$" Design ofSRAM
*$( SRAM O!eration
Ca!ter + #ntro,u&tion to FP-A ,esign Flo. +$" #NTRODUCT#ON TO VLS# / FP-A DES#-N FLO) +$"$" Design Entr0 +$"$( S0ntesis +$"$*$ #1!le1entation +$"$*$" Translate +$"$*$( Ma! +$"$*$* Pla&e an, Route
+$"$+ De'i&e Progra11ing +$"$2 Design Verifi&ation +$"$3 Bea'ioral Si1ulation +$"$4 Fun&tional si1ulation +$"$5$ Stati& Ti1ing Anal0sis Ca!ter 2 VER#LO- #NTRODUCT#ON Ca!ter 3 Si1ulation Results Ca!ter 4 Con&lusion an, Future S&o!e Bibliogra!0 an, Referen&es
ABSTRACT Low power and low area Static Random Access Memory (SRAM) is essential for System on Chip (SoC) technology. Dual-ort (D) SRAM greatly reduces the power consumption !y full current-mode techni"ues for read#write operation and the area !y using Single-ort (S) cell. An $ !it D-SRAM is proposed in this study. %egati&e !it-line techni"ue during write has !een utili'ed for write-assist solutions. %egati&e &oltage is generated on-chip using capaciti&e coupling. he proposed circuit design topology does not affect the read operation for !it interlea&ed architectures ena!ling high-speed operation. Designed in *L*% *S+ ,. Simulation results and comparati&e study of the present scheme with state of-the art con&entional schemes proposed .show that the proposed scheme is superior in terms of process-&ariations impact area o&erhead timings and dynamic power consumption. he proposed negati&e !itline techni"ue can !e used to impro&e the write a!ility of / Single-ort (S) as well as $ D and other multiport SRAM cells.
CHAPTER " #NTRODUCT#ON
Most systems contain the following 0inds of memories1 2R3M 2 +R3M 2 ++R3M#4lash 2 DRAM 2SRAM R3Ms +R3Ms and 4lash memories come under the cat- egory of non&olatile memories. %on&olatile memories are de&ices that will store data e&en when the power to the de&ice is remo&ed. R3Ms +R3Ms and 4lash memories differ in the technology used and method !y which the user reprograms the de&ice and the method !y which the user erases the data in the memory de&ice. SRAM and DRAM are random access memories that can store data as long as power is applied to the de&ice. *f the power is e&er remo&ed all data that was stored in the mem- ory will !e lost. +&en when powered in DRAMs data could !e lost if it is not periodically refreshed5 while in SRAMs the data can !e stored without any 0ind of e6tra processing or refresh- ing. he data will remain in t he SRAM once it has !een writ- ten there as long as the power supply to the de&ice is maintained. SRAMs are differentiated from their other memory counter- parts !y the type of the memory cell. %early all SRAMs either use a -transistor or a /-tran sistor Memory Cell. hese cell structures allow data to !e stored for an indefinite amount of time in the de&ice as long as it is powered. 4igure ,!elow shows the -transistor and the /-transistor cell. (hese are usually referred to as the - and the /- cells respecti&ely.) he SRAM cell is formed !y two cross-coupled in&erters. System e&olution o&er time has led to the creation of different types of SRAMs. he ne6t section goes into the details of the different 0inds of SRAMs and their applications.
A si6-transistor CM3S SRAM cell.
Tools Re6uire,7
Si1ulators7 Modelsim /.7! ilin6 ,8.,i *sim Simulator S0ntesis7 ilin6 ,. S (ilin6 Synthesis echnology) Synthesi'er $ FP-A Fa1il07 ilin6 Spartan 9+ C9S7::+.
"$( ACCOMPL#SHMENTS7
his section descri!es the wor0 done in the pro;ect. he accomplishments are categori'ed in to four main phases of the pro;ect in chronological order1 ,. Literature Re&iew a. Sur&eyed the theory on memories . !. Studied the concept of Static RAM. 8. *nitially done a pen and paper wor0 and designed a top le&el !loc0 diagram of SRAM 9. Design hase a. Designed a top le&el module of SRAM in
rote a self chec0ing test !ench consisting of a dri&er monitor chec0er components in &erilog.
!. Applied different test cases for the multiplier in the dri&er section of the test !ench and performed !eha&ioral simulation for the top le&el module. 7. Synthesis hase a. erformed logic synthesis ranslate Map lace and Route processes and synthesis report is generated.
CHAPTER ( L#TERATURE REV#E) Se1i&on,u&tor 1e1or0 is an electronic data storage de&ice often used as computer
memory implemented on a semiconductor - !ased integrated circuit. *t is made in many different types and technologies. Semiconductor memory has the property of random access which means that it ta0es the same amount of time to access any memory location so data can !e efficiently accessed in any random order.?,@ his contrasts with data storage media such as hard dis0s and CDs which read and write data consecuti&ely and therefore the data can only !e accessed in the same se"uence it was written. Semiconductor memory also has much faster access times than other types of data storage5 a !yte of data can !e written to or read from semiconductor memory within a few nanoseconds while access time for rotating storage such as hard dis0s is in the range of milliseconds. 4or these reasons it is used for main computer memory (primary storage) to hold data the computer is currently wor0ing on among other uses. Shift registers processor registers data !uffers and other small digital registers that ha&e no memory address decoding mechanism are not considered as memory although they also store digital data.
"$"
Des&ri!tion
*n a semiconductor memory chip each !it of !inary data is stored in a tiny circuit called a memory cell consisting of one to se&eral transistors. he memory cells are laid out in rectangular arrays on the surface of the chip. he ,-!it memory cells are grouped in small units called words which are accessed together as a single memory address. Memory is manufactured in word length that is usually a power of two typically %, 8 or $ !its. Data is accessed !y means of a !inary num!er called a memory address applied to the chipBs address pins which specifies which word in the chip is to !e accessed. *f the memory address consists of M !its the num!er of addresses on the chip is 8M each containing an % !it word. Conse"uently the amount of data stored in each chip is %8M !its.?,@ he data capacity is usually a power of two1 8 $ ,/ 98 / ,8$ 87/ and 7,8 and measured in 0i!i!its me!i!its gi!i!its or te!i!its etc. Currently (8:,) the largest semiconductor memory chips hold a few gi!i!its of data !ut higher capacity memory is constantly !eing de&eloped. y com!ining se&eral integrated circuits memory can !e arranged into a larger word length and#or address space than what is offered !y each chip often !ut not necessarily a power of two.?,@ he two !asic operations performed !y a memory chip are read in which the data contents of a memory word is read out (nondestructi&ely) and write in which data is stored in a memory word replacing any data that was pre&iously stored there. o increase data rate in some of the latest types of memory chips such as DDR SDRAM multiple words are accessed with each read or write operation. *n addition to standalone memory chips !loc0s of semiconductor memory are integral parts of many computer and data processing integrated circuits. 4or e6ample the microprocessor chips that run computers contain cache memory to store instructions awaiting e6ecution.
"$(
T0!es
RAM chips for computers usually come on remo&a!le memory modules li0e these. Additional memory can !e added to the computer !y plugging in additional modules. RAM (Random access memory) has !ecome a generic term for any semiconductor
memory that can !e written to as well as read from in contrast to R3M (!elow) which can only !e read. All semiconductor memory not ;ust RAM has the property of random access.
DRAM (Dynamic random-access memory) which uses memory cells
consisting of one capacitor and one transistor to store each !it. his is the cheapest and highest in density so it is used for the main memory in computers. =owe&er the electric charge that stores the data in the memory cells slowly lea0s off so the memory cells must !e periodically refreshed (rewritten) re"uiring additional circuitry. he refresh process is automatic and transparent to the user.
o
FPM DRAM (4ast page mode DRAM) An older type of asynchronous
DRAM that impro&ed on pre&ious types !y allowing repeated accesses to a single page of memory to occur at a faster rate. Esed in the mid,FF:s. o
EDO DRAM (+6tended data out DRAM) An older type of
asynchronous DRAM which had faster access time than earlier types !y !eing a!le to initiate a new memory access while data from the pre&ious access was still !eing transferred. Esed in the later part of the ,FF:s. o
VRAM (
memory once used for the frame !uffers of &ideo adapters (&ideo cards). o
SDRAM (Synchronous dynamic random-access memory) his was a
reorgani'ation of the DRAM memory chip which added a cloc0 line to ena!le it to operate in synchronism with the computerBs memory !us cloc0. he data on the chip is di&ided into !an0s so it can wor0 on se&eral memory accesses simultaneously in separate !an0s. *t !ecame the dominant type of computer memory !y a!out the year 8:::.
DDR SDRAM (Dou!le data rate SDRAM) his was an
increased data rate modification ena!ling the chip to transfer twice the memory data (two consecuti&e words) on each cloc0 cycle !y dou!le pumping transferring data on !oth the leading and trailing edges of the cloc0 pulse. +6tensions of this idea are the current (8:,8) techni"ue !eing used to increase memory access rate and !andwidth. Since it is pro&ing difficult to further increase the internal cloc0 speed of memory chips these chips increase data rate !y transferring data in larger !loc0s1
DDR( SDRAM transfers consecuti&e words per
internal cloc0 cycle
DDR* SDRAM transfers $ consecuti&e words per
internal cloc0 cycle.
DDR+ SDRAM transfers ,/ consecuti&e words per
internal cloc0 cycle. *t is scheduled to de!ut in 8:,8.
RDRAM (Ram!us DRAM) an alternate dou!le data rate
memory standard that was used on some *ntel systems !ut ultimately lost out to DDR SDRAM.
S-RAM (Synchronous graphics RAM) a speciali'ed type of
SDRAM made for graphics adaptors (&ideo cards). *t can perform graphics-related operations such as !it mas0ing and !loc0 write and can open two pages of memory at once.
PSRAM (seudostatic RAM) his is DRAM which has
circuitry to perform memory refresh on the chip so that it acts li0e SRAM allowing the e6ternal memory controller to !e shut down to sa&e energy. *t is used in a few porta!le game controllers such as the >ii. •
SRAM (Static random-access memory) which relies on se&eral transistors
forming a digital flip-flop to store each !it. his is less dense and more e6pensi&e per !it than DRAM !ut faster and does not re"uire memory refresh. *t is used for smaller cache memories in computers. •
Content8a,,ressable 1e1or0 his is a speciali'ed type in which instead of
accessing data using an address a data word is applied and the memory returns the location if the word is stored in the memory. *t is mostly incorporated in other chips such as microprocessors where it is used for cache memory. %on&olatile memory preser&es the data stored in it during periods when the power to the chip is turned off. herefore it is used for the memory in porta!le de&ices which
donBt ha&e dis0s and for remo&a!le memory cards among other uses. Ma;or types are1 ?8@?9@
•
ROM (Read-only memory) his is designed to hold permanent data and in
normal operation is only read from not written to. Although many types can !e written to the writing process is slow and usually all the data in the chip must !e rewritten at once. *t is usually used to store system software which must !e immediately accessi!le to the computer such as the *3S program which starts the computer and the software (microcode) for porta!le de&ices and em!edded computers such as microcontrollers. o
Mas9 !rogra11e, ROM *n this type the data is programmed into
the chip during manufacture so it is only used for large production runs. *t cannot !e rewritten with new data. o
PROM (rogramma!le read-only memory) *n this type the data is
written into the chip !efore it is installed in the circuit !ut it can only !e written once. he data is written !y plugging the chip into a de&ice called a R3M programmer. o
EPROM (+rasa!le programma!le read-only memory) *n this type the
data in it can !e rewritten !y remo&ing the chip from the circuit !oard e6posing it to an ultra&iolet light to erase the e6isting data and plugging it into a R3M programmer . he *C pac0age has a small transparent window in the top to admit the E< light. *t is often used for prototypes and small production run de&ices where the program in it may ha&e to !e changed at the factory.
M +R3M showing transparent window used to erase the chip
o
EEPROM (+lectrically erasa!le programma!le read-only memory) *n
this type the data can !e rewritten electrically while the chip is on the circuit !oard !ut the writing process is slow. his type is used to hold firmware the low le&el microcode which runs hardware de&ices such as the *3S program in most computers so that it can !e updated. •
NVRAM :Flas 1e1or0; *n this type the writing process is intermediate in
speed !etween ++R3MS and RAM memory5 it can !e written to !ut not fast enough to ser&e as main memory. *t is often used as a semiconductor &ersion of a hard dis0 to store files. *t is used in porta!le de&ices such as DAs ES flash dri&es and remo&a!le memory cards used in digital cameras and cellphones. Stati& ran,o18a&&ess 1e1or0 (SRAM or stati& RAM) is a type of semiconductor
memory that uses !ista!le latching circuitry to store each !it. he term static differentiates it from dynamic RAM (DRAM) which must !e periodically refreshed. SRAM e6hi!its data remanence?,@ !ut it is still &olatile in the con&entional sense that data is e&entually lost when the memory is not powered. "$*
A!!li&ations an, uses
SRAM cells on the die of a SM984,:9
Comparison image of ,$: nanometre SRAM cells on a SM984,:9
"$*$" •
Cara&teristi&s
SRAM is more e6pensi&e and less dense than DRAM and is therefore not used for high-capacity low-cost applications such as the main memory in personal computers. "$*$"$" Clo&9 rate an, !o.er
he power consumption of SRAM &aries widely depending on how fre"uently it is accessed5 it &an !e as power-hungry as dynamic RAM when used at high fre"uencies and some *Cs can consume many watts at full !andwidth. 3n the other hand static RAM used at a somewhat slower pace such as in applications with moderately cloc0ed microprocessors draws &ery little power and can ha&e a nearly negligi!le power consumption when sitting idle H in the region of a few micro-watts. Static RAM e6ists primarily as1 •
general purpose products
o
with asynchronous interface such as the u!i"uitous 8$-pin $I J $ and 98I J $ chips (often !ut not always named something along the lines of /8/ and /8C87/ respecti&ely) as well as similar products up to ,/ M!it per chip
o
with synchronous interface usually used for caches and other applications re"uiring !urst transfers up to ,$ M!it (87/I J K8) per chip
•
integrated on chip
o
as RAM or cache memory in micro-controllers (usually from around 98 !ytes up to ,8$ 0ilo!ytes)
o
as the primary caches in powerful microprocessors such as the 6$/ family and many others (from $ I up to many mega!ytes)
o
to store the registers and parts of the state-machines used in some microprocessors (see register file)
o
on application specific *Cs or AS*Cs (usually in the order of 0ilo!ytes)
o
in 4GAs and CLDs
"$*$"$( E1be,,e, use •
Many categories of industrial and scientific su!systems automoti&e electronics and similar contain static RAM.
•
Some amount (0ilo!ytes or less) is also em!edded in practically all modern appliances toys etc. that implement an electronic user interface.
•
Se&eral mega!ytes may !e used in comple6 products such as digital cameras cell phones synthesi'ers etc.
SRAM in its dual-ported form is sometimes used for realtime digital signal processing circuits.?citation needed@ "$*$"$* #n &o1!uters
SRAM is also used in personal computers wor0stations routers and peripheral e"uipment1 CE register files internal CE caches and e6ternal !urst mode SRAM caches hard dis0 !uffers router !uffers etc. LCD screens and printers also normally employ static RAM to hold the image displayed (or to !e printed). "$*$"$+ Hobb0ists
=o!!yists specifically home!uilt processor enthusiasts?8@ often prefer SRAM due to the ease of interfacing. *t is much easier to wor0 with than DRAM as there are no refresh cycles and the address and data !uses are directly accessi!le rather than multiple6ed. *n addition to !uses and power connections SRAM usually re"uires only three controls1 Chip +na!le (C+) >rite +na!le (>+) and 3utput +na!le (3+). *n synchronous SRAM Cloc0 (CLI) is also included.?citation needed@
"$+ "$+$"
T0!es of SRAM Non8'olatile SRAM
%on-&olatile SRAMs or n&SRAMs ha&e standard SRAM functionality !ut they sa&e the data when the power supply is lost ensuring preser&ation of critical information. n&SRAMs are used in a wide range of situationsnetwor0ing aerospace and medical among many others?9@ where the preser&ation of data is critical and where !atteries are impractical. "$+$(
As0n&ronous SRAM
Asynchronous SRAM are a&aila!le from I! to / M!. he fast access time of SRAM ma0es asynchronous SRAM appropriate as main memory for small cache-less em!edded processors used in e&erything from industrial electronics and measurement systems to hard dis0s and networ0ing e"uipment among many other applications. hey are used in &arious applications li0e switches and routers *-hones *C-esters DSLAM Cards to Automoti&e +lectronics. "$+$* •
B0 transistor t0!e
ipolar ;unction transistor (used in L and +CL) H &ery fast !ut consumes a lot of power
•
"$+$+ •
M3S4+ (used in CM3S) H low power and &ery common today B0 fun&tion
Asynchronous H independent of cloc0 fre"uency5 data in and data out are controlled !y address transition
•
Synchronous H all timings are initiated !y the cloc0 edge(s). Address data in and other control signals are associated with the cloc0 signals
"$+$2 •
B0 feature
( stands for 'ero !us turnaround) H the turnaround is the num!er of cloc0 cycles it ta0es to change access to the SRAM from write to read and
&ice &ersa. he turnaround for SRAMs or the latency !etween read and write cycle is 'ero. •
syncurst (syncurst SRAM or synchronous-!urst SRAM) H features synchronous !urst write access to the SRAM to increase write operation to the SRAM
•
DDR SRAM H Synchronous single read#write port dou!le data rate *#3
•
Nuad Data Rate SRAM H Synchronous separate read and write ports "uadruple data rate *#3
Ca!ter *
DES#-N OF SRAM #N VER#LO-
*$" Design of SRAM
A typical SRAM cell is made up of si6 M3S4+s. +ach !it in an SRAM is stored on four transistors (M, M8 M9 M) that form two cross-coupled in&erters. his storage cell has two sta!le states which are used to denote < and ". wo additional access transistors ser&e to control the access to a storage cell during read and write operations. *n addition to such si6-transistor (/) SRAM other 0inds of SRAM chips use $ ,: ( $ ,: SRAM) or more transistors per !it. ?@?7@?/@ 4our-transistor SRAM is "uite common in stand-alone SRAM de&ices (as opposed to SRAM used for CE caches) implemented in special processes with an e6tra layer of polysilicon allowing for &ery high-resistance pull-up resistors. ?K@ he principal draw!ac0 of using SRAM is increased static power due to the constant current flow through one of the pull-down transistors.
4our transistor SRAM pro&ides ad&antages in density at the cost of manufacturing comple6ity. he resistors must ha&e small dimensions and large &alues. his is sometimes used to implement more than one (read and#or write) port which may !e useful in certain types of &ideo memory and register files implemented with multi-ported SRAM circuitry. Generally the fewer transistors needed per cell the smaller each cell can !e. Since the cost of processing a silicon wafer is relati&ely fi6ed using smaller cells and so pac0ing more !its on one wafer reduces the cost per !it of memory.
A si6-transistor CM3S SRAM cell. Memory cells that use fewer than four transistors are possi!le H !ut such 9?$@?F@ or , cells are DRAM not SRAM (e&en the so-called ,-SRAM). Access to the cell is ena!led !y the word line (>L in figure) which controls the two access transistors M7 and M/ which in turn control whether the cell should !e connected to the !it lines1 L and L. hey are used to transfer data for !oth read and write operations. Although it is not strictly necessary to ha&e two !it lines !oth the signal and its in&erse are typically pro&ided in order to impro&e noise margins.
During read accesses the !it lines are acti&ely dri&en high and low !y the in&erters in the SRAM cell. his impro&es SRAM !andwidth compared to DRAMs H in a DRAM the !it line is connected to storage capacitors and charge sharing causes the !itline to swing upwards or downwards. he symmetric structure of SRAMs also allows for differential signaling which ma0es small &oltage swings more easily detecta!le. Another difference with DRAM that contri!utes to ma0ing SRAM faster is that commercial chips accept all address !its at a time. y comparison commodity DRAMs ha&e the address multiple6ed in two hal&es i.e. higher !its followed !y lower !its o&er the same pac0age pins in order to 0eep their si'e and cost down. he si'e of an SRAM with m address lines and n data lines is 8m words or 8m J n !its. he most common word si'e is $ !its meaning that a single !yte can !e read or written to each of 8m different words within the SRAM chip. Se&eral common SRAM chips ha&e ,, address lines (thus a capacity of 8m 8:$ 80 words) and an $-!it word so they are referred to as 80 J $ SRAM. *$(SRAM o!eration
An SRAM cell has three different states. *t can !e in1 stand!y (the circuit is idle) reading (the data has !een re"uested) and writing (updating the contents). he SRAM to operate in read mode and write mode should ha&e reada!ility and write sta!ility respecti&ely. he three different states wor0 as follows1 *$($" Stan,b0
*f the word line is not asserted the access transistors M7 and M/ disconnect the cell from the !it lines. he two cross-coupled in&erters formed !y M, H M will continue to reinforce each other as long as they are connected to the supply. *$($( Rea,ing
Assume that the content of the memory is a " stored at N. he read cycle is started !y precharging !oth the !it lines to a logical " then asserting the word line >L ena!ling !oth the access transistors. he second step occurs when the &alues stored in N and N are transferred to the !it lines !y lea&ing L at its precharged &alue and discharging L through M, and M7 to a logical < (i. e.
e&entually discharging discharging through the transistor transistor M, as it is turned on !ecause the N is logically set to "). 3n the L side the transistors M and M/ pull the !it line toward
he start of a write cycle !egins !y applying the &alue to !e written to the !it lines. *f we wish to write a < we would apply a < to the !it lines i.e. setting L to " and L to <. his is similar to applying a reset pulse to an SR-latch which causes the flip flop to change state. A " is written !y in&erting the &alues of the !it lines. >L is then asserted and the &alue that is to !e stored is latched in. %ote that the reason this wor0s is that the !it line input-dri&ers are designed to !e much stronger than the relati&ely wea0 transistors in the cell itself so that they can easily o&erride the pre&ious state of the cross-coupled in&erters. Careful si'ing of the transistors in an SRAM cell is needed to ensure proper operation. "
Bus bea'ior
RAM with RAM with an access time of K: ns will output &alid data within K: ns from the time that the address lines are &alid. ut the data will remain for a hold time as well (7H ,: ns). Rise and fall times also influence influence &alid timeslots with appro6imately 7 ns. y reading the lower part of an address range !its in se"uence (page cycle) one can read with significantly shorter access time (9: ns)
CHAPTER + #NTRODUCT#ON TO VLS# / FP-A DES#-N FLO)
*ntroduction to
+$" Histori&al Pers!e&ti'e
he electronics industry has achie&ed a phenomenal growth o&er the last two decades mainly due to the rapid ad&ances in integration technologies large-scale systems design - in short due to the ad&ent of
his trend is e6pected to continue with &ery important implications on
transistors depending on the function. State-of-the-art e6amples of ELS* chips such as the D+C Alpha or the *%+L entium contain 9 to / million transistors. +RA
DA+
C3ML+*O
Single transistor
,F7F
less than ,
Enit logic (one gate)
,F/:
,
Multi-function
,F/8
8-
Comple6 function
,F/
7 - 8:
Medium Scale *ntegration
,F/K
8: - 8:: (MS*)
Large Scale *ntegration
,FK8
8:: - 8:::
,FK$
8::: - 8::::
,F$F
8:::: - P(ELS*)
(num!er of logic !loc0s per chip)
(LS*)
Table8*$"7 +&olution of logic comple6ity in integrated circuits.
he most important message here is that the logic comple6ity per chip has !een (and still is) increasing e6ponentially. he monolithic integration of a large num!er of functions on a single chip usually pro&ides1 •
Less area#&olume and therefore compactness
•
Less power consumption
•
Less testing re"uirements at system le&el
•
=igher relia!ility mainly due to impro&ed on-chip interconnects
•
=igher speed due to significantly reduced interconnection length
•
Significant cost sa&ings
Figure8*$(7 +&olution of integration density and minimum feature si'e as seen in the
early ,F$:s. herefore the current trend of integration will also continue in the foreseea!le future. Ad&ances in de&ice manufacturing technology and especially the steady reduction of minimum feature si'e (minimum length of a transistor or an interconnect reali'a!le on chip) support this trend. 4igure ,.8 shows the history and forecast of chip comple6ity - and minimum feature si'e - o&er time as seen in the early ,F$:s. At that time a minimum feature si'e of :.9 microns was e6pected around the year 8:::. he actual de&elopment of the technology howe&er has far e6ceeded these e6pectations. A minimum si'e of :.87 microns was readily achie&a!le !y the year ,FF7. As a direct result of this the integration density has also e6ceeded pre&ious e6pectations - the first / M!it DRAM and the *%+L entium microprocessor chip containing more than 9 million transistors were already a&aila!le !y ,FF pushing the en&elope of integration density. >hen comparing the integration density of integrated circuits a clear distinction must !e made !etween the memory chips and logic chips. 4igure ,.9 shows the le&el of integration o&er time for memory and logic chips starting in ,FK:. *t can !e o!ser&ed that in terms of transistor count logic chips contain significantly fewer transistors in
any gi&en year mainly due to large consumption of chip area for comple6 interconnects. Memory circuits are highly regular and thus more cells can !e integrated with much less area for interconnects.
Figure8*$*7 Le&el of integration o&er time for memory chips and logic chips.
Generally spea0ing logic chips such as microprocessor chips and digital signal processing (DS) chips contain not only large arrays of memory (SRAM) cells !ut also many different functional units. As a result their design comple6ity is considered much higher than that of memory chips although ad&anced memory chips contain some sophisticated logic functions. he design comple6ity of logic chips increases almost e6ponentially with the num!er of transistors to !e integrated. his is translated into the increase in the design cycle time which is the time period from the start of the chip de&elopment until the mas0-tape deli&ery time. =owe&er in order to ma0e the !est use of the current technology the chip de&elopment time has to !e short enough to allow the maturing of chip manufacturing and timely deli&ery to customers. As a result the le&el of actual logic integration tends to fall short of the integration le&el achie&a!le with the current processing technology. Sophisticated computer-aided design (CAD) tools and methodologies are de&eloped and applied in order to manage the rapidly increasing design comple6ity.
+$( VLS# Design Flo.
he design process at &arious le&els is usually e&olutionary in nature. *t starts with a gi&en set of re"uirements. *nitial design is de&eloped and tested against the re"uirements. >hen re"uirements are not met the design has to !e impro&ed. *f such impro&ement is either not possi!le or too costly then the re&ision of re"uirements and its impact analysis must !e considered. he O-chart (first introduced !y D. Ga;s0i) shown in 4ig. ,. illustrates a design flow for most logic chips using design acti&ities on three different a6es (domains) which resem!le the letter O.
Figure8*$+7 ypical
he O-chart consists of three ma;or domains namely1 •
!eha&ioral domain
•
structural domain
•
geometrical layout domain.
he design flow starts from the algorithm that descri!es the !eha&ior of the target chip. he corresponding architecture of the processor is first defined. *t is mapped onto the chip surface !y floorplanning. he ne6t design e&olution in the !eha&ioral domain defines finite state machines (4SMs) which are structurally implemented with functional modules such as registers and arithmetic logic units (ALEs). hese modules are then geometrically placed onto the chip surface using CAD tools for automatic module placement followed !y routing with a goal of minimi'ing the interconnects area and signal delays. he third e&olution starts with a !eha&ioral module description. *ndi&idual modules are then implemented with leaf cells. At this stage the chip is descri!ed in terms of logic gates (leaf cells) which can !e placed and interconnected !y using a cell placement Q routing program. he last e&olution in&ol&es a detailed oolean description of leaf cells followed !y a transistor le&el implementation of leaf cells and mas0 generation. *n standard-cell !ased design leaf cells are already pre-designed and stored in a li!rary for logic design use. *$" CONVOLUT#ONAL ENCODER
Figure8*$27 A more simplified &iew of
4igure ,.7 pro&ides a more simplified &iew of the
to fit the architecture into the allowa!le chip area some functions may ha&e to !e remo&ed and the design process must !e repeated. Such changes may re"uire significant modification of the original re"uirements. hus it is &ery important to feed forward low-le&el information to higher le&els (!ottom up) as early as possi!le. *n the following we will e6amine design methodologies and structured approaches which ha&e !een de&eloped o&er the years to deal with !oth comple6 hardware and software pro;ects. Regardless of the actual si'e of the pro;ect the !asic principles of structured design will impro&e the prospects of success. Some of the classical techni"ues for reducing the comple6ity of *C design are1 =ierarchy regularity modularity and locality. *$* Design Hierar&0
he use of hierarchy or �di&ide and con"uer � techni"ue in&ol&es di&iding a module into su!- modules and then repeating this operation on the su!-modules until the comple6ity of the smaller parts !ecomes managea!le. his approach is &ery similar to the software case where large programs are split into smaller and smaller sections until simple su!routines with well-defined functions and interfaces can !e written. *n Section ,.8 we ha&e seen that the design of a
layout) domain resulting in a simple floorplan. his physical &iew descri!es the e6ternal geometry of the adder the locations of input and output pins and how pin locations allow some signals (in this case the carry signals) to !e transferred from one su!-!loc0 to the other without e6ternal routing. At lower le&els of the physical hierarchy the internal mas0
Figure8*$37 Structural decomposition of a four-!it adder circuit showing the
hierarchy down to gate le&el.
Figure8*$47 Regular design of a 8-, ME a D44 and an adder using in&erters and
tri-state !uffers. *$+ VLS# Design St0les
Se&eral design styles can !e considered for chip implementation of specified algorithms algorithms or logic functions. +ach design style has its own merits and shortcomings shortcomings and thus a proper choice has to !e made !y designers in order to pro&ide the functionality at low cost. *$+$" Fiel, Progra11able -ate Arra0 :FP-A;
4ully fa!ricated 4GA chips containing thousands of logic gates or e&en more with programma!le interconnects are a&aila!le to users for their custom hardware programming to reali'e desired functionality. his design style pro&ides a means for fast prototyping and also for cost-effecti&e chip design especially for low-&olume applications. A typical field programma!le gate array (4GA) chip consists of *#3 !uffers an array of configura!le logic logi c !loc0s (CLs) and programma!le p rogramma!le interconnect interconn ect structures. structures. he programming of the interconnects is implemented !y programming programming of RAM cells whose output terminals are connected to the gates of M3S pass transistors. A general architecture of 4GA from *L*% is shown in 4ig. 9.$. A more detailed &iew showing the locations of switch matrices used for interconnect routing is gi&en in 4ig. 9.F. A simple CL (model C8::: from *L*%) is shown in 4ig. 9.,:. *t consists of four signal input terminals (A C D) a cloc0 signal terminal user programma!le multiple6ers an SR-latch and a loo0-up ta!le (LE). he LE is a
digital memory that stores the truth ta!le of the oolean function. hus it can generate any function of up to four &aria!les or any two functions of three &aria!les. he control terminals of multiple6ers are not shown e6plicitly in 4ig. 9.,:. hee CL h CL is config configure ured d such such that that many many diffe differen rentt logic logic functi function onss can can !e reali reali'ed 'ed !y progr program ammi ming ng its its array array.. Mo More re sop sophi histi sticat cated ed CLs CLs ha&e ha&e also also !een !een introduced to map comple6 functions. he typical design flow of an 4GA chip starts with with the !eha&ior !eha&ioral al descript description ion of its function functionalit ality y using using a hardwar hardwaree descript description ion language such as <=DL. <=DL. he synthesi'ed synthesi'ed architecture is then technology-mapped technology-mapped (or partitioned) into circuits or logic cells. At this stage the chip design is completely descri!ed in terms of a&aila!le logic cells. %e6t the placement and routing step assigns assigns indi&idu indi&idual al logic logic cells cells to 4GA 4GA sites (CLs) (CLs) and determin determines es the routing routing patterns among the cells in accordance a ccordance with the th e netlist. After routing is completed co mpleted the on-chip
Figure8*$57 General architecture of ilin6 4GAs.
Figure8*$=7 Detailed &iew of switch matrices and interconnection routing !etween
CLs.
Figure8*$"<7 C8::: CL of the ilin6 4GA.
erformance of the design can !e simulated and &erified !efore downloading the design for programming of the 4GA chip. he programming of the chip remains &alid as long as the chip is powered-on or until new programming is done. *n most cases full utili'ation of the 4GA chip area is not possi!le - many cell sites may remain unused. he largest ad&antage of 4GA-!ased design is the &ery short turn-around time i.e. the time re"uired from the start of the design process until a functional chip is a&aila!le. Since no physical manufacturing step is necessary for customi'ing the 4GA chip a functional sample can !e o!tained almost as soon as the design is mapped into a specific technology. he typical price of 4GA chips are usually higher than other reali'ation alternati&es (such as gate array or standard cells) of the same design !ut for small-&olume production of AS*C chips and for fast prototyping 4GA offers a &ery &alua!le option. *$+$( -ate Arra0 Design
*n &iew of the fast prototyping capa!ility capa!ility the gate array (GA) comes after the 4GA 4GA.. >hil >hilee the the design design imple impleme menta ntati tion on of the 4GA 4GA chip is don donee with with user user programming that of the gate array is done with metal mas0 design and processing. Gate array implementation re"uires a two-step manufacturing process1 he first phase which is !ased on generic (standard) mas0s results in an array of uncommitted trans transist istors ors on each each GA chip. chip. h hes esee uncom uncommi mitt tted ed chips chips can can !e store stored d for later later customi'ation which is completed !y defining the metal interconnects !etween the
transistors of the array (4ig. 9.,,). Since the patterning of metallic interconnects is done at the end of the chip fa!rication the turn-around time can !e still short a few days to a few wee0s. 4igure 9.,8 shows a corner of a gate array chip which contains !onding pads on its left and !ottom edges diodes for *#3 protection nM3S transistors and pM3S transistors for chip output dri&er circuits in the neigh!oring areas of !onding pads arrays of nM3S transistors and pM3S transistors underpass wire segments and power and ground !uses along with contact windows.
Figure8*$""7 asic processing steps re"uired for gate array implementation.
Figure8*$"(7 A corner of a typical gate array chip.
4igure 9.,9 shows a magnified portion of the internal array with metal mas0 design (metal lines highlighted in dar0) to reali'e a comple6 logic function. ypical gate array platforms allow dedicated areas called channels for intercell routing as shown in 4igs. 9.,8 and 9.,9 !etween rows or columns of M3S transistors. he
a&aila!ility of these routing channels simplifies the interconnections e&en using one metal layer only. he interconnection patterns to reali'e !asic logic gates can !e stored in a li!rary which can then !e used to customi'e rows of uncommitted transistors according to the netlist. >hile most gate array platforms only contain rows of uncommitted transistors separated !y routing channels some other platforms also offer dedicated memory (RAM) arrays to allow a higher density where memory functions are re"uired. 4igure 9., shows the layout &iews of a con&entional gate array and a gate array platform with two dedicated memory !an0s. >ith the use of multiple interconnect layers the routing can !e achie&ed o&er the acti&e cell areas5 thus the routing channels can !e remo&ed as in Sea-of-Gates (S3G) chips. =ere the entire chip surface is co&ered with uncommitted nM3S and pM3S transistors. As in the gate array case neigh!oring transistors can !e customi'ed using a metal mas0 to form !asic logic gates. 4or intercell routing howe&er some of the uncommitted transistors must !e sacrificed. his approach results in more fle6i!ility for interconnections and usually in a higher density. he !asic platform of a S3G chip is shown in 4ig. ,.,F. 4igure ,.8: offers a !rief comparison !etween the channeled (GA) &s. the channelless (S3G) approaches.
Figure8*$"*7 Metal mas0 design to reali'e a comple6 logic function on a channeled
GA platform.
Figure8*$"+7 Layout &iews of a con&entional GA chip and a gate array with two
memory !an0s.
Figure8*$"27 he platform of a Sea-of-Gates (S3G) chip.
*n general the GA chip utili'ation factor as measured !y the used chip area di&ided !y the total chip area is higher than that of the 4GA and so is the chip speed since more customi'ed design can !e achie&ed with metal mas0 designs. he current gate array chips can implement as many as hundreds of thousands of logic gates.
Figure8*$"37 Comparison !etween the channeled (GA) &s. the channelless (S3G)
approaches. *$+$* Stan,ar,8Cells Base, Design
he standard-cells !ased design is one of the most pre&alent full custom design styles which re"uire de&elopment of a full custom mas0 set. he standard cell is also called the polycell. *n this design style all of the commonly used logic cells are de&eloped characteri'ed and stored in a standard cell li!rary. A typical li!rary may contain a few hundred cells including in&erters %A%D gates %3R gates comple6 A3* 3A* gates D-latches and flip-flops. +ach gate type can ha&e multiple implementations to pro&ide ade"uate dri&ing capa!ility for different fanouts. 4or instance the in&erter gate can ha&e standard si'e transistors dou!le si'e transistors and "uadruple si'e transistors so that the chip designer can choose the proper si'e to achie&e high circuit speed and layout density. he characteri'ation of each cell is done for se&eral different categories. *t consists of •
delay time &s. load capacitance
•
circuit simulation model
•
timing simulation model
•
fault simulation model
•
cell data for place-and-route
•
mas0 data
o ena!le automated placement of the cells and routing of inter-cell connections each cell layout is designed with a fi6ed height so that a num!er of cells can !e a!utted side-!y-side to form rows. he power and ground rails typically run parallel to the upper and lower !oundaries of the cell thus neigh!oring cells share a common power and ground !us. he input and output pins are located on the upper and lower !oundaries of the cell. 4igure 9.,K shows the layout of a typical standard cell. %otice that the nM3S transistors are located closer to the ground rail while the pM3S transistors are placed closer to the power rail.
Figure8"$"47 A standard cell layout e6ample.
4igure 9.,$ shows a floorplan for standard-cell !ased design. *nside the *#3 frame which is reser&ed for *#3 cells the chip area contains rows or columns of standard cells. etween cell rows are channels for dedicated inter-cell routing. As in the case of Sea-of-Gates with o&er-the- cell routing the channel areas can !e reduced or e&en remo&ed pro&ided that the cell rows offer sufficient routing space. he physical design and layout of logic cells ensure that when cells are placed into rows their heights are matched and neigh!oring cells can !e a!utted side-!y-side which pro&ides natural connections for power and ground lines in each row. he signal delay noise margins and power consumption of each cell should !e also optimi'ed with proper si'ing of transistors using circuit simulation.
Figure8*$"57 A simplified floorplan of standard-cells-!ased design.
*f a num!er of cells must share the same input and#or output signals a common signal !us structure can also !e incorporated into the standard-cell-!ased chip layout. 4igure ,.89 shows the simplified sym!olic &iew of a case where a signal !us has !een inserted !etween the rows of standard cells. %ote that in this case the
chip consists of two !loc0s and power#ground routing must !e pro&ided from !oth sides of the layout area. Standard-cell !ased designs may consist of se&eral such macro-!loc0s each corresponding to a specific unit of the system architecture such as ALE control logic etc.
Figure8*$"=7 Simplified floorplan consisting of two separate !loc0s and a common
signal !us. After chip logic design is done using standard cells in the li!rary the most challenging tas0 is to place indi&idual cells into rows and interconnect them in a way that meets stringent design goals in circuit speed chip area and power consumption. Many ad&anced CAD tools for place-and-route ha&e !een de&eloped and used to achie&e such goals. Also from the chip layout circuit models which include interconnect parasitics can !e e6tracted and used for timing simulation and analysis to identify timing critical paths. 4or timing critical paths proper gate si'ing is often practiced to meet the timing re"uirements. *n many
he a&aila!ility of dedicated memory !loc0s also reduces the area since the reali'ation of memory elements using standard cells would occupy a larger area.
Figure8*$(<7 Mas0 layout of a standard-cell-!ased chip with a single !loc0 of cells
and three memory !an0s. *$+$+ Full Custo1 Design
Although the standard-cells !ased design is often called full custom design in a strict sense it is somewhat less than fully custom since the cells are pre-designed for general use and the same cells are utili'ed in many different chip designs. *n a fuller custom design the entire mas0 design is done anew without use of any li!rary. =owe&er the de&elopment cost of such a design style is !ecoming prohi!iti&ely high. hus the concept of design reuse is !ecoming popular in order to reduce design cycle time and de&elopment cost. he most rigorous full custom design can !e the design of a memory cell !e it static or dynamic. Since the same layout design is replicated there would not !e any alternati&e to high density memory chip design. 4or logic chip design a good compromise can !e achie&ed !y using a com!ination of different design styles on the same chip such as standard cells data-path cells and LAs. *n real full-custom layout in which the geometry orientation and placement of e&ery transistor is done indi&idually !y the designer design producti&ity is usually &ery low - typically ,: to 8: transistors per day per designer.
*n digital CM3S
Figure8*$("7 3&er&iew of
+$" FP-A DES#-N FLO)
4GA contains a two dimensional arrays of logic !loc0s and interconnections !etween logic !loc0s. oth the logic !loc0s and interconnects are programma!le. Logic !loc0s are programmed to implement a desired function and the interconnects are programmed using the switch !o6es to connect the logic !loc0s. o !e more clear if we want to implement a comple6 design (CE for instance) then the design is di&ided into small su! functions and each su! function is implemented using one logic !loc0. %ow to get our desired design (CE) all the su! functions implemented in logic !loc0s must !e connected and this is done !y programming the interconnects. *nternal structure of an 4GA is depicted in the following figure.
4GAs alternati&e to the custom *Cs can !e used to implement an entire System 3n one Chip (S3C). he main ad&antage of 4GA is a!ility to reprogram. Eser can reprogram an 4GA to implement a design and this is done after the 4GA is manufactured. his !rings the name 4ield rogramma!le. Custom *Cs are e6pensi&e and ta0es long time to design so the y are useful when produced in !ul0 amounts. ut 4GAs are easy to implement with in a short time with the help of Computer Aided Designing (CAD) tools (!ecause there is no physical layout process no mas0 ma0ing and no *C manufacturing). Some disad&antages of 4GAs are they are slow compared to custo m *Cs as they canTt handle &ary comple6 designs and also they draw more power. ilin6 logic !loc0 consists of one Loo0 Ep a!le (LE) and one 4lip4lop. An LE is used to implement num!er of different functionality. he input lines to the logic !loc0 go into the
LE and ena!le it. he output of the LE gi&es the result of the logic function that it implements and the output of logic !loc0 is registered or unregistered out put from the LE. SRAM is used to implement a LE.A 0-input logic function is implemented using 8U0 V , si'e SRAM. %um!er of different possi!le functions for 0 input LE is 8U8U0. Ad&antage of such an architecture is that it supports implementation of so man y logic functions howe&er the disad&antage is unusually large num!er of memory cells re"uired to implement such a logic !loc0 in case num!er of inputs is large. 4igure !elow shows a -input LE !ased implementation of logic !loc0.
LE !ased design pro&ides for !etter logic !loc0 utili'ation. A 0-input LE !ased logic !loc0 can !e implemented in num!er of different ways with trade off !etween performance and logic
density.
An n-LE can !e shown as a direct implementation of a function truth-ta!le. +ach of the latch holds the &alue of the function corresponding to one input co m!ination. 4or +6ample1 8LE can !e used to implement ,/ types of functions li0e A%D 3R AWnot .... etc.
A
A%D
3R
%A%D
:
:
:
:
,
:
,
:
,
,
,
:
:
,
,
......
....
,
,
,
,
:
#nter&onne&ts
A wire segment can !e descri!ed as two end points of an interconnect with no programma!le switch !etween them. A se"uence of one or more wire segments in an 4GA
can
!e
termed
as
a
trac0.
ypically an 4GA has logic !loc0s interconnects and switch !loc0s (*nput#3utput !loc0s). Switch !loc0s lie in the periphery of logic !loc0s and interconnect. >ire segments are connected to logic !loc0s through switch !loc0s. Depending on the re"uired design one logic !loc0 is connected to another and so on. FP-A DES#-N FLO)
*n this part of tutorial we are going to ha&e a short intro on 4GA design flow. A simplified &ersion of design flow is gi&en in the flowing diagram.
+$"$" Design Entr0
here are different techni"ues for design entry. Schematic !ased =ardware Description Language and com!ination of !oth etc. . Selection of a method depends on the design and designer. *f the designer wants to deal more with =ardware then Schematic entry is the !etter choice. >hen the design is comple6 or the designer thin0s the design in an algorithmic way then =DL is the !etter choice. Language !ased entry is faster !ut lag in performance and density. =DLs represent a le&el of a!straction that can isolate the designers from the details of the hardware implementation. Schematic !ased entry gi&es designers much more &isi!ility into the hardware. *t is the !etter choice for those who are hardware oriented. Another method !ut rarely used is state-machines. *t is the !etter choice for the designers who thin0 the design as a series of states. ut the tools for state machine entry are limited. *n this documentation we are going to deal with the =DL !ased design entry.
+$"$( S0ntesis
he process which translates <=DL or
netlist
for
each
design
element
Synthesis process will chec0 code synta6 and analy'e the hierarchy of the design which ensures that the design is optimi'ed for the design architecture the designer has selected. he resulting netlist(s) is sa&ed to an %GC( %ati&e Generic Circuit) file (for ilin6Y Synthesis echnology (S)).
+$"$*$ #1!le1entation
his process consists a se"uence of three steps ,. ranslate 8. Map 9. lace and Route +$"$*$" Translate
his process com!ines all the input netlists and constraints to a logic design file. his information is sa&ed as a %GD (%ati&e Generic Data!ase) file. his can !e done using %GD uild program. =ere defining constraints is nothing !ut assigning the ports in the design to the physical elements (e6. pins switches !uttons etc) of the targeted de&ice and specifying time re"uirements of the design. his information is stored in a file named EC4 (Eser Constraints 4ile). ools used to create or modify the EC4 are AC+ Constraint +ditor etc.
+$"$*$( Ma!
his process di&ides the whole circuit with logical elements into su! !loc0s such that they can !e fit into the 4GA logic !loc0s. hat means map process fits the logic defined !y the %GD file into the targeted 4GA elements (Com!inational Logic loc0s (CL) *nput 3utput loc0s (*3)) and generates an %CD (%ati&e Circuit Description) file which physically represents the design mapped to the components of 4GA. MA program is used for this purpose.
+$"$*$* Pla&e an, Route
AR program is used for this process. he place and route process places the su! !loc0s from the map process into logic !loc0s according to the constraints and connects the logic !loc0s. +6. if a su! !loc0 is placed in a logic !loc0 which is &ery near to *3 pin then it may sa&e the time !ut it may effect some other constraint. So trade off !etween all the constraints is ta0en account !y the place and route process. he AR tool ta0es the mapped %CD file as input and produces a completely routed %CD file as output. 3utput %CD file consists the routing information.
Pla&e an, Route AR program is used for this process. he place and route process
places the su! !loc0s from the map process into logic !loc0s according to the constraints and connects the logic !loc0s. +6. if a su! !loc0 is placed in a logic !loc0 which is &ery near to *3 pin then it may sa&e the time !ut it may effect some other constraint. So trade off !etween all the constraints is ta0en account !y the place and route process. he AR tool ta0es the mapped %CD file as input and produces a completely routed %CD file as output. 3utput %CD file consists the routing information.
+$"$+ De'i&e Progra11ing
%ow the design must !e loaded on the 4GA. ut the design must !e con&erted to a format so that the 4GA can accept it. *G+% program deals with the con&ersion. he routed %CD file is then gi&en to the *G+% program to generate a !it stream (a .* file) which can !e used to configure the target 4GA de&ice. his can !e done using a ca!le. Selection of ca!le depends on the design. +$"$2 Design Verifi&ation
those are encountered throughout the hierarchy of the design flow. his simulation is performed !efore synthesis process to &erify RL (!eha&ioral) code and to confirm that the design is functioning as intended. eha&ioral simulation can !e performed on either <=DL or
+$"$4 Fun&tional si1ulation (ost ranslate Simulation) 4unctional simulation gi&es
information a!out the logic operation of the circuit. Designer can &erify the functionality of the design using this process after the ranslate process. *f the functionality is not as e6pected then the designer has to made changes in the code and again follow the design flow steps. +$"$5$ Stati& Ti1ing Anal0sis his can !e done after MA or AR processes ost
MA timing report lists signal path delays of the design deri&ed from the design logic. ost lace and Route timing report incorporates timing delay information to pro&ide a comprehensi&e timing summary of the design.
CHAPTER 2
#ntro,u&tion to Verilog *n
the semiconductor and electronic
design industry Verilog is
a hardware
description language(=DL) used to model electronic systems.
confused with <=DL (a competing language) is most commonly used in the design &erification and
implementation of digital logic
chips
at
the register-transfer
le&el of a!straction. *t is also used in the &erification of analog and mi6ed-signal circuits. "$2
O'er'ie.
=ardware
description
languages
such
as
differ
from
software programming languages !ecause they include ways of descri!ing the propagation of time and signal dependencies (sensiti&ity). here are two assignment operators a !loc0ing assignment () and a non-!loc0ing (Z) assignment. he non !loc0ing assignment allows designers to descri!e a state-machine update without needing to declare and use temporary storage &aria!les (in any general programming language we need to define some temporary storage spaces for the operands to !e operated on su!se"uently5 those are temporary storage &aria!les). Since these concepts are part of
than
that
of
A%S*
C#CWW)
and
e"ui&alent control
flow 0eywords (if#else for while case etc.) and compati!le operator precedence. Syntactic differences include &aria!le declaration (
needed@
) demarcation of procedural !loc0s (!egin#end instead of
curly !races [\) and many other minor differences. A
design
consists
of
a
hierarchy
of
modules.
Modules
encapsulate design hierarchy and communicate with other modules through a set of declared input output and !idirectional ports. *nternally a module can contain any com!ination of the following1 net#&aria!le declarations (wire reg integer etc.)
concurrent and se"uential statement !loc0s and instances of other modules (su!hierarchies). Se"uential statements are placed inside a !egin#end !loc0 and e6ecuted in se"uential order within the !loc0. ut the !loc0s themsel&es are e6ecuted concurrently "ualifying hen a wire has multiple dri&ers the wireBs (reada!le) &alue is resol&ed !y a function of the source dri&ers and their strengths. A su!set of statements in the
"$3
Histor0
"$3$" Beginning
"$3$( Verilog8=2 >ith the increasing success of <=DL at the time Cadence decided to ma0e the language a&aila!le for open standardi'ation. Cadence transferred
"$3$* Verilog (<<" +6tensions to
"$3$+ Verilog (<<2 %ot to !e confused with System
S0ste1Verilog
System
he most &alua!le !enefit of System
Constrained-random stimulus generation
4unctional co&erage
=igher-le&el structures especially 3!;ect 3riented rogramming
Multi-threading and interprocess communication
Support for =DL types such as
ight integration with e&ent-simulator for control of the design
here are many other useful features !ut these allow you to create test !enches at a higher le&el of a!straction than you are a!le to achie&e with an =DL or a programming language such as C. System
+liminate the effort and time spent creating hundreds of tests.
•
+nsure thorough &erification using up-front goal setting.
•
Recei&e early error notifications and deploy run-time chec0ing and error analysis to simplify de!ugging.
3$* E>a1!les
+6,1 A hello world program loo0s li0e this1 1o,ule main5 initial begin
_display(=ello world`)5 _finish5
en, en,1o,ule
+681 A simple e6ample of two flip-flops follows1 1o,ule tople&el(cloc0reset)5 in!ut cloc05 in!ut reset5 reg flop,5 reg flop85 al.a0s ^ (!ose,ge reset or !ose,ge cloc0) if (reset) begin
flop, Z :5 flop8 Z ,5 en, else begin
flop, Z flop85 flop8 Z flop,5 en, en,1o,ule
he Z operator in hen assignment is used for the purposes of logic the target &aria!le is updated immediately. *n the a!o&e e6ample had the statements used the !loc0ing operator instead of Z flop, and flop8 would not ha&e !een swapped. *nstead as in
traditional programming the compiler would understand to simply set flop, e"ual to flop8 (and su!se"uently ignore the redundant logic to set flop8 e"ual to flop,.) +691 An e6ample counter circuit follows1 1o,ule Di&8:6 (rst cl0 cet cep count tc)5
## *L+ BDi&ide-!y-8: Counter with ena!lesB ## ena!le C+ is a cloc0 ena!le only ## ena!le C+ is a cloc0 ena!le and ## ena!les the C output ## a counter using the
## within an always ## (or initial)!loc0 ## must !e of type reg .ire tc5 ## 3ther signals are of type wire
## he always statement !elow is a parallel ## e6ecution statement that ## e6ecutes any time the signals ## rst or cl0 transition from low to high
al.a0s ^ (!ose,ge cl0 or !ose,ge rst) if (rst) ## his causes reset of the cntr
count Z [si'e[,B!:\\5 else if (cet QQ cep) ## +na!les !oth true begin if (count length-,)
count Z [si'e[,B!:\\5 else
count Z count W ,B!,5 en,
## the &alue of tc is continuously assigned ## the &alue of the e6pression assign tc (cet QQ (count length-,))5 en,1o,ule E>+7 An e>a1!le of ,ela0s7
... reg a ! c d5 .ire e5
... al.a0s ^(! or e) begin
a ! Q e5 ! a !5 b7 c !5 d b/ c U e5 en,
he always clause a!o&e illustrates the other type of method of use i.e. the always clause e6ecutes any time any of the entities in the list change i.e. the ! or e
change. >hen one of these changes immediately a is assigned a new &alue and due to the !loc0ing assignment ! is assigned a new &alue afterward (ta0ing into account the new &alue of a.) After a delay of 7 time units c is assigned the &alue of ! and the &alue of c U e is tuc0ed away in an in&isi!le store. hen after / more time units d is assigned the &alue that was tuc0ed away. Signals that are dri&en from within a process (an initial or always !loc0) must !e of type reg. Signals that are dri&en from outside a process must !e of type wire. he 0eyword reg does not necessarily imply a hardware register. 3$* Constants
he definition of constants in idth in !its]BZ!ase letter]Znum!er] +6amples1
,8Bh,89 - =e6adecimal ,89 (using ,8 !its)
8:Bd - Decimal (using 8: !its - : e6tension is automatic)
B!,:,: - inary ,:,: (using !its)
/BoKK - 3ctal KK (using / !its)
3$+ S0ntesi?able Constru&ts
here are se&eral statements in
assign out sel P a 1 !5
## the second e6ample uses a procedure ## to accomplish the same thing. reg out5 al.a0s ^(a or ! or sel) begin &ase(sel)
,B!:1 out !5 ,B!,1 out a5 en,&ase en,
## 4inally - you can use if#else in a ## procedural structure. reg out5 al.a0s ^(a or ! or sel) if (sel)
out a5 else
out !5 he ne6t interesting structure is a transparent latch it will pass the input to the output when the gate signal is set for pass-through and captures the input and stores it upon transition of the gate signal to hold. he output will remain sta!le regardless of the input signal while the gate is set to hold. *n the e6ample !elow the passthrough le&el of the gate would !e when the &alue of the if clause is true i.e. gate ,. his is read if gate is true the din is fed to latchout continuously. 3nce the if clause is false the last &alue at latchout will remain and is independent of the &alue of din. +/1 ## ransparent latch e6ample reg out5 al.a0s ^(gate or din) if (gate)
out din5 ## ass through state
## %ote that the else isnBt re"uired here. he &aria!le ## out will follow the &alue of din while gate is high. ## >hen gate goes low out will remain constant. he flip-flop is the ne6t significant template5 in
" Z d5 he significant thing to notice in the e6ample is the use of the non-!loc0ing assignment.
A
!asic rule
of
thum! is
to
use @ when
there
is
a
!ose,ge or nege,ge statement within the always clause.
A &ariant of the D-flop is one with an asynchronous reset5 there is a con&ention that the reset state will !e the first if clause within the statement. reg "5 al.a0s ^(!ose,ge cl0 or !ose,ge reset) if (reset)
" Z :5 else
" Z d5 he ne6t &ariant is including !oth an asynchronous reset and asynchronous set condition5 again the con&ention comes into play i.e. the reset term is followed !y the set term. reg "5 al.a0s ^(!ose,ge cl0 or !ose,ge reset or !ose,ge set) if (reset)
" Z :5 else if (set)
" Z ,5
else
" Z d5 %ote1 *f this model is used to model a Set#Reset flip flop then simulation errors can result. Consider the following test se"uence of e&ents. ,) reset goes high 8) cl0 goes high 9) set goes high ) cl0 goes high again 7) reset goes low followed !y /) set going low. Assume no setup and hold &iolations. *n this e6ample the always ^ statement would first e6ecute when the rising edge of reset occurs which would place " to a &alue of :. he ne6t time the always !loc0 e6ecutes would !e the rising edge of cl0 which again would 0eep " at a &alue of :. he always !loc0 then e6ecutes when set goes high which !ecause reset is high forces " to remain at :. his condition may or may not !e correct depending on the actual flip flop. =owe&er this is not the main pro!lem with this model. %otice that when reset goes low that set is still high. *n a real flip flop this will cause the output to go to a ,. =owe&er in this model it will not occur !ecause the always !loc0 is triggered !y rising edges of set and reset - not le&els. A different approach may !e necessary for set#reset flip flops. %ote that there are no initial !loc0s mentioned in this description. here is a split !etween 4GA and AS*C synthesis tools on this structure. 4GA tools allow initial !loc0s where reg &alues are esta!lished instead of using a reset signal. AS*C synthesis tools donBt support such a statement. he reason is that an 4GABs initial state is something that is downloaded into the memory ta!les of the 4GA. An AS*C is an actual hardware implementation.
3$2 #nitial Vs Al.a0s7 here are two separate ways of declaring a
!loc0. *n fact it is !etter to thin0 of the initial-!loc0 as a special-case of the al.a0s !loc0 one which terminates after it completes for the first time. ##+6amples1 initial begin
a ,5 ## Assign a &alue to reg a at time : b,5 ## >ait , time unit ! a5 ## Assign the &alue of reg a to reg ! en, al.a0s ^(a or !) ## Any time a or ! C=A%G+ run the process begin if (a)
c !5 else
d !5 en, ## Done with this !loc0 now return to the top (i.e. the ^ e&ent-control) al.a0s ^(!ose,ge a)## Run whene&er reg a has a low to high change
a Z !5 hese are the classic uses for these two 0eywords !ut there are two significant additional uses. he most common of these is an al.a0s0eyword without the :$$$; sensiti&ity list. *t is possi!le to use always as shown !elow1 al.a0s begin ## Always !egins e6ecuting at time : and %+<+R stops
cl0 :5 ## Set cl0 to : b,5 ## >ait for , time unit cl0 ,5 ## Set cl0 to , b,5 ## >ait , time unit en, ## Ieeps e6ecuting - so continue !ac0 at the top of the !egin
he al.a0s 0eyword acts similar to the C construct .ile:"; $$ in the sense that it will e6ecute fore&er. he other interesting e6ception is the use of the initial 0eyword with the addition of the fore'er 0eyword. 3$3 Ra&e Con,ition
he order of e6ecution isnBt always guaranteed within
a :5 initial
! a5 initial begin
b,5 _display(
>hat will !e printed out for the &alues of a and !P Depending on the order of e6ecution of the initial !loc0s it could !e 'ero and 'ero or alternately 'ero and some other ar!itrary uninitiali'ed &alue. he _display statement will always e6ecute after !oth assignment !loc0s ha&e completed due to the b, delay.
/.K O!erators %ote1 hese operators are not shown in order of precedence.
itwise
Logical
Reduction
Arithmetic
Relational
Shift
3$5 S0ste1 Tas9s7 System tas0s are a&aila!le to handle simple *#3 and &arious design measurement functions. All system tas0s are prefi6ed with to distinguish them from user tas0s and functions. his section presents a short list of the most often used tas0s. *t is !y no means a comprehensi&e list.
_display - rint to screen a line followed !y an automatic newline.
_write - >rite to screen a line without the newline.
_swrite - rint to &aria!le a line without the newline.
_sscanf - Read from &aria!le a format-specified string. (V
_fopen - 3pen a handle to a file (read or write)
_fdisplay - >rite to file a line followed !y an automatic newline.
_fwrite - >rite to file a line without the newline.
_fscanf - Read from file a format-specified string. (V
_fclose - Close and release an open file handle.
_readmemh - Read he6 file content into a memory array.
_readmem! - Read !inary file content into a memory array.
_monitor - rint out all the listed &aria!les when any change &alue.
_time -
_dumpfile - Declare the
_dump&ars - urn on and dump the &aria!les.
_dumpports - urn on and dump the &aria!les in +6tended-
_random - Return a random &alue.
CHAPTER 2
#L#N #SE "($" Design Suite Tutorial , . Clic0 on ilin> #SE Design Suite "($" *con on des0top
8 . #SE Pro%e&t Na'igator :M2* D; 8 #SE Design Suite #nfo &enter !e
9. ress 30 .hen go to File menu . Select the Ne. !ro%e&t .
window will opened.
,. %ew op up window named Ne. Pro%e&t .i?ar, appeared.
8. +nter the ro;ect name in Na1e field.
9. Select the location where you want to store the pro;ect !y selecting the Lo&ation field
. >or0ing Directory is automatically select same location .
7. Select the HDL in To! Le'el Sour&e T0!e present at !ottom of window.
/.
hen press the NET $
K. hen it goes to ro;ect Settings .
$. *n this pro;ect settings you can select the product details used to dump our program.
F. After selecting the &alues clic0 %+.
,:. *t goes to ro;ect Summary ta! .hen Clic0 4inish.
,,. After that it Can !ac0 to #SE Pro%e&t Na'igator :M2* D; window .
,8. Go to Pro%e&t ta! open the Ne. Sour&e$
,9. %ew Source >i'ard >i'ard pup up will will !e opend.
,. =ere we can can seselect &erilog &erilog Module. Module.
,7. hen enter enter the file name for for new source.
8:.
Verifcation using test bench
Ca!ter 3 Results an, ,is&ussions
V##. CONCLUS#ON
*n this paper we ha&e presented the design and implementation of the SRAM . his design has !een simulated in M3D+LS*M altera /./d and synthesi'ed using *L*%*S+ ,.i targeted to Spartan 9+ 4GA . he gi&en input se"uence has !een Stored in SRAM and return as output .
REFERENCES