1. INTRODUCTION Exponential savings in the performance of digital circuits due to parameter scaling have disappeared. Alternative technologies, such as threshold logic gates (TLGs), among others, can extend parallel processing capabilities. A TLG is an N -input -input device that calculates the weighted sum of inputs. Current mode, mono stable – bi stable transition logic element, neuron MOS, and single electron technology are a few examples for the design of TLGs, Some of these methodologies are CMOS-based and the synthesis of efficient TLG-based circuits becomes feasible Threshold logic gates (TLG) are an attractive alternative for implementing digital circuits. Methodologies for implementation of circuits using TLG become available and thus the synthesis of efficient TLG based circuits becomes feasible. An existing issue is to optimize the performance of a TLG gate by selecting appropriate transistor sizes. An alternative to time consuming exhaustive SPICE simulations is presented and evaluated. It is based on an analytical method capable of providing near optimum sensor sizes for the circuit implementing the TLG. It is also capable of providing the expected gate delay without time consuming simulation steps; thus improving the performance of TLG based synthesis methodologies. It is expected that the exponential savings in performance of digital circuits due to parameter scaling will evaporate soon [4-8]. Alternative technologies, such as multiple valued logic, threshold logic gates, and others, can extend parallel processing capabilities capab ilities [4-8]. Monostable – Bistable Bistable Transition Logic Element (MOBILE), neuron MOS, single electron technology are few examples of threshold logic gate implementations [6, 9, 10, 13]. A Threshold Logic Gate (TLG) is a N-input device which calculates the weighted sum of inputs [3]. A basic TLG consists of N-inputs, a weight value for each input, and a threshold weight. The sum of the input weights is compared with the threshold weight. If it is greater than the threshold weight, then the digital output of TLG is logic high, and if it is less it will be logic zero [3]. In the CMOS-based implementation considered in this paper, when the sum of the input weights is equal to the threshold weight, then the gate is in undefined state. Weights are selected so that this case is avoided.
1
The equation representing output of a TLG is given as
Where wi is the weight of the ith input, x i is the input applied to the ith input, and wT is the threshold weight for the function f of a TLG. The input weights can be either positive or negative but the threshold weight is always positive. In this paper, an N-input function with P positive weights is denoted as {w1, . . . , wP : wT , wP+1, . . . , w N }. Example 1: Consider a function f = x1 + x2 + x3 with weight configuration (w1, w2, w3 : wT ),
where w1, w2, and w3 correspond to the weights of the inputs x1, x2, and x3, respectively, and wT is the threshold weight. A possible weight configuration is {w 1, w2, w3 : wT } = {4, 4, 4 : 3}, where all the input weights are positive. When applying the input pattern {x1, x2, x3} = {0, 0, 1}, the weighted sum of inputs is 4 · 0 + 4 · 0 + 4 · 1 > 3, and, according to (1), f = 1. See also Fig.11. Function f is denoted as {4, 4, 4: 3}.
Fig 1.1 output functionality of a TLG for a given weight configuration and input pattern
This system considers implementations of threshold logic functions using current mode. This is a popular CMOS-based approach. All current mode implementation methods considered in this paper consist of two parts: the differential part and the sensor part. The number of transistors in the sensor part is constant and does not depend on the implemented function. The number of transistors in the differential part depends on the sum of input weights and the threshold weight.
2
There exist two approaches for implementing current mode TLGs: the current mode TLG (CMTLG) [1] and the Differential current mode logic (DCML)[11].Section II reviews these two approaches. Section III presents a new implementation, which we call the dual clock current mode logic (DCCML), which results in both speed and switching energy [power-delay product (PDP)] improvements over the approaches in [1] and [11]. They consist of two parts: the differential part and the sensor part. All the PMOS transistors in the sensor part have the same size S, which we call the sensor size. The sensor size impacts the performance of all the three current mode implementations for any threshold logic function. It is a very time-consuming task to obtain the optimum sensor size through iterative SPICE simulations, one simulation for a different sensor size. Section IV presents the second contribution of this paper, which is an analytical approach to determine quickly and accurately the appropriate sensor size S for a given function under any existing current mode approach, such as those in [1] and [11] and the proposed implementation in Section III. Section V presents simulation results that demonstrate the accuracy of the optimum sensor identification method in Section IV. It also presents results that show that the current mode approach in Section III consistently outperforms those in [1] and [11] on delay as well as switching energy. energ y. Finally, Section VI concludes.
3
2. LITERATURE SURVEY Logic optimization and timing estimations are basic tasks for digital circuit designers. The logical effort (LE) method was first presented by Sutherland, for easy and fast evaluation and optimization of delay in CMOS logic paths. Because of its elegance, the LE method has become a very ver y popular tool for designing and education purposes and is adopted to be the basis for several computer-aided-design tools. Although LE is mainly used for standard CMOS logic, it is also shown to be useful for other logic families, such as the pass transistor logic achiev e very low power dissipation, with some degradation in performance, as compared with standard CMOS. we proposed the novel dual mode logic (DML), which provides the designer with a very high level of flexibility. It allows on-the-fly switching between two modes of operation: 1) static and 2) dynamic modes. In the static mode, DML other hand, dynamic operation of DML gates achieves very high speed at the expense of increased power dissipation. Basic DML gate is composed of any static logic family gate, which can be a conventional CMOS gate, and an additional transistor. DML gates have a very simple and intuitive structure, requiring unconventional sizing methodology to achieve the desired performance. Conventional methodology cannot be used with the DML family as it does not consider its unconventional sizing rules and topology. The objective of this paper is to develop a simple method for minimizing delays and achieving an optimized number of stages in logical paths containing CMOS-based DML gates. A unified LE method is introduced for the delay evaluation and optimization of logic paths constructed with DML logic gates. DML-LE answers complete design problems, which can be solved numerically, and simplifies these problems to a straightforward and easy computational problem [approximate and semi approximate (SA) solutions] with a unified analytic model. With this model, we can estimate the minimum to maximum error under delay approximation and the error in the target optimum number of stages for a given logic function. The efficiency of the developed method is shown by a comparison of the theoretical results, achieved using the proposed method with simulation results of the Cadence Virtuoso optimizer tool using a standard 40-nm technology. Cmtl and Dcml implementations of a threshold logic function MTLG is a CMOS based implementation of TLG shown in Figure 3.1 [1]. The CMTLG can be divided into
4
two parts, the differential part and the sensor part. The differential part can be subdivided into two parts, the threshold part and the inputs part. In the threshold part and the input part all the transistors are connected in parallel. The transistors in the threshold part are always ON and the total current flowing through the threshold part is represented as Threshold current IT. The number of PMOS active (ON) in the input part depends on the input pattern applied. The total current passing through the input side for a particular input pattern is represented as the Active Current IA. The nodes connecting the differential part and the sensor part on the input side and the threshold side are M1 and M2, respectively and nodes O and OB are the output nodes The nodes connecting the differential part and the sensor part on the input side and the threshold side are M1 and M2. The sensor part has three three PMOS transistors P1, P2, P3, and four NMOS transistors N1, N2, N3, and N4 If the size of the sensor is S, then all the PMOS transistors in the sensor p art have S μm size and all the NMOS transistors in the sensor part have a size smaller than S μm. The operation of the CMTLG is divided into two phases [1]: the equalization phase and the evaluation phase. These phases are explained with the help of Figs. 2 and 3. When the applied clock (clk) to the CMTLG is high, then the circuit is in the equalization phase. When clk is low, then the circuit is in the evaluation phase [1]. In the equalization phase, transistors N1 and N2 are ON, nodes M1 and M2 have the same s ame voltage because of transistor N1, and nodes O and OB have the same voltage because of transistor N2. In the evaluation phase, transistors N1 and N2 are OFF, and if the threshold current is less than the active current, then the voltage at node O rises faster than that at node OB [1]. If during the evaluation phase the threshold current exceeds the active current, then the voltage at node OB rises faster than that at node O [1]. This system derives an analytical formula for optimum sensor size which is used to obtain the minimum delay for a given threshold and number of inputs. Then using the optimum sensor sizing, the CMTLG is designed. The value of fan-in can go up to 150 (only by using the appropriate sensor sizing) considering all the fan-in have minimum weights.
5
1. IN “S. Bobba and I. N. Hajj, “Current-mode threshold logic gates,” in Proc. IEEE ICCD, Sep. 2000, pp. 235 – 240 240” In this paper, we present low-power and high-performance logic gates called the currentmode threshold logic (CMTL) gates. Low-power dissipation is achieved by limiting the voltage swing on the interconnects and the internal nodes of the CMTL gates. High-performance is achieved by the use of transistor configurations that sense a small difference in current and set the differential outputs to the correct values. The realization of NAND, NOR, AND, OR logic gates and other logic functions using the CMTL gates is presented. We also present several implementations of CMTL gates and describe the relative advantages and limitations of these implementations. SPICE simulation, results for a 1.5 V 0.18 u CMOS technology are also presented for the different circuit configurations configurations described in the paper. The design of circuits using different logic families results in circuits with different performance and power dissipation. A description of the different logic families and their strengths/ weaknesses can be found in [l]. The functionality of a logic gate in the non-clocked logic families such as static CMOS, differential cascade voltage switch logic (DCVS) [2], passgate logic, and clocked logic families such as single-rail domino, dual-rail domino, latched domino structures [3-51 and clocked pass-gate logic is based on the the use of transistor as an ON/ OFF switch. For certain input assignment, some transistors turn ON (or OFF) and it connects (or disconnects) the output node of the logic gate to one of the supply rails (Vdd/ Gnd) through the 0N (or OFF) transistors. In order to completely turn ON or turn OFF a transistor, the voltage swing at the gate input of the transistor has to be from Gnd to Vdd or from Vdd to Gnd. Since the input to some logic gate is also the output of some other logic gate, it implies that the output nodes of logic gates must also make full voltage swings. The power dissipated by a circuit is proportional to the square of the supply voltage. In order to the reduce the power dissipation, the supply voltage can be reduced which results in a lower voltage swing. Lower supply voltage also results in reduced drive strength and performance degradation for the logic gates. An alternate approach is the design of logic gates in which the transistors are not required to completely turn-ON/ OFF. In [6], the logic gates are operated in the sub-threshold region and the transistors are not completely turned ON. But the slow operating speed of these logic gates limits their application to low speed circuits. In this paper, we use low swing input voltages and impose the design constraint that a transistor cannot be turned turn ed OFF.
6
The functional dependence of the ON-transistor properties (transconductance) on the input voltage needs to be used in the design of these logic gates. This requires a high-gain stage that can sense the difference in the transconductance or the current driven through the transistors to set the output to the correct values. This can be obtained by using the latched domino structure. The latched domino structures [3-51 can provide high-performance by using the latch structure as a senseamplifier. The differences between the existing work and the proposed work are one or more of the following: Most latched domino circuits are used with differential NMOS tree where only one of output nodes is pulled down through the differential NMOS tree. In the proposed work, the logic lo gic circuitry is not differential and both sets of transistors are always ON. In latched domino circuits, the NMOS logic circuit and the latch are connected to the output nodes in parallel. This implies that there are multiple paths to charge or discharge the output nodes. In our proposed work, the logic network appears on the path from Vdd to the output node. By modulating the current that charges the output node, we set the output to the correct value. This increases the sensitivity of the logic gate and provides it with the capability to sense small difference in currents. In latched domino circuits, the logic network consists of NMOS transistors with full input swings. In the proposed work, we use PMOS transistors in the logic network with small voltage swings.
Fig 2.1 Circuit with CMTL gates
2. In “T. Ogawa, T. Hirose, T.Asai, and Y. Amemiya, “Threshold-logic devices consisting of sub threshold CMOS circuits,” IEICE Trans. Fundam. Electron., Commun. Commun. Comput. Sci., vol. E92-A, no. 2, pp. 436 –442, 2009.” A threshold-logic gate device consisting of sub-threshold MOSFET circuits is proposed. The gate device performs threshold-logic operation, using the technique of current-mode
7
addition and subtraction. Sample digital subsystems, i.e add ers and morphological operation cells based on threshold logic, are designed using the gate devices, and their operations are confirmed by computer simulation. The device has a simple structure and operates at low power dissipation, so it is suitable for constructing cell-based, parallel processing LSIs such as cellular-automaton and neural-network LSIs.
3. IN “W. Prost et al., “Manufacturability and robust design of nanoelectronic logic circuits based on resonant tunnelling diodes,” Int. J. Circuit Theory 2000.” Appl., vol. 28, no. 6, pp. 537 –552, Nov./Dec. 2000.” The manufacturability of logic circuits based on quantum tunneling devices, namely double‐barrier resonant tunneling diodes (RTD), is studied in detail. The homogeneity and reproducibility of III/V mesa technology‐based devices is experimentally evaluated and
interpreted using multiple I o f the V characteristic simulations. The experimental sensitivity of – V RTD I V parameters on well and barrier thickness is compared with multiple I – V – V simulations. With shrinking minimum feature size the fluctuations in the peak current can be directly attributed to an RTD area variation caused by the increasing impact of lithography and etching on lateral dimensions. These results prove that the III/V technology fulfils the requirements for a large scale integration of RTD devices. A nano electronic circuit architecture based on an improved MOBILE threshold logic gate is presented. Detailed SPICE simulations using the experimental data show that clock and supply voltage fluctuations are tolerated up to ± 0.1 V at a supply voltage of 0.7V. Very strong local peak voltage variations of 15 per cent in opposite directions would be necessary to have a critical impact on to the circuit functionality. Smaller deviations only affect the timing without degrading the reliability of the circuit. Consequently, the design of a stable power supply and clocking scheme is more important for the overall circuit performance than the small relative deviations of the RTD peak voltage.
4. In “S. Muroga, Threshold Logic and Its Applications. New York, NY, USA: Wiley, 1971” As an approach to clarifying the basic properties of threshold logic, the completely monotonic function is investigated. Its testing procedure, functional form, etc., are discussed by using a new concept, mutual monotonicity.
8
5. In “S. Leshner, K. Berezowski, X. Yao, G. Chalivendra, S. Patel, and S. Vrudhula, “A low power, high performance threshold logic-based standard cell multiplier in 65 nm CMOS,” in Proc. IEEE Comput. Soc. Annu. Symp.
VLSI, Lixouri, Greece, Jul. 2010, pp. 210 – 215 215” In this paper we describe the design, simulation, fabrication, and test of a 32-bit 2's complement integer multiplier constructed from a combination of CMOS standard cells and threshold logic elements in a 65 nm low power process. As compared to a multiplier designed solely using CMOS standard cells, the threshold logic based multiplier is 1.23x smaller and consumes 1.41x less dynamic power and 2.5x less leakage power at the same process corner. Increasing demand for greater functionality, faster response and longer battery life in embedded mobile applications continues to aggressively push for higher performance and lower power. Meanwhile, the competitive marketplace is forcing greater flexibility, reduced costs, and faster time-to-market, making a broad use of standard cell-based design automation inevitable. As such, automated design tools have become extremely sophisticated, capable of addressing low power design in a variety of ways. Consequently, continuing to push the boundaries of power efficiency in semi-custom ICs requires a lateral shift in the value delivered by underlying standard cells. In this work, we address this challenge through a design technique that utilizes a clocked logic style that implements threshold logic [5]. Threshold logic enables the compression of complex computations into fewer transistors, providing improvements in area, speed, and/or power consumption. Additionally, threshold logic gates can be constructed to be consistently reliable under noisy and inconsistent operating conditions, and as such retain compatibility with standard cell design flows. The prototypical circuit described in this work capitalizes on merging the advantages of traditional static CMOS logic and the clocked threshold logic, yielding substantial reductions in power consumption for high performance embedded applications.
9
Fig 2.2 absorption of the function a(b(c+d+e) + c(d+e) + de) + bcde into a single threshold gate with inputs {a, b, c, d, e}, weights {2, 1, 1, 1, 1}, and threshold 4
6. In “M.Sharad, D.Fan, and K.Roy. (2013). “Ult ra-low energy, high performance dynamic resistive threshold logic.” We propose dynamic resistive threshold-logic (DRTL) design based on non-volatile resistive memory. A threshold logic gate (TLG) performs summation of multiple inputs multiplied by a fixed set of weights and compares the sum with a threshold. DRTL employs resistive memory elements to implement the weights and the thresholds, while a compact dynamic CMOS latch is used for the comparison operation. The resulting DRTL gate acts as a low-power, configurable dynamic logic unit and can be used to build fully pipelined, high performance programmable computing blocks. Multiple stages in such a DRTL design can be connected using energy-efficient low swing programmable interconnect networks based on resistive switches. Owing to memory-based compact logic and interconnect design and high speed dynamic-pipelined operation, DRTL can achieve more than two orders of magnitude improvement in energy-delay product as compared to look-up table based CMOS FPGA. In recent years several computing schemes have been explored based on nano-scale programmable resistive elements, generally categorized under the term ‘memristor’ [1-3]. Of
special interest are those which are amenable to integration with state of the art CMOS technology, like memristors based on Ag-Si filaments [2, 3] or spintronic spintronic memristors based on domain-wall magnets and magnetic tunnel junctions [9, 10]. Such devices can be integrated into metallic crossbars to obtain high density crossbar memory arrays. Some of these devices can facilitate the design of multi-level, non-volatile memory [3, 4].
10
The device-technologies for non-volatile resistive memory have led to possibilities of implementing programmable computing hardware that combine logic with memory. One such scheme is threshold logic [5-8]. It constitutes of summation of weighted inputs, followed by a threshold operation , as given in eq.1 : Y=sign (∑Ini Wi +bi ) (1) Here, Ini , Wi and bi are the inputs, weights and the thresholds respectively. While a memristor-array can be employed to perform current-mode analog summation of input signals, the thresholding operation requires the application of a comparator circuit. In recent proposal [5-7], such circuits have been designed using analog CMOS units that can be complex in terms of area and in-efficient in terms of power consumption. Such circuits based on analog amplifiers and current mirrors may suffer from stringent mismatch constraints and hence may not be scalable. Recently, design of resistive threshold logic gates (RTLG) based on Ag-Si memristor was demonstrated [8], where authors employed a simple CMOS flip-flop as a comparator. However, such a scheme would require application of large voltage and static current across the memristors in order to achieve enough sensing margin, leading to large power consumption. In this work we propose the design of dynamic resistive threshold logic using programmable resistive elements. Noting that threshold logic synthesis with small fan-in TLGs require lower comparator resolution, we employ compact, low-power and high speed dynamic CMOS latch for thresholding operation. Such hybrid dynamic RTLGs can be pipelined to achieve high performance, thereby leading to energy-efficient computation Design of Dynamic Resistive Threshold logic Gate :
For the design of DRTL-gate (DRTLG) we exploit the fact that, threshold logic synthesis using TLGs with small fan-in (2 to 3) need fewer levels in input weights as well as reduced comparator resolution (minimum % difference between threshold and input-summation to be detected) [11]. The set of weight levels needed for different fan-in restriction is depicted in fig. 1b, which shows that for a fan-in restriction of 2, only two weight-levels are required. The number of levels in the threshold was found to be 4 in this case. Lower number of weight-levels implies higher variation tolerance for weights and relaxed resolution constraint for the comparison operation. For instance for ideal 2-level weights, a 2-input TLG requires a comparator of only 25% resolution. The figure also shows that the increase in number of nodes while reducing the fan-in restriction from 4 to 2 is only marginal. Hence, owing to the
11
aforementioned advantages offered by lower fan-in restrictions, in this work we limit our discussion on 2-input DRTLG.
Fig 2.3 (a) A threshold logic gate, (b) Change in number of TLG required for a given logic block for different TLG restrictions
7. In “P. Celinski, J. F. López, S. Al- Sarawi, and D. Abbott, “Low power, high speed, charge recycling CMOS threshold logic gate,” Electron. Lett., vol. 37, 2001.” no. 17, pp. 1067 –1069, Aug. 2001.” A new implementation of a threshold gate based on a capacitive input, charge recycling differential sense amplifier latch is presented. Simulation results indicate that the proposed structure has very low power dissipation and high operating speed, as well as robustness under process, temperature and supply voltage variations, and is therefore highly suitable as an element in digital integrated circuit design As the demand for higher performance, very large scale integration processors with increased sophistication grows, continuing research is focused on improving the performance, area efficiency, and functionality of the arithmetic and other units contained therein. Low power dissipation has become a major issue demanded by the high performance processor market to meet the high density requirements of advanced VLSI processors. The import“ of low power is also evident in portable and aerospace applications, and
is related to issues of reliability, packaging, cooling and cost. Threshold logic (TL) was
12
introduced over four decades ago, and over the years has promised much in terms of reduced logic depth and gate count compared to traditional AND-OR-NOT (AON) logic-gate based design. However, lack of efficient physical realizations has meant that TL has, until recently, had little impact on VLSI. Efficient TL gate realizations have recently become available, and a number of applications based on TL gates have demonstrated its ability to achieve high operating speed and significantly reduced area [l]. Both static and dynamic TL gate implementations have been devised. Purely static gates such as neuron-MOS suffer from limited Fan in [l], typically less than 12 inputs. ln addition, some of the existing dynamic gates have relatively high static power dissipation, and some require multiple clock phases [I, 21, introducing the drawbacks associated with clock signal routing cost, clock skew and clock power dissipation. Although the dynamic approach proposed in [3] dissipates no static power, it will be shown that its dynamic power dissipation is comparable to the total power dissipation of other existing approaches. In this Letter we propose a new realization for CMOS threshold gates which operates on a single phase clock, clock , is capable of high- speed operation, ope ration, is suitable for high fan-in gate implementation and has a very low overall power dissipation. Threshold logic: A threshold logic gate is functionally similar to a hard limiting neuron. The gate takes n binary inputs XI, X,, ..., X, and produces a single binary output Y. The weighted sum of the binary inputs is computed followed by a thresholding th resholding operation. The Boolean function computed by such su ch a gate is called a threshold function and it is specified by the gate threshold T and the weights w,, w2, ..., where y is the weight corresponding to the ith input variable 4. [4]. Any threshold function may be computed with positive integral weights and a positive real threshold, and all Boolean functions can be realised by a threshold gate network of depth at most two [4]. A TL gate can be programmed to realise many distinct Boolean functions by adjusting the threshold T. For example, an n-input TL gate with T = n will realise an n-input AND gate and by setting T = ni2, the gate computes a majority function.
13
Fig 2.4 Proposed CRTL gate structure
Charge recycling threshold logic (CRTL) Fig 2.4 shows the proposed circuit structure for implementing a threshold gate with positive weights and threshold. It is based on the charge recycling asynchronous asynchronous sense differential logic (ASDL) (ASDL) developed by BaiSun.
The main
element is the sense amplifier (cross-coupled transistors MlLM4) which generates output Y and its complement K. Pre-charge and evaluate is specified by the dual enable clock signals E and its its complement E;. The inputs 4 are capacitively coupled onto the floating gate 9 of M5, and the threshold is set by the gate voltage T of M6.. Weight values are thus realised by setting capacitors C, to appropriate values. Typically, these capacitors are implemented between the poly silicon 1 and pu1ysil:icon 2 layers, although alternatives, such as trench capacitors available in DRAM processes, may obviously also be used. The ASDL comparator architecture from which the proposed CRTL gate is derived implements high performance, energy efficient operation by recycling charge which has already been drawn from the supply. The enable signal E controls the pre-charge and activation of the sense circuit. Transistors M8 and M9 equalize the outputs. The logic gate has two phases of operation, The evaluate phase and the equalize phase. When E, is high the output voltages are equalized. When E is high, the outputs are disconnected and the differential circuit (MSSM7) draws different current from equalized nodes Y and K. The sense amplifier is activated after the delay of the enable inverters and amplifies the difference in potential now present between Y and y, accelerating the transition. In this way the
14
circuit structure evaluates it' the weighted sum of the inputs, $, is greater or less than the threshold T, and a TL gate is realized. To ensure reliable operation, the gate layout must be symmetrical to minimize the transistor mismatches and interconnects must be of similar length and width to eliminate interconnect-related mismatch. The delay of the enable inverter needs to be sufficiently large so that the output nodes have sufficient voltage difference at the start of sensing.
Fig 2.5 Power dissipation against frequency
To evaluate and compare the performance of the proposed CRTL gate against other CMOS TL gate implementations, a 20- input majority gate (T = 10, achieved by setting voltage T = kbd/ 2), was designed in a 0.25 pm process. The 20-input majority function was also implemented using clocked-neuron-MOS [l], CMOS capacitor coupling logic (CCCL) [2] and the TL structure reported in [3] (LPTL). The unit capacitance value used in each implementation was 5fF. To compare the power dissipation, each of the gates was designed to have similar delay, output rise and fall times, a$ was loaded by equally sized inverters. All transistors were of minimum length for each implementation and transistor widths were selected to achieve the above timing requirements. AU inputs to each gate were switched such that during each
15
evaluation cycle the minimum majority or minority was achieved (1 1 out of 20 inputs were high or low, respectively). Also, the power dissipated in the inverters driving the clock and data inputs was included in the total power dissipation measured for each gate. CRTL improves power dissipation by between 15 and 30% over the other CMOS threshold gate implementations.
Fig 2.6 Input , enable and output simulation results
To ensure correct behavior under process and operating point variations, the proposed gate was tested at 45 corners (V,, at 2, 2.5 and 3 V, process Slow-Slow, Slow-Fast, Fast-Slow, Fast-Fast and Typical-Typical, and temperature at -25, 75 and 125°C). Fig. 3 shows the transient waveform results from the HSPICE simulation for the 2V-typical-75"C corner at 300 MHz. Simulation results of the 20-input majority gate also indicate that the CRTL gate can operate even at frequencies over 400MHz with low power dissipation under worst case conditions.
8. IN “THRESHOLD LOGIC ELEMENT HAVING LOW LEAKAGE POWER AND HIGH PERFORMANCE” Embodiments of a threshold logic element are provided. Preferably, embodiments of the threshold logic element discussed herein have low leakage power and high performance characteristics. In the preferred embodiment, the threshold logic element is a threshold logic latch (TLL). The TLL is a dynamically operated current-mode threshold logic cell that provides fast and efficient implementation of digital logic functions. The TLL can be operated synchronously or asynchronously and is fully compatible with standard Complementary MetalOxide-Semiconductor (CMOS) technology. technology.
16
3. EXISTING SYSTEM 3.1 CMTLG AND DCML IMPLEMENTATIONS OF A THRESHOLD LOGIC FUNCTION CMTLG is a CMOS based implementation of TLG T LG shown in Figure 3.1 [1]. The CMTLG can be divided into two parts, the differential part and the sensor part. The differential part can be subdivided into two parts, the threshold part and the inputs part. In the threshold part and the input part all the transistors are connected in parallel. The transistors in the threshold part are always ON and the total current flowing through the threshold part is represented as Threshold current IT. The number of PMOS active (ON) in the input part depends on the input pattern applied. The total current passing through the input side for a particular input pattern is represented as the Active Current IA. The nodes connecting the differential part and the sensor part on the input side and the threshold side are M1 and M2, respectively and nodes O and OB are the output nodes and are shown in Fig 3.1. The nodes connecting the differential part and the sensor part on the input side and the threshold side are M1 and M2, Fig 3.1. Output voltages and their difference in the two clock phases for CMTLG respectively. The sensor part has three pMOS transistors P1, P2, P3, and four nMOS transistors N1, N2, N3, and N4 as shown in Fig3.1. If the size of the sensor is S, then all the pMOS transistors in the sensor part have S μm size and all the nMOS n MOS transistors in the sensor part have a size smaller than S μm.
The operation of the CMTLG is divided into two phases [1] the equalization phase and the evaluation phase. These phases are explained with the help of fig3.1 and fig 3.2. When the applied clock (clk) to the CMTLG is high, then the circuit is in the equalization phase. When clk is low, then the circuit is in the evaluation phase [1]. In the equalization phase, transistors N1 and N2 are ON, nodes M1 and M2 have the same s ame voltage because of transistor N1, and nodes O and OB have the same voltage because of transistor N2 . In the evaluation phase, transistors N1 and N2 are OFF, and if the threshold current is less than the active current, then the voltage at node O rises faster than that at node OB [1]. If during the evaluation phase the threshold current exceeds the active current, then the voltage at node OB rises faster than that at node O [1].This system derives an analytical formula for optimum sensor size which is used to obtain the minimum delay for a given threshold and number of inputs.
17
Then using the optimum sensor sizing, the CMTLG is designed. The value of fan-in can go up to 150 (only by using the appropriate sensor sizing) considering all the fan-in have minimum weights.
Fig 3.1 Current Mode Threshold Logic Gate
Figure 3.2 shows the two phases of the clock, the voltage at output nodes O and OB and the voltage difference between nodes O and OB (dV). The delay of a CMTLG can be divided into two phases, the activation time and the boosting time. The first phase is the time taken by CMTLG to develop a small voltage difference (200μV) across the output nodes O and OB In this
phase, the difference between IA and IT leads to gradually increasing voltage difference between the nodes M1 and M2 also increases. The time taken by the CMTLG to develop initial voltage difference is represented as the activation time T A. The second phase is the time taken by the sensor (the back to back connected inverters) to boost the initial voltage difference to a logic state at the output nodes. This Th is time is referred as the boosting time (TB). The activation time depends mainly on the differential part. The second phase is the time taken by the sensor part (the back-to-back connected inverters) to boost the initial voltage difference to a logic state at the output nodes. This time is referred to as the boosting time TB. The boosting time depends mainly on the sensor part.
18
Fig 3.2 Output voltages and their difference in the two clock phases for CMTLG.
An alternative differential clock threshold logic implementation is presented in [11], and it is referred to as the differential current mode logic (DCML) approach. Its block diagram is shown in Fig3.3. It is also divided into the differential part and the sensor part. The currents through the threshold part and the inputs part are also denoted by IT and IA, respectively. The sensor part consists of four pMOS transistors, labeled P1 – P4, P4, and six nMOS transistors, labeled N1 – N6. The load capacitance CL is applied to both the output nodes O and OB.
19
Fig 3.3 Block diagram of differential current mode logic .
The applied clock is divided into two phases: when the clock is high the TLG is in the equalization phase and when it is low it operates on the evaluation phase. In the equalization phase, NMOS transistors N1, N2, N3, and N6 are active. Transistor N1 equalizes the voltage at nodes M1 and M2. Similarly, transistor N2 equalizes the voltage at nodes M3 and M4. In the equalization phase, transistors N6 and N3 are active and there exists a discharge path for nodes O and OB of Fig 3.3. If there is a voltage difference at nodes O and OB, during the evaluation phase, then the sensor part will identify the voltage voltag e difference and it will boost the voltage at the output nodes O and OB to a desired voltage. When the active current IA is greater than the threshold current IT , then the voltage at the output node O rises faster than the voltage at node OB. As a result, high voltage is obtained at node O and low voltage is obtained at node OB.
20
When IT is greater than IA, then the voltage at OB rises faster than the voltage at O and low voltage results at OB.
Fig 3.4 Output voltages and their difference in the two clock phases for DCML
Fig 3.4 shows the two phases of the clock, the voltage at nodes O and OB, and the voltage difference between O and O B (dV). The delay of DCML is divided into the activation time T A and the boosting time TB.
A method to obtain optimum sensor sizes: The CMTLG of [1] assumes that all inputs have minimum weights. If a TLG requires weight wi > 1 (greater than minimum weight) for some input i, then as an alternative we can implement the function with wi minimum weight inputs for CMTLG of Fig 3.1. We consider an N-input CMTLG of [1] that can be used to implement different TLG functions for a given value of N and T. This section shows how to identify optimum sensor size for the Ninput CMTLG of [1], so that the delay of any TLG implemented by the N-input CMTLG is minimized.
21
4. PROPOSED SYSTEM 4.1 Design Considerations Considerations in Integrated Circuits After guaranteeing correct digital functionality, the primary consideration for system designers has always been speed. A circuit is specified to operate at a particular delay, otherwise the entire system may not work; further reduction is beneficial but not strictly necessary. Other factors may have equal or greater importance than power dissipation; area of implementation and reliability issues are subjects which designer must take into account. It’s worth to note that power
reduction techniques are not necessarily negatively correlated to delay reduction. For example, one method to reduce delay in a circuit’s critical path is to upsize the driving strength of gates,
which results in increased power reduction. However, reducing interconnect capacitance, which is another way to lower delay, reduces both power and delay. Generally, great power savings can be achieved if delay is not an issue, but optimizing power without delay consideration is insignificant.
4.2 Why Low Power? Power dissipation limitations come in two ways. The first is related to cooling considerations when implementing high performance systems. High-speed circuits dissipate large amounts of energy in a short amount of time, generating a great deal of heat. This heat needs to be removed by the package on which integrated circuits are mounted. Heat removal may become a limiting factor if the package cannot sufficiently dissipate this heat or if the required thermal components are too expensive for the application. The second failure of high-power circuits relates to the increasing popularity of portable electronic devices. Laptop computers, portable video players and cellular phones all use batteries as a power source. These devices provide a limited time of operation before they require recharging. To extend the battery life, low power operation is desirable in integrated circuits.
4.3 Circuit Design Circuit design plays an important role in the design of digital circuits like multiplier. First, to guarantee the multiplier to work at the desired clock rate, the designer has to know the delay of the critical path and the required time of inserting a pipeline stage. Second, to reduce the area of the multiplier, several architectures of adders are investigated. Circuit analysis helps the designer verify the functions and performances of the adders. The architecture of the adder has to
22
be determined first. Then Th en the number of the pipeline stages can be decided by the speed of the adder. The size of the multiplier should be as small as possible if all the requirements can be met. Fast arithmetic requires fast circuits. Fast circuits require small size, to minimize the delay effects of wires. Small size implies a single chip implementation, to minimize wire delays, and to make it possible to implement these fast circuits as part of a larger single chip system to minimize input/output delays. The increasing demand for low-power VLSI asks, among others, for power efficient logic styles. Performance criteria for logic styles are circuit speed, circuit size, power dissipation, and wiring complexity as well as ease-of-use and generality of gates in cell-based design techniques. Dynamic logic styles are often a good choice for high-speed, but not for low-power circuit implementations due to the high node activity and large clock loads . This chapter focuses review of various logic styles suitable for low power.
4.4 CMOS Logic Structure Today CMOS (Complementary Metal Oxide Semiconductor) is the primary primary technology in the Semiconductor industry. Most high speed microprocessors are implemented using CMOS. Contemporary CMOS technology is characterized by:
Small minimum sized transistors, allowing for dense layouts, although the interconnect limits the density. Low Quiescent Power - The power consumption of conventional CMOS circuits is largely determined by the AC power caused by the charge and discharge of capacitances.
4.4.1 Low Power And High-Speed Dual-Clock-Based Current Mode Tlg Implementation
A new TLG implementation is proposed. It is called DCCML. As the name indicates, two clocks are used to achieve low power consumption and high speed. The approach consists of two steps. First, the set of functions that can be implemented using CMTLG for a given input configuration (number of inputs N and threshold T) are grouped in to equivalent classes. We show that when T+1 inputs are active on the input side then the TLG exhibits its worst delay. The block diagram DCCML is shown in Fig 4.1 As in previous approaches, the DCCML is divided into two basic blocks: the differential block and the sensor block. The differential block is further divided into four blocks: the positive threshold, the negative inputs, the negative threshold, and the positive inputs. All the transistors in the differential block are equal-sized PMOS transistors and are connected in parallel, as shown in Fig 4.1 The sensor block consists of
23
six PMOS transistorsP 1···P6 and three NMOS transistors N1, N2, and N3. The gates of transistors P1 and N1 are connected to Clk 1 and the gates of transistors P 2, P5, and P6 are connected to Clk 2. Transistor N1 acts as an equalizing transistor and it equalizes the voltage at nodes OP and OPB. Transistors P 5 and P6 isolate the differential block from the sensor block. The transistors in the positive threshold and negative threshold are always active. Transistors in the positive and negative inputs blocks are active depending upon the input pattern applied. The input pattern applied for the positive inputs block is denoted by {x 1, x 2, . . . , x I }. Let N denote the number of inputs, and I denote the number of positive inputs. Then the number of negative inputs is N – I. I. The input pattern applied for the negative inputs block is denoted by {xI+1, x I+2, …. x N} .Consider a function f, with a possible weight configuration {w 1, w2 : w T , w3, w4}={2, 2:3, −1, −1}. In the given weight configuration, we have two positive weights w1 and w2 and two negative weights w3 and w4. Weights w1 and w2 are implemented in the positive inputs section and weights w3 and w4 are implemented in the negative inputs section. The threshold weight wT is implemented in the positive threshold section. The current through the four blocks (positive threshold, negative inputs, negative threshold, and positive inputs) are denoted by IPT , I NI , I NT , and IPI , respectively. The currents through transistors P5 and P6 are denoted by IP5 and IP6 . Here, IP5 = IPT + I NI and IP6 = I NT + IPI . Nodes OP and OPB are the output nodes. The load capacitance is denoted by CL. The operation is divided into three phases: the equalization phase, the pre-evaluation phase, and the final-evaluation phase. When clocks Clk1 and Clk2 are high, then the circuit is in the equalization phase. When clocks Clk1 and Clk2 are low, then the circuit is in the pre-evaluation phase. When Clk1 is low and Clk2 is high, then the circuit is in the final-evaluation final-e valuation phase. See also Fig 4.2 It is noted that when the two clocks are not completely aligned the operation of the gate is not affected. The possible cases of misalignment are: 1) the falling edge of Clk2 comes before the falling edge edg e of o f Clk1 and 2) the falling edge ed ge of o f Clk2 comes after the falling edge ed ge of Clk1.
24
Fig 4.1 Block diagram of DCCML TLG
Fig 4.2 Clocks in DCCML
25
In the first case, the current from the differential part is equalized because of transistor N1 and the evaluation phase starts after the falling edge of Clk1. In the second case, there will be no current from the differential part as Clk2 is not active yet. Hence, the pre-evaluation phase starts after the falling edge of Clk2. The implementation avoids a very early arrival of Clk1. In that case, An unstable signal might result in erroneous output. If the current IP6 through the PMOS transistor P6 is greater than the current IP5 through the PMOS transistor P5, then the voltage at the output node OP rises faster than the output node OPB. As a result, high voltage is obtained at output node OP and low voltage occurs at output node OPB. Otherwise, the voltage at the output node OPB rises faster than the output node OP. As a result, high voltage is obtained at the output node OPB and low voltage is obtained at node OP. In DCCML, the PMOS transistors P1, P2, P5, P6 and the PMOS transistors in the differential block are used to provide the initial voltage at the output nodes OP and OPB. Using Clk2, we restrict the current flow from the differential block to the sensor block, once initial voltage difference is established at the nodes OP and OPB; in this way we stop the current flowing from the differential block to the sensor block. Using Clk2, we are able to minimize power consumption in the circuit. Transistors P 5 and P6 are also used to isolate high capacitance circuit block (the differential block) at the output nodes. Hence, in the final evaluation phase the sensor block drives the load capacitance as well as the capacitance from a single transistor P5 or P6. Delay is reduced because the duration of the final evaluation phase is small. The voltage at the output nodes OP and OPB and the voltage difference (dV) at the output nodes OP and OPB are shown in Fig 4.3 for the three clock phases.
26
Fig 4.3 Voltage at output nodes OP and OPB and dV during the three clock phases
In particular, the delay of the DCCML is divided into two time phases: the activation time and the boosting time. The activation time is the time taken by the circuit to develop an initial voltage difference at the output nodes OP and OPB. The boosting time is the time taken by the DCCML to bring the initial voltage to the correct voltage at the output nodes OP and OPB. In the pre-evaluation phase, both the differential part and the sensor part are active, and therefore the activation time is not affected. In the final evaluation phase, the differential part is kept inactive using Clk2. Therefore, the effect of internal capacitance due to the differential part is isolated. Hence, it takes very little time to boost the outputs to the final value. The power is also reduced due to the isolation of the differential part.
27
4.5 DELAY MINIMIZATION BY AN APPROPRIATE SENSOR SIZE SELECTION This section presents an analytical formula to compute the sensor size that minimizes the gate delay. Let N denote the number of inputs, N the sum of all positive input weights, and T the sum of the threshold weight and negative input weights Our analysis assumes that all the input weights are connected in parallel, and that each weight wi can be implemented by wi unit width PMOS transistors connected in parallel. This is an accurate assumption. We have implemented TLG weights using a smaller number of wider PMOS transistors connected in parallel and SPICE simulations showed no difference in the performance of the TLG. This is further explained in the example below. Example 2: Consider a threshold function where N, the sum of positive input weights, is 11. Let
also T , the sum of the threshold weight and negative input weights, be 4. In this function, we have (N, T) = (11, 4). Gates {11:4}, {6, 5: 4}, {5, 5, 1: 4}, {5, 4, 1, 1: 4}, {4, 4, 1, 1, 1: 4}, and {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1: 4} were implemented in the 45-nm technology. SPICE simulation shows an identical delay of 297 ps. In the following, it is not differentiated among functions for which the sum of all positive input weights is N, and the sum of the negative input weight and threshold weight is T. Since all these threshold functions exhibit the same delay, these functions will be denoted by the pair (N, T). The remaining focus is on how to determine the optimum sensor that minimizes the delay of any (N, T) function. The work considers that the TLG operates under an input pattern that exhibits the worst case propagation delay, and then focuses on deriving an analytical model that expresses TLG delay in terms of the sensor size S in that setting. In a first step, we identify the pattern that gives the highest delay for the function. In a second step, we consider this worst case scenario, and the delay will be expressed as a function of the sensor size S. Then, we operate on that function in order to optimize the sensor size S. In the first step, it is shown that when T +1 inputs are active then the TLG exhibits its worst delay. Let NA = ∑ i wi , such that xi = 1. Such inputs are called active, and the respective PMOS transistors are also called active. Assume that the initial current flowing through an active minimum-sized PMOS is I p p. Then the current flowing through the threshold side of the TLG is T · I p, and the current flowing through the input side for NA inputs being active is NA · I p.
28
To obtain the worst case delay for logic 1 at the output node O, the current difference IA− IT should be minimum. For logic 0, this current difference should also be minimum. Since transistors on the threshold side are always ON, the maximum delay for a rising transition of the output is obtained when we have T + 1 active transistors. Likewise, T − 1 active transistor s tend to obtain the worst case delay for a falling transition at the output. However, it is known that the worst case delay occurs for rising output transition [1]. Hence, a worst case delay pattern is one that gives the least current difference at nodes M1 and M2. The following is an example where SPICE simulations confirm this analysis. Example 3: Consider a CMTLG implementation of a function with T = 4, N = 11, and sensor
size S = 10. The input pattern that has T + 1 number of active inputs gives the worst delay. Hence, the highest delay encountered is NA = 5. Fig.4.4 shows the delay of the TLG using SPICE in 45-nm technology. When NA varies in range [5], [11], the output transition is rising, and the highest rising delay occurs when NA = 5. When NA is in the range (0, 5) the transition at the output is falling and in that case the delay is less.
Fig 4.4 CMTLG delay with N = 11 and T = 4 as NA varies
29
Similar behavior has been observed for different values of T and N. Furthermore, extensive SPICE simulations have confirmed that the worst case delay of DCML gates is obtained when NA = T + 1 and also occurs when the output is rising. In the second step of the proposed method, it is shown how to obtain an analytical expression exp ression that approximates the time delay TD as a function of the sensor size S, given N and T . The delay time TD is divided into two phases: the activation time TA and the boosting time TB. The tradeoff among the two phases is analyzed by varying the sensor size S and keeping all the other parameters N, T , and NA constant. During the activation time, the major current component is the current from the differential part. From the schematics in Fig 3.1and 4.1, due to the voltage difference at nodes O and OB, we conclude that |IA − IT | is proportional to |N A − T|. The time requirement for the activation time will be inversely proportional to the current. The time will be proportional to a charge that depends on two components: the voltage difference that is required at the end of the activation phase and the capacitance that the differential part is driving. This capacitance is the difference in the differential capacitance N · Ci’ − T · CT’ , where Ci’ and CT’ are the unit capacitances of the input part and the threshold part, the sensor capacitance, which is S · C’s where C’s is unit capacitances of the sensor part, and the output capacitance. The overall time required for activation will be proportional to (N · Ci’ − T · CT’ + S · C’s + CL/|NA − T|). Term (N · Ci’ − T · CT’ + CL/|NA − T|) is invariant to the sensor size and term (S · C ’s /|NA − T|) is proportional to S. During the boosting time, the delay depends on the current provided by the sensor. This current will be proportional to the sensor size. The capacitance to be charged will be the same as in the activation time. (The voltage will be different, which does not depend on the sensor size S.) Hence, the boosting time will be proportional to (N · C i’ − T · CT’ + S · C ’s + CL/S). The numerator is an approximation to the overall capacitance connected to the outputs O and OB. The boosting time consists of (S · C ’s /S) = C’s , which is invariable to the size of the sensor and (N · Ci’ − T · CT’ + CL/S), which is inversely proportional to the sensor size. We conclude that the gate delay consists of three components T0, T1, and T2 defined below. Component T0 is invariant to S and that is the sum of the invariant components of the activation time and the invariable components of the boosting time, i.e., T0 = C ’s + (N · Ci’ − T · CT’ + CL/|NA − T|).
30
Component T1 is proportional to the sensor size S and occurs during the activation time, i.e., T1 = C’s · |(1/NA − T )|. Finally, component T2, which is inversely proportional to the sensor size, occurs during the boosting time and is equal to N · C i’− T · C T’ + CL. Concluding, the overall time TD is estimated as
By applying regression analysis on SPICE simulations, T D is rewritten as are constants and their values are a = 1e − 9, b = 1 for CMTLG, DCML and b = 0.1 for DCCML, c = 1e − 11, and d = 3.86e − 2. Equation (3) gives the gate delay for different sensor sizes for fixed values of
N, T , NA, and CL.
The final step of the proposed method operates on (3) in order to derive sensor size Sopt, which gives the minimum gate delay. Sensor size Sopt is derived by applying the first derivative on (3) and equating it to zero in order to find the minimum value of TD.
The remainder of this section presents the corollaries obtained by (3). Corollary 1: The delay TD decreases with an increase in S, reaches an optimum value for some
consecutive values of S, and then increases as S increases. The actual values of minimum S depend on N and T. Corollary 2: For a sensor size that is smaller than the optimum sensor size Sopt, the activation
time TA is low and the boosting bo osting time TB is high.
31
The activation time is less because it has less capacitance and the output can drive this small capacitance faster to develop an initial voltage difference. In order to boost the initial voltage difference, the back-to-back connected inverters must be small. Hence, the boosting time is high. Corollary 3: For a sensor size that is larger than the optimum sensor size Sopt, the activation time
TA is high and the boosting time TB is low. The activation time is high because it may have a large capacitance and the output is slow to develop an initial voltage difference. Large back-to back connected inverters will boost the initial voltage difference quickly. Corollary 4: TD decreases as S approaches Sopt and then increases as S grows larger than Sopt.
The corollary is justified because the total delay T D of TLG is the sum of T A and TB.
4.6 Microwind The Micro wind program allows the student to design and simulate an integrated circuit. The package itself contains a library of common logic and analog ICs to view and simulate. MICROWIND includes all the commands for a mask editor as well as new original tools never gathered before in a single module. Microwind can gain access to circuit simulation by pressing one single key. The electric extraction of your circuit is automatically performed and the analog simulator produces voltage and current curves immediately. A specific command displays the characteristics of PMOS and NMOS, where the size of the device and the process parameters can be very easily changed. Altering the MOS model parameters and, then, seeing the effects on the Vds and Ids curves constitutes a good interactive tutorial on devices. The Process Simulator shows the layout in a vertical perspective, as when fabrication has been completed. This feature is a significant aid to supplement the descriptions of fabrication found in most textbooks. The Logic Cell Compiler is a particularly sophisticated tool enabling the automatic design of a CMOS circuit corresponding to your logic description in VERILOG. The DSCH software, which is a user-friendly schematic editor and a logic simulator presented in a companion manual, is used to generate this Verilog description. The cell is created in compliance with the environment, design rules and fabrication specifications. A set of CMOS processes ranging from 1.2µm down to state-of-the-art 0.25µm are proposed.The chapters of this manual have been summarized below.
32
A Quick reference sheet, the complete list of files and the instructor guide are reported at the end of the present manual.The major updates of MICROWIND compared to the DOS version concern the support of advanced technologies, improvements in editing commands, the possibility to handle very complex designs and the VERILOG compilation from high-level description into layout. The new software, DSCH, concerning logic editing and simulation is now part of the package.
4.6 License The MICROWIND is a licensed software, that has been licensed in in France by Language Informatique Inc, Toulouse, and by INSA in all other countries. The single license authorizes to use one copy of the software and includes one copy of the documentation. The site license authorizes to use ten copies of the software software and includes one copy of the documentation. An authorized copy of the program is required for each one of the computer operating the program simultaneously. we may not transfer, sell, sell, or distribute the software. MICROWIND is recommended by EURO-PRACTICE, the American Society for Engineering Education (ASEE), and supported by the National Comity for Micro-Electronics Education (CNFM). The MICROWIND display window is shown in Fig4.5. It includes four main windows: the main menu, the layout display window, the icon menu and the layer palette. The cursor appears in the middle of the layout window and is controlled by using the mouse. The layout window features a grid grid that represents the current scale of the drawing, scaled in lambda () units and in micron. The lambda unit is fixed to half of the minimum available lithography of the technology. The default technology is a 0.8 µm technology, consequently lambda is 0.4 µm.
33
Fig 4.5 The MICROWIND MICROWIND window as it appears at the initialization stage stage
4.6 The MOS device The MOS symbols are reported in fig 4.6.. The n-channel MOS is built using polysilicon as the gate material and N+ diffusion to build the source and drain. The p-channel MOS is built using polysilicon as the gate material and P+ P + diffusion to build the source and drain.
nMOS
pMOS
Fig 4.6 pmos and nmos
34
4.7 Manual Design
By using the following procedure, you can create a manual design of the n-channel MOS. The default icon is the drawing icon shown above. It permits box editing. The display window is empty. The palette is located in the lower right corner of the screen. A red color indicates the current layer. Initially the selected layer in the palette is polysilicon. The two first steps are illustrated in Fig 4.7. 1. Fix the first corner of the box with the mouse 2. While keeping the mouse button pressed, move the mouse to the Opposite corner of the box. 3. Release the button. This creates a box in polysilicon layer as shown in Fig 4.8 The box width should not be inferior to 2, which is the minimum width of the polysilicon box.
Fig 4.7 Creating a polysilicon box
Change the current layer into N+ diffusion by a click on the palette of the Diffusion N+ button. Make sure that the red layer is now the N+ Diffusion. Draw a n-diffusion box at the bottom of the drawing as in Figure 3. N-diffusion boxes are represented in green. The intersection between diffusion and polysilicon creates the channel of the nMOS device.
35
Fig 4.8 Creating the N-channel MOS transistor
4.8 Process Simulation
Click on this icon to access process simulation. The cross-section is given by a click of the mouse at the first point and the release of the mouse at the second point. In the example below (Figure 4.9), three nodes appear in the cross-section of the n-channel MOS device: the gate (red), the left diffusion called source (green) and the right diffusion called drain(green), over a substrate (gray). The gate is isolated by a thin oxide called the gate oxide. Various steps of oxidation have lead to a thick oxide on the top of the gate.
36
Fig 4.9 The cross-section of the nMOS devices
The physical properties of the source and of the drain are exactly the same. Theoretically, the source is the origin of channel impurities. In the case of this nMOS device, the channel impurities are the electrons. Therefore, the source is the diffusion area with th e lowest voltage. The polysilicon gate floats over the channel, and splits the diffusion into 2 zones, the source and the drain. The gate controls the current flow from the drain to the source, both ways. A high voltage on the gate attracts electrons below the gate, creates an electron channel and enables current to flow. A low voltage disables the channel.
4.10 MOS Characteristics Characteristics
Click on the MOS characteristics icon. The screen shown in Fig 4.10 appears. It represents the Id/Vd simulation of the nMOS device.
37
Fig 4.10 5N-Channel MOS characteristics
The MOS size (width and length of the channel situated at the intersection of the polysilicon gate and the diffusion) has a strong influence on the value of o f the current. In Fig4.10, Fig4.10 , the MOS width is 12.8µm and the length is 1.2µm. Click on OK to return to the editor. A high gate voltage (Vg =5.0) corresponds to the highest Id/Vd curve. For Vg=0, no current flows. The maximum current is obtained for Vg=5.0V, Vd=5.0V, with Vs=0.0. The MOS parameters correspond to SPICE Level 3, can alter the value of the parameters, or even access to Level 1,may also skip to PMOS, as well add some measurements to fit the simulation. Finally, the we get the system simulate devices with with other sizes in the proposed list.
4.11 Add Properties for Simulation Properties must be added to the layout to activate the MOS device. The most convenient way to operate the MOS is to apply a clock to the gate, another to the source and to observe the drain. The summary of available properties is reported below.
38
VDD ro ert Node visible visible
VSS ro ert
Pulse ro ert
Clock ro ro ert
1. Apply a clock to the drain. Click on the Clock icon, click on the left diffusion. The Clock menu appears as fig 4.11 . Change the name into « drain » and click on OK. A default clock with 3 ns period is generated. The Clock property is sent to the node and appears at the right hand side of the desired location with the name « drain ».
Fig 4.11 The clock menu
2. Apply a clock to the gate. Click on the Clock icon and then, click on the polysilicon gate. The clock menu appears again. Change the name into « gate» and click on OK to apply a clock with 6 ns period. 3. Watch the output: Click on the Visible icon and then, click on the right diffusion. The window below appears. Click OK. The Visible property is then sent to the node. The associated text « s1 » is in italic. The wave wa ve form of this node will appear at the th e next simulation.
39
Fig 4.12 The visible node menu
4.12 Save before Simulation Click on File in the main menu. Move the cursor to Save as ... and click on it. A new window appears, into which you enter the design name. Type, for example, myMos. Use the keyboard for this and press
.
Then click on OK. After a confirmation question, the design is
saved under that filename.
IMPORTANT : Always save BEFORE any an y simulation ! 4.13.1Analog Simulation
Click on Simulate on the main menu. The timing diagrams of the inverter appear, as shown in Fig 4.13.
40
Fig 4.13 Analog simulation of the MOS device
When the gate is at zero, no channel exists so the node s1 is disconnected from the drain. When the gate is on, the source copies the drain. It can be observed that the nMOS device drives well at zero but at the high voltage. The final value is 4.2V, that is VDD minus the threshold voltage. Click on More in order to perform more simulations. Click on Stop to return to the editor.
41
5. ADVANTAGES&APPLICATIONS 5.1 ADVANTAGES
Delay is low.
Energy consumption is low.
5.2 APPLICATIONS
Electronic system in cars.
Digital electronics control VCRs.
Transaction processing system, ATM. ATM.
Personal computers and Workstations.
Medical electronic systems.
42
6. SIMULATION RESULTS 6.1 CMPTL
Fig 6.1CMTL
43
7.2 Layout
Fig 6.2 Lay Out
44
6.3 Simulation Wave Form
Fig 6.3 Simulation Wave Form
45
7.4 DCCML
Fig 6.4 DCCML
46
7.5 Layout
Fig 6.5 Layout
47
7.6 Simulation Wave Form
Fig 7.6 Simulation Wave Form
48
7s. CONCLUSION `An analytical method has been proposed to identify quickly the transistor size in the sensor component of a current mode implementation that ensures very low gate delay (very close to the minimum), independent of the current mode method used to implement the threshold logic function. A new current mode implementation method was also proposed that outperforms existing implementations both in gate delay as well as energy.
49
9. FUTURE SCOPE A new utilization of a sill get entry to in response to a capacitive testimony, charge
recycling
prong
judgment
loudspeaker
close
is
proficient.
Simulation
consequences factor out in order that the taken into c onsideration device has virtually low prope pro perr drunk dru nken enne ness ss and an d massi ma ssive ve perfo pe rformi rming ng exped ex ped ite, ite , simila sim ilarl rly y clou cl outt belo be low w pro pr o ceed ce edin ing, g, incalescence and provide electricity variations, and is therefore pretty sufficient as a factor in mainframe microelectronics layout.
50
8. REFERENCES [1] S. Bobba and I. N. Hajj, “Current-mode threshold logic gates,” in Proc. IEEE ICCD, Sep. 2000, pp. 235 – 240. 240. [2] T. Ogawa, T. Hirose, T. Asai, and Y. Amemiya, “Threshold-logic devices consisting of subthreshold CMOS circuits,” IEICE Trans. Fundam. Electron., Commun. Comput. Sci., vol.
E92-A, no. 2, pp. 436 – 442, 442, 2009. [3] S. Muroga, Threshold Logic and Its Applications. New York, NY, USA: Wiley, 1971. [4] W. Prost et al. , “Manufacturability and robust design of nanoelectronic logic circuits based 552, on resonant tunnelling diodes,” Int. J. Circuit Theory Appl., vol. 28, no. 6, pp. 537 – 552, Nov./Dec. 2000. [5] S. Leshner, K. Berezowski, X. Yao, G. Chalivendra, S. Patel, and S. Vrudhula, “A low
power, high performance threshold logic- based based standard cell multiplier in 65 nm CMOS,” in Proc. IEEE Comput. Soc. Annu. Symp. VLSI, Lixouri, Greece, Jul. 2010, pp. 210 – 215. 215. [6] M. Sharad, D. Fan, and K. Roy. (2013). “Ultra-low energy, highperformance dynamic resistive threshold logic.” [Online]. Available: http://arxiv.org/abs/1308.4672
[7] P. Celinski, J. F. López, S. Al-Sarawi, and D. Abbott, “Low power, high speed, charge 1069, Aug. recycling CMOS threshold logic gate,” Electron. Lett., vol. 37, no. 17, pp. 1067 – 1069, 2001. 8] S. Leshner and S. Vrudhula, “Threshold logic element having low leakage power and high performance,” WO Patent 2009 102 948, Aug. 20, 2009. [9] T. Shibata and T. Ohmi, “A functional MOS transistor featuring gatelevel weighted sum and
1455, Jun. 1992. threshold operations,” IEEE Trans. Electron Devices, vol. 39, no. 6, pp. 1444 – 1455, [10] V. Beiu, J. M. Quintana, and M. J. Avedillo, “VLSI im plementations of threshold logic — A
1243, Sep. 2003. comprehensive survey,” IEEE Trans. Neural Netw., vol. 14, no. 5, pp. 1217 – 1243, [11] T. Gowda, S. Leshner, S. Vrudhula, and S. Kim, “Threshold logic gene regulatory
4. networks,” in Proc. IEEE Int. Workshop GENSIPS, Jun. 2007, pp. 1 – 4. [12] A. K. Palaniswamy and S. Tragoudas, “A scalable threshold logic synthesis method using
310. ZBDDs,” in Proc. 22nd Great Lakes Symp. VLSI, 2012, pp. 307 – 310. [13] C. B. Dara, T. Haniotakis, and S. Tragoudas, “Delay analysis for an N-input current mode threshold logic gate,” in Proc. IEEE Comput. Soc. Annu. Symp. VLSI (ISVLSI), Aug. 2012, pp.
344 – 349. 349.
51
[14] A. K. Palaniswamy, T. Haniotakis, and S. Tragoudas, “ATPG for delay defects in current mode threshold logic circuits,” IEEE Trans. Comput.- Aided Des. Integr. Circuits Syst., vol. PP,
no. 99, pp. 1 – 1. 1. [15] A. Neutzling, J. M. Matos, A. Mishchenko, R. Ribas, and A. I. Reis, “Threshold logic
499. synthesis based on cut pruning,” in Proc. ICCAD, Nov. 2015, pp. 494 – 499.
52
53