Logic Gate Analysis With Feed Forward Neural Network

235 WHAMPOA - An Interdisciplinary Journal 51(2006) 235-251

Logic Gate Ana y is with Feed Forward Neural Network 1

David C.T. Tai , Shu-hui Wu

2

1

Dept. of Computer Information Science, ROC Military Mil itary Academy 2Dept. of Mechanical Engineering, ROC Military Academy

Abstract Logic gates are main components to design and assemble an integrated circuit part. A

multifunction electronic part is often composed by a great many of basic gates and derived gates. To analyze such a part is very tedious job. With advent of knowledge and technique in neural network, the complexity can be simplified with this technology to remarkably extent. The purpose of the study is to focus on how to establish a feed forward neural network to analyze a complex logic gates. Firstly, the mathematical background of a neural network model is reviewed. Then, the neural network toolbox in MatLAB® is described. Finally, a simple hybrid logic gate is used as an illustrated example. To generate the truth-value for the gate, LabVIEW™ is employed to produce the dataset for training, validating and testing the neural network. Keywords : Feed Forward Neural Network, Logic gate, MatLAB®, LabVIEW™

I. Introduction

Logic gates process signals, which

basic gates. The Integrated Circuit (IC)

represent true or false. The basic basic gates are

composed of lots of such basic gates and

AND, NAND, OR and XOR gates [4].

some derived gates.

Figure 1, shows traditional symbol for the

AND Gate NAND Gate

OR Gate

NOR Gate

Figure 1 Traditional Symbol for Basic Gates

212

In digital system design to analyze the input

Neural Network Toolbox in MATLB® will

and output status for a combined gate is a

be used to develop the neural network for

very important and tedious job. With advent

assisting the analyzing of a combined logic

of knowledge and technique in neural

gate. Finally, a simple hybrid logic gate

network, feed forward neural network is

constructed with LabVIEW™. It generates

employed to help the analysis. In this paper,

the dataset for training, validating and

the mathematical background for neural

testing the Feed Forward Neural Network

network will be reviewed firstly. Then,

established [3]. the continuous maintenance required to upkeep the neuron function.

II. Mathematical Background for Feed Forward Neural Network 1. Introduction to Neural Networks

Neural networks are computer systems of

biological

neurons,

composed

of

nonlinear computational elements operating in parallel. The brain is a collection of about 10 billion interconnected neurons. Figure 2 shows the scheme of a neuron. Each neuron is a cell that uses biochemical reactions to

Figure 2 Organization of a Nueron 1

receive process and transmit information. A neuron's dendrite tree is connected to a

The part of the soma that does concern itself

thousand neighboring neurons. As one of

with the signal is the axon hillock. If the

those neurons fires, a positive or negative

aggregate input is greater than the axon

charge is received by one of the dendrites.

hillock's threshold value, then the neuron

The strengths of all the received charges are

fires, and an output signal is transmitted

added together through the processes of

down the axon (Fraser, 2000)

spatial and temporal summation. Spatial

networks consist of nodes (neurons) and

summation

synaptic connections that connect these

occurs

when

several

weak

2

. Neural

signals are converted into a single large one, while temporal summation converts a rapid series of weak pulses from one source into

1

Source from the web site,

one large signal. The aggregate input is then passed to the soma. The soma and the

http://vv.carleton.ca/~neil/neural/neuron-a.html

enclosed nucleus don't play a significant role in the processing of incoming and outgoing

2

Fraser, N. ,1998, Training Neural Networks, Web

data. Their primary function is to perform site: http://vv.carleton.ca/~neil/neural/neuron-d.htm

David C.T. Tai, Shu-hui Wu

Logic Gate Analysis with Feed Forward Neural Network

nodes. Each connection is assigned a relative weight, also called connection

connected to node i, and its bias.The

strength, or synaptic strength. The output at each node depends on the threshold, also

213

state variable

called bias or offset, specified and a transfer

si

of node i is given by J

si = f i (∑ wij s j − vi )

(activation) function. In mathematical terms,

(1)

j =1

a neural network model can be represented by the following parameters: •

Note that transfer functions and bias terms

A State variable sι is associated with

A

weight

The sigmoid function is the commonly used transfer function (Russell and Norving,

each node i.

•

are absent for the input nodes.

1995). It is given by wij

associates

with

f (α ) =

a

β

1

(2)

(1 + e − 2 βα )

connection between a node i and, a node

Where

j that the node i is connected to, such

Generally, β is set to unity, which results in

that signals flow from j to i.

a sigmoid transfer function of the form

•

A bias vi associated with each node i.

•

f i ( s j , wij , v i ) A transfer function

determine

the

steepness.

below,

for

f (α ) =

1 (1 + e

−α

)

(3)

each node I, which determines the state

Graphical representation of sigmoid function

of the node as a function of the summed

is shown in Figure 3.

output from all the nodes that are

238

1 0.9 0.8 0.7 0.6 )

0.5

( f 0.4 0.3 0.2 0.1 0 -10

-5

0

5

10

Figure 3 Plot of Sigmoid Function

a finite derivative. If y i = f ( hi ), where hi

Another popular activation function is the hyper-tangent function, tanh, which is

is the summed input to a node i, then

given by

f ' ( hi ) = y i (1 − y i ). f (α ) =

eα − e −α α −α e +e

(5)

(4)

The activation function is equivalent to the sigmoid function if we apply a linear

2. Three-layered Feed-forward Network

Multi-layered networks are those with input, output and inner (or hidden) neuron layers that intervene between the input and output layers. The input-output

~

transformation α = α / 2 to the input and a ~

linear transformation f = 2 f − 1 to the output. These functions are monotonic with

relation defines a mapping, and the neural network provides a representation of this mapping. The number of hidden layers and number of nodes in each hidden layer depend on the complexity of the problem, and to a large extent vary from problem to problem. Increasing the number of hidden


215


layers increases the complexity of a neural

the training breaks, training cycle numbers

network and may or may not result in a

and

better network performance but will lead to

performance and reliability parameters for

a longer training time. Based on previous

this study. Bias or threshold of nodes in the

experience, one or two hidden layers

hidden layer can be treated as one of the

provide a better performance, while not

input node generally with the value –1. Bias

requiring extensive training time [2].

of nodes in the output layer has same

training

performance

with

initial

property as that of nodes in the hidden layer. A three-layer feed-forward network

Summation of the weight value of nodes in

consists of an input layer, an output layer

the input layer including that of bias is

and a hidden layer, resulting in two layers of

transformed as the output value of nodes in

weights. One connects the input to the

the hidden layer. And summation of the

hidden layer and the other connects the

weight value of nodes in the hidden layer

hidden layer to the output layer. Figure 4

including that of bias is transformed as the

illustrates

output value of nodes in the output layer.

a

three-layered

feed-forward

implementation architecture for evaluating

Threshold (H)

Training Breaks

Initial Performance

Cycle Numbers

Reliability

Performance

Figure 4 A Three-layered Feed-forward Network propagation is the most extensively used neural network training method. The back-

3. The Back-Propagation Algorithm

propagation

algorithm

is

an

iterative

Learning for a neural network is

gradient method based algorithm developed

accomplished through an adaptive procedure,

to introduce synaptic correction (or weight

known as a learning rule or algorithm.

adjustments) by minimizing the sum of

Learning algorithms indicate how weights

square errors (objective function). Figure 5

which connecting nodes between two layers

is architecture of three-layer network and is

should be incrementally adapted to improve

used to derive the back-propagation method.

a

predefined

performance

measure.

Learning can be viewed as a search in multidimensional

weight

solution,

gradually

which

space

for

a

optimizes

a

predefined objective function [1].

Back-

Node indexes

i, j

and

k

respectively

represent the index of nodes in the input, hidden and output layers. The

wij

is the

216

weight or synaptic strength connecting from node i in the input layer to node j in the w jk

hidden layer, while the

threshold for hidden layer nodes will be represented as node 0 in the input layer, while bias for output layer nodes will be represented as node 0 in the hidden layer.

is the weight or

The number of nodes in input, hidden and

synaptic strength from node j in the hidden

output layers is equal to l, m and n

layer to node k in the output layer. Bias or

separately.

k

Output Layer w jk

Hidden Layer

wij

Input Layer

i

Figure 5 Architecture of Three-layered Network

Then, the node output value for the hidden and

output

layer,

oi and

o j will

be

The back-propagation method is derived by minimizing the error on the output units over all the patterns using the following formula for this error, E, E =

represented as follows:

1

∑∑ (t pk − o pk ) 2 p

o j =

− net j

(6)

k

is the subscript for the output units. Then, tpk is the target value of output unit k for

m

net j = ∑ wij oi

(7)

j =1

o k =

(10)

where p is the subscript for the pattern and k

1 1+ e

2

pattern p and opk is the actual output value of output layer unit k for pattern p. This is

1 1+ e

− net k

(8)

k =1

function; however from time to time people have tried other error functions. Notice that

n

net k = ∑ w jk o j

by far the most commonly used error

(9)

E is the sum over all the patterns, however it is assume that by minimizing that error for each pattern individually E will also be minimized. Therefore the subscript, p, will


217


be dropped form here on and the weight

output unit changes. The second part of the

change formulas for just one pattern will be

problem is to find out how E changes as the

derived. The first part of the problem is to find how E changes as the weight, E with respect to ∂ E

=

∂w jk

The

plot

w jk

of

w jk

changes. , leading into an

∂ok ∂net k ∂w jk

error

function

looks

final term is the slope of the error curve for w jk

. If the slope is positive, we

need to decrease the weight by a small amount to lower the error, and if the slope is negative, we need to increase the weight by a small amount

3

3.

McAuley, D., 1997, The BackPropagation Network:

Learning by Examples, URL:http://www2.psy.uq.edu.au/~brainwav/Manual/ BackProp.html

chain rule,

the

derivative

= −( t k − ok )(1 − ok )o j

something like the curve in Figure 6. This

one weight,

By

wij

,

partial of

is

∂ E ∂ok ∂net k

an

weight leading into a hidden layer unit j,

(11)

242

Figure 6 Plot of an Error Function

Hence, the weight adjustment can be given by w jk (t + 1) = w jk (t ) + ( −η )(−(t k −ok )ok (1 − ok )o j

(12)

wij

The w jk ( t + 1) and w jk (t ) represent the

this case, a change to the weight

weight after and before adjustment

oj and this changes the inputs into each unit

respectively.

k in the output layer. The change in E with a change in

A concise form is given by

wij

changes

is therefore the sum of the

changes to each of the output units. Again application of the chain rule, the results will

Δw jk = ηδ k o j

(13)

where δ k = (t k −o k )o k (1 − o k ) . Next the relation between the changes in E and in the weight

wij

will be derived.

In

be obtained as



219

The weight changes can be written by ∂ E ∂wij

=∑ k

∂ E ∂o k ∂net k ∂o j ∂o k ∂net k ∂o j

∂net j

∂net j ∂wij

Δwij = ηδ j oi

=∑−(t k −ok )ok (1−ok )w jk o j (1−o j )oi

(15)

where δ j = o j (1 − o j )∑ δ k w jk .

k

k

= ∑ − δ k w jk o j (1 − o j )oi

Continuously using the procedure, a general

k

=−oi o j (1−o j )∑δ k w jk

(14)

formula of weight change for more than three layers of network can be obtained.

k

In order to use the mathematical analysis mentioned above, training set (or training example) which contain several pairs of input and output values (a training pattern)

has to be prepared. Using the known

propagation algorithm can be displayed as

input/output value, the training rule of back-

pseudo program syntax in Figure 7.

Repeat For each training pattern Train on that pattern End of for loop Until the error comes to a acceptable level

Figure 7 Pseudo Codes for Back-propagation Algorithm

The

main

procedure

for

the

back-

Randomly generate weights from input to

propagation algorithm may be expanded to

hidden layer and from hidden to output layer,

the following steps.

including the weights of bias of nodes in hidden and output layers.

220

For each training pattern, calculate the node

needs to be run in the environment of

output values and delta values for hidden

MatLAB . It can be used from simple

and output layers.

neuron to multilayer neurons. Three popular

Update weights leading to hidden layers and

Transfer Functions are provided. They are

to output layers.

Hard-Limit

®

Transfer

Transfer Function

Function, and

Linear

Log-Sigmoid

Continue step 2 and 3 until all patterns have

Transfer Function. Figure 8 describes the

been trained.

three functions.

If the error is small enough, then training will stop. Otherwise, training will go on. That is, return to step 2, 3 and 4. In the above step, cycle from step 2 to step 4 is typically called an epoch (or iteration). A lower acceptable error level will generally require a large number of epochs. III.

Neural

Network

ToolBox

in

®

MatLAB

Neural Network ToolBox accessory to MatLAB® is widely applied to Industry and

business,

Automotive,

such Banking,

as

Aerospace, Electronics,

Entertainment and etc. The toolbox module



221

Figure 8 Three Popular Transfer Functions Offered The first step in training a feed forward

to create a feed forward network. It requires

network is to create the network object. The

four inputs and returns the network object.

toolbox module offers the function, newff,

The usage for this function is given as:

net=newff([-1 2; 0 5],[3,1],{'tansig','purelin'},'traingd'); The first input is an R-by-2 matrix of

There

are

seven

minimum and maximum values for each of

associated with traingd:

the R elements of the input vector. The

•

epochs

second input is an array containing the sizes

•

show

of each layer. The third input is a cell array

•

goal

containing

transfer

•

time

functions to be used in each layer. The final

•

min_grad

input contains the name of the training

•

max_fail

function to be used.

•

lr

the

names

of

the

There are lots of

training

parameters

training functions provided with the toolbox.

The learning rate lr is multiplied times the

In this study, Batch Gradient Descent

negative of the gradient to determine the

(traingd) is used. This command creates the

changes to the weights and biases. The

network object and also initializes the

larger the learning rate is, the bigger the step.

weights and biases of the network; therefore

If the learning rate is made too large, the

the network is ready for training. There are

algorithm becomes unstable. If the learning

times when you might want to reinitialize

rate is set too small, the algorithm takes a

the weights, or to perform a custom

long time to converge. The training status is

initialization. The following command can

displayed for every 50 iterations of the

be used to initialize or reinitialize a net.

algorithm. The other parameters determine when the training stops. The training stops if

net = init(net);

the number of iterations exceeds epochs, if the performance function drops below goal,

222

if the magnitude of the gradient is less than

momentum a network can slide through

mingrad, or if the training time is longer

such a minimum. A momentum constant,

than

is

mc, mediates the magnitude of the effect

associated with the early stopping technique.

that the last weight change is allowed to

time

seconds.

The

max_fail

have which can be any number between 0 The following code creates a training set of

and 1. When the momentum constant is 0, a

inputs p and targets t. For batch training, all

weight change is based solely on the

the input vectors are placed in one matrix.

gradient. When the momentum constant is 1,

p = [-1 -1 2 2;0 5 0 5];

the new weight change is set to equal the

t = [-1 -1 1 1];

last weight change and the gradient is

Create the feedforward network. Here the

simply ignored.

function minmax is used to determine the range of the inputs to be used in creating the

Based on the functions offered, a example

network.

will be provided in next paragraph to

net=newff(minmax(p),[3,1],{'tansig','purelin'

demonstrate the details to develop a feed

},'traingd');

forward neural network.

Some of the default training parameters can be modified as bellows:

IV. Illustrated Example

net.trainParam.show = 50; net.trainParam.lr = 0.05;

In this section, a combined logic gate

net.trainParam.epochs = 300;

will be constructed to develop a feed

net.trainParam.goal = 1e-5;

forward neural network to analyze the input-

If you want to use the default training

output truth-values. National Instrument

parameters, the preceding commands are not

LabVIEW™ [6] is employed to define the

necessary.

logic gate. Figure 7 depicts the front panel

In addition to traingd, there is another batch

for interaction. And the graphical

algorithm for feed forward networks that

programming diagram block is shown in

often provides faster convergence: traingdm,

Figure 8. As A, B and C buttons are pressed;

steepest

their status will be changed. That is, truth-

descent

with

momentum.

Momentum allows a network to respond not

value will be toggled from 1 to 0 or vise visa.

only to the local gradient, but also to recent

Each status will bring different byte-value

trends in the error surface. Acting like a

for button Q. With this system, eight

lowpass

patterns truth-value can be used as the

filter,

momentum

allows

the

network to ignore small features in the error

dataset for training, validating and testing

surface. Without momentum a network can

the feed forward neural network. Table 1 is

get stuck in a shallow local minimum. With

the truth-value is generated by the system.



Figure 7 Front Panel of Logic Gate in LabVIEW™

Figure 8 Block Diagram in LabVIEW™

223

224

Table 1 Truth Table for the combined logic gate A 0 0 0 1 1 0 1 1 B 0 1 0 0 1 1 0 1 C 0 0 1 0 0 1 1 1 Q 1 0 1 0 0 1 0 1 There are generally six steps to develop a

figure 9.

In the program, there are three

neural network with MatLAB [5].

neurons in the hidden layer. One neuron

1. Prepare the dataset.

composes the output layer. In this model,

2. Create network object.

logsigmoid transfer functions are both used

3. Initialize the network

by the neurons in the two layers and batch

4. Set network training parameters.

gradient descent with momentum training

5. Training the network.

style is employed. The performance goal

6. Simulate the response with test data.

(Mean Square Error) is set to 0.001. In figure 10, the training session is displayed

The script command for the network with

each 50 iterations. The training comes to

function provided by the tools is given in the

stop and met the performance goal.

Figure 9 MatLAB® Script File Figure

11

depicts

the

relationship

of

model is over-trained. Another 20% of data

Performance goal and Epochs. In this study,

can be used for testing the model. Hence,

since size of the dataset is small. Hence,

only 60% of data is used to train the neural

validation is not conducted. For a lump of

network.

dataset, 20% of total number of data can be

training epochs and set the performance goal

used for validating to check whether the

nearly to zero can get the good result.

For

more

precision,

increase



However, it needs to consume much

rce as well as time.

com putin g resou

Figure 10 Part Result of Training Session

225

226

Figure 11 Relationships between Performance Goal and Epochs



227

International Conference on Industrial

V. Conclusion

Engineering-Theory, Applications and In this study, mathematical background for

Practice, Hsinchu, Taiwan (1995).

feed forward neural networks is reviewed. Use National Instrument LabVIEW™ to construct

a

combined

logic

gate

and

[3] Kehtarnavaz, N. and Kim N., Digital Signal Processing System-Level Design

generate a dataset to use for training neural

Using LabVIEW, Elsevier, New

network. The Neural Network Toolbox in

York( 2005).

MatLAB® is employed to establish the multilayer neural network model. The log-

[4] Marek J. P, Janos L. and Koster K.,

sigmoid transfer function is used by neurons

“Digital Fuzzy Logic Controller: Design

in each layer. Batch Gradient Decent with

and Implementation”, IEEE Transactions

Momentum method is used to train the

on Fuzzy Systems, Vol. 4, No. 4, 439-

networks. The size of dataset in the research

459(1996).

is small; they are only used for training the net. To construct a more precise model,

[5] Martin T. H., Howard B. D. and Mark H.

epochs needs more many and mean square

B., Neural Network Design, University

errors should be nearly equal to zero.

of Colorado Bookstore (2006).

However, it is time and computing resource consuming. Finding more data for training, validating and testing with searching an

[6] Neural Network User Guide, available

optimal network training parameters such as

from

learning rate and momentum will be an

http://www.mathworks.com/access/help

interesting topic for future study.

desk/help/toolbox/nnet/ (2006). LabVIEW Fundamentals, available from

Reference

[1] Huang, C. L., et al., “ The Construction of Production Performance Prediction System for Semiconductor Manufacturing with Artificial Neural Network ”, International Journal of

Production Research, Vol. 37, No. 6, 1387 – 1402(1999). [2] Hassoun, M. H., “ Fundamentals of Artificial Neural Networks”, The MIT

Press, Cambridge, Massaachussetts, Proceedings of the 5th Annual

http://sine.ni.com/manuals/ (2006).

228

1

2

1 2

邏輯閘為設計及製造積體電路重要元件。多功能的數位電子零件，是由許多的基本的邏輯閘所組成。分析這種數位零件，既耗時又繁雜的工作。由於類神經網路相關科技及知識的成熟，其被使用於邏輯閘的分析，使工作能夠有相當程度的簡化。本研究主要目的，在於如何建構一個類神經網路應用於複雜邏輯閘的分析。文中，首先探討類神經網路的數學理論相關背景；然後使用 MatLAB 所提共的類神經網路工具函數，建構一個類神經網路物件；為了訓練此網路，使用美國儀器公司的 LabVIEW，設計一組合邏輯閘並產生一組訓練資料。關鍵字：前授型類神經網路，邏輯閘 MatLAB®, LabVIEW™



229

Logic Gate Analysis With Feed Forward Neural Network

Recommend Documents