235 WHAMPOA - An Interdisciplinary Journal 51(2006) 235-251
Logic Gate Ana y is with Feed Forward Neural Network 1
David C.T. Tai , Shu-hui Wu
2
1
Dept. of Computer Information Science, ROC Military Mil itary Academy 2Dept. of Mechanical Engineering, ROC Military Academy
Abstract Logic gates are main components to design and assemble an integrated circuit part. A
multifunction electronic part is often composed by a great many of basic gates and derived gates. To analyze such a part is very tedious job. With advent of knowledge and technique in neural network, the complexity can be simplified with this technology to remarkably extent. The purpose of the study is to focus on how to establish a feed forward neural network to analyze a complex logic gates. Firstly, the mathematical background of a neural network model is reviewed. Then, the neural network toolbox in MatLAB® is described. Finally, a simple hybrid logic gate is used as an illustrated example. To generate the truth-value for the gate, LabVIEW™ is employed to produce the dataset for training, validating and testing the neural network. Keywords : Feed Forward Neural Network, Logic gate, MatLAB®, LabVIEW™
I. Introduction
Logic gates process signals, which
basic gates. The Integrated Circuit (IC)
represent true or false. The basic basic gates are
composed of lots of such basic gates and
AND, NAND, OR and XOR gates [4].
some derived gates.
Figure 1, shows traditional symbol for the
AND Gate NAND Gate
OR Gate
NOR Gate
Figure 1 Traditional Symbol for Basic Gates
212
In digital system design to analyze the input
Neural Network Toolbox in MATLB® will
and output status for a combined gate is a
be used to develop the neural network for
very important and tedious job. With advent
assisting the analyzing of a combined logic
of knowledge and technique in neural
gate. Finally, a simple hybrid logic gate
network, feed forward neural network is
constructed with LabVIEW™. It generates
employed to help the analysis. In this paper,
the dataset for training, validating and
the mathematical background for neural
testing the Feed Forward Neural Network
network will be reviewed firstly. Then,
established [3]. the continuous maintenance required to upkeep the neuron function.
II. Mathematical Background for Feed Forward Neural Network 1. Introduction to Neural Networks
Neural networks are computer systems of
biological
neurons,
composed
of
nonlinear computational elements operating in parallel. The brain is a collection of about 10 billion interconnected neurons. Figure 2 shows the scheme of a neuron. Each neuron is a cell that uses biochemical reactions to
Figure 2 Organization of a Nueron 1
receive process and transmit information. A neuron's dendrite tree is connected to a
The part of the soma that does concern itself
thousand neighboring neurons. As one of
with the signal is the axon hillock. If the
those neurons fires, a positive or negative
aggregate input is greater than the axon
charge is received by one of the dendrites.
hillock's threshold value, then the neuron
The strengths of all the received charges are
fires, and an output signal is transmitted
added together through the processes of
down the axon (Fraser, 2000)
spatial and temporal summation. Spatial
networks consist of nodes (neurons) and
summation
synaptic connections that connect these
occurs
when
several
weak
2
. Neural
signals are converted into a single large one, while temporal summation converts a rapid series of weak pulses from one source into
1
Source from the web site,
one large signal. The aggregate input is then passed to the soma. The soma and the
http://vv.carleton.ca/~neil/neural/neuron-a.html
enclosed nucleus don't play a significant role in the processing of incoming and outgoing
2
Fraser, N. ,1998, Training Neural Networks, Web
data. Their primary function is to perform site: http://vv.carleton.ca/~neil/neural/neuron-d.htm
David C.T. Tai, Shu-hui Wu
Logic Gate Analysis with Feed Forward Neural Network
nodes. Each connection is assigned a relative weight, also called connection
connected to node i, and its bias.The
strength, or synaptic strength. The output at each node depends on the threshold, also
213
state variable
called bias or offset, specified and a transfer
si
of node i is given by J
si = f i (∑ wij s j − vi )
(activation) function. In mathematical terms,
(1)
j =1
a neural network model can be represented by the following parameters: •
Note that transfer functions and bias terms
A State variable sι is associated with
A
weight
The sigmoid function is the commonly used transfer function (Russell and Norving,
each node i.
•
are absent for the input nodes.
1995). It is given by wij
associates
with
f (α ) =
a
β
1
(2)
(1 + e − 2 βα )
connection between a node i and, a node
Where
j that the node i is connected to, such
Generally, β is set to unity, which results in
that signals flow from j to i.
a sigmoid transfer function of the form
•
A bias vi associated with each node i.
•
f i ( s j , wij , v i ) A transfer function
determine
the
steepness.
below,
for
f (α ) =
1 (1 + e
−α
)
(3)
each node I, which determines the state
Graphical representation of sigmoid function
of the node as a function of the summed
is shown in Figure 3.
output from all the nodes that are
238
1 0.9 0.8 0.7 0.6 )
0.5
( f 0.4 0.3 0.2 0.1 0 -10
-5
0
5
10
Figure 3 Plot of Sigmoid Function
a finite derivative. If y i = f ( hi ), where hi
Another popular activation function is the hyper-tangent function, tanh, which is
is the summed input to a node i, then
given by
f ' ( hi ) = y i (1 − y i ). f (α ) =
eα − e −α α −α e +e
(5)
(4)
The activation function is equivalent to the sigmoid function if we apply a linear
2. Three-layered Feed-forward Network
Multi-layered networks are those with input, output and inner (or hidden) neuron layers that intervene between the input and output layers. The input-output
~
transformation α = α / 2 to the input and a ~
linear transformation f = 2 f − 1 to the output. These functions are monotonic with
relation defines a mapping, and the neural network provides a representation of this mapping. The number of hidden layers and number of nodes in each hidden layer depend on the complexity of the problem, and to a large extent vary from problem to problem. Increasing the number of hidden
David C.T. Tai, Shu-hui Wu
215
Logic Gate Analysis with Feed Forward Neural Network
layers increases the complexity of a neural
the training breaks, training cycle numbers
network and may or may not result in a
and
better network performance but will lead to
performance and reliability parameters for
a longer training time. Based on previous
this study. Bias or threshold of nodes in the
experience, one or two hidden layers
hidden layer can be treated as one of the
provide a better performance, while not
input node generally with the value –1. Bias
requiring extensive training time [2].
of nodes in the output layer has same
training
performance
with
initial
property as that of nodes in the hidden layer. A three-layer feed-forward network
Summation of the weight value of nodes in
consists of an input layer, an output layer
the input layer including that of bias is
and a hidden layer, resulting in two layers of
transformed as the output value of nodes in
weights. One connects the input to the
the hidden layer. And summation of the
hidden layer and the other connects the
weight value of nodes in the hidden layer
hidden layer to the output layer. Figure 4
including that of bias is transformed as the
illustrates
output value of nodes in the output layer.
a
three-layered
feed-forward
implementation architecture for evaluating
Threshold (H)
Training Breaks
Initial Performance
Cycle Numbers
Reliability
Performance
Figure 4 A Three-layered Feed-forward Network propagation is the most extensively used neural network training method. The back-
3. The Back-Propagation Algorithm
propagation
algorithm
is
an
iterative
Learning for a neural network is
gradient method based algorithm developed
accomplished through an adaptive procedure,
to introduce synaptic correction (or weight
known as a learning rule or algorithm.
adjustments) by minimizing the sum of
Learning algorithms indicate how weights
square errors (objective function). Figure 5
which connecting nodes between two layers
is architecture of three-layer network and is
should be incrementally adapted to improve
used to derive the back-propagation method.
a
predefined
performance
measure.
Learning can be viewed as a search in multidimensional
weight
solution,
gradually
which
space
for
a
optimizes
a
predefined objective function [1].
Back-
Node indexes
i, j
and
k
respectively
represent the index of nodes in the input, hidden and output layers. The
wij
is the
216
weight or synaptic strength connecting from node i in the input layer to node j in the w jk
hidden layer, while the
threshold for hidden layer nodes will be represented as node 0 in the input layer, while bias for output layer nodes will be represented as node 0 in the hidden layer.
is the weight or
The number of nodes in input, hidden and
synaptic strength from node j in the hidden
output layers is equal to l, m and n
layer to node k in the output layer. Bias or
separately.
k
Output Layer w jk
Hidden Layer
wij
Input Layer
i
Figure 5 Architecture of Three-layered Network
Then, the node output value for the hidden and
output
layer,
oi and
o j will
be
The back-propagation method is derived by minimizing the error on the output units over all the patterns using the following formula for this error, E, E =
represented as follows:
1
∑∑ (t pk − o pk ) 2 p
o j =
− net j
(6)
k
is the subscript for the output units. Then, tpk is the target value of output unit k for
m
net j = ∑ wij oi
(7)
j =1
o k =
(10)
where p is the subscript for the pattern and k
1 1+ e
2
pattern p and opk is the actual output value of output layer unit k for pattern p. This is
1 1+ e
− net k
(8)
k =1
function; however from time to time people have tried other error functions. Notice that
n
net k = ∑ w jk o j
by far the most commonly used error
(9)
E is the sum over all the patterns, however it is assume that by minimizing that error for each pattern individually E will also be minimized. Therefore the subscript, p, will
David C.T. Tai, Shu-hui Wu
217
Logic Gate Analysis with Feed Forward Neural Network
be dropped form here on and the weight
output unit changes. The second part of the
change formulas for just one pattern will be
problem is to find out how E changes as the
derived. The first part of the problem is to find how E changes as the weight, E with respect to ∂ E
=
∂w jk
The
plot
w jk
of
w jk
changes. , leading into an
∂ok ∂net k ∂w jk
error
function
looks
final term is the slope of the error curve for w jk
. If the slope is positive, we
need to decrease the weight by a small amount to lower the error, and if the slope is negative, we need to increase the weight by a small amount
3
3.
McAuley, D., 1997, The BackPropagation Network:
Learning by Examples, URL:http://www2.psy.uq.edu.au/~brainwav/Manual/ BackProp.html
chain rule,
the
derivative
= −( t k − ok )(1 − ok )o j
something like the curve in Figure 6. This
one weight,
By
wij
,
partial of
is
∂ E ∂ok ∂net k
an
weight leading into a hidden layer unit j,
(11)
242
Figure 6 Plot of an Error Function
Hence, the weight adjustment can be given by w jk (t + 1) = w jk (t ) + ( −η )(−(t k −ok )ok (1 − ok )o j
(12)
wij
The w jk ( t + 1) and w jk (t ) represent the
this case, a change to the weight
weight after and before adjustment
oj and this changes the inputs into each unit
respectively.
k in the output layer. The change in E with a change in
A concise form is given by
wij
changes
is therefore the sum of the
changes to each of the output units. Again application of the chain rule, the results will
Δw jk = ηδ k o j
(13)
where δ k = (t k −o k )o k (1 − o k ) . Next the relation between the changes in E and in the weight
wij
will be derived.
In
be obtained as
David C.T. Tai, Shu-hui Wu
Logic Gate Analysis with Feed Forward Neural Network
219
The weight changes can be written by ∂ E ∂wij
=∑ k
∂ E ∂o k ∂net k ∂o j ∂o k ∂net k ∂o j
∂net j
∂net j ∂wij
Δwij = ηδ j oi
=∑−(t k −ok )ok (1−ok )w jk o j (1−o j )oi
(15)
where δ j = o j (1 − o j )∑ δ k w jk .
k
k
= ∑ − δ k w jk o j (1 − o j )oi
Continuously using the procedure, a general
k
=−oi o j (1−o j )∑δ k w jk
(14)
formula of weight change for more than three layers of network can be obtained.
k
In order to use the mathematical analysis mentioned above, training set (or training example) which contain several pairs of input and output values (a training pattern)
has to be prepared. Using the known
propagation algorithm can be displayed as
input/output value, the training rule of back-
pseudo program syntax in Figure 7.
Repeat For each training pattern Train on that pattern End of for loop Until the error comes to a acceptable level
Figure 7 Pseudo Codes for Back-propagation Algorithm
The
main
procedure
for
the
back-
Randomly generate weights from input to
propagation algorithm may be expanded to
hidden layer and from hidden to output layer,
the following steps.
including the weights of bias of nodes in hidden and output layers.
220
For each training pattern, calculate the node
needs to be run in the environment of
output values and delta values for hidden
MatLAB . It can be used from simple
and output layers.
neuron to multilayer neurons. Three popular
Update weights leading to hidden layers and
Transfer Functions are provided. They are
to output layers.
Hard-Limit
®
Transfer
Transfer Function
Function, and
Linear
Log-Sigmoid
Continue step 2 and 3 until all patterns have
Transfer Function. Figure 8 describes the
been trained.
three functions.
If the error is small enough, then training will stop. Otherwise, training will go on. That is, return to step 2, 3 and 4. In the above step, cycle from step 2 to step 4 is typically called an epoch (or iteration). A lower acceptable error level will generally require a large number of epochs. III.
Neural
Network
ToolBox
in
®
MatLAB
Neural Network ToolBox accessory to MatLAB® is widely applied to Industry and
business,
Automotive,
such Banking,
as
Aerospace, Electronics,
Entertainment and etc. The toolbox module
David C.T. Tai, Shu-hui Wu
Logic Gate Analysis with Feed Forward Neural Network
221
Figure 8 Three Popular Transfer Functions Offered The first step in training a feed forward
to create a feed forward network. It requires
network is to create the network object. The
four inputs and returns the network object.
toolbox module offers the function, newff,
The usage for this function is given as:
net=newff([-1 2; 0 5],[3,1],{'tansig','purelin'},'traingd'); The first input is an R-by-2 matrix of
There
are
seven
minimum and maximum values for each of
associated with traingd:
the R elements of the input vector. The
•
epochs
second input is an array containing the sizes
•
show
of each layer. The third input is a cell array
•
goal
containing
transfer
•
time
functions to be used in each layer. The final
•
min_grad
input contains the name of the training
•
max_fail
function to be used.
•
lr
the
names
of
the
There are lots of
training
parameters
training functions provided with the toolbox.
The learning rate lr is multiplied times the
In this study, Batch Gradient Descent
negative of the gradient to determine the
(traingd) is used. This command creates the
changes to the weights and biases. The
network object and also initializes the
larger the learning rate is, the bigger the step.
weights and biases of the network; therefore
If the learning rate is made too large, the
the network is ready for training. There are
algorithm becomes unstable. If the learning
times when you might want to reinitialize
rate is set too small, the algorithm takes a
the weights, or to perform a custom
long time to converge. The training status is
initialization. The following command can
displayed for every 50 iterations of the
be used to initialize or reinitialize a net.
algorithm. The other parameters determine when the training stops. The training stops if
net = init(net);
the number of iterations exceeds epochs, if the performance function drops below goal,
222
if the magnitude of the gradient is less than
momentum a network can slide through
mingrad, or if the training time is longer
such a minimum. A momentum constant,
than
is
mc, mediates the magnitude of the effect
associated with the early stopping technique.
that the last weight change is allowed to
time
seconds.
The
max_fail
have which can be any number between 0 The following code creates a training set of
and 1. When the momentum constant is 0, a
inputs p and targets t. For batch training, all
weight change is based solely on the
the input vectors are placed in one matrix.
gradient. When the momentum constant is 1,
p = [-1 -1 2 2;0 5 0 5];
the new weight change is set to equal the
t = [-1 -1 1 1];
last weight change and the gradient is
Create the feedforward network. Here the
simply ignored.
function minmax is used to determine the range of the inputs to be used in creating the
Based on the functions offered, a example
network.
will be provided in next paragraph to
net=newff(minmax(p),[3,1],{'tansig','purelin'
demonstrate the details to develop a feed
},'traingd');
forward neural network.
Some of the default training parameters can be modified as bellows:
IV. Illustrated Example
net.trainParam.show = 50; net.trainParam.lr = 0.05;
In this section, a combined logic gate
net.trainParam.epochs = 300;
will be constructed to develop a feed
net.trainParam.goal = 1e-5;
forward neural network to analyze the input-
If you want to use the default training
output truth-values. National Instrument
parameters, the preceding commands are not
LabVIEW™ [6] is employed to define the
necessary.
logic gate. Figure 7 depicts the front panel
In addition to traingd, there is another batch
for interaction. And the graphical
algorithm for feed forward networks that
programming diagram block is shown in
often provides faster convergence: traingdm,
Figure 8. As A, B and C buttons are pressed;
steepest
their status will be changed. That is, truth-
descent
with
momentum.
Momentum allows a network to respond not
value will be toggled from 1 to 0 or vise visa.
only to the local gradient, but also to recent
Each status will bring different byte-value
trends in the error surface. Acting like a
for button Q. With this system, eight
lowpass
patterns truth-value can be used as the
filter,
momentum
allows
the
network to ignore small features in the error
dataset for training, validating and testing
surface. Without momentum a network can
the feed forward neural network. Table 1 is
get stuck in a shallow local minimum. With
the truth-value is generated by the system.
David C.T. Tai, Shu-hui Wu
Logic Gate Analysis with Feed Forward Neural Network
Figure 7 Front Panel of Logic Gate in LabVIEW™
Figure 8 Block Diagram in LabVIEW™
223
224
Table 1 Truth Table for the combined logic gate A 0 0 0 1 1 0 1 1 B 0 1 0 0 1 1 0 1 C 0 0 1 0 0 1 1 1 Q 1 0 1 0 0 1 0 1 There are generally six steps to develop a
figure 9.
In the program, there are three
neural network with MatLAB [5].
neurons in the hidden layer. One neuron
1. Prepare the dataset.
composes the output layer. In this model,
2. Create network object.
logsigmoid transfer functions are both used
3. Initialize the network
by the neurons in the two layers and batch
4. Set network training parameters.
gradient descent with momentum training
5. Training the network.
style is employed. The performance goal
6. Simulate the response with test data.
(Mean Square Error) is set to 0.001. In figure 10, the training session is displayed
The script command for the network with
each 50 iterations. The training comes to
function provided by the tools is given in the
stop and met the performance goal.
Figure 9 MatLAB® Script File Figure
11
depicts
the
relationship
of
model is over-trained. Another 20% of data
Performance goal and Epochs. In this study,
can be used for testing the model. Hence,
since size of the dataset is small. Hence,
only 60% of data is used to train the neural
validation is not conducted. For a lump of
network.
dataset, 20% of total number of data can be
training epochs and set the performance goal
used for validating to check whether the
nearly to zero can get the good result.
For
more
precision,
increase
David C.T. Tai, Shu-hui Wu
Logic Gate Analysis with Feed Forward Neural Network
However, it needs to consume much
rce as well as time.
com putin g resou
Figure 10 Part Result of Training Session
225
226
Figure 11 Relationships between Performance Goal and Epochs
David C.T. Tai, Shu-hui Wu
Logic Gate Analysis with Feed Forward Neural Network
227
International Conference on Industrial
V. Conclusion
Engineering-Theory, Applications and In this study, mathematical background for
Practice, Hsinchu, Taiwan (1995).
feed forward neural networks is reviewed. Use National Instrument LabVIEW™ to construct
a
combined
logic
gate
and
[3] Kehtarnavaz, N. and Kim N., Digital Signal Processing System-Level Design
generate a dataset to use for training neural
Using LabVIEW, Elsevier, New
network. The Neural Network Toolbox in
York( 2005).
MatLAB® is employed to establish the multilayer neural network model. The log-
[4] Marek J. P, Janos L. and Koster K.,
sigmoid transfer function is used by neurons
“Digital Fuzzy Logic Controller: Design
in each layer. Batch Gradient Decent with
and Implementation”, IEEE Transactions
Momentum method is used to train the
on Fuzzy Systems, Vol. 4, No. 4, 439-
networks. The size of dataset in the research
459(1996).
is small; they are only used for training the net. To construct a more precise model,
[5] Martin T. H., Howard B. D. and Mark H.
epochs needs more many and mean square
B., Neural Network Design, University
errors should be nearly equal to zero.
of Colorado Bookstore (2006).
However, it is time and computing resource consuming. Finding more data for training, validating and testing with searching an
[6] Neural Network User Guide, available
optimal network training parameters such as
from
learning rate and momentum will be an
http://www.mathworks.com/access/help
interesting topic for future study.
desk/help/toolbox/nnet/ (2006). LabVIEW Fundamentals, available from
Reference
[1] Huang, C. L., et al., “ The Construction of Production Performance Prediction System for Semiconductor Manufacturing with Artificial Neural Network ”, International Journal of
Production Research, Vol. 37, No. 6, 1387 – 1402(1999). [2] Hassoun, M. H., “ Fundamentals of Artificial Neural Networks”, The MIT
Press, Cambridge, Massaachussetts, Proceedings of the 5th Annual
http://sine.ni.com/manuals/ (2006).
228
1
2
1 2
邏輯閘為設計及製造積體電路重要元件。多功能的數位電子零件,是由許多的基本的 邏輯閘所組成。分析這種數位零件,既耗時又繁雜的工作。由於類神經網路相關科技 及知識的成熟,其被使用於邏輯閘的分析,使工作能夠有相當程度的簡化。本研究主 要目的,在於如何建構一個類神經網路應用於複雜邏輯閘的分析。文中,首先探討類 神經網路的數學理論相關背景;然後使用 MatLAB 所提共的類神經網路工具函數,建 構一個類神經網路物件;為了訓練此網路,使用美國儀器公司的 LabVIEW,設計一組 合邏輯閘並產生一組訓練資料。 關鍵字:前授型類神經網路,邏輯閘 MatLAB®, LabVIEW™
David C.T. Tai, Shu-hui Wu
Logic Gate Analysis with Feed Forward Neural Network
229