Natural Language Processing

ASSIGNMENT THREE Q1. Describe the diferent levels o knoled!e th"t "re necess"r# or n"t$r"l l"n!$"!"e %rocessin!. Sol$tion& There are 5 main types of knowledge representation in Articial Intelligence. 1. Met" 'noled!e - its knowledge about knowledge and how to gain them 2. He$ristic 'noled!e - epresenting knowledge of some e!pert in a eld or sub"ect. (roced$r"ll 'nole 'noled!e d!e - $i% #. (roced$r" $i%es es inf infor ormat mation ion&& kno knowle wledge dge abo about ut how to achie%e something Decl"r"t l"r"tive ive 'no 'noled led!e !e - It(s about statement '. Dec nts s tha hatt des esc cribe a particular ob"e ob "ect ct an and d it its s at attr trib ibut utes es)) in incl clud udin ing g so some me be beha ha%i %ior or in rel elat atio ion n wi with th it 'noled!e e - *es 5. Str$ct$r"l 'noled! *escri cribes bes wha whatt re relat lation ionshi ship p e! e!ist ists s bet betwee ween n concepts& ob"ects.

Q). De*ne A$!+ented Tr"nsition Netork. Sol$tion+ An "$!+ente "$!+ented d tr"nsition tr"nsition netork netork ,AT ,AT is a type type of graph graph theoretic theoretic structure structure used in the operational operational denition denition of formal formal languages) languages) used especially in parsing relati%ely comple! natural languages) and ha%ing wide application in articial intelligence. An AT can) theoretically) analy/e the structure of any sentence) howe%er complicated. ATs build on the idea of using nite state machines ,0arko% model to parse sente sentenc nces es.. . A. oods oods in Trans ransit itio ion n etw etwor ork k $ram $ramma mars rs for for atu atura rall 3anguage Analysis claims that by adding a recursi%e mechanism to a nite state state model) model) parsin parsing g can be achie% achie%ed ed much much more more e4cien e4ciently tly. Instea Instead d of building building an automaton automaton for a particular particular sentence) sentence) a collectio collection n of transition transition graphs are built. A grammatically correct sentence is parsed by reaching a nal state in any state graph. Transitions between these graphs are simply subro subrout utin ine e call calls s from from one one stat state e to any any init initia iall stat state e on any any graph graph in the the network. A sentence is determined to be grammatically correct if a nal state is reached by the last word in the sentence. This model meets many of the goals set forth by the nature of language in that it captures the regularities of the language. That is) if there is a process that operates in a number of en%ironments) the grammar should encapsulate the process in a single structure. uch encapsulation not only simplies the gramm grammar ar)) but has has the the adde added d bonu bonus s of e4ci e4cien ency cy of oper operat atio ion. n. Anoth Another er ad%a ad%ant ntag age e of such such a mode modell is the abil abilit ity y to post postpo pone ne deci decisi sion ons. s. 0any 0any grammars use guessing when an ambiguity comes up. This means that not enough is yet known about the sentence. 6y the use of recursion) ATs sol%e this this ine4 ine4ci cienc ency y by post postpo poni ning ng decis decisio ions ns until until more more is know known n abou aboutt a sentence.

Q,. -h"t "re the ch"r"cteristics o e%ert s#ste+/ -rite the v"rio$s "%%lic"tion "re" o e%ert s#ste+. Sol$tion& 7haracteristics of an 8!pert ystem 

  

  

eparates knowledge from control. 9ossesses e!pert task-specic knowledge. :ocuses e!pertise. easons with symbols ,which is what all computation is - can relate to original problems of machine translation. easons heuristically ,cf. e!pertise abo%e. 9ermits ine!act reasoning ,not essential. 3imited to sol%able problems ,or might not terminate procedure.

8!pert systems address areas where combinatory is enormous+ • • •

• • • •

;ighly interacti%e or con%ersational applications) I<) %oice ser%e. :ault diagnosis) medical diagnosis. *ecision support in comple! systems) process control) interacti%e user guide. 8ducational and tutorial software. 3ogic simulation of machines or systems. =nowledge management. 7onstantly changing software.

They can also be used in software engineering for rapid prototyping applications ,A*. Indeed) the e!pert system >uickly de%eloped in front of the e!pert shows him if the future application should be programmed.

Q0. E%l"in the r$le b"sed s#ste+ "rchitect$re. Sol$tion& 7on%entional problem-sol%ing computer programs make use of well-structured algorithms) data structures) and crisp reasoning strategies to nd solutions. :or the di4cult problems with which e!pert systems are concerned) it may be more useful to employ heuristics+ strategies that often lead to the correct solution) but that also sometimes fail 8!pert knowledge is often represented in the form of rules or as data within the computer. *epending upon the problem re>uirement) these rules and data can be recalled to sol%e problems. ule-based e!pert systems ha%e played an important role in modern intelligent systems and their applications in strategic goal setting) planning) design) scheduling) fault monitoring) diagnosis and so on. A typical rule-based system has four basic components+ •

A list of rules or r$le b"se) which is a specic type of knowledge base.

•

•

An inference engine or semantic reasoner) which infers information or takes action based on the interaction of input and the rule base. The interpreter e!ecutes a production system program by performing the following match-resol%e-act cycle. 0atch+ In this rst phase) the left-hand sides of all productions are matched against the contents of working memory. As a result a con?ict set is obtained) which consists of instantiations of all satised productions. An instantiation of a production is an ordered list of working memory elements that satises the left-hand side of the production. •

7on?ict-esolution+ In this second phase) one of the production instantiations in the con?ict set is chosen for e!ecution. If no productions are satised) the interpreter halts.

•

Act+ In this third phase) the actions of the production selected in the con?ict-resolution phase are e!ecuted. These actions may change the contents of working memory. At the end of this phase) e!ecution returns to the rst phase.

•

•

Temporary working memory. A user interface or other connection to the outside world through which input and output signals are recei%ed and sent.

Q. Describe "nd co+%"re the diferent t#%es o %roble+s solved b# these e%ert s#ste+s& DENDRA23 M45IN3 (R6S(E5T6R "nd R1. Sol$tion& DENDRA2 ;euristic *endral is a program that uses mass spectra or other e!perimental data together with knowledge base of chemistry) to produce a set of possible chemical structures that may be responsible for producing the data It is an in?uential pioneer pro"ect in articial intelligence ,AI of the 1@Bs) and the computer software e!pert system that it produced. Its primary aim was to help organic chemists in identifying unknown organic molecules) by analy/ing their mass spectra and using knowledge of chemistry. A mass spectrum of a compound is produced by a mass spectrometer) and is used to determine its molecular weight) the sum of the masses of its atomic constituents. :or e!ample) the compound water ,;2C) has a molecular weight of 1D since hydrogen has a mass of 1.B1 and o!ygen 1.BB) and its mass spectrum

has a peak at 1D units. ;euristic *endral would use this input mass and the knowledge of atomic mass numbers and %alence rules) to determine the possible combinations of atomic constituents whose mass would add up to 1D.As the weight increases and the molecules become more comple!) the number of possible compounds increases drastically. Thus) a program that is able to reduce this number of candidate solutions through the process of hypothesis formation is essential.

M45IN M45IN was an early e!pert system that used articial intelligence to identify bacteria causing se%ere infections) such as bacteremia and meningitis) and to recommend antibiotics) with the dosage ad"usted for patientEs body weight .The 0ycin system was also used for the diagnosis of blood clotting diseases. 0F7I operated using a fairly simple inference engine) and a knowledge base of GBB rules. It would >uery the physician running the program %ia a long series of simple yes&no or te!tual >uestions. At the end) it pro%ided a list of possible culprit bacteria ranked from high to low based on the probability of each diagnosis) its condence in each diagnosisE probability) the reasoning behind each diagnosis ,that is) 0F7I would also list the >uestions and rules which led it to rank a diagnosis a particular way) and its recommended course of drug treatment. (R6S(E5T6R 7onsultation system to assist geologists working in mineral e!ploration Attempts to represent the knowledge and reasoning processes of e!perts in the geological domain ystem has been kept domain independent It matches data from a site against models describing regional and local characteristics fa%orable for specic ore deposits The input data are assumed to be incomplete and uncertain 9C987TC performs a consultation to determine such things as H hich model best ts the data H here the most fa%orable drilling sites are located H hat additional data would be most helpful in reaching rmer conclusions H hat is the basis for these conclusions and recommendations

R1 •

•

ule-based system de%eloped by *87 and 70J to congure
•

Cutput is corrected order with diagrams showing component layout and wiring suggestions

1. 7heck order for missing& mismatched pieces 2. 3ayout processor cabinets #. 9ut bo!es in input&output cabinets and put components in bo!es '. 9ut panels in input&output cabinets 5. 3ayout ?oor plan . Indicate cabling

Q7. -h"t is e%ert s#ste+ shell/ Sol$tion& It is a Tool for building an e!pert system. A software package that includes an inference engine )knowledge representation language) user interface and all the code used by an e!pert system regardless of the domain. All we ha%e to do is add the knowledge) i.e.) the rules and facts used by an e!pert to sol%e problems in a certain domain. 73I9 are an e!ample of an e!pert system shell.

Q18. Give the "dv"nt"!es o e%ert s#ste+ "rchitect$re b"sed on decision trees over those o %rod$ction r$les. -h"t "re the +"in dis"dv"nt"!es/ Sol$tion& Amongst decision support tools) decision trees ,and in?uence diagrams ha%e se%eral ad%antages. *ecision trees+ •

•

• •

• •

Are simple to understand and interpret. 9eople are able to understand decision tree models after a brief e!planation. ;a%e %alue e%en with little hard data. Important insights can be generated based on e!perts describing a situation ,its alternati%es) probabilities) and costs and their preferences for outcomes. 9ossible scenarios can be added orst) best and e!pected %alues can be determined for diKerent scenarios Jse a white bo! model. If a gi%en result is pro%ided by a model. 7an be combined with other decision techni>ues. The following e!ample uses et 9resent
*isad%antages of decision trees+ •

•

:or data including categorical %ariables with diKerent number of le%els) information gain in decision trees are biased in fa%or of those attributes with more le%els. 7alculations can get %ery comple! particularly if many %alues are uncertain and&or if many outcomes are linked.

Q1). Give "n e"+%le o the $se o Met" knoled!e in e%ert s#ste+ inerence. Sol$tion& A specic e!ample of an e!pert system using 0eta knowledge is 9M*8 which is a pneumoconiosis) a lung disease) M-ray diagnosis. This e!pert system incorporates the inference engine to e!amine the shadows on the M-ray. The shadows are used to determine the type and the degree of pneumoconiosis. This system also includes three other modes+ the knowledge base) the e!planation interface) and the knowledge ac>uisition modes. The knowledge base mode contains the data of M-ray representations of %arious stages of the disease. These elements are in the form of fu//y production rules discussed in the pre%ious paragraphs. The e!planation interface details the conclusions based on the 0eta knowledge) and the knowledge ac>uisition mode allows medical e!perts to add or change information in the system. This phase also adds details into the system based on the 0eta knowledge. This system hence uses 0eta knowledge to infer.

ASSIGNMENT 96:R Q1. Ho does "n "rti*ci"l ne$r"l netork +odel the br"in/ Describe to +";or cl"sses o le"rnin! %"r"di!+s& s$%ervised "nd $ns$%ervised le"rnin!. -h"t "re the e"t$res th"t distin!$ish these %"r"di!+s ro+ e"ch other/ Sol$tion& In particular) the most basic element of the human brain is a specic type of cell which) unlike the rest of the body) doesnEt appear to regenerate. 6ecause this type of cell is the only part of the body that isnEt slowly replaced) it is assumed that these cells are what pro%ides us with our abilities to remember) think) and apply pre%ious e!periences to our e%ery action. These cells) all 1BB billion of them) are known as neurons. 8ach of these neurons can connect with up to 2BB)BBB other neurons) although 1)BBB to 1B)BBB is typical.

The power of the human mind comes from the sheer numbers of these basic components and the multiple connections between them. It also comes from genetic programming and learning. The indi%idual neurons are complicated. They ha%e a myriad of parts) subsystems) and control mechanisms. They con%ey information %ia a host of electrochemical pathways. There are o%er one hundred diKerent classes of neurons) depending on the classication method used. Together these neurons and their connections form a process which is not binary) not stable) and not synchronous. In short) it is nothing like the currently a%ailable electronic computers) or e%en articial neural networks. These articial neural networks try to replicate only the most basic elements of this complicated) %ersatile) and powerful organism. They do it in a primiti%e way uper%ised learning which incorporates an e!ternal teacher) so that each output unit is told what its desired response to input signals ought to be. *uring the learning process global information may be re>uired. 9aradigms of super%ised learning include error-correction learning) reinforcement learning and stochastic learning. An important issue concerning super%ised learning is the problem of error con%ergence) ie the minimi/ation of error between the desired and computed unit %alues. The aim is to determine a set of weights which minimi/es the error. Cne well-known method) which is common to many learning paradigms) is the least mean s>uare ,30 con%ergence. Jnsuper%ised learning uses no e!ternal teacher and is based upon only local information. It is also referred to as self-organi/ation) in the sense that it selforgani/es data presented to the network and detects their emergent collecti%e properties. 9aradigms of unsuper%ised learning are ;ebbian learning and competiti%e learning.

:rom ;uman eurons to Articial euron 8sther aspect of learning concerns the distinction or not of a separate phase) during which the network is trained) and a subse>uent operation phase. e say that a neural network learns oK-line if the learning phase and the operation phase are distinct. A neural network learns on-line if it learns and operates at the same time. Jsually) super%ised learning is performed oK-line) whereas unsuper%ised learning is performed on-line.

Q). -h"t is +e"nt b# "n "ctiv"tion $nction in "n "rti*ci"l ne$ron +odel/ Describe the v"rio$s "ctiv"tion $nctions th"t "re e+%lo#ed "nd co+%"re their +erits "nd de+erits. Sol$tion& In computational networks, the activation function of a node defines the output of that node given an input or set of inputs. 0ost units in neural network transform their

net inputs by using a scalar-to-scalar function called an activation function) yielding a %alue called the unitEs acti%ation. 8!cept possibly for output units) the acti%ation %alue is fed to one or more other units. Acti%ation functions with a bounded range are often called s>uashing functions. ome of the most commonly used acti%ation functions are+ 1

Identity function ,:igure 2-2

It is ob%ious that the input units use the identity function. ometimes a constant is multiplied by the net input to form a linear function.

:igure 2-2 Identity function 2 6inary step function ,:igure 2-# Also known as threshold function or Heaviside function. The output of this function is limited to one of the two %alues+

,2.' This kind of function is often used in single layer networks.

:igure 2-# 6inary step function # igmoid function ,:igure 2-'

,2.5 This function is especially ad%antageous for use in neural networks trained by back-propagationN because it is easy to diKerentiate) and thus can dramatically reduce the computation burden for training. It applies to applications whose desired output %alues are between B and 1.

:igure 2-' igmoid function ' 6ipolar sigmoid function ,:igure 2-5

,2. This function has similar properties with the sigmoid function. It works well for applications that yield output %alues in the range of O-1)1P.

:igure 2-5 bipolar sigmoid function Acti%ation functions for the hidden units are needed to introduce nonlinearity into the networks. The reason is that a composition of linear functions is again a linear function. ;owe%er) it is the non-linearity ,i.e.) the capability to represent nonlinear functions that makes multi-layer networks so powerful. Almost any nonlinear function does the "ob) although for back-propagation learning it must be diKerentiable and it helps if the function is bounded ,see ection #.'. The sigmoid functions are the most common choices .

Q<. -h"t is +e+or# b"sed le"rnin!/ Disc$ss "bo$t it. Sol$tion+ In neural networks) instance-based learning or memory-based learning is a family of learning algorithms that) instead of performing e!plicit generali/ation) compare new problem instances with instances seen in training) which ha%e been stored in memory. Instance-based learning is a kind of la/y learning. It is called instance-based because it constructs hypotheses directly from the training instances themsel%es. This means that the hypothesis comple!ity can grow with the data in the worst case) a hypothesis is a list of n training items and the computational comple!ity of classication a single new instance is C,n. Cne ad%antage that instance-based learning has o%er other methods of machine learning is its ability to adapt its model to pre%iously unseen data. here other methods generally re>uire the entire set of training data to be re-e!amined when one instance is changed) instance-based learners may simply store a new instance or throw an old instance away. A simple e!ample of an instance-based learning algorithm is the k-nearest neighbor algorithm. *aelemans and
Q4. Discuss how neural networks help in solving AI problems. Discuss about different types of functions of an ANN.

olution& Ne$r"l Netork neural networks are used to sol%e a wide %ariety of problems) some of which ha%e been sol%ed by e!isting statistical methods) and some of which ha%e not. These applications fall into one of th e following three categories+ H :orecasting+ predicting one or more >uantitati%e outcomes from both >uantitati%e and nominal input data) H or

7lassication+ classifying input data into one of two or more categories)

H tatistical pattern recognition+ unco%ering patterns) typically spatial or temporal) among a set of %ariables. :orecasting) pattern recognition and classication problems are not new. They e!isted years before the disco%ery of neural network solutions in the 1@DBEs. hat is new is that neural networks pro%ide a single framework for sol%ing so many traditional problems and) in some cases) e!tend the range of problems that can be sol%ed. Traditionally) these problems were sol%ed using a %ariety of widely known statistical methods+ H

linear regression and general least s>uares)

H

logistic regression and discrimination)

H

principal component analysis)

H

discriminant analysis)

H

k -nearest neighbor classication) and

H

A0A and A0A time series forecasts.

In many cases) simple neural network congurations yield the same solution as many traditional statistical applications. :or e!ample) a single-layer) feedforward neural network with linear acti%ation for its output perceptron is e>ui%alent to a general linear regression t. eural networks can pro%ide more accurate and robust solutions for problems where traditional methods do not completely apply. A neural network is dened not only by its architecture and ?ow) or interconnections) but also by computations used to transmit information from one node or input to another node. These computations are determined by

network weights. The process of tting a network to e!isting data to determine these weights is referred to as training the network) and the data used in this process are referred to as patterns. Indi%idual network inputs are referred to as attributes and outputs are referred to as classes. The table below lists terms used to describe neural networks that are synonymous to common statistical terminology.

T#%es o $nctions& Ste% 9$nction A step function is a function like that used by the original 9erceptron. The output is a certain %alue) A1) if the input sum is abo%e a certain threshold and AB if the input sum is below a certain threshold. The %alues used by the 9erceptron were A1 Q 1 and AB Q B.

These kinds of step acti%ation functions are useful for binary classication schemes

2ine"r co+bin"tion A linear combination is where the weighted sum input of the neuron plus a linearly dependant bias becomes the system output. pecically+ In these cases) the sign of the output is considered to be e>ui%alent to the 1 or B of the step function systems) which enables the two methods) be to e>ui%alent if

5ontin$o$s 2o!=Si!+oid 9$nction A log-sigmoid function) also known as a logistic function) is gi%en by the relationship+

here R is a slope parameter. This is called the log-sigmoid because a sigmoid can also be constructed using the hyperbolic tangent function

instead of this relation) in which case it would be called a tan-sigmoid. ;ere) we will refer to the log-sigmoid as simply Ssigmoid

Sot+" 9$nction The softma! acti%ation function is useful predominantly in the output layer of a clustering system. oftma! functions con%ert a raw %alue into a posterior probability. This pro%ides a measure of certainty. The softma! acti%ation function is gi%en as+

3 is the set of neurons in the output layer.

Natural Language Processing

Recommend Documents