4-Hidden Markov Models

1

, 2006.2.1~3, KAIST

A T u t o r i a l o n Hi H i d d e n M a rk r k o v M o d e ls ls

2006년 2월 2일 하진영 [email protected] 강원대학 강원 대학교 교 컴퓨 컴퓨터학 터학부 부

Contents • • • •

Introduction Markov Model Hidden Ma Markov mo model (H (HMM) Three algorithms of HMM – Model evaluation – Most probable path decoding – Model training

• • • •

Patt Patter ern n cla class ssif ific icat atio ion n usi using ng HMMs HMMs HMM HMM Ap Applic lication ions an and So Software ware Summary References

A Tutorial on HMMs

2

S e q u e n t i a l Da Da t a • Examples

Speech data (“하나 둘 셋”)

Handwriting data A Tutorial on HMMs

3

C h a ra r a c t e r is is t i c s o f s u c h d a t a    

Data are sequentially generated according to time or index Spatial information along time or index Often highly variable, but has an embedded structure Information is contained in the structure

A Tutorial on HMMs

4

A d v a n t a g e o f H M M o n Se S e q u e n t i a l Da Da t a •

Natur Natural al model model struc structu ture: re: doubl doubly y stoc stochas hasti tic c proce process ss – transition parameters model temporal variability – output distribution model spatial variability

•

Effi fficien cientt and and good good mode modeliling ng tool tool for for – sequences with temporal constraints – spatial variability along the sequence – real world complex processes

•

Effi Effici cien entt eva evalu luat atio ion, n, decodi decoding ng and train trainin ing g alg algori orithm thms s – Mathematically strong – Computationally efficient

•

Proven technology! – Successful stories in many applications

•

Tools already exist – HTK (Hidden Markov Model Toolkit) – HMM toolbox for Matlab

A Tutorial on HMMs

5

S u c c e s s f u l A p p l i c a t i o n A re re a s o f H M M • • • • • • • •

On-l On-lin ine e hand handwr writ itin ing g reco recogn gnit itio ion n Spee Speech ch reco recogn gnit itio ion n and and seg segm ment entatio ation n Gesture recognition Language modeling Moti Motion on vide video o ana analy lysi sis s and and track rackin ing g Prot Protei ein n seq seque uenc nce/ e/ge gene ne sequ sequen ence ce alig alignm nmen entt Stock price prediction …

A Tutorial on HMMs

6

Wh a t ’s H M M ?

Hidden Markov Model

Hidden

What is ‘hidden’?

A Tutorial on HMMs

+

Markov Markov Model Model

What is ‘Markov model’?

7

M a r k o v Mo Mo d e l • • • • •

Scenario Graph aphica ical rep repres resentation Definition Sequence pr probability State probability

A Tutorial on HMMs

8

M a r k o v M o d e l : Sc e n a r i o • Clas Classi siffy a wea weath ther er int into thr three ee stat states es – State 1: rain or snow – State 2: cloudy – State 3: sunny

• By caref careful ully ly exam examini ining ng the the wea weath ther er of some some city city for for a long time, we found following weather change pattern Tomorrow Today

Rain/snow

Cloudy

Sunny

Rain/Snow

0.4

0.3

0.3

Cloudy

0.2

0.6

0.2

Sunny

0.1

0.1

0.8

Assumption:: tomorrow weather depends only on today’s weather! Assumption A Tutorial on HMMs

9

M a r k o v M o d e l : G r a p h i c a l Re R e p r e s en en t a t io n • Visu Visual al illu illust stra rati tion on with with diag diagra ram m 0.4

0.3

1: rain

2: cloudy

0.2

0.6

0.2

0.3

0.1

0.1 3: sunny

0.8

- Each state state correspon corresponds ds to one observati observation on - Sum of of outgoin outgoing g edge weig weights hts is is one A Tutorial on HMMs

10

M a r k o v M o d e l : De f i n i t i o n 0.4

• Observable states

0.3

1: rain

0.2

{1 , 2 , L , N }

0.6

0.2

0.3

• Observed sequence

2: cloudy

0.1

0.1 3: sunny

q1 , q2 , L , qT

0.8

• 1st order Markov assumption

P ( qt = j | qt −1 q1

q2

L

• Stationary

P ( qt = j | qt −1 A Tutorial on HMMs

= i, qt − 2 = k ,L) = P(qt = j | qt −1 = i) qt −1

qt

qt −1

qt

Bayesian network representation

= i ) = P (qt +l = j | qt +l −1 = i ) 11

M a r k o v M o d e l : D e f i n i t i o n (C (Co n t .) • State tr transition ma matrix

⎡ a11 ⎢a 21 A = ⎢ ⎢ M ⎢ ⎣ a N 1

a12

L

a 22

L

M

M

a NN

L

0.4

= P (qt = j | qt −1 = i ),

– With constraints

a ij

∑a

0.1

0.1 3: sunny

0.8

1 ≤ i, j ≤ N

N

≥ 0,

0.6

0.2

0.3

⎥ a 2 N ⎥ M ⎥ ⎥ a NN ⎦

2: cloudy

0.2

a1 N ⎤

– Where

aij

0.3

1: rain

ij

=1

j =1

• Initia itiall state prob robabili bilitty

π i

=

P ( q1

A Tutorial on HMMs

=

i ),

1

≤

i

≤ N 12

M a r k o v M o d e l : Se Se q u e n c e P Prr o b . • Conditional pr probability P ( A, B ) = P ( A | B ) P ( B )

• Sequ Sequen ence ce prob probab abililit ity y of of Mar Marko kov v mod model el P ( q1 , q2 , L , qT )

Chain rule

= P (q1 ) P (q2 | q1 ) L P (qT −1 | q1 ,L , qT −2 ) P (qT | q1 , L, qT −1 ) = P (q1 ) P (q2 | q1 ) L P (qT −1 | qT − 2 ) P (qT | qT −1 ) 1st order Markov assumption

A Tutorial on HMMs

13

M a r k o v M o d e l : Se Se q u e n c e P Prr o b . (C (C o n t . ) • Ques Questi tion on:: W What hat is the the proba probabi bilit lity y tha thatt the the weath weather er for for the the next 7 days will be “sun-sun-rain-rain-sun-cloudy-sun” when today is sunny? S1 : rain, S 2 : cloudy , S 3 : sunny P (O | model) = P ( S 3 , S 3 , S 3 , S1 , S1 , S 3 , S 2 , S 3 | model)

= P( S3 ) ⋅ P( S3 | S3 ) ⋅ P ( S3 | S3 ) ⋅ P ( S1 | S3 ) ⋅ P( S1 | S1 ) P( S3 | S1 ) P( S 2 | S3 ) P ( S3 | S 2 ) = π 3 ⋅ a33 ⋅ a33 ⋅ a31 ⋅ a11 ⋅ a13 ⋅ a32 ⋅ a23 = 1⋅ (0.8)(0.8)(0.1)(0.4)(0.3)(0.1)(0.2) 0.4 1: 0.3 = 1.536 ×10 −4 rain

2: cloudy

0.2

0.3

0.6

0.2 0.1

0.1 3: sunny

A Tutorial on HMMs

0.8

14

M a r k o v M o d e l : S t a t e Pr Pr o b a b i l i t y • Stat State e prob robabili ility at time ime t : P ( qt = i )

• Sim Simple but slow low algo lgorit rithm: – Probability of a path that ends to state i at time t : Qt (i ) = ( q1 , q2 , L , qt = i ) Exponential time complexity: P (Qt (i )) = π q1

t

∏ P (q

k

| qk −1 )

O ( N t )

k = 2

– Summation of probabilities of all the paths that ends to i at t

P ( qt = i ) = A Tutorial on HMMs

∑ P(Q (i)) t

all Qt ( i )'s

15

M a r k o v M o d e l : S t a t e Pr Pr o b . (Co n t .) • Stat State e prob robabili ility at time ime t : P ( qt = i ) P ( q t −1

= 1)

a 1i

Each node stores the sum of probabilities of partial paths

• Efficient algorithm (Lattice algorithm) – Recursive path probability calculation P ( q t

N

∑

P ( q t − 1

=

j , q t

P ( q t − 1

=

j ) P ( q t

=

i | q t − 1

P ( q t − 1

=

j ) ⋅ a ji

=

i)

=

=

i)

j = 1

=

N

∑

=

j )

j = 1

=

N

∑

j = 1

A Tutorial on HMMs

Time complexity: O ( N 2t ) 16

Wh a t ’s H M M ?

Hidden Markov Model

Hidden

What is ‘hidden’?

A Tutorial on HMMs

+

Markov Model

What is ‘Markov model’?

17

H i d d e n M a rk rk ov Model • • • • • •

Example Generation pr process Definition Model evalua luation algo algori ritthm Path decoding algorithm Training al algorithm

A Tutorial on HMMs

18

T i m e Se S e r i e s Ex Ex a m p l e

• Representation – X = x1 x2 x3 x4 x5 … xT-1 xT

= s φ p iy iy iy φ φ ch ch ch ch

A Tutorial on HMMs

19

A n a l y si si s M e t h o d s

• Prob Probab abililit ity y-bas -based ed anal analy ysis sis? P (s φ p iy iy iy φ φ ch ch ch ch ) = ?

• Method I P (s) P (φ ) 3 P ( p) P (iy) 3 P (ch ) 4 – Observations are independent; no time/order – A poor model for temporal structure • Model size = |V | = N

A Tutorial on HMMs

20

A n a ly l y s is is m e t h o d s

• Method II P(s) P(s | s) P(φ | s) P( p | φ ) P(iy | p) P(iy | iy)

2

× P(φ | iy) P(φ | φ ) P(ch | φ ) P(ch | ch ) 2` – A simple model of ordered sequence • A symbol symbol is is depende dependent nt only only on the imme immediat diately ely prec precedi eding: ng:

P ( xt | x1 x2 x3 L xt −1 ) = P ( xt | xt −1 ) • |V |×|V | matrix model

• 50×50 – not not ver very y bad bad … • 105×105 – doubly outrageous!!

A Tutorial on HMMs

21

The problem

• “What you see is the truth” truth” – Not quite a valid assumption – There are often errors or noise • Noisy Noisy sound, sound, sloppy sloppy handwr handwriti iting, ng, ungram ungrammati matical cal senten sentences ces

– There may be some truth process • Unde Underl rlyi ying ng hid hidde den n sequ sequen ence ce • Obscur Obscured ed by the incomp incomplet lete e observa observatio tion n

A Tutorial on HMMs

22

A n ot o t h e r a na n a l ys ys i s m e t h o d

• Method III – What you see is a clue to what lies behind and is not known a priori • The source source that that gener generate ated d the the obse observa rvatio tion n • The source source evolv evolves es and generate generates s charact characteri eristi stic c observa observatio tion n sequences

q0

→ q1 → q 2 →

L

→ qT

P(s, q1 ) P(s, q2 | q1 ) P (φ , q3 | q2 ) L P(ch , qT | qT −1 ) =

∑ P(s, q ) P(s, q 1

Q

A Tutorial on HMMs

2

| q1 ) P (φ , q3 | q2 ) L P (ch , qT | qT −1

∏ P( x , q | q ) ) = ∑ ∏ P ( x , q | q t

t

t −1

t

t

Q

t

t −1

t

23

)

T h e A u x i l i a r y V a r i ab able

qt ∈ S

= {1, K,

N }

• N is also conjectured • {qt :t ≥0} is conjectured, not visible – is Q = q1 q 2 L qT – is Markovian

P(q1q2 L qT ) = P(q1 ) P(q2 | q1 ) L P(qT | qT −1 ) – “Markov chain”

A Tutorial on HMMs

24

S u m m a r y of o f t h e Co Co n c e p t

∑ P( X , Q) = ∑ P(Q) P( X | Q) = ∑ P(q q L q ) P( x x L x | q q L q = ∑ ∏ P(q | q )∏ p( x | q )

P ( X ) =

Q

Q

1

Q

T

2

T

t −1

t =1

Markov chain process

A Tutorial on HMMs

T

2

1

2

T

)

T

t

Q

1

t

t

t =1

Output process

25

H i d d e n M a rk rk ov Model

• is a dou doubl bly y sto stoc chas hastic tic pro proc cess ess – stochastic chain process : { q(t ) } – output process : { f ( x x|q) }

• is also called as – Hidden Markov chain – Probabilistic function of Markov chain

A Tutorial on HMMs

26

HMM C Ch h a r a c t e r i za za t i o n • λ = ( A A, B, π ) – A : state transition probability { aij | aij = p(qt +1 +1= j|qt =i) }

– B : symbol output/observation probability x=v|qt = j) } { b j(v) | b j(v) = p( x

– π : initial state distribution probability { π i | π i = p(q1=i) }

∑ P(Q | λ )P(X | Q, λ ) = ∑π a a ... a Q

q1

Q

A Tutorial on HMMs

q1q 2

q2 q3

b ( x1 )bq2 ( x2 ) ... bqT ( xT )

qT −1qT q1

λ

27

G r a p h i c a l Ex Ex a m p l e

π = [ 1.0 0 0 0 ] 0.6

1

0.5 0.4

iy psch

2

p

0. 7 0.5

iy

3

1.0 0.3

4 ch

1 2 A= 3 4

2

3

4

0.6 0. 0 0. 0 0. 0

0.4 0.5 0.0 0.0

0.0 0.5 0.7 0.0

0.0 0. 0 0. 3 1. 0

ch iy p 1 2 B= 3 4

A Tutorial on HMMs

1

0.2 0.2 0.0 0.0 0.6

s

0.2 0.2 0.0 0.0 0.6 0.6 … 0.2 0.5 0.3 … 0.8 0.1 0.1 … 0.0 0.2 0.2 … 28

D a t a i n t e r pr pr e t a t i o n P(s s p p iy iy iy ch ch ch|λ) = ∑Q P(ssppiyiyiychchch,Q|λ) = ∑Q P(Q|λ) p(ssppiyiyiychchch|Q,λ)

Let Q = 1 1 2 2 3 3 3 4 4 4

0.6 0.0 0.0 0.0

0.2 0.2 0.0 0.0 0.6

0.4 0.5 0.0 0.0

0.0 0.5 0.7 0.0

0.2 0.2 0.0 0.0 0.2 0.5 0.8 0.1 0.0 0.2

0.0 0.0 0.3 1.0

0.6 0.6 … 0.3 … 0.1 … 0.2 …

P(Q|λ) p(ssppiyiyiychchch|Q, λ) = P(1122333444|λ) p(ssppiyiyiychchch|1122333444, λ) 2 = (1×.6)×(.6×.6)×(.4×.5)×(.5×.5)×(.5×.8)×(.7×.8) 2 ×(.3×.6)×(1.×.6) ≅ 0.0000878 T

#multiplications ~ 2TN A Tutorial on HMMs

29

Issues in HMM

•

Intuitive decisions 1. number of states ( N N ) 2. topo topolo logy gy (sta (state te inte interr-co conn nnec ecttion) ion) 3. num number of observ ervation symbols (V )

•

Difficult problems 4. efficient ent co computatio ation n me methods 5. probability parameters (λ )

A Tutorial on HMMs

30

T h e Nu N u m b e r o f St St a t e s

• How many states? – Model size – Model topology/structure

• Factors – Pattern complexity/length and variability variabili ty – The number of samples

• Ex: rrgbbgbbbr

A Tutorial on HMMs

31

(1 ) T h e s i m p l e s t m o d e l

• Model I – N = 1

1.0

– a11=1.0 – B = [1/3, 1/6, 1/2] P(r r g b b g b b b r | λ 1 ) = 1×

1 3

1

1

1

1

1

3

6

2

2

6

×1× ×1× ×1× ×1× ×1× 1

1

1

1

2

2

2

3

×1× ×1× ×1× ×1× ≅ 0.0000322

A Tutorial on HMMs

(< 0.0000338)

32

( 2 ) T w o st st a t e m o d e l

• Model II: – N = 2 A =

0.6 0.4 0.6 0.4

B =

1/2 1/3 1/6 1/6 1/ 1/6 2/3

0.6

0.4

1

0.6

P( r r g b b g b b b r | λ 1 ) = .5 ×

1 2

1

1

2

2

1

2

3

3

3

3

× .6 × × .6 × × .4 × × .4 × × .6 × 2

2

2

1

3

3

3

2

× .4 × × .4 × × .4 × × .6 × =? A Tutorial on HMMs

0.4

2

+

LL

33

(3 ) T h r e e s t a t e m o d e l s • N=3:

2

0.7

0.2 0.6

1

0.3

0.2

3

0.7

0.3 2 0.2 0.5

1

0.6 0.2

0.2 0.6 0.3

3

0.3

0.1 A Tutorial on HMMs

34

T h e Cr Cr i t e r i o n i s

• Obtaini ining the best est model del(λ ) that maximizes

ˆ) P( X | λ • The The best best topo topolo logy gy come comes s fro from m ins insig ight ht and and experience ← the # classes/symbols/samples

A Tutorial on HMMs

35

A t r a i ne ne d H M M

.5 1

.4

R .6 G .2 B .2

.1 .2 .6 .5 2 .3 .4 .0 3 .3 .7

A Tutorial on HMMs

π=

1. 0. 0. 1

2

3

.5 .4 .1 2 .0 .6 .4 3 .0 .0 .0 1

A =

R

G

B

.6 .2 .2 2 .2 .5 .3 3 .0 .3 .7 1

B =

36

H i d d e n M a r k o v M o d e l : Ex a m p l e 0.3

2

0.6

0.2 0.6

1

0.1

0.1 0.2

3

0.6

0.3

• N pots containing color balls • M distinct colors • Each pot pot contains contains different different number of color color balls

A Tutorial on HMMs

37

H M M : G e n e r a t i o n Pr Pr o c e s s • Seq Sequen uence gen generat rating alg algorit rithm – Step 1: Pick initial pot according to some random process – Step 2: Randomly pick a ball from the pot and then replace rep lace it – Step 3: Select another pot according to a random selection process 0.3 2 – Step 4: Repeat steps 2 and 3 0.6

0.2

1

1

0.6

3

1

0.1 0.1 0.2

0.6

3

0.3

Markov process: {q(t )} Output process: { f ( x x|q)} A Tutorial on HMMs

38

H M M : H i d d e n I n f o rm rm at ion

• Now, wh what is is hidden?

– We can just see the chosen balls – We can’t see which pot is selected at a time – So, pot selection (state transition) information is hidden A Tutorial on HMMs

39

H M M : Fo r m a l D De efinit ion

• Notation: λ = ( Α, Β, π ) (1) N : Number of states (2) M : Number of symbols observable in states

V = { v 1 , L , v M } (3) A : State transition probability distribution A = { a ij }, 1 ≤ i , j ≤ N (4) B : Observation symbol probability distribution

B = { b i ( v k )}, 1 (5) π : Initial state distribution

π i

A Tutorial on HMMs

=

P ( q1

=

i ),

≤ 1

i

≤ N , 1 ≤

≤

i

j

≤ M

≤ N

40

T h r e e Pr Pr o b l e m s

1. Model evaluation problem – –

What is the probability of the observation? Forward algorithm

2. Path decoding problem – –

What is the best state sequence for the observation? Viterbi algorithm

3. Model training problem – –

How to estimate the model parameters? Baum-Welch reestimation algorithm

A Tutorial on HMMs

41

So l u t i o n t o M o d e l Ev E v a l u a t i o n Pr Pr o b l e m Forward algorithm Backward algorithm

Definition • Given a model λ • Observation sequence: X • P(X| λ) = ? • P ( X | λ ) =

∑ P ( X , Q | λ ) = ∑ P ( X | Q , λ ) P ( Q | λ ) Q

(A path or state sequence: x1 x2 x3 x4 x5 x6 x7 x8

A Tutorial on HMMs

= x1 , x 2 , L , x T

Q

Q

=

q 1 , L , q T

)

x1 x2 x3 x4 x5 x6 x7 x8

43

Solution • Easy Easy but but slo slow w solu soluti tion on:: exha exhaus usti tive ve enu enume mera rati tion on P ( X | λ )

=

∑ P ( X , Q | λ ) = ∑ P ( X | Q , λ ) P ( Q | λ ) Q

=

∑b

q1

Q

( x 1 ) b q 2 ( x 2 ) L b q T ( x T )π q 1 a q 1 q 2 a q 2 q 3 L a q T −1 q T

Q

– Exhaustive enumeration = combinational explosion!

O ( N T )

• Smart solution exists? – – – –

Yes! Dynamic Programming technique Lattice structure based computation Highly efficient -- linear in frame length

A Tutorial on HMMs

44

F o r w a r d A l go go rit h m • Key idea – Span a lattice of N states and T times – Keep the sum of probabilities of all the paths coming to each state i at time t

• Forward probability α t ( j ) = P ( x1 x2 ... xt , qt = S j | λ )

= ∑ P( x1 x2 ... xt , Qt = q1...qt | λ ) Qt

N

= ∑ α t −1 (i )aij b j ( xt ) i =1

A Tutorial on HMMs

45

F o r w a r d A l go go r i t h m • Initialization α 1 (i ) = π i bi (x1 )

1 ≤ i ≤ N

• Induction α t ( j ) =

N

∑α

t −1

(i )aij b j (x t )

1 ≤ j ≤ N , t = 2, 3, L , T

i =1

• Termination P( X | λ ) =

N

∑α (i) T

i =1

A Tutorial on HMMs

46

N um u m e r ic i c a l Ex E x a m p l e: e : P(RRGB|λ )

π =[1 0 0]T .5

.6

.4

.4

R

R .6 G .2 B .2

1×.6

.1 .2 .5 .3

0×.2

.0 .3 .7

A Tutorial on HMMs

R

.6

.5×.6

G

.18

.4×.2

0×.0

.0

.0

.5×.2

.018

.4×.5

B .5×.2

.0018

.4×.3

.6×.2

.6×.5 .6×.3 .048 .0504 .01123 .1×.0 .1×.3 .1×.7 .4×.0 .4×.3 .4×.7 .0

.01116

.01537

47

B ac a c k w a rd rd A All g or o r i t h m (1 ) • Key Idea – Span a lattice of N states and T times – Keep the sum of probabilities of all the outgoing paths at each ea ch state i at time t

• Backward pr probability β t (i ) = P( xt +1 xt + 2 ... xT | qt = S i , λ )

= ∑ P( xt +1 xt + 2 ... xT , Qt +1 = qt +1...qT | qt = Si , λ ) Qt +1 N

= ∑ aij b j ( xt +1 ) β t +1 ( j )

j =1 A Tutorial on HMMs

48

B ac a c k w a rd rd A All g or o r i t h m (2 ) • Initialization 1 ≤ i ≤ N

β T (i ) = 1

• Induction β t (i ) =

N

∑ a b (x ij j

t +1

) β t +1 ( j )

1 ≤ i ≤ N , t = T − 1, T − 2, L, 1

j =1

A Tutorial on HMMs

49

So l u t i o n t o P a t h De D e c o d i n g Pr o b l e m State sequence Optimal path Vite Viterb rbii algo algori rith thm m Sequence segmentation

T h e M o s t P r o b a b l e Pa Pa t h • • • •

Given a model λ Observation sequence: X P ( X , Q | λ ) = ?

=

x 1 , x 2 , L , x T

Q = arg max Q P ( X , Q | λ ) = arg max Q P ( X | Q , λ ) P ( Q | λ ) – (A path or state sequence: Q = q 1 , L , q T ) *

x1 x2 x3 x4 x5 x6 x7 x8

A Tutorial on HMMs

x1 x2 x3 x4 x5 x6 x7 x8

51

Vit e rbi Algorit h m • Purpose – An analysis for internal processing result – The best, the most likely state sequence – Internal segmentation

• Viterbi Algorithm – Alignment of observation and state transition – Dynamic programming technique

A Tutorial on HMMs

52

V i t e rrb b i P a t h I de dea • Key idea – Span a lattice of N states and T times – Keep the probability and the previous node of the most probable path coming to each state i at time t

•

Recursive path selection

δ t (i )aij b j (x t +1 ) – Path probability: δ t +1 ( j ) = max 1≤ i ≤ N – Path node: ψ t +1 ( j ) = arg max δ t (i ) aij 1≤i ≤ N

δ t (1) a 1 j b j ( x t + 1 ) δ t ( i ) a ij b j ( x t + 1 )

A Tutorial on HMMs

53

Vit e rbi Algorit h m • Introduction:

δ 1 (i ) = π i bi (x1 ), 1 ≤ i ≤ N ψ 1 (i ) = 0

• Recursion:

δ t +1 ( j ) = max δ t (i )aij b j (x t +1 ), 1 ≤ t ≤ T − 1 1≤i ≤ N

ψ t +1 ( j ) = arg max δ t (i )aij

1 ≤ j ≤ N

1≤ i ≤ N

∗

= max δ T (i) 1≤i ≤ N ∗ qT = arg max δ T (i )

• Termination:

P

1≤i ≤ N

• Path backtracking:

qt ∗

= ψ t +1 (qt ∗+1 ),

t = T − 1, K ,1

1

states

2 3

A Tutorial on HMMs

54

N u m e r i c a l Ex E x a m p l e : P ( R R G B ,Q , Q * |λ)

π =[1 0 0]T .5

.6

.4

.4

R

R .6 G .2 B .2

1×.6

.1 .2 .5 .3

0×.2

.0 .3 .7

A Tutorial on HMMs

R

.6

.5×.6

G

.18

.4×.2

0×.0

.0

.0

.5×.2

.018

.4×.5

B .5×.2

.0018

.4×.3

.6×.2

.6×.5 .6×.3 .048 .036 .00648 .1×.0 .1×.3 .1×.7 .4×.0 .4×.3 .4×.7 .0

.00576

.01008

55

So l u t i o n t o M o d e l t r a i n i n g Pr Pr o b l e m HMM training algorithm Maximum likelihood estimation Baum-Welch reestimation

H M M T r a i ni ni n g A l g o r i t h m • Given an obs observa rvation sequence nce X = x 1 , x 2 , L , x T • Find th the mo model pa parameter λ * = ( A , B , π ) s.t. P ( X | λ * ) ≥ P ( X | λ ) for ∀ λ – Adapt HMM parameters maximally to training samples – Likelihood of a sample

P ( X | λ )

=

∑ P ( X | Q ,λ ) P ( Q | λ ) Q

State transition is hidden!

• NO analytical solution • Baum-Welch reest reestim imat atio ion n (EM) (EM) – iterative procedures that locally maximizes P (X |λ) – convergence proven – MLE statistic estimation

P (X |λ)

λ:HMM parameters A Tutorial on HMMs

57

M ax a x i m u m L i k e l i ho h o o d Es t i m a t i o n • MLE MLE “sel “selec ects ts thos those e para parame mete ters rs that that maxi maximi mize zes s the the probability function of the observed sample.” • [Def [Defin init itio ion] n] Maxi Maximu mum m Lik Likel elih ihoo ood d Esti Estima mate te – Θ: a set of distribution parameters – Given X , Θ* is maximum likelihood estimate of Θ if f (X |Θ*) = maxΘ f (X |Θ)

A Tutorial on HMMs

58

M L E Ex a m p l e • Scenario – Known: 3 balls inside pot (some red; some white) – Unknown: R = # red balls – Observation: (two reds)

• Two models –P ( –P (

⎛ 2 ⎞⎛ 1 ⎞ ⎛ 3 ⎞ |R=2) = ⎜⎜ 2 ⎟⎟ ⎜⎜ 0 ⎟⎟ ⎜⎜ 2 ⎟⎟ = ⎝ ⎠⎝ ⎠ ⎝ ⎠ ⎛ 3 ⎞ ⎛ 3 ⎞ |R=3) = ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ = 1 ⎝ 2 ⎠ ⎝ 2 ⎠

1 3

• Which model? – L(λ R =3 ) > L(λ R = 2 ) – Model(R=3) is our choice A Tutorial on HMMs

59

M L E Ex a m p l e (Co n t .) • Model(R el(R= =3) is a more lik likely strat rategy egy, unless we have a priori knowledge of the system. • Howe Howeve ver, r, wit witho hout ut an obs obser erva vati tion on of of two two red red bal balls ls – No reason to prefer P (λR=3) to P (λR=2) • ML meth method od choo choose ses s the the set set of of par param amet eter ers s tha thatt maximizes the likelihood of the given observation. • It make makes s param paramet eters ers maxi maxima mally lly adapt adapted ed to to trai trainin ning g data data..

A Tutorial on HMMs

60

EM A l g o r i t h m f o r T r a i n i n g

• With With λ ( t ) =< { a ij }, { b ik }, π i > , esti estim mate ate EXPECTATION of following quantities: –Expected number of state i visiting –Expected number of transitions from i to j

• With With followi following ng quanti quantities ties:: –Expected number of state i visiting –Expected number of transitions from i to j

• Obtain Obtain the the MAXIM MAXIMUM UM LIKE LIKELIH LIHOO OOD D of ′ }, π i′ > λ ( t + 1 ) =< { a ij′ }, { b ik A Tutorial on HMMs

61

Ex p e c t e d N u m b e r o f γ t ( i )

= = =

S i V i s i t i n g

= S i | X , λ ) P ( q t = S i , X | λ )

P ( q t

P ( X | λ )

α t ( i ) β t ( i )

∑ α ( j ) β ( j ) t

t

j

A Tutorial on HMMs

62

E x p e c t e d Nu N u m b e r of o f T ra r a n si si t i o n

ξ t (i, j ) = P(qt = S i , qt +1

= S j | X , λ ) =

α t (i )aij b j ( xt +1 )

∑ ∑ α (i)a b ( x i

i

A Tutorial on HMMs

ij j

t +1

t +1

( j )

) β t +1 ( j )

j

63

P a r a m e t e r Re R e e st st i m a t i o n

• MLE par parameter es estimation ion T

a

ij

=

−1

∑

ξ t ( i , j )

P(X|λ)

t = 1 T − 1

∑

γ t ( i )

λ:HMM parameters

t = 1 T

∑

b j ( v k )

π

i

=

γ t ( j )

t = 1 s . t . x t = v

= γ 1 ( i )

k

T

∑

γ t ( j )

t = 1

( t +1)

– Iterative: P ( X | λ – convergence proven: – arriving local optima A Tutorial on HMMs

) ≥ P( X | λ (t ) )

64

Ot h e r i s s u e s • Other me method of training – – – – –

MAP (Maximum A Posteriori) estimation – for adaptation MMI (Maximum Mutual Information) estimation MDI (Minimum Discrimination Information) estimation Viterbi training Discriminant/reinforcement training

A Tutorial on HMMs

65

• Ot Othe herr ty types pes of of par param amet etri ric c str struc uctu ture re – Continuous density HMM (CHMM) • More More accur accurate ate,, but but much much more more parame parameter ters s to train train

– Semi-continuous HMM • Mix of CHMM CHMM and DHMM, DHMM, using using parame parameter ter sharing sharing

– State-duration HMM • More More accu accura rate te temp tempor oral al beha behavi vior or

• Other ex extensions – HMM+NN, HMM+NN, Autoregressive HMM – 2D models: MRF, MRF, Hidden Mesh model, model, pseudo-2D HMM

A Tutorial on HMMs

66

Gr a p h i c a l D DH H M M a n d CH CH M M • Models for ‘5’ and ‘2’

A Tutorial on HMMs

67

P a t t e r n Cl C l a s s i fi fi c a t i o n u s i n g H M M s • • • • •

Pattern cl classification Extension of HMM HMM struct ucture Exte Extens nsio ion n of of HMM HMM trai traini ning ng metho ethod d Practical issues of HMM HMM history

A Tutorial on HMMs

68

P a t t e r n C l a s s i f ic ic at ion • Cons Consttruct ruct one one HMM HMM per per eac each h class lass k – λ 1 , L , λ N • Train rain each HMM λ k with ith samples les Dk – Baum-Welch reestimation algorithm

• Calc Calcul ulat ate e mode modell like likeliliho hood od of λ 1 , L , λ N with with obse observ rvat atio ion n X – Forward algorithm: P ( X | λ k )

• Find ind the model with maxi aximum a posteriori probability

λ *

= argmax λ

P(λ k | X )

= argmax λ

P(λ k ) P( X | λ k )

k

P( X ) = argmax λ k P(λ k ) P( X | λ k ) k

A Tutorial on HMMs

69

E x t e n s io i o n o f H M M St St r u c t u r e • Exte Extens nsio ion n of of sta state te tran transi siti tion on para parame mete ters rs – Duration modeling HMM • More More accu accura rate te temp tempor oral al beha behavi vior or

– Transition-output HMM • HMM output output functions functions are attach attached ed to transitions transitions rather rather than than states states

• Exte Extens nsio ion n of obse observ rvat atio ion n par param amet eter er – Segmental HMM • More More accurat accurate e modeli modeling ng of traje trajecto ctorie ries s at each each state, state, but but more more computational cost

– Continuous density HMM (CHMM) • Output Output dist distribu ributio tion n is model modeled ed with with mixtu mixture re of Gaus Gaussian sian

– Semi-continuous HMM (Tied mixture HMM) • Mix of of continu continuous ous HMM HMM and discr discrete ete HMM HMM by shar sharing ing Gaus Gaussia sian n components

A Tutorial on HMMs

70

E x t e n s i o n o f H M M T r ai ai n i n g M e t h o d • Maxi Maximu mum m Like Likeliliho hood od Esti Estima mati tion on (MLE (MLE)* )* – maximize the probability of the observed samples

• Maxi Maximu mum m Mut Mutua uall Inf Infor orma mati tion on (MMI (MMI)) Met Metho hod d – information-theoretic measure – maximize average mutual information: *

I

=

max

⎧V ⎡ λ ⎨ ∑ ⎢ log ⎩ ν =1 ⎣

P ( X | λ v ) − log v

V

∑ P ( X w =1

w

⎤⎫ | λ w ) ⎥ ⎬ ⎦⎭

– maximize discrimination power by training models together

• Minim Minimum um Discri Discrimi minat nation ion Info Informa rmati tion on (MDI (MDI)) Meth Method od – minimize the DI or the cross entropy between pd(signal) and pd(HMM)’s – use generalized Baum algorithm A Tutorial on HMMs

71

P r a c t i c a l I s su s u e s of of H M M • Arch Archit itec ectu tura rall and and beha behavi vior oral al cho choic ices es – – – –

the unit of modeling -- design choice type of models: ergodic, left-right, parallel path. number of states observation symbols;discrete, continuous; mixture number

• Initial estimates – A, π : adequate with random or uniform initial values – B : good initial estimates are essential for CHMM

ergodic A Tutorial on HMMs

left-right

parallel path 72

P r a c t i c a l I s s u e s o f H M M (C (Co n t .) • Scaling t −1

α t (i ) =

t

∏ a ∏ b (x ) s , s +1

s =1

s

s

s =1

– heads exponentially to zero:  scaling (or using log likelihood)

• Mult Multip iple le obse observ rvat atio ion n sequ sequen ence ces s X (k )| – accumulate the expected freq. with weight P( X )|l)

• Insuffici icient trai rainin ning da data – deleted interpolation with desired model & small model mode l – output prob. smoothing (by local perturbation of symbols) – output probability tying between different states

A Tutorial on HMMs

73

P r a c t i c a l I s s u e s o f H M M (C (Co n t .) •

HMM to topology op optimization – What to optimize • # of states • # of of Gau Gauss ssia ian n mix mixtu ture res s per per stat state e • Transitions

– Methods • Heur Heuris isti tic c meth ethods – # of states from from average (or mod) length of of input frames

• Split / merge – # of states from iterative iterative split / merge

• Mode Modell sel selec ecti tion on crit criter eria ia – – – – – –

A Tutorial on HMMs

# of states and mixtures at the same time ML (maximum likelihood) BIC (Bayesian information criteria) HBIC (HMM-oriented BIC) DIC (Discriminative information criteria) ..

74

H M M a p p li l i c a t i o n s a n d So So f t w a r e • • • •

On-li n-line ne hand handwr writ itin ing g reco recogn gnit itio ion n Speech applications HMM to toolbox fo for Ma Matlab HTK HTK (hi (hidd dden en Marko arkov v mod model el Tool Toolki kit) t)

A Tutorial on HMMs

75

H M M Ap A p p l ic ic at ions • On-l On-lin ine e hand handwr writ itin ing g reco recogn gnit itio ion n – BongNet: HMM network-based network-ba sed handwriting recognition system

• Speech applications – CMU Sphinx : Speech recognition toolkit – 언어과학 Dr.Speaking Dr.Speaking : English English pronunciation pronunciation correctio correction n system

A Tutorial on HMMs

76

BongNet •

Cons Consort ortiu ium m of CAIR( CAIR(Cen Cente terr for Artif Artific icia iall Intel Intellilige genc nce e Resea Researc rch) h) at KAIST – The name “BongNet” from its major inventor, BongKee Shin

• •

Prom Promin inen entt perf perfor orman mance ce for uncons unconstr trai aine ned d on-l on-lin ine e Han Hangul gul recognition Mode odeling of of Ha Hangul handw ndwriti ritin ng – considers ligature between letters as well as consonants and vowels • (initi (initial al conson consonant ant)+( )+(lig ligatu ature re)+( )+(vow vowel) el) • (initi (initial al conson consonant ant)+( )+(lig ligatu ature) re)+(v +(vowe owel)+ l)+(li (ligat gature ure)+( )+(fin final al conson consonant ant))

– connects letter models and ligature models using Hangul composition principle – further improvements • • • •

BongNet+ BongNet+ : incorp incorporat orating ing structur structural al informat information ion explicitl explicitly y Circu Circular lar BongN BongNet et : succ success essive ive charac character ter recogn recogniti ition on Unifie Unified d Bong BongNet Net : Hangu Hangull and and alpha alphanum numeri eric c reco recogni gnitio tion n dict iction ionary look look-u -up p

A Tutorial on HMMs

77

• Network structure

A Tutorial on HMMs

78

A M o di di f i c a t i o n t o B o n ng gNet 16-dir Chaincode  Structure Code Generation • Structure co code se sequence – carries structural information • not easily easily acquire acquired d using using chain chain code code seque sequence nce • includ including ing length length,, direc directio tion, n, and vendin vending g

Distance Rotation 18.213 45.934 41.238 45.796 18.299 16.531 45.957 52.815 26.917 53.588 56.840

A Tutorial on HMMs

Straightness 96.828 87.675 99.997 97.941 98.820 88.824 100.000 99.999 99.961 99.881 80.187

Direction 46.813 146.230 0.301 138.221 8.777 298.276 293.199 95.421 356.488 156.188 17.449

Real 1 1 1 1 1 1 0 1 1 1 1

1 1 0 1 0 -1 0 0 0 0 -1

3 37 0 37 0 31 54 11 28 15 5

79

Dr . Sp e a k i n g

1

–

2

–

A Tutorial on HMMs

80

시스템 구조

speech

F e a t u r e Ex Ex t r a c t i o n & Acoustic Analysis

Target Speech DB spoken b y N a t i v e Sp Sp e a k e r Target Speech DB spoken b y N o n - n a t i v e S p ea ea k e r (mis-pronunciation)

Decoder

Acoustic Score

Score Estimation

Evaluation Score

Acoustic Model ( P h o n e m e U n it it of Native) Language Model (Phoneme Unit) Acoustic Model ( P h o n e m e U n it it o f n o n n a t i v e) e)

Target Pronunciation Dictionary

Target mis-Pronunciation Dictionary ( A n a l y si s i s N o n n a t i v e s p e e ch ch p a t t e r n )

A Tutorial on HMMs

81

• Acoustic modeling Native HMM Non-Native HMM

A

B C standard

• Language modeling standard

error

replacement error modeling A Tutorial on HMMs

deletion error modeling

error

insertion error modeling 82

•

단어수준 발음 교정 : 음소단위 오류패턴 검출 – 오류발음대치, 삽입, 삭제, 이중모음 분리, 강세, 장단오류

A Tutorial on HMMs

83

•

문장 발음 연습

시각적인 교정정보 및 수준별 학습 평가

1

정확성평가 - 정발음패턴과 다양한 유형의 오류발음 패턴을 기반으로 평가

2

억양평가 - 억양관련 음성신호 추출 후 표준패턴과 오류패턴을 기반으로 평가

3

유창성평가 - 연음여부, 끊어 읽기, 발화구간 등 다양한 평가요소를 기반으로 평가 A Tutorial on HMMs

84

S o f t w a r e T o ol o l s f o r H MM MM •

HMM toolbox for Matlab – – – – –

•

Developed by Kevin Murphy Freely downloadable SW written in Matlab (Hmm… Matlab is not free!) Easy-to-use: flexible data structure and fast prototyping by Matlab Somewhat slow performance due to Matlab Download: http://www.cs.ubc.ca/~murphyk/Software/HMM/hmm.html

HTK (Hid Hidden Marko rkov toolk olkit) – Developed by Speech Vision and Robotics Group of Cambridge University – Freely downloadable SW written in C – Useful for speech recognition research: comprehensive set of programs for training, recognizing and analyzing speech signals – Powerful and comprehensive, but somewhat complicate – Download: http://htk.eng.cam.ac.uk/

A Tutorial on HMMs

85

W h a t i s H TK TK ? • Hidd Hidde en Marko rkov Model del Toolk olkit • Set Set of of too tools ls for for tra train inin ing g and and eval evalua uati tion on HMMs HMMs • Prima Primaril rily y used used in auto automa mati tic c speec speech h rec recog ognit nitio ion n and and economic modeling • Modul Modular ar imple impleme ment ntat ation ion,, (rel (relat ativ ively ely)) easy easy to exte extend nd

A Tutorial on HMMs

86

H TK T K S o ft f t w a re r e A r c h i t e c t u re re – HShell : User input/output & interaction with the OS – HLabel : Label files – HLM : Language model – HNet : Network and lattices – HDic : Dictionaries – HVQ : VQ codebooks – HModel : HMM definitions – HMem : Memory management – HGraf : Graphics – HAdapt : Adaptation – HRec : main recognition processing functions A Tutorial on HMMs

87

Ge n e r i c Pr o p e r t i e s o f a H T K T o o l

• •

Desi Design gned ed to to run run with with a tradi traditi tiona onall comm command and-l -lin ine e styl style e inter interfac face e Each Each too tooll has has a numb number er of of requ requir ired ed argu argumen mentt plus plus opti optiona onall argum argumen ents ts HFoo

-T

1

-f

34.3

-a

-s

myfil e myfile

file1 file2

– This tool has two main arguments called file1 and file2 plus four optional arguments – -f : real number, -T : integer, -s : string, -a : no following value HFoo

-C

config

-f

34.3

-a

-s

myfile

file1

file2

– HFoo will load the parameters stored in the configuration configurat ion file config during its initialization procedures – Configuration parameters can sometimes by used as an alternative to using command line arguments A Tutorial on HMMs

88

T h e T o ol olk it • There are 4 main phases – data preparation, training, testing and analysis

• The Toolkit – Data Preparation Tools – Training Tools – Recognition Tools – Analysis Tools < HTK Processing Stages > A Tutorial on HMMs

89

Da t a P Prr e p a r a t i o n T o o l s • • • • • • • •

A set set of spe speec ech h data data fil file e and and thei theirr asso associ ciat ated ed tran transc scri ripti ptions ons are are required It must must by conv convert erted ed into into the the app appro ropri priat ate e para parame metr tric ic form form HSlab : Used both to record the speech and to manually annotate it with and required transcriptions HCopy : simply copying each file performs the required encoding HList : used to check the contents of any speech file HLed : output file to a single Master Label file MLF which is usually more convenient for subsequent processing HLstats : gather and display statistics on label files and where required HQuant : used to build a VQ codebook in preparation for building discrete probability HMM system

A Tutorial on HMMs

90

Training Tools • If there is some speech ech data available for which the location of the sub-word boundaries have been marked, this can be used as bootstrap data • HInit and HRest provide isolated word style training using the fully labeled bootstrap data • Each of of the re required HMMs is generated individually A Tutorial on HMMs

91

T r a i n i n g T o o l s (c (c o n t ’d ) •

HInit : iteratively compute an initial set of parameter values using a segmental k-means procedure

•

HRest : process fully labeled bootstrap data using a Baum-Welch re-estimation procedure HCompV : all of the phone models are initialized to by identical and have state means and variances equal to the global speech mean and variance HERest : perform a single Baum-Welch re-estimation of the whole set of HMM phone models simultaneously HHed : apply a variety of parameter tying and increment the number of mixture components in specified distributions HEadapt : adapt HMMs to better model the characteristics of particular speakers using a small amount of training or adaptation data

•

• • •

A Tutorial on HMMs

92

Re c o g n i t i o n T o o l s •

HVite : use the token passing algorithm to perform Viterbi-based speech recognition

•

HBuild : allow sub-networks to be created and used within higher level networks

•

HParse : convert EBNF into the equivalent word network

•

HSgen : compute the empirical perplexity of the task

•

HDman : dictionary management tool

A Tutorial on HMMs

93

Analysis Tools • HResults – Use dynamic programming to align the two transcriptions tran scriptions and count substitution, deletion and insertion errors – Provide speaker-by-speaker breakdowns, breakdow ns, confusion matrices and time –aligned transcriptions – Compute Figure of Merit scores and Receiver Operation Curve information

A Tutorial on HMMs

94

H TK T K Ex a m p l e • Isolat lated word reco ecogniti ition

A Tutorial on HMMs

95

• Isol Isolat ated ed word word reco recogn gnit itio ion n (co (cont nt’d ’d))

A Tutorial on HMMs

96

S p e e c h Re R e c o g n i t i o n Ex E x a m p l e u s i n g HT HTK • Reco Recogn gniz izer er for for voi voice ce dial dialin ing g app applilica cati tion on – Goal of our system • Provide Provide a voice-o voice-oper perate ated d interf interface ace for for phone phone dialing dialing

– Recognizer • digi digitt strin strings gs,, limi limite ted d set set of of name names s • sub-word ba based

A Tutorial on HMMs

97

1> gram 파일을 생성한다. - gram파일은 사용할 grammar를 정의한 파일로서 전체적인 시나리오의 구성을 알려주는 파일이다. ---------------------------------------------- gram ------------------------------------------------$digit = 일 | 이 | 삼 | 사 | 오 |..... | 구 | 공; $name = 철수 | 만수 | ..... | 길동; ( SENT-START ( 누르기 <$digit> | 호출 $name) SENT-ENT ) -------------------------------------------------------$표시 이후는 각 단어군의 정의이고 맨 아랫줄이 문법이다. < >속의 내용은 반복되는 내용이며 |은 or(또는) 기호이다. SENT-START 로 시작해서 SENT-END로 끝이 난다.

2> HPa HPars rse e gram gram wdne wdnett 명령 실행. - HPar HParse se.e .exe xe가 실행되어 gram파일로부터 wdnet을 생성시킨다. A Tutorial on HMMs

98

3> dict 생성 - 단어 수준에서 각 단어의 음소를 정의 한다. ----------------------------------- dict dict ---------------------------------------------SENT-END[] sil SENT-START[] sil 공 kc oxc ngc sp 구 kc uxc sp .... 영희 jeoc ngc hc euic sp .... 팔 phc axc lc sp 호출 hc oxc chc uxc lc sp --------------------------------------------------- A Tutorial on HMMs

99

4> HSG HSGen -l -n 200 200 wdn wdnet et dict dict 명령실행 - wdnet 과 dict를 이용하여 HSGen.exe가 실행되어 입력 가능 한 문장 200개를 생성해 준다.

5> HSGen이 만들어준 훈련용 문장을 녹음한다. - HSLab 또는 일반 녹음 툴 사용

A Tutorial on HMMs

100

6> words.mlf 파일을 작성한다. - word words. s.ml mlff 파일은 녹음한 음성 파일들의 전사파일의 모음이다 . --------------------------------------- words.mlf words.mlf ---------------------------------------#!MLF!# "*/s0001.lab" 누르기 공 이 칠 공 구 일 . #*/s0002.lab" 호출 영희 . ..... ------------------------------------------------------

A Tutorial on HMMs

101

7> mkphones0.led 파일의 작성 -mkphones0.led 은 words.mlf 파일의 각 단어를 음소로 치환시킬 때의 옵션들을 저장하는 파일이다.

------------------------- mkphones0.le mkphones0.led d ------------------------------EX IS si sil sil DE sp -------------------------------------------위의 옵션의 뜻은 문장의 양끝에 sil을 삽입하고 sp는 삭제한다는 의미.

A Tutorial on HMMs

102

8>HLEd 8>HLEd -d dict -i phones0. phones0.mlf mlf mkphone mkphones0. s0.led led words.m words.mlf lf 명 령실행 - HLEd HLEd.e .exe xe 가 실행되어 mkphones0.led와 words.mlf를 이용하여 모든 단어가 음소기호로 전환된 phones0.mlf 전사 파일 작성해줌. ------------------------------------ phones0.mlf phones0.mlf ------------------------------------------------#!MLF!# EX "*/s0001.lab" sil IS sil sil nc DE sp uxc rc ... oxc ngc kc . "*/s0002.lab" ..... ------------------------------------------------------------

A Tutorial on HMMs

103

9> config 파일의 작성 성데이 이터를 mfc데이터로 전환시킬 때 사용되는 -config 파일은 음성데 각 옵션들의 집합이다.

------------------------- config config ------------------------------------TARGETKIND = MFCC_0 TARGETRATE = 100000.0 SOURCEFORMAT = NOHEAD SOURCERATE = 1250 WINDOWSIZE = 250000.0 ...... -------------------------------------------

A Tutorial on HMMs

104

10> codetr.scp 파일의 작성 -녹음한 음성파일명과 그것이 변환될 *.mfc파일명을 병렬적으로 적어 놓은 파일 ------------------------ codetr codetr.sc .scp p -----------------------------DB\s0001.wav DB\s0001.mfc DB\s0002.wav DB\s0002.mfc ... DB\s0010.wav DB\s0010.mfc ...... ------------------------------------------

11> 11> HCop HCopy y -T 1 -C con confi fig g -S cod codet etr. r.sc scp p 명령 실행 - HCop HCopy. y.ex exe e이 config 와 codetr.scp를 이용하여 음성파일을 mfc파일로 변환시켜 줌. mfc파일은 각 음성에서 config옵션에 따라 특징값을 추 출한 데이터임.

A Tutorial on HMMs

105

12> proto 파일과 train.scp파일의 작성 - proto 파일은 HMM 훈련에서 모델 토폴로지를 정의하는 것이다 . 음소기반 시스템을 위한 3상태 left-right의 정의 ----------------------------------------------------------- proto ----------------------------------------------------~o 39 ~h "proto" 5 2 39 0.0 0.0 0.0 0.0 .... .... 39 ... ... 5 .... -------------------------------------------------------------------------------------------------------------------------------train.scp: 생성된 mfc파일 리스트를 포함하는 파일임

A Tutorial on HMMs

106

13> config1 파일의 생성 - HMM훈련을 위해 config파일의 옵션 MFCC_0  MFCC_0_D_A 로 변환한 config1을 생성한다.

14> HCompV HCompV -C conf config1 ig1 -f 0.01 0.01 -m -m -S -S train train.sc .scp p -m hmm0 hmm0 proto - HCompV.exe가 hmm0폴더에 proto파일과 vFloors파일을 생성해 준다. 이것들을 이것 들을 이용 이용하여 하여 macros 와 hmmdefs파일 파일을 을 생성한 성한다 다. proto파일에 각 음소들을 포함시켜 hmmdefs파일 파일을 을 생성 생성한다 한다. -------------------------------------- hmmdefs hmmdefs -----------------------------------~h "axc" ... ~h "chc" ... ... ... -------------------------------------------------

A Tutorial on HMMs

107

vFloors 파일에 ~o를 추가하여 macros파일을 생성한다. --------------------------------- macros --------------------------------------------~o 39 ~v "varFoorl" 39 ... ----------------------------------------------Proto 파일의 일부

Hmm0/vFloors

A Tutorial on HMMs

108

15> HERest -C config1 -I phones0.mlf phones0.mlf -t 250.0 150.0 1000.0 -S train.scp - H hmm0\macros hmm0\macros -H -H hmm0\hmmdefs hmm0\hmmdefs -M hmm1 monophone monophones0 s0 명령 실 행 - HERe HERest st.e .exe xe이 hmm1 폴더에 macros 와 hmmdefs 파일을 생성해준다. - HERe HERest st.e .exe xe를 2번 실행하여 hmm2 폴더에 macros와 hmmdefs파일을 만든다. - hmm3 hmm3,, hmm hmm4, 4, … 에 대해 반복

16> 16> HVite HVite -H hmm7 hmm7/m /mac acros ros –H hmm7 hmm7/h /hmm mmde defs fs -S test.scp -l ‘* ‘*’ -i re recout.mlf -w wd wdnet -p 0. 0.0 -s -s 5. 5.0 di dict monophones

A Tutorial on HMMs

109

A Tutorial on HMMs

110

Summary •

Markov model – 1-st order Markov assumption on state transition – ‘Visible’: observation sequence determines state transition seq.

•

Hidden Markov model – 1-st order Markov assumption on state transition – ‘Hidden’: observation sequence may result from many possible state transition sequences – Fit very well to the modeling of spatial-temporally variable signal – Three algorithms: model evaluation, the most probable path decoding, model training

•

HMM ap applicat cations an and So Software are – Handwriting and speech applications – HMM tool box for Matlab – HTK

•

Acknowledgement – 본 HMM 튜토리얼 자료를 만드는데, 상당 부분 이전 튜토리얼 자료의 사 용을 허락해주신 부경대학교 신봉기 교수님과 삼성종합기술원 조성정 박 사님께 감사를 표합니다.

A Tutorial on HMMs

111

References •

Hidden Markov Model – L.R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition”, IEEE Proc . pp. 267-295, 1989. – L.R. Bahl et. al, “A Maximum Likelihood Approach to Continuous Speech Recognition” , IEEE PAMI , pp. 179-190, May. 1983. – M. Ostendorf, “From HMM’s to Segment Models: a Unified View of Stochastic Modeling for Speech Recognition ” , IEEE SPA, pp 360-378, Sep., 1996.

•

HMM Tutorials – 신봉기, “HMM Theory and Applications”, 2003컴퓨터비젼및패턴인식연구 회 춘계워크샵 튜토리얼.

– 조성정, 한국정보과학회 ILVB Tutorial, 2005.04.16, 서울. – Sam Roweis, “Hidden Markov Models (SCIA Tutorial 2003)”, http://www.cs.toronto.edu/~roweis/notes/scia03h.pdf – Andrew Moore, “Hidden Markov Models”, http://www-2.cs.cmu.edu/~awm/tutorials/hmm.html

A Tutorial on HMMs

112

Re f e r e n c e s (Co n t .) •

HMM Applications –

B.-K. Sin, J.-Y. Ha, S.-C. S.-C. Oh, Jin H. Kim, Kim, “Network-Based Approach to Online Online Cursive Script Recognition”, IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics, Vol. 29, No. 2, pp.321-328, 1999. – J.-Y. Ha, "Structure Code Code for HMM Network-Based Hangul Recognition", 18th International Conference on Computer Processing of Oriental Languages , pp.165-170, 1999. – 김무중, 김효숙, 김선주, 김병기, 하진영, 권철홍, “한국인을 위한 영어 발음 교정 시스템의 개발 및 성능 평가”, 말소리 , 제46호, pp.87-102, 2003.

•

HMM Topology optimization – – – –

•

H. Singer, and M. Ostendorf, Ostendorf, “Maximum likelihood successive state splitting,” ICASSP , 1996, pp. 601-604. A. Stolcke, and S. Omohundro, Omohundro, “Hidden Markov model induction by Bayesian model merging,” Advances in NIPS. 1993, pp. 11-18. San Mateo, CA: Morgan Kaufmann. 0 A. Biem, J.-Y. Ha, J. Subrahmonia, "A Bayesian Model Selection Criterion Criterion for HMM Topology Optimization", International Conference on Acoustics Speech and Signal Processing, pp.I989~I992, IEEE Signal Processing Society, 2002. A. Biem, “A Model Selection Criterion for Classification: Application to HMM Topology Optimization Optimization,” ,” ICDAR 2003, 2003, pp. 204-210, 2003. 2003.

HMM Software – – –

Kevin Murphy, “HMM toolbox for Matlab”, Matlab”, freely downloadable SW written in Matlab, http://www.cs.ubc.ca/~murphyk/Software/HMM/hmm.html Speech Vision and Robotics Group of Cambridge University, “HTK (Hidden Markov toolkit)”, freely downloadable SW written in C, http://htk.eng.cam.ac.uk/ Sphinx at CMU http://cmusphinx.sourceforge.net/html/cmusphinx.php

A Tutorial on HMMs

113

4-Hidden Markov Models

Recommend Documents