1
, 2006.2.1~3, KAIST
A T u t o r i a l o n Hi H i d d e n M a rk r k o v M o d e ls ls
2006년 2월 2일 하진영
[email protected] 강원대학 강원 대학교 교 컴퓨 컴퓨터학 터학부 부
Contents • • • •
Introduction Markov Model Hidden Ma Markov mo model (H (HMM) Three algorithms of HMM – Model evaluation – Most probable path decoding – Model training
• • • •
Patt Patter ern n cla class ssif ific icat atio ion n usi using ng HMMs HMMs HMM HMM Ap Applic lication ions an and So Software ware Summary References
A Tutorial on HMMs
2
S e q u e n t i a l Da Da t a • Examples
Speech data (“하나 둘 셋”)
Handwriting data A Tutorial on HMMs
3
C h a ra r a c t e r is is t i c s o f s u c h d a t a
Data are sequentially generated according to time or index Spatial information along time or index Often highly variable, but has an embedded structure Information is contained in the structure
A Tutorial on HMMs
4
A d v a n t a g e o f H M M o n Se S e q u e n t i a l Da Da t a •
Natur Natural al model model struc structu ture: re: doubl doubly y stoc stochas hasti tic c proce process ss – transition parameters model temporal variability – output distribution model spatial variability
•
Effi fficien cientt and and good good mode modeliling ng tool tool for for – sequences with temporal constraints – spatial variability along the sequence – real world complex processes
•
Effi Effici cien entt eva evalu luat atio ion, n, decodi decoding ng and train trainin ing g alg algori orithm thms s – Mathematically strong – Computationally efficient
•
Proven technology! – Successful stories in many applications
•
Tools already exist – HTK (Hidden Markov Model Toolkit) – HMM toolbox for Matlab
A Tutorial on HMMs
5
S u c c e s s f u l A p p l i c a t i o n A re re a s o f H M M • • • • • • • •
On-l On-lin ine e hand handwr writ itin ing g reco recogn gnit itio ion n Spee Speech ch reco recogn gnit itio ion n and and seg segm ment entatio ation n Gesture recognition Language modeling Moti Motion on vide video o ana analy lysi sis s and and track rackin ing g Prot Protei ein n seq seque uenc nce/ e/ge gene ne sequ sequen ence ce alig alignm nmen entt Stock price prediction …
A Tutorial on HMMs
6
Wh a t ’s H M M ?
Hidden Markov Model
Hidden
What is ‘hidden’?
A Tutorial on HMMs
+
Markov Markov Model Model
What is ‘Markov model’?
7
M a r k o v Mo Mo d e l • • • • •
Scenario Graph aphica ical rep repres resentation Definition Sequence pr probability State probability
A Tutorial on HMMs
8
M a r k o v M o d e l : Sc e n a r i o • Clas Classi siffy a wea weath ther er int into thr three ee stat states es – State 1: rain or snow – State 2: cloudy – State 3: sunny
• By caref careful ully ly exam examini ining ng the the wea weath ther er of some some city city for for a long time, we found following weather change pattern Tomorrow Today
Rain/snow
Cloudy
Sunny
Rain/Snow
0.4
0.3
0.3
Cloudy
0.2
0.6
0.2
Sunny
0.1
0.1
0.8
Assumption:: tomorrow weather depends only on today’s weather! Assumption A Tutorial on HMMs
9
M a r k o v M o d e l : G r a p h i c a l Re R e p r e s en en t a t io n • Visu Visual al illu illust stra rati tion on with with diag diagra ram m 0.4
0.3
1: rain
2: cloudy
0.2
0.6
0.2
0.3
0.1
0.1 3: sunny
0.8
- Each state state correspon corresponds ds to one observati observation on - Sum of of outgoin outgoing g edge weig weights hts is is one A Tutorial on HMMs
10
M a r k o v M o d e l : De f i n i t i o n 0.4
• Observable states
0.3
1: rain
0.2
{1 , 2 , L , N }
0.6
0.2
0.3
• Observed sequence
2: cloudy
0.1
0.1 3: sunny
q1 , q2 , L , qT
0.8
• 1st order Markov assumption
P ( qt = j | qt −1 q1
q2
L
• Stationary
P ( qt = j | qt −1 A Tutorial on HMMs
= i, qt − 2 = k ,L) = P(qt = j | qt −1 = i) qt −1
qt
qt −1
qt
Bayesian network representation
= i ) = P (qt +l = j | qt +l −1 = i ) 11
M a r k o v M o d e l : D e f i n i t i o n (C (Co n t .) • State tr transition ma matrix
⎡ a11 ⎢a 21 A = ⎢ ⎢ M ⎢ ⎣ a N 1
a12
L
a 22
L
M
M
a NN
L
0.4
= P (qt = j | qt −1 = i ),
– With constraints
a ij
∑a
0.1
0.1 3: sunny
0.8
1 ≤ i, j ≤ N
N
≥ 0,
0.6
0.2
0.3
⎥ a 2 N ⎥ M ⎥ ⎥ a NN ⎦
2: cloudy
0.2
a1 N ⎤
– Where
aij
0.3
1: rain
ij
=1
j =1
• Initia itiall state prob robabili bilitty
π i
=
P ( q1
A Tutorial on HMMs
=
i ),
1
≤
i
≤ N 12
M a r k o v M o d e l : Se Se q u e n c e P Prr o b . • Conditional pr probability P ( A, B ) = P ( A | B ) P ( B )
• Sequ Sequen ence ce prob probab abililit ity y of of Mar Marko kov v mod model el P ( q1 , q2 , L , qT )
Chain rule
= P (q1 ) P (q2 | q1 ) L P (qT −1 | q1 ,L , qT −2 ) P (qT | q1 , L, qT −1 ) = P (q1 ) P (q2 | q1 ) L P (qT −1 | qT − 2 ) P (qT | qT −1 ) 1st order Markov assumption
A Tutorial on HMMs
13
M a r k o v M o d e l : Se Se q u e n c e P Prr o b . (C (C o n t . ) • Ques Questi tion on:: W What hat is the the proba probabi bilit lity y tha thatt the the weath weather er for for the the next 7 days will be “sun-sun-rain-rain-sun-cloudy-sun” when today is sunny? S1 : rain, S 2 : cloudy , S 3 : sunny P (O | model) = P ( S 3 , S 3 , S 3 , S1 , S1 , S 3 , S 2 , S 3 | model)
= P( S3 ) ⋅ P( S3 | S3 ) ⋅ P ( S3 | S3 ) ⋅ P ( S1 | S3 ) ⋅ P( S1 | S1 ) P( S3 | S1 ) P( S 2 | S3 ) P ( S3 | S 2 ) = π 3 ⋅ a33 ⋅ a33 ⋅ a31 ⋅ a11 ⋅ a13 ⋅ a32 ⋅ a23 = 1⋅ (0.8)(0.8)(0.1)(0.4)(0.3)(0.1)(0.2) 0.4 1: 0.3 = 1.536 ×10 −4 rain
2: cloudy
0.2
0.3
0.6
0.2 0.1
0.1 3: sunny
A Tutorial on HMMs
0.8
14
M a r k o v M o d e l : S t a t e Pr Pr o b a b i l i t y • Stat State e prob robabili ility at time ime t : P ( qt = i )
• Sim Simple but slow low algo lgorit rithm: – Probability of a path that ends to state i at time t : Qt (i ) = ( q1 , q2 , L , qt = i ) Exponential time complexity: P (Qt (i )) = π q1
t
∏ P (q
k
| qk −1 )
O ( N t )
k = 2
– Summation of probabilities of all the paths that ends to i at t
P ( qt = i ) = A Tutorial on HMMs
∑ P(Q (i)) t
all Qt ( i )'s
15
M a r k o v M o d e l : S t a t e Pr Pr o b . (Co n t .) • Stat State e prob robabili ility at time ime t : P ( qt = i ) P ( q t −1
= 1)
a 1i
Each node stores the sum of probabilities of partial paths
• Efficient algorithm (Lattice algorithm) – Recursive path probability calculation P ( q t
N
∑
P ( q t − 1
=
j , q t
P ( q t − 1
=
j ) P ( q t
=
i | q t − 1
P ( q t − 1
=
j ) ⋅ a ji
=
i)
=
=
i)
j = 1
=
N
∑
=
j )
j = 1
=
N
∑
j = 1
A Tutorial on HMMs
Time complexity: O ( N 2t ) 16
Wh a t ’s H M M ?
Hidden Markov Model
Hidden
What is ‘hidden’?
A Tutorial on HMMs
+
Markov Model
What is ‘Markov model’?
17
H i d d e n M a rk rk ov Model • • • • • •
Example Generation pr process Definition Model evalua luation algo algori ritthm Path decoding algorithm Training al algorithm
A Tutorial on HMMs
18
T i m e Se S e r i e s Ex Ex a m p l e
• Representation – X = x1 x2 x3 x4 x5 … xT-1 xT
= s φ p iy iy iy φ φ ch ch ch ch
A Tutorial on HMMs
19
A n a l y si si s M e t h o d s
• Prob Probab abililit ity y-bas -based ed anal analy ysis sis? P (s φ p iy iy iy φ φ ch ch ch ch ) = ?
• Method I P (s) P (φ ) 3 P ( p) P (iy) 3 P (ch ) 4 – Observations are independent; no time/order – A poor model for temporal structure • Model size = |V | = N
A Tutorial on HMMs
20
A n a ly l y s is is m e t h o d s
• Method II P(s) P(s | s) P(φ | s) P( p | φ ) P(iy | p) P(iy | iy)
2
× P(φ | iy) P(φ | φ ) P(ch | φ ) P(ch | ch ) 2` – A simple model of ordered sequence • A symbol symbol is is depende dependent nt only only on the imme immediat diately ely prec precedi eding: ng:
P ( xt | x1 x2 x3 L xt −1 ) = P ( xt | xt −1 ) • |V |×|V | matrix model
• 50×50 – not not ver very y bad bad … • 105×105 – doubly outrageous!!
A Tutorial on HMMs
21
The problem
• “What you see is the truth” truth” – Not quite a valid assumption – There are often errors or noise • Noisy Noisy sound, sound, sloppy sloppy handwr handwriti iting, ng, ungram ungrammati matical cal senten sentences ces
– There may be some truth process • Unde Underl rlyi ying ng hid hidde den n sequ sequen ence ce • Obscur Obscured ed by the incomp incomplet lete e observa observatio tion n
A Tutorial on HMMs
22
A n ot o t h e r a na n a l ys ys i s m e t h o d
• Method III – What you see is a clue to what lies behind and is not known a priori • The source source that that gener generate ated d the the obse observa rvatio tion n • The source source evolv evolves es and generate generates s charact characteri eristi stic c observa observatio tion n sequences
q0
→ q1 → q 2 →
L
→ qT
P(s, q1 ) P(s, q2 | q1 ) P (φ , q3 | q2 ) L P(ch , qT | qT −1 ) =
∑ P(s, q ) P(s, q 1
Q
A Tutorial on HMMs
2
| q1 ) P (φ , q3 | q2 ) L P (ch , qT | qT −1
∏ P( x , q | q ) ) = ∑ ∏ P ( x , q | q t
t
t −1
t
t
Q
t
t −1
t
23
)
T h e A u x i l i a r y V a r i ab able
qt ∈ S
= {1, K,
N }
• N is also conjectured • {qt :t ≥0} is conjectured, not visible – is Q = q1 q 2 L qT – is Markovian
P(q1q2 L qT ) = P(q1 ) P(q2 | q1 ) L P(qT | qT −1 ) – “Markov chain”
A Tutorial on HMMs
24
S u m m a r y of o f t h e Co Co n c e p t
∑ P( X , Q) = ∑ P(Q) P( X | Q) = ∑ P(q q L q ) P( x x L x | q q L q = ∑ ∏ P(q | q )∏ p( x | q )
P ( X ) =
Q
Q
1
Q
T
2
T
t −1
t =1
Markov chain process
A Tutorial on HMMs
T
2
1
2
T
)
T
t
Q
1
t
t
t =1
Output process
25
H i d d e n M a rk rk ov Model
• is a dou doubl bly y sto stoc chas hastic tic pro proc cess ess – stochastic chain process : { q(t ) } – output process : { f ( x x|q) }
• is also called as – Hidden Markov chain – Probabilistic function of Markov chain
A Tutorial on HMMs
26
HMM C Ch h a r a c t e r i za za t i o n • λ = ( A A, B, π ) – A : state transition probability { aij | aij = p(qt +1 +1= j|qt =i) }
– B : symbol output/observation probability x=v|qt = j) } { b j(v) | b j(v) = p( x
– π : initial state distribution probability { π i | π i = p(q1=i) }
∑ P(Q | λ )P(X | Q, λ ) = ∑π a a ... a Q
q1
Q
A Tutorial on HMMs
q1q 2
q2 q3
b ( x1 )bq2 ( x2 ) ... bqT ( xT )
qT −1qT q1
λ
27
G r a p h i c a l Ex Ex a m p l e
π = [ 1.0 0 0 0 ] 0.6
1
0.5 0.4
iy psch
2
p
0. 7 0.5
iy
3
1.0 0.3
4 ch
1 2 A= 3 4
2
3
4
0.6 0. 0 0. 0 0. 0
0.4 0.5 0.0 0.0
0.0 0.5 0.7 0.0
0.0 0. 0 0. 3 1. 0
ch iy p 1 2 B= 3 4
A Tutorial on HMMs
1
0.2 0.2 0.0 0.0 0.6
s
0.2 0.2 0.0 0.0 0.6 0.6 … 0.2 0.5 0.3 … 0.8 0.1 0.1 … 0.0 0.2 0.2 … 28
D a t a i n t e r pr pr e t a t i o n P(s s p p iy iy iy ch ch ch|λ) = ∑Q P(ssppiyiyiychchch,Q|λ) = ∑Q P(Q|λ) p(ssppiyiyiychchch|Q,λ)
Let Q = 1 1 2 2 3 3 3 4 4 4
0.6 0.0 0.0 0.0
0.2 0.2 0.0 0.0 0.6
0.4 0.5 0.0 0.0
0.0 0.5 0.7 0.0
0.2 0.2 0.0 0.0 0.2 0.5 0.8 0.1 0.0 0.2
0.0 0.0 0.3 1.0
0.6 0.6 … 0.3 … 0.1 … 0.2 …
P(Q|λ) p(ssppiyiyiychchch|Q, λ) = P(1122333444|λ) p(ssppiyiyiychchch|1122333444, λ) 2 = (1×.6)×(.6×.6)×(.4×.5)×(.5×.5)×(.5×.8)×(.7×.8) 2 ×(.3×.6)×(1.×.6) ≅ 0.0000878 T
#multiplications ~ 2TN A Tutorial on HMMs
29
Issues in HMM
•
Intuitive decisions 1. number of states ( N N ) 2. topo topolo logy gy (sta (state te inte interr-co conn nnec ecttion) ion) 3. num number of observ ervation symbols (V )
•
Difficult problems 4. efficient ent co computatio ation n me methods 5. probability parameters (λ )
A Tutorial on HMMs
30
T h e Nu N u m b e r o f St St a t e s
• How many states? – Model size – Model topology/structure
• Factors – Pattern complexity/length and variability variabili ty – The number of samples
• Ex: rrgbbgbbbr
A Tutorial on HMMs
31
(1 ) T h e s i m p l e s t m o d e l
• Model I – N = 1
1.0
– a11=1.0 – B = [1/3, 1/6, 1/2] P(r r g b b g b b b r | λ 1 ) = 1×
1 3
1
1
1
1
1
3
6
2
2
6
×1× ×1× ×1× ×1× ×1× 1
1
1
1
2
2
2
3
×1× ×1× ×1× ×1× ≅ 0.0000322
A Tutorial on HMMs
(< 0.0000338)
32
( 2 ) T w o st st a t e m o d e l
• Model II: – N = 2 A =
0.6 0.4 0.6 0.4
B =
1/2 1/3 1/6 1/6 1/ 1/6 2/3
0.6
0.4
1
0.6
P( r r g b b g b b b r | λ 1 ) = .5 ×
1 2
1
1
2
2
1
2
3
3
3
3
× .6 × × .6 × × .4 × × .4 × × .6 × 2
2
2
1
3
3
3
2
× .4 × × .4 × × .4 × × .6 × =? A Tutorial on HMMs
0.4
2
+
LL
33
(3 ) T h r e e s t a t e m o d e l s • N=3:
2
0.7
0.2 0.6
1
0.3
0.2
3
0.7
0.3 2 0.2 0.5
1
0.6 0.2
0.2 0.6 0.3
3
0.3
0.1 A Tutorial on HMMs
34
T h e Cr Cr i t e r i o n i s
• Obtaini ining the best est model del(λ ) that maximizes
ˆ) P( X | λ • The The best best topo topolo logy gy come comes s fro from m ins insig ight ht and and experience ← the # classes/symbols/samples
A Tutorial on HMMs
35
A t r a i ne ne d H M M
.5 1
.4
R .6 G .2 B .2
.1 .2 .6 .5 2 .3 .4 .0 3 .3 .7
A Tutorial on HMMs
π=
1. 0. 0. 1
2
3
.5 .4 .1 2 .0 .6 .4 3 .0 .0 .0 1
A =
R
G
B
.6 .2 .2 2 .2 .5 .3 3 .0 .3 .7 1
B =
36
H i d d e n M a r k o v M o d e l : Ex a m p l e 0.3
2
0.6
0.2 0.6
1
0.1
0.1 0.2
3
0.6
0.3
• N pots containing color balls • M distinct colors • Each pot pot contains contains different different number of color color balls
A Tutorial on HMMs
37
H M M : G e n e r a t i o n Pr Pr o c e s s • Seq Sequen uence gen generat rating alg algorit rithm – Step 1: Pick initial pot according to some random process – Step 2: Randomly pick a ball from the pot and then replace rep lace it – Step 3: Select another pot according to a random selection process 0.3 2 – Step 4: Repeat steps 2 and 3 0.6
0.2
1
1
0.6
3
1
0.1 0.1 0.2
0.6
3
0.3
Markov process: {q(t )} Output process: { f ( x x|q)} A Tutorial on HMMs
38
H M M : H i d d e n I n f o rm rm at ion
• Now, wh what is is hidden?
– We can just see the chosen balls – We can’t see which pot is selected at a time – So, pot selection (state transition) information is hidden A Tutorial on HMMs
39
H M M : Fo r m a l D De efinit ion
• Notation: λ = ( Α, Β, π ) (1) N : Number of states (2) M : Number of symbols observable in states
V = { v 1 , L , v M } (3) A : State transition probability distribution A = { a ij }, 1 ≤ i , j ≤ N (4) B : Observation symbol probability distribution
B = { b i ( v k )}, 1 (5) π : Initial state distribution
π i
A Tutorial on HMMs
=
P ( q1
=
i ),
≤ 1
i
≤ N , 1 ≤
≤
i
j
≤ M
≤ N
40
T h r e e Pr Pr o b l e m s
1. Model evaluation problem – –
What is the probability of the observation? Forward algorithm
2. Path decoding problem – –
What is the best state sequence for the observation? Viterbi algorithm
3. Model training problem – –
How to estimate the model parameters? Baum-Welch reestimation algorithm
A Tutorial on HMMs
41
So l u t i o n t o M o d e l Ev E v a l u a t i o n Pr Pr o b l e m Forward algorithm Backward algorithm
Definition • Given a model λ • Observation sequence: X • P(X| λ) = ? • P ( X | λ ) =
∑ P ( X , Q | λ ) = ∑ P ( X | Q , λ ) P ( Q | λ ) Q
(A path or state sequence: x1 x2 x3 x4 x5 x6 x7 x8
A Tutorial on HMMs
= x1 , x 2 , L , x T
Q
Q
=
q 1 , L , q T
)
x1 x2 x3 x4 x5 x6 x7 x8
43
Solution • Easy Easy but but slo slow w solu soluti tion on:: exha exhaus usti tive ve enu enume mera rati tion on P ( X | λ )
=
∑ P ( X , Q | λ ) = ∑ P ( X | Q , λ ) P ( Q | λ ) Q
=
∑b
q1
Q
( x 1 ) b q 2 ( x 2 ) L b q T ( x T )π q 1 a q 1 q 2 a q 2 q 3 L a q T −1 q T
Q
– Exhaustive enumeration = combinational explosion!
O ( N T )
• Smart solution exists? – – – –
Yes! Dynamic Programming technique Lattice structure based computation Highly efficient -- linear in frame length
A Tutorial on HMMs
44
F o r w a r d A l go go rit h m • Key idea – Span a lattice of N states and T times – Keep the sum of probabilities of all the paths coming to each state i at time t
• Forward probability α t ( j ) = P ( x1 x2 ... xt , qt = S j | λ )
= ∑ P( x1 x2 ... xt , Qt = q1...qt | λ ) Qt
N
= ∑ α t −1 (i )aij b j ( xt ) i =1
A Tutorial on HMMs
45
F o r w a r d A l go go r i t h m • Initialization α 1 (i ) = π i bi (x1 )
1 ≤ i ≤ N
• Induction α t ( j ) =
N
∑α
t −1
(i )aij b j (x t )
1 ≤ j ≤ N , t = 2, 3, L , T
i =1
• Termination P( X | λ ) =
N
∑α (i) T
i =1
A Tutorial on HMMs
46
N um u m e r ic i c a l Ex E x a m p l e: e : P(RRGB|λ )
π =[1 0 0]T .5
.6
.4
.4
R
R .6 G .2 B .2
1×.6
.1 .2 .5 .3
0×.2
.0 .3 .7
A Tutorial on HMMs
R
.6
.5×.6
G
.18
.4×.2
0×.0
.0
.0
.5×.2
.018
.4×.5
B .5×.2
.0018
.4×.3
.6×.2
.6×.5 .6×.3 .048 .0504 .01123 .1×.0 .1×.3 .1×.7 .4×.0 .4×.3 .4×.7 .0
.01116
.01537
47
B ac a c k w a rd rd A All g or o r i t h m (1 ) • Key Idea – Span a lattice of N states and T times – Keep the sum of probabilities of all the outgoing paths at each ea ch state i at time t
• Backward pr probability β t (i ) = P( xt +1 xt + 2 ... xT | qt = S i , λ )
= ∑ P( xt +1 xt + 2 ... xT , Qt +1 = qt +1...qT | qt = Si , λ ) Qt +1 N
= ∑ aij b j ( xt +1 ) β t +1 ( j )
j =1 A Tutorial on HMMs
48
B ac a c k w a rd rd A All g or o r i t h m (2 ) • Initialization 1 ≤ i ≤ N
β T (i ) = 1
• Induction β t (i ) =
N
∑ a b (x ij j
t +1
) β t +1 ( j )
1 ≤ i ≤ N , t = T − 1, T − 2, L, 1
j =1
A Tutorial on HMMs
49
So l u t i o n t o P a t h De D e c o d i n g Pr o b l e m State sequence Optimal path Vite Viterb rbii algo algori rith thm m Sequence segmentation
T h e M o s t P r o b a b l e Pa Pa t h • • • •
Given a model λ Observation sequence: X P ( X , Q | λ ) = ?
=
x 1 , x 2 , L , x T
Q = arg max Q P ( X , Q | λ ) = arg max Q P ( X | Q , λ ) P ( Q | λ ) – (A path or state sequence: Q = q 1 , L , q T ) *
x1 x2 x3 x4 x5 x6 x7 x8
A Tutorial on HMMs
x1 x2 x3 x4 x5 x6 x7 x8
51
Vit e rbi Algorit h m • Purpose – An analysis for internal processing result – The best, the most likely state sequence – Internal segmentation
• Viterbi Algorithm – Alignment of observation and state transition – Dynamic programming technique
A Tutorial on HMMs
52
V i t e rrb b i P a t h I de dea • Key idea – Span a lattice of N states and T times – Keep the probability and the previous node of the most probable path coming to each state i at time t
•
Recursive path selection
δ t (i )aij b j (x t +1 ) – Path probability: δ t +1 ( j ) = max 1≤ i ≤ N – Path node: ψ t +1 ( j ) = arg max δ t (i ) aij 1≤i ≤ N
δ t (1) a 1 j b j ( x t + 1 ) δ t ( i ) a ij b j ( x t + 1 )
A Tutorial on HMMs
53
Vit e rbi Algorit h m • Introduction:
δ 1 (i ) = π i bi (x1 ), 1 ≤ i ≤ N ψ 1 (i ) = 0
• Recursion:
δ t +1 ( j ) = max δ t (i )aij b j (x t +1 ), 1 ≤ t ≤ T − 1 1≤i ≤ N
ψ t +1 ( j ) = arg max δ t (i )aij
1 ≤ j ≤ N
1≤ i ≤ N
∗
= max δ T (i) 1≤i ≤ N ∗ qT = arg max δ T (i )
• Termination:
P
1≤i ≤ N
• Path backtracking:
qt ∗
= ψ t +1 (qt ∗+1 ),
t = T − 1, K ,1
1
states
2 3
A Tutorial on HMMs
54
N u m e r i c a l Ex E x a m p l e : P ( R R G B ,Q , Q * |λ)
π =[1 0 0]T .5
.6
.4
.4
R
R .6 G .2 B .2
1×.6
.1 .2 .5 .3
0×.2
.0 .3 .7
A Tutorial on HMMs
R
.6
.5×.6
G
.18
.4×.2
0×.0
.0
.0
.5×.2
.018
.4×.5
B .5×.2
.0018
.4×.3
.6×.2
.6×.5 .6×.3 .048 .036 .00648 .1×.0 .1×.3 .1×.7 .4×.0 .4×.3 .4×.7 .0
.00576
.01008
55
So l u t i o n t o M o d e l t r a i n i n g Pr Pr o b l e m HMM training algorithm Maximum likelihood estimation Baum-Welch reestimation
H M M T r a i ni ni n g A l g o r i t h m • Given an obs observa rvation sequence nce X = x 1 , x 2 , L , x T • Find th the mo model pa parameter λ * = ( A , B , π ) s.t. P ( X | λ * ) ≥ P ( X | λ ) for ∀ λ – Adapt HMM parameters maximally to training samples – Likelihood of a sample
P ( X | λ )
=
∑ P ( X | Q ,λ ) P ( Q | λ ) Q
State transition is hidden!
• NO analytical solution • Baum-Welch reest reestim imat atio ion n (EM) (EM) – iterative procedures that locally maximizes P (X |λ) – convergence proven – MLE statistic estimation
P (X |λ)
λ:HMM parameters A Tutorial on HMMs
57
M ax a x i m u m L i k e l i ho h o o d Es t i m a t i o n • MLE MLE “sel “selec ects ts thos those e para parame mete ters rs that that maxi maximi mize zes s the the probability function of the observed sample.” • [Def [Defin init itio ion] n] Maxi Maximu mum m Lik Likel elih ihoo ood d Esti Estima mate te – Θ: a set of distribution parameters – Given X , Θ* is maximum likelihood estimate of Θ if f (X |Θ*) = maxΘ f (X |Θ)
A Tutorial on HMMs
58
M L E Ex a m p l e • Scenario – Known: 3 balls inside pot (some red; some white) – Unknown: R = # red balls – Observation: (two reds)
• Two models –P ( –P (
⎛ 2 ⎞⎛ 1 ⎞ ⎛ 3 ⎞ |R=2) = ⎜⎜ 2 ⎟⎟ ⎜⎜ 0 ⎟⎟ ⎜⎜ 2 ⎟⎟ = ⎝ ⎠⎝ ⎠ ⎝ ⎠ ⎛ 3 ⎞ ⎛ 3 ⎞ |R=3) = ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ = 1 ⎝ 2 ⎠ ⎝ 2 ⎠
1 3
• Which model? – L(λ R =3 ) > L(λ R = 2 ) – Model(R=3) is our choice A Tutorial on HMMs
59
M L E Ex a m p l e (Co n t .) • Model(R el(R= =3) is a more lik likely strat rategy egy, unless we have a priori knowledge of the system. • Howe Howeve ver, r, wit witho hout ut an obs obser erva vati tion on of of two two red red bal balls ls – No reason to prefer P (λR=3) to P (λR=2) • ML meth method od choo choose ses s the the set set of of par param amet eter ers s tha thatt maximizes the likelihood of the given observation. • It make makes s param paramet eters ers maxi maxima mally lly adapt adapted ed to to trai trainin ning g data data..
A Tutorial on HMMs
60
EM A l g o r i t h m f o r T r a i n i n g
• With With λ ( t ) =< { a ij }, { b ik }, π i > , esti estim mate ate EXPECTATION of following quantities: –Expected number of state i visiting –Expected number of transitions from i to j
• With With followi following ng quanti quantities ties:: –Expected number of state i visiting –Expected number of transitions from i to j
• Obtain Obtain the the MAXIM MAXIMUM UM LIKE LIKELIH LIHOO OOD D of ′ }, π i′ > λ ( t + 1 ) =< { a ij′ }, { b ik A Tutorial on HMMs
61
Ex p e c t e d N u m b e r o f γ t ( i )
= = =
S i V i s i t i n g
= S i | X , λ ) P ( q t = S i , X | λ )
P ( q t
P ( X | λ )
α t ( i ) β t ( i )
∑ α ( j ) β ( j ) t
t
j
A Tutorial on HMMs
62
E x p e c t e d Nu N u m b e r of o f T ra r a n si si t i o n
ξ t (i, j ) = P(qt = S i , qt +1
= S j | X , λ ) =
α t (i )aij b j ( xt +1 )
∑ ∑ α (i)a b ( x i
i
A Tutorial on HMMs
ij j
t +1
t +1
( j )
) β t +1 ( j )
j
63
P a r a m e t e r Re R e e st st i m a t i o n
• MLE par parameter es estimation ion T
a
ij
=
−1
∑
ξ t ( i , j )
P(X|λ)
t = 1 T − 1
∑
γ t ( i )
λ:HMM parameters
t = 1 T
∑
b j ( v k )
π
i
=
γ t ( j )
t = 1 s . t . x t = v
= γ 1 ( i )
k
T
∑
γ t ( j )
t = 1
( t +1)
– Iterative: P ( X | λ – convergence proven: – arriving local optima A Tutorial on HMMs
) ≥ P( X | λ (t ) )
64
Ot h e r i s s u e s • Other me method of training – – – – –
MAP (Maximum A Posteriori) estimation – for adaptation MMI (Maximum Mutual Information) estimation MDI (Minimum Discrimination Information) estimation Viterbi training Discriminant/reinforcement training
A Tutorial on HMMs
65
• Ot Othe herr ty types pes of of par param amet etri ric c str struc uctu ture re – Continuous density HMM (CHMM) • More More accur accurate ate,, but but much much more more parame parameter ters s to train train
– Semi-continuous HMM • Mix of CHMM CHMM and DHMM, DHMM, using using parame parameter ter sharing sharing
– State-duration HMM • More More accu accura rate te temp tempor oral al beha behavi vior or
• Other ex extensions – HMM+NN, HMM+NN, Autoregressive HMM – 2D models: MRF, MRF, Hidden Mesh model, model, pseudo-2D HMM
A Tutorial on HMMs
66
Gr a p h i c a l D DH H M M a n d CH CH M M • Models for ‘5’ and ‘2’
A Tutorial on HMMs
67
P a t t e r n Cl C l a s s i fi fi c a t i o n u s i n g H M M s • • • • •
Pattern cl classification Extension of HMM HMM struct ucture Exte Extens nsio ion n of of HMM HMM trai traini ning ng metho ethod d Practical issues of HMM HMM history
A Tutorial on HMMs
68
P a t t e r n C l a s s i f ic ic at ion • Cons Consttruct ruct one one HMM HMM per per eac each h class lass k – λ 1 , L , λ N • Train rain each HMM λ k with ith samples les Dk – Baum-Welch reestimation algorithm
• Calc Calcul ulat ate e mode modell like likeliliho hood od of λ 1 , L , λ N with with obse observ rvat atio ion n X – Forward algorithm: P ( X | λ k )
• Find ind the model with maxi aximum a posteriori probability
λ *
= argmax λ
P(λ k | X )
= argmax λ
P(λ k ) P( X | λ k )
k
P( X ) = argmax λ k P(λ k ) P( X | λ k ) k
A Tutorial on HMMs
69
E x t e n s io i o n o f H M M St St r u c t u r e • Exte Extens nsio ion n of of sta state te tran transi siti tion on para parame mete ters rs – Duration modeling HMM • More More accu accura rate te temp tempor oral al beha behavi vior or
– Transition-output HMM • HMM output output functions functions are attach attached ed to transitions transitions rather rather than than states states
• Exte Extens nsio ion n of obse observ rvat atio ion n par param amet eter er – Segmental HMM • More More accurat accurate e modeli modeling ng of traje trajecto ctorie ries s at each each state, state, but but more more computational cost
– Continuous density HMM (CHMM) • Output Output dist distribu ributio tion n is model modeled ed with with mixtu mixture re of Gaus Gaussian sian
– Semi-continuous HMM (Tied mixture HMM) • Mix of of continu continuous ous HMM HMM and discr discrete ete HMM HMM by shar sharing ing Gaus Gaussia sian n components
A Tutorial on HMMs
70
E x t e n s i o n o f H M M T r ai ai n i n g M e t h o d • Maxi Maximu mum m Like Likeliliho hood od Esti Estima mati tion on (MLE (MLE)* )* – maximize the probability of the observed samples
• Maxi Maximu mum m Mut Mutua uall Inf Infor orma mati tion on (MMI (MMI)) Met Metho hod d – information-theoretic measure – maximize average mutual information: *
I
=
max
⎧V ⎡ λ ⎨ ∑ ⎢ log ⎩ ν =1 ⎣
P ( X | λ v ) − log v
V
∑ P ( X w =1
w
⎤⎫ | λ w ) ⎥ ⎬ ⎦⎭
– maximize discrimination power by training models together
• Minim Minimum um Discri Discrimi minat nation ion Info Informa rmati tion on (MDI (MDI)) Meth Method od – minimize the DI or the cross entropy between pd(signal) and pd(HMM)’s – use generalized Baum algorithm A Tutorial on HMMs
71
P r a c t i c a l I s su s u e s of of H M M • Arch Archit itec ectu tura rall and and beha behavi vior oral al cho choic ices es – – – –
the unit of modeling -- design choice type of models: ergodic, left-right, parallel path. number of states observation symbols;discrete, continuous; mixture number
• Initial estimates – A, π : adequate with random or uniform initial values – B : good initial estimates are essential for CHMM
ergodic A Tutorial on HMMs
left-right
parallel path 72
P r a c t i c a l I s s u e s o f H M M (C (Co n t .) • Scaling t −1
α t (i ) =
t
∏ a ∏ b (x ) s , s +1
s =1
s
s
s =1
– heads exponentially to zero: scaling (or using log likelihood)
• Mult Multip iple le obse observ rvat atio ion n sequ sequen ence ces s X (k )| – accumulate the expected freq. with weight P( X )|l)
• Insuffici icient trai rainin ning da data – deleted interpolation with desired model & small model mode l – output prob. smoothing (by local perturbation of symbols) – output probability tying between different states
A Tutorial on HMMs
73
P r a c t i c a l I s s u e s o f H M M (C (Co n t .) •
HMM to topology op optimization – What to optimize • # of states • # of of Gau Gauss ssia ian n mix mixtu ture res s per per stat state e • Transitions
– Methods • Heur Heuris isti tic c meth ethods – # of states from from average (or mod) length of of input frames
• Split / merge – # of states from iterative iterative split / merge
• Mode Modell sel selec ecti tion on crit criter eria ia – – – – – –
A Tutorial on HMMs
# of states and mixtures at the same time ML (maximum likelihood) BIC (Bayesian information criteria) HBIC (HMM-oriented BIC) DIC (Discriminative information criteria) ..
74
H M M a p p li l i c a t i o n s a n d So So f t w a r e • • • •
On-li n-line ne hand handwr writ itin ing g reco recogn gnit itio ion n Speech applications HMM to toolbox fo for Ma Matlab HTK HTK (hi (hidd dden en Marko arkov v mod model el Tool Toolki kit) t)
A Tutorial on HMMs
75
H M M Ap A p p l ic ic at ions • On-l On-lin ine e hand handwr writ itin ing g reco recogn gnit itio ion n – BongNet: HMM network-based network-ba sed handwriting recognition system
• Speech applications – CMU Sphinx : Speech recognition toolkit – 언어과학 Dr.Speaking Dr.Speaking : English English pronunciation pronunciation correctio correction n system
A Tutorial on HMMs
76
BongNet •
Cons Consort ortiu ium m of CAIR( CAIR(Cen Cente terr for Artif Artific icia iall Intel Intellilige genc nce e Resea Researc rch) h) at KAIST – The name “BongNet” from its major inventor, BongKee Shin
• •
Prom Promin inen entt perf perfor orman mance ce for uncons unconstr trai aine ned d on-l on-lin ine e Han Hangul gul recognition Mode odeling of of Ha Hangul handw ndwriti ritin ng – considers ligature between letters as well as consonants and vowels • (initi (initial al conson consonant ant)+( )+(lig ligatu ature re)+( )+(vow vowel) el) • (initi (initial al conson consonant ant)+( )+(lig ligatu ature) re)+(v +(vowe owel)+ l)+(li (ligat gature ure)+( )+(fin final al conson consonant ant))
– connects letter models and ligature models using Hangul composition principle – further improvements • • • •
BongNet+ BongNet+ : incorp incorporat orating ing structur structural al informat information ion explicitl explicitly y Circu Circular lar BongN BongNet et : succ success essive ive charac character ter recogn recogniti ition on Unifie Unified d Bong BongNet Net : Hangu Hangull and and alpha alphanum numeri eric c reco recogni gnitio tion n dict iction ionary look look-u -up p
A Tutorial on HMMs
77
• Network structure
A Tutorial on HMMs
78
A M o di di f i c a t i o n t o B o n ng gNet 16-dir Chaincode Structure Code Generation • Structure co code se sequence – carries structural information • not easily easily acquire acquired d using using chain chain code code seque sequence nce • includ including ing length length,, direc directio tion, n, and vendin vending g
Distance Rotation 18.213 45.934 41.238 45.796 18.299 16.531 45.957 52.815 26.917 53.588 56.840
A Tutorial on HMMs
Straightness 96.828 87.675 99.997 97.941 98.820 88.824 100.000 99.999 99.961 99.881 80.187
Direction 46.813 146.230 0.301 138.221 8.777 298.276 293.199 95.421 356.488 156.188 17.449
Real 1 1 1 1 1 1 0 1 1 1 1
1 1 0 1 0 -1 0 0 0 0 -1
3 37 0 37 0 31 54 11 28 15 5
79
Dr . Sp e a k i n g
1
–
2
–
A Tutorial on HMMs
80
시스템 구조
speech
F e a t u r e Ex Ex t r a c t i o n & Acoustic Analysis
Target Speech DB spoken b y N a t i v e Sp Sp e a k e r Target Speech DB spoken b y N o n - n a t i v e S p ea ea k e r (mis-pronunciation)
Decoder
Acoustic Score
Score Estimation
Evaluation Score
Acoustic Model ( P h o n e m e U n it it of Native) Language Model (Phoneme Unit) Acoustic Model ( P h o n e m e U n it it o f n o n n a t i v e) e)
Target Pronunciation Dictionary
Target mis-Pronunciation Dictionary ( A n a l y si s i s N o n n a t i v e s p e e ch ch p a t t e r n )
A Tutorial on HMMs
81
• Acoustic modeling Native HMM Non-Native HMM
A
B C standard
• Language modeling standard
error
replacement error modeling A Tutorial on HMMs
deletion error modeling
error
insertion error modeling 82
•
단어수준 발음 교정 : 음소단위 오류패턴 검출 – 오류발음대치, 삽입, 삭제, 이중모음 분리, 강세, 장단오류
A Tutorial on HMMs
83
•
문장 발음 연습
시각적인 교정정보 및 수준별 학습 평가
1
정확성평가 - 정발음패턴과 다양한 유형의 오류발음 패턴을 기반으로 평가
2
억양평가 - 억양관련 음성신호 추출 후 표준패턴과 오류패턴을 기반으로 평가
3
유창성평가 - 연음여부, 끊어 읽기, 발화구간 등 다양한 평가요소를 기반으로 평가 A Tutorial on HMMs
84
S o f t w a r e T o ol o l s f o r H MM MM •
HMM toolbox for Matlab – – – – –
•
Developed by Kevin Murphy Freely downloadable SW written in Matlab (Hmm… Matlab is not free!) Easy-to-use: flexible data structure and fast prototyping by Matlab Somewhat slow performance due to Matlab Download: http://www.cs.ubc.ca/~murphyk/Software/HMM/hmm.html
HTK (Hid Hidden Marko rkov toolk olkit) – Developed by Speech Vision and Robotics Group of Cambridge University – Freely downloadable SW written in C – Useful for speech recognition research: comprehensive set of programs for training, recognizing and analyzing speech signals – Powerful and comprehensive, but somewhat complicate – Download: http://htk.eng.cam.ac.uk/
A Tutorial on HMMs
85
W h a t i s H TK TK ? • Hidd Hidde en Marko rkov Model del Toolk olkit • Set Set of of too tools ls for for tra train inin ing g and and eval evalua uati tion on HMMs HMMs • Prima Primaril rily y used used in auto automa mati tic c speec speech h rec recog ognit nitio ion n and and economic modeling • Modul Modular ar imple impleme ment ntat ation ion,, (rel (relat ativ ively ely)) easy easy to exte extend nd
A Tutorial on HMMs
86
H TK T K S o ft f t w a re r e A r c h i t e c t u re re – HShell : User input/output & interaction with the OS – HLabel : Label files – HLM : Language model – HNet : Network and lattices – HDic : Dictionaries – HVQ : VQ codebooks – HModel : HMM definitions – HMem : Memory management – HGraf : Graphics – HAdapt : Adaptation – HRec : main recognition processing functions A Tutorial on HMMs
87
Ge n e r i c Pr o p e r t i e s o f a H T K T o o l
• •
Desi Design gned ed to to run run with with a tradi traditi tiona onall comm command and-l -lin ine e styl style e inter interfac face e Each Each too tooll has has a numb number er of of requ requir ired ed argu argumen mentt plus plus opti optiona onall argum argumen ents ts HFoo
-T
1
-f
34.3
-a
-s
myfil e myfile
file1 file2
– This tool has two main arguments called file1 and file2 plus four optional arguments – -f : real number, -T : integer, -s : string, -a : no following value HFoo
-C
config
-f
34.3
-a
-s
myfile
file1
file2
– HFoo will load the parameters stored in the configuration configurat ion file config during its initialization procedures – Configuration parameters can sometimes by used as an alternative to using command line arguments A Tutorial on HMMs
88
T h e T o ol olk it • There are 4 main phases – data preparation, training, testing and analysis
• The Toolkit – Data Preparation Tools – Training Tools – Recognition Tools – Analysis Tools < HTK Processing Stages > A Tutorial on HMMs
89
Da t a P Prr e p a r a t i o n T o o l s • • • • • • • •
A set set of spe speec ech h data data fil file e and and thei theirr asso associ ciat ated ed tran transc scri ripti ptions ons are are required It must must by conv convert erted ed into into the the app appro ropri priat ate e para parame metr tric ic form form HSlab : Used both to record the speech and to manually annotate it with and required transcriptions HCopy : simply copying each file performs the required encoding HList : used to check the contents of any speech file HLed : output file to a single Master Label file MLF which is usually more convenient for subsequent processing HLstats : gather and display statistics on label files and where required HQuant : used to build a VQ codebook in preparation for building discrete probability HMM system
A Tutorial on HMMs
90
Training Tools • If there is some speech ech data available for which the location of the sub-word boundaries have been marked, this can be used as bootstrap data • HInit and HRest provide isolated word style training using the fully labeled bootstrap data • Each of of the re required HMMs is generated individually A Tutorial on HMMs
91
T r a i n i n g T o o l s (c (c o n t ’d ) •
HInit : iteratively compute an initial set of parameter values using a segmental k-means procedure
•
HRest : process fully labeled bootstrap data using a Baum-Welch re-estimation procedure HCompV : all of the phone models are initialized to by identical and have state means and variances equal to the global speech mean and variance HERest : perform a single Baum-Welch re-estimation of the whole set of HMM phone models simultaneously HHed : apply a variety of parameter tying and increment the number of mixture components in specified distributions HEadapt : adapt HMMs to better model the characteristics of particular speakers using a small amount of training or adaptation data
•
• • •
A Tutorial on HMMs
92
Re c o g n i t i o n T o o l s •
HVite : use the token passing algorithm to perform Viterbi-based speech recognition
•
HBuild : allow sub-networks to be created and used within higher level networks
•
HParse : convert EBNF into the equivalent word network
•
HSgen : compute the empirical perplexity of the task
•
HDman : dictionary management tool
A Tutorial on HMMs
93
Analysis Tools • HResults – Use dynamic programming to align the two transcriptions tran scriptions and count substitution, deletion and insertion errors – Provide speaker-by-speaker breakdowns, breakdow ns, confusion matrices and time –aligned transcriptions – Compute Figure of Merit scores and Receiver Operation Curve information
A Tutorial on HMMs
94
H TK T K Ex a m p l e • Isolat lated word reco ecogniti ition
A Tutorial on HMMs
95
• Isol Isolat ated ed word word reco recogn gnit itio ion n (co (cont nt’d ’d))
A Tutorial on HMMs
96
S p e e c h Re R e c o g n i t i o n Ex E x a m p l e u s i n g HT HTK • Reco Recogn gniz izer er for for voi voice ce dial dialin ing g app applilica cati tion on – Goal of our system • Provide Provide a voice-o voice-oper perate ated d interf interface ace for for phone phone dialing dialing
– Recognizer • digi digitt strin strings gs,, limi limite ted d set set of of name names s • sub-word ba based
A Tutorial on HMMs
97
1> gram 파일을 생성한다. - gram파일은 사용할 grammar를 정의한 파일로서 전체적인 시나리오의 구성을 알려주는 파일이다. ---------------------------------------------- gram ------------------------------------------------$digit = 일 | 이 | 삼 | 사 | 오 |..... | 구 | 공; $name = 철수 | 만수 | ..... | 길동; ( SENT-START ( 누르기 <$digit> | 호출 $name) SENT-ENT ) -------------------------------------------------------$표시 이후는 각 단어군의 정의이고 맨 아랫줄이 문법이다. < >속의 내용은 반복되는 내용이며 |은 or(또는) 기호이다. SENT-START 로 시작해서 SENT-END로 끝이 난다.
2> HPa HPars rse e gram gram wdne wdnett 명령 실행. - HPar HParse se.e .exe xe가 실행되어 gram파일로부터 wdnet을 생성시킨다. A Tutorial on HMMs
98
3> dict 생성 - 단어 수준에서 각 단어의 음소를 정의 한다. ----------------------------------- dict dict ---------------------------------------------SENT-END[] sil SENT-START[] sil 공 kc oxc ngc sp 구 kc uxc sp .... 영희 jeoc ngc hc euic sp .... 팔 phc axc lc sp 호출 hc oxc chc uxc lc sp --------------------------------------------------- A Tutorial on HMMs
99
4> HSG HSGen -l -n 200 200 wdn wdnet et dict dict 명령실행 - wdnet 과 dict를 이용하여 HSGen.exe가 실행되어 입력 가능 한 문장 200개를 생성해 준다.
5> HSGen이 만들어준 훈련용 문장을 녹음한다. - HSLab 또는 일반 녹음 툴 사용
A Tutorial on HMMs
100
6> words.mlf 파일을 작성한다. - word words. s.ml mlff 파일은 녹음한 음성 파일들의 전사파일의 모음이다 . --------------------------------------- words.mlf words.mlf ---------------------------------------#!MLF!# "*/s0001.lab" 누르기 공 이 칠 공 구 일 . #*/s0002.lab" 호출 영희 . ..... ------------------------------------------------------
A Tutorial on HMMs
101
7> mkphones0.led 파일의 작성 -mkphones0.led 은 words.mlf 파일의 각 단어를 음소로 치환시킬 때의 옵션들을 저장하는 파일이다.
------------------------- mkphones0.le mkphones0.led d ------------------------------EX IS si sil sil DE sp -------------------------------------------위의 옵션의 뜻은 문장의 양끝에 sil을 삽입하고 sp는 삭제한다는 의미.
A Tutorial on HMMs
102
8>HLEd 8>HLEd -d dict -i phones0. phones0.mlf mlf mkphone mkphones0. s0.led led words.m words.mlf lf 명 령실행 - HLEd HLEd.e .exe xe 가 실행되어 mkphones0.led와 words.mlf를 이용하여 모든 단어가 음소기호로 전환된 phones0.mlf 전사 파일 작성해줌. ------------------------------------ phones0.mlf phones0.mlf ------------------------------------------------#!MLF!# EX "*/s0001.lab" sil IS sil sil nc DE sp uxc rc ... oxc ngc kc . "*/s0002.lab" ..... ------------------------------------------------------------
A Tutorial on HMMs
103
9> config 파일의 작성 성데이 이터를 mfc데이터로 전환시킬 때 사용되는 -config 파일은 음성데 각 옵션들의 집합이다.
------------------------- config config ------------------------------------TARGETKIND = MFCC_0 TARGETRATE = 100000.0 SOURCEFORMAT = NOHEAD SOURCERATE = 1250 WINDOWSIZE = 250000.0 ...... -------------------------------------------
A Tutorial on HMMs
104
10> codetr.scp 파일의 작성 -녹음한 음성파일명과 그것이 변환될 *.mfc파일명을 병렬적으로 적어 놓은 파일 ------------------------ codetr codetr.sc .scp p -----------------------------DB\s0001.wav DB\s0001.mfc DB\s0002.wav DB\s0002.mfc ... DB\s0010.wav DB\s0010.mfc ...... ------------------------------------------
11> 11> HCop HCopy y -T 1 -C con confi fig g -S cod codet etr. r.sc scp p 명령 실행 - HCop HCopy. y.ex exe e이 config 와 codetr.scp를 이용하여 음성파일을 mfc파일로 변환시켜 줌. mfc파일은 각 음성에서 config옵션에 따라 특징값을 추 출한 데이터임.
A Tutorial on HMMs
105
12> proto 파일과 train.scp파일의 작성 - proto 파일은 HMM 훈련에서 모델 토폴로지를 정의하는 것이다 . 음소기반 시스템을 위한 3상태 left-right의 정의 ----------------------------------------------------------- proto ----------------------------------------------------~o
39 ~h "proto" 5 2 39 0.0 0.0 0.0 0.0 .... .... 39 ... ... 5 .... -------------------------------------------------------------------------------------------------------------------------------train.scp: 생성된 mfc파일 리스트를 포함하는 파일임
A Tutorial on HMMs
106
13> config1 파일의 생성 - HMM훈련을 위해 config파일의 옵션 MFCC_0 MFCC_0_D_A 로 변환한 config1을 생성한다.
14> HCompV HCompV -C conf config1 ig1 -f 0.01 0.01 -m -m -S -S train train.sc .scp p -m hmm0 hmm0 proto - HCompV.exe가 hmm0폴더에 proto파일과 vFloors파일을 생성해 준다. 이것들을 이것 들을 이용 이용하여 하여 macros 와 hmmdefs파일 파일을 을 생성한 성한다 다. proto파일에 각 음소들을 포함시켜 hmmdefs파일 파일을 을 생성 생성한다 한다. -------------------------------------- hmmdefs hmmdefs -----------------------------------~h "axc" ... ~h "chc" ... ... ... -------------------------------------------------
A Tutorial on HMMs
107
vFloors 파일에 ~o를 추가하여 macros파일을 생성한다. --------------------------------- macros --------------------------------------------~o 39 ~v "varFoorl" 39 ... ----------------------------------------------Proto 파일의 일부
Hmm0/vFloors
A Tutorial on HMMs
108
15> HERest -C config1 -I phones0.mlf phones0.mlf -t 250.0 150.0 1000.0 -S train.scp - H hmm0\macros hmm0\macros -H -H hmm0\hmmdefs hmm0\hmmdefs -M hmm1 monophone monophones0 s0 명령 실 행 - HERe HERest st.e .exe xe이 hmm1 폴더에 macros 와 hmmdefs 파일을 생성해준다. - HERe HERest st.e .exe xe를 2번 실행하여 hmm2 폴더에 macros와 hmmdefs파일을 만든다. - hmm3 hmm3,, hmm hmm4, 4, … 에 대해 반복
16> 16> HVite HVite -H hmm7 hmm7/m /mac acros ros –H hmm7 hmm7/h /hmm mmde defs fs -S test.scp -l ‘* ‘*’ -i re recout.mlf -w wd wdnet -p 0. 0.0 -s -s 5. 5.0 di dict monophones
A Tutorial on HMMs
109
A Tutorial on HMMs
110
Summary •
Markov model – 1-st order Markov assumption on state transition – ‘Visible’: observation sequence determines state transition seq.
•
Hidden Markov model – 1-st order Markov assumption on state transition – ‘Hidden’: observation sequence may result from many possible state transition sequences – Fit very well to the modeling of spatial-temporally variable signal – Three algorithms: model evaluation, the most probable path decoding, model training
•
HMM ap applicat cations an and So Software are – Handwriting and speech applications – HMM tool box for Matlab – HTK
•
Acknowledgement – 본 HMM 튜토리얼 자료를 만드는데, 상당 부분 이전 튜토리얼 자료의 사 용을 허락해주신 부경대학교 신봉기 교수님과 삼성종합기술원 조성정 박 사님께 감사를 표합니다.
A Tutorial on HMMs
111
References •
Hidden Markov Model – L.R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition”, IEEE Proc . pp. 267-295, 1989. – L.R. Bahl et. al, “A Maximum Likelihood Approach to Continuous Speech Recognition” , IEEE PAMI , pp. 179-190, May. 1983. – M. Ostendorf, “From HMM’s to Segment Models: a Unified View of Stochastic Modeling for Speech Recognition ” , IEEE SPA, pp 360-378, Sep., 1996.
•
HMM Tutorials – 신봉기, “HMM Theory and Applications”, 2003컴퓨터비젼및패턴인식연구 회 춘계워크샵 튜토리얼.
– 조성정, 한국정보과학회 ILVB Tutorial, 2005.04.16, 서울. – Sam Roweis, “Hidden Markov Models (SCIA Tutorial 2003)”, http://www.cs.toronto.edu/~roweis/notes/scia03h.pdf – Andrew Moore, “Hidden Markov Models”, http://www-2.cs.cmu.edu/~awm/tutorials/hmm.html
A Tutorial on HMMs
112
Re f e r e n c e s (Co n t .) •
HMM Applications –
B.-K. Sin, J.-Y. Ha, S.-C. S.-C. Oh, Jin H. Kim, Kim, “Network-Based Approach to Online Online Cursive Script Recognition”, IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics, Vol. 29, No. 2, pp.321-328, 1999. – J.-Y. Ha, "Structure Code Code for HMM Network-Based Hangul Recognition", 18th International Conference on Computer Processing of Oriental Languages , pp.165-170, 1999. – 김무중, 김효숙, 김선주, 김병기, 하진영, 권철홍, “한국인을 위한 영어 발음 교정 시스템의 개발 및 성능 평가”, 말소리 , 제46호, pp.87-102, 2003.
•
HMM Topology optimization – – – –
•
H. Singer, and M. Ostendorf, Ostendorf, “Maximum likelihood successive state splitting,” ICASSP , 1996, pp. 601-604. A. Stolcke, and S. Omohundro, Omohundro, “Hidden Markov model induction by Bayesian model merging,” Advances in NIPS. 1993, pp. 11-18. San Mateo, CA: Morgan Kaufmann. 0 A. Biem, J.-Y. Ha, J. Subrahmonia, "A Bayesian Model Selection Criterion Criterion for HMM Topology Optimization", International Conference on Acoustics Speech and Signal Processing, pp.I989~I992, IEEE Signal Processing Society, 2002. A. Biem, “A Model Selection Criterion for Classification: Application to HMM Topology Optimization Optimization,” ,” ICDAR 2003, 2003, pp. 204-210, 2003. 2003.
HMM Software – – –
Kevin Murphy, “HMM toolbox for Matlab”, Matlab”, freely downloadable SW written in Matlab, http://www.cs.ubc.ca/~murphyk/Software/HMM/hmm.html Speech Vision and Robotics Group of Cambridge University, “HTK (Hidden Markov toolkit)”, freely downloadable SW written in C, http://htk.eng.cam.ac.uk/ Sphinx at CMU http://cmusphinx.sourceforge.net/html/cmusphinx.php
A Tutorial on HMMs
113