Distributed Database Management Systems
Derived Horizontal Fragmentation(DHF)
2
•
Defined on a member relation of a link according to a selection operation specified on its owner
3
•
Two important points: – Each
link is an equi-join.
– Equijoin can
be implemented by means of
semi-joins
4
•
So we are interested in defining the partitions of member based on fragmentation of its owner, but want to see attributes only from member, so
5
Ri = R ⋉ Si, 1≤ i ≤ w where w is the maximum number of fragments that will be defined on R and Si = Fi (S), where Fi is formula for PHF on S
6
DHF Example
7
PAY
title, sal L1
EMP
eNo, Name, titke
ASIGN
jNo, jName, budget, loc
eNo, jNo, resp, dur
8
PROJ
Considering the link L1 above: owner (L1) = PAY member (L1) = EMP • • We want to group employees on the basis of their salaries one with salary less than or equal to 30,000/and other more than that •
9
eNo
eName
title
E1
T Khan
Elec Eng
E2
W Shah
Sys Ana
E3
R Dar
Mech Eng
E4
K Butt
Programme
E5
F Sahbai
Sys Ana
E6
A Haq
Elec Eng
E7
S Farhana
Mech Eng
E8
M Daud
Sys Ana
10
Title
Sal
Elect. Eng
40000
Sys Analyst
34000
Mech. Eng
27000
Programmer
24000
11
eNo
eName
title
E1
T Khan
Elec Eng
E2
W Shah
Sys Ana
E5
F Sahbai
Sys Ana
E6
A Haq
Elec Eng
E8
M Daud
Sys Ana
E3
R Dar
Mech Eng
E4
K Butt
Programme
E7
S Farhana
Mech Eng
(PAY) PAY2 = σ sal > 30000 (PAY) EMP1 = EMP ⋉ PAY1 EMP2 = EMP ⋉ PAY2
• PAY1 = σ sal ≤ 30000 • • •
DHF •
The inputs required for DHF – The set
of partitions for owner
– Member
relation
– Semi-join predicates between owner and
member
14
DHA
Care in case of multiple owners, like ASIGN • Fragmentation selection depends: •
1- One with better Join Characteristics 2- One used in more applications •
Second one is straight forward, we should try to facilitate heavy users; the first one needs more considerations
15
DHF •
For the first point; Join is performed on smaller relations, that increases efficiency – The join can be performed in parallel in case of simple graphs, that improves efficiency as well; simple graph means –
PAY1
PAY2
EMP1
EMP2
16
DHF •
demonstrates two things
1-Derived fragmentation may follow a chain, like PAY-EMP-ASIGN 2-Typically, more than one fragmentation options are there, which one adopted is an allocation problem discussed later
Checking for Correctness
Completeness: for PHF depends on Pr’, and in DHF, completeness of owner Pr’, and the referential integrity constraint Reconstruction: Involves Union in both cases Disjointness: Simple in PHF if the p i in Pr’ are mutually exclusive; in DHF, guaranteed in case of simple join graph, however in case of partitioned join graph it is hard to establish 17
Checking for Correctness
Completeness: for PHF depends on Pr’, and in DHF, completeness of owner Pr’, and the referential integrity constraint Let R be member S be owner Fs = { S1,S2,…Sn} ‘A’ the common attribute t[A] = t’ [A]
18
Reconstruction: Involves Union in both cases FR = {R1,R2,….Rn}
R = U Ri ∀ Ri ∈FR
19
Disjoint ness: Simple in PHF if the pi in Pr’ are mutually exclusive; in DHF, guaranteed in case of simple join graph, however in case of partitioned join graph it is hard to establish
20
Vertical Fragmentation (VF) •
Vertical subset of relation
•
A VF of a relation produces fragments R1, R2, …. Rn, each of which contains subset of attributes of R and PK of R.
•
Objective is to produce smaller relations, so that most of the applications run on smaller relations; so they become fast. 21
Vertical Fragment •
Vertical fragmentation is more complicated, since more alternatives exist.
•
VF is mainly based on heuristics
22
Example of VF CUST
A/C# Na me Delta = ΠA/C#, Name, Branch (CUST) Beta =
ΠA/C#, Bal
(CUST)
Delta A/C# Na me
Bran ch
AB10 Sae MTN 1 ed AB20 Lae LHR 2
Bal
AB10 Sae 4535 1 ed
Bran ch MTN
AB20 Lae 45632 LHR 2 eq .34 AB20 Sal 3 ma
67839 LHR .87
AB10 Sha 45.32 MTN 9 an
23
Two Alternatives of VF •
Grouping: Starting with single attribute VFs and then combining different attributes
24
Two Alternatives of VF •
Splitting: Starting from the whole relation and then breaking it down analyzing the nature of applications • Later suits better to DDB environment; results non-overlapping fragments; so discussed here
25
Thanks