2.1. Entity Relationship Modeling Introduction The Entity-Relationship model (or ER model) is a way of graphically representing the logical relationships of objects in order to create a database. Creation of an ER diagram is the first step in designing a database. It helps the designer(s) to understand and to specify the desired components of the database and the relationships among those components. n ER model is a graphical graphical representation representation which contains entities or !items!" !items!" relationship relationshipss among among the entities and attributes of the entities and relationships. The following are the three basic elements in the ER model. •
Entities # ny objects or items
•
ttribute# The ttribute is nothing but a property of an entity
•
Relationships # The lin$s between %arious entities
&et us ta$e 'ni%ersity database as an eample and try to understand how ER model is arri%ed at. Eample# uni%ersity consists of a number of departments. Each department offers se%eral courses. Each course includes a number of modules. tudents enroll in a particular course and study modules towards the completion of that course. Each module is taught by a lecturer from the appropriate department" and each lecturer teaches a group of students. Entities Entities are real world items or concepts that eist on their own and are represented as objects or things of interest. n entity type is a collection of entities that share a common definition. Identify all nouns in our uni%ersity eample"
uni%ersity consists of a number of departments. Each department offers se%eral courses. Each course includes a number of modules. tudents enroll in a particular course and study modules towards the completion of that course. Each module is taught by a lecturer from the appropriate department" and each lecturer teaches a group of students. This scenario consists of students" students" lecturers" modules" modules" courses and departments. departments. o here the physical things(*hysical things are those which eist in this world" that we can touch" feel etc.) li$e students" lecturers and abstract things(n abstract thing is an idea or a concept in your mind. It is not something that you can physically reach out and touch" smell" hear" taste" see) li$e modules"department etc." ma$e an entity type. If we ta$e students as an entity type" then each student in the uni%ersity is an entity. The entities are represented as nouns in the description because they are objects or things. +e can touch an entity of physical things and feel the entity of abstract things but an entity type is simply an idea. tudent is an idea of physical things (entity type) while cott" ,ancy" &indsey" and ac$enie are touchable (tudent names are entities). /epartment is an idea of abstract things (entity (entit y type) while IT"CE"ECE IT"CE"ECE and CI0I& are entities. Entity Diagrams
•
•
In an E-R /iagram" an entity is usually drawn as a rectangle. The bo is labeled with the name of the entity type. The entities identified in our eample are shown in 1igure 2.3.
Figure 2.1 : Entities Entities Weak Entity If an entity depends on another eisting entity then it is considered as wea$. wea$ entity cannot be identified by its own attributes. wea$ entity is represented by double rectangles in E-R diagram. Eample# ubodule is a good eample for wea$ entity. entit y. The ubodule will be meaningless without a odule entity and so it depends on the eistence of odule as shown in 1igure 2.2
Figure 2.2 : Weak Weak Entity Attributes ttributes represent properties" facts" aspects or details of an entity. There are attributes or
particular properties that describe each entity. In our 'ni%ersity database each student in the uni%ersity will ha%e a tudent I/" ,ame" Course ta$en etc. imilarly each lecturer will ha%e his4her own properties of I/" ,ame" department etc. ttributes will ha%e a name" an associated entity and properties of an entity. ttributes are often nouns also. Attributes in ER diagram •
In an E4R /iagram attributes are represented by an o%al.
•
line is used to lin$ an attribute to its entity.
The figure below represents the entities and their corresponding attributes in the 'ni%ersity database.
Figure 2.3 : Entities and Attributes Multivalued Attribute multi%alued attribute is an attribute that has more than one %alue attached to it. 1or instance if phone number and graduating degree are the attributes of an Entity called *erson" then those attributes could ha%e multiple %alues" as a person could ha%e multiple phone numbers or could hold multiple graduating degrees. +e represent a multi%alued attribute by double o%al in E-R diagram. ingle 0alued ttribute# ttribute that holds a single %alue5 in 6ur eample the attributes of tudents such as Roll number" ge" /ate of 7irth" City etc." can ha%e only a single %alue. In our eample" a tudent can ha%e multiple phone numbers" and so *hone number is a multi%alued attribute.
Figure 2.4 : Multivalued Attributes Relationships The association between two or more entities is called a relationship. In our 'ni%ersity database" each student studies se%eral odules and each &ecturer teaches se%eral tudents. 8ere the entity types tudent - odules and &ecturer - tudents ha%e a relationship. The 0erbs most often describe relationships between entities. Identify the %erbs(relationships) in our 'ni%ersity database eample#
uni%ersity consists of a number of departments. Each department offers se%eral courses. Each course includes a number of modules. tudents enroll in a particular course and study modules towards the completion of that course. Each module is taught by a lecturer from the appropriate a department" and each lecturer teaches a group of students. Each relationship has a name" a set of entities that participate in it" a degree and a cardinality ratio. The degree is the number of entities that participate in that relationship(most ha%e degree 2" 1or eample in figure 2.9 each &ecturer teaches se%eral tudents" so we can say that this relationship has degree 2. 8ere the degree is 2 because it has two entities related to it). Relationships in an ER diagram Relationships are denoting lin$s between two entities. •
The name of the relationship is gi%en in a diamond bo (1or eample 7elongs to as shown in 1igure :.3).
Cardinality Ratio Each entity can be in%ol%ed in three types of relationships as shown# 6ne to 6ne (3#3) •
Each student belongs to one 'ni%ersity. +e can illustrate this ratio by writing ones on the lines indicating the relationship as shown in 1igure 2.:.
Figure 2.5 : One-one Maing •
The notation for the 3#3 relationship is shown in 1igure 2.;.
Figure 2.! : One-one Maing 6ne to any (3#) •
lecturer teaches many students" and this 6ne to any relationship is illustrated in figure 2.<.
Figure 2." : One-Many
The notation for the 3# relationship is shown in 1igure 2.=.
•
Figure 2.# : One-Many any to any (#) •
Each student ta$es many modules" and each module is ta$en by many students as shown in figure 2.>.
Figure 2.$ : Many-Many Making ER Models Till now we ha%e seen how to identify the basic elements in an ER /iagram. 1inally" to ma$e an E4R model you need to identify# •
Entities
•
ttributes
•
Relationships
•
Cardinality ratios
,ow lets see how an ER model will loo$ li$e when all these elements are put together. The final ER odel of our 'ni%ersity database is shown in the 1igure 2.3?. In this figure we ha%e
shown the entities and the relationship between the entities which depict the complete ER model of a 'ni%ersity. 8ere /epartment" Course" odule" &ecturer and tudent are the entities. The relationships in the 1igure 2.3? are defined as /epartment 6ffers many Courses and those two entities ha%e 6ne to any relationship. /epartment ssigns any &ecturers(6ne(3) To any(n)). Each &ecturer teaches any tudents(6ne(3) To any(n)). E%ery tudent ta$es se%eral odules(any(n) To any(n)). E%ery odule includes any Courses(any(n) To any(n)). Course is enrolled by any tudents(6ne(3) to any(n)). !he ER Model "or the above e#ample is given belo$% The complete ER odel for our 'ni%ersity database will be as shown in the diagram below. It is an Integrated ER model containing the Entities and Relationships for a 'ni%ersity database.
Figure 2.1% : &niversity E' Model ummary •
ER /iagrams play a major role in database designing.
•
The ER /iagrams act as a non-technical communication tool.
•
This tool is used by both technical and non-technical users.
•
Entities represent real world things5 They can be conceptual as a transaction or physical as a ban$.
Figure 2.11 : E' Model (u))ary
2.2. &ormali'ation ( )irst &ormal )orm* +econd &ormal )orm and !hird &ormal )orm The database design techni@ue that is used to organie tables in a manner that reduces redundancy and dependency of data is called ,ormaliation. It is the scientific process of decomposing comple tables(Relations) into smaller and easily manageable tables. The use of normaliation is to accurately access data from database. +ithout normaliation" database systems can be inaccurate" redundant" slow and inefficient. They might not produce the data that is epected. &isted below are the ad%antages of normaliation. d%antages •
maller" simpler and well-structured relations.
•
%oids unnecessary duplication of data. That is" it helps to reduce redundancy.
•
*ro%ides data integrity.
•
•
8elps to a%oid update anomalies. That is" it isolates data so that additions" deletions" and modifications of a field can be made in just one table. The changes are then propagated to the rest of the database through the defined relationships. a%e storage space.
Edgar Codd in%ented the relational model and he proposed the theory of normaliation with the introduction of 1irst ,ormal 1orm. 8e continued to etend the theory with econd and Third ,ormal 1orms. &ater Edgar Codd joined with Raymond 1. 7oyce to de%elop the theory of 7oyce-Codd ,ormal 1orm(7C,1). Theory of ,ormaliation is still de%eloping. 1or eample" the discussions on ;th ,ormal 1orm are in progress. 8owe%er" in most practical applications normaliation achie%es its best in Third ,ormal 1orm. The e%olution of ,ormaliation theories is illustrated below#
Figure 2.12 : *or)ali+ation Evolution &etAs understand a few things before we proceed
--
+hat is a BE D BE is a %alue used to uni@uely identify a row in a table. It could be a single column or a combination of multiple columns. ,ote# The columns in a table that are ,6T used to uni@uely identify a record or row in a table are called non-$ey columns. +hat is a primary BeyD primary $ey is a single column %alue that is used to uni@uely identify a database record.
Figure 2.13 : ,ri)ary ey •
The primary $ey column in a table must always ha%e a %alue.
•
The primary $ey column in a table cannot ha%e duplicate %alues. Each primary $ey %alue must be uni@ue.
•
The primary $ey %alues cannot be modified.
•
The primary $ey column should ha%e a %alue when a new record is inserted into the table.
Eample# The table below contains the details of students. 8ere studentId is *rimary Bey which is used to uni@uely identify the details of a student from the table.
Figure 2.14 : ,ri)ary ey llustration Composite Bey If two or more columns are used to uni@uely identify a record then combination of those multiple columns constitutes a composite $ey. In the tudent table gi%en below" we ha%e tudentId" TestId and ar$. 8ere one student can ta$e multiple tests and one test can be ta$en by multiple students. In this case in order to uni@uely identify the mar$ of a student in a test we re@uire both tudentId and TestId. This is a composite $ey. +tudent !able
/able 2.1 1unctional /ependency In simple terms" functional dependency can be eplained as follows. If you $now one attribute then you can get another attribute. Then both these attributes are said to be functionally dependent. In the tudent table gi%en below" we can get the attribute A,ameA if you $now the attribute AtudentIdA" then ,ame and tudentId are functionally dependent. 8ere we can say tudentId is determinant and ,ame as dependent. 1or eample" letAs consider the tudent table gi%en below. Table 2.2 stores student details(tudentId" ,ame" &anguages Bnown)" studentAs department details (/ept,o" /ept,ame) and lecturer details (&ecturerInCharge" /esignation) for tudents. In this approach" we $eep repeating the languages $nown and department details data for all the students in the same field. This is called an 'n,ormalied table. Instead of storing the same data again and again" we could normalie the data and create related tables.
&etAs see how we can normalie the table"create related tables and learn forms with the tudent table(which is not normalied)#
+tudent !able ,-n&ormali'ed !able%
/able 2.2 )irst &ormal )orm To mo%e from unnormalied form to first normal form all multi-%alued attributes (called repeating groups) should be remo%ed. The repeating groups nust be eliminated. ll attributes must be atomic.
Table 2.2 is not in 3,1 since there are repeating groups (more than 3 %alue in a field). The column !&anguages Bnown! has(English" 8indi and Tamil) in the Row(Tuple)3 and (English and 8indi) in the Row(Tuple) 2 .To satisfy 3,1 we can create separat e rows for each %alue in &anguages Bnown by duplicating the %alues in the remaining columns. Table 2.9 represents the same. 3,1 Rules •
Each column in a table should contain single %alue.
•
Each record needs to be uni@ue as shown in Table 2.9
/able 2.3 : 1*F For) +econd &ormal )orm *artial functional dependencies must be remo%ed. If two attributes of a table are combined to form a composite $ey" then the non-$ey attributes of that table must depend on both the attributes of the composite $ey. They must not depend on one of the attributes" which is the part of the composite $ey.
2,1 Rules
•
Rule 3- The table should be in 3,1.
•
Rule 2- The ingle Column must be used as *rimary Bey.
•
relation in 3,1 will be in second normal form (2,1) if there are no partial dependencies.
*artial dependency It is the functional dependency on part of the primary $ey instead of the entire primary $ey. It is clear that we canAt mo%e forward to ma$e our simple database in 2nd ,ormaliation form unless we partition the columns in Table 2.9. 8ere" assume that tudentId and /ept,o together act as the $ey (Composite $ey). s per 2,1 all non-$ey attributes must be dependent on whole $ey. In Table 2.9 the attribute A/ept,ameA is functionally dependent on whole $ey (tudentIdF/ept,o). That is" you can get the department name only if you $now both tudentId and /ept,o. ll other column attributes can be identified by just pro%iding AtudentIdA. o for all other columns tudentId acts as the primary $ey. o split the table as gi%en below to satisfy 2,1. +tudent
/able 2.4 Department
/able 2.5 /anguages
/able 2.! Introducing 1oreign Bey foreign $ey is a field in a table that matches the primary $ey column of another table. The cross-reference tables can be achie%ed by 1oreign Bey. In Table 2.<"/ept,o is the foreign Bey
/able 2."
Figure 2.15 : Foreign ey 1oreign $ey refers primary $ey of another table. It helps to connect the two tables. •
The %alues of a foreign $ey and a primary $ey may be different.
•
The foreign $ey ensures that a row in a table is mapped to a corresponding row in another table.
•
1oreign $ey does not ha%e to be uni@ue5 most often it is not uni@ue.
1oreign Bey
Figure 2.1! : Foreign ey llustration +hy do you need a foreign $eyD 1oreign $ey is re@uired in R/7 for the concept of Referential Integrity. Referential integrity It is a concept used in database to ensure that there is consistency in table relationships. If one table has a foreign $ey to another table" then the concept of referential integrity states that you cannot add a record to the table that contains the foreign $ey unless there is a corresponding record in the lin$4relationship with the other table. 1or eample" consider the 1igure 2.3; gi%en in the pre%ious page" where /ept,o in the tudent table is foreign $ey of /ept,o in /epartment table. 8ere letAs try to add a student with tudentId as !3?9! and /ept,o as !/??9! in tudent table as shown below. 7ut the entry for /ept,o !/??9! is not present in /epartment table which means we ha%e added a student to a department which does not eist. This leads to inconsistency of data across related tables. 8ence R/ has the concept of referential integrity which does not allow to add a record to the table that contains the foreign $ey unless there is a corresponding record in the table to which it is lin$ed. +tudent
/able 2.# Department
/able 2.$ Transiti%e functional dependencies +hen changing a non-$ey column might cause any of the other non-$ey columns to change" it is called transiti%e functional dependency. ttributes that are not a part of the $ey must not depend on any non-$ey attribute. Consider the table 2.>. Changing the non-$ey column &ecturer In Charge " may change /esignation. 8ere /ept,o acts as the $ey. ll other columns are non-$ey attributes. s per 9,1 non-$ey attributes should not be dependent on any other non-$ey attributes but A&ecturer In ChargeA is dependent on A/esignationA. 7oth &ecturer In Charge and /esignation are non$ey attributes. o it forms transiti%e dependency. o" to satisfy 9,1 letAs split the table in a short while. !hird &ormal )orm Third normal form (9,1) is the third step in database normaliation and it builds on the first (I,1)and second normal forms(2,1). The Third ,ormal 1orm(9,1) states that all column references in the referenced data that are not dependent on the primary $ey should be remo%ed. nother way of putting this statement is that only foreign $ey columns should be used to reference another table" and the other columns from the parent table should not eist in the reference table.
The econd ,ormal form(2,1) co%ers in case of multi-column primary $eys. 9,1 is meant to co%er single column $eys as mentioned in transiti%e functional dependencies abo%e. 9,1 Rules •
Rule 3- The table should be in 2,1.
•
Rule 2- The table has no transiti%e functional dependencies which is eplained abo%e.
+e need to di%ide our table if it has to be mo%ed from second normal form(2,1) into Third ,ormal form(9,1). In table 2.3 /ept,o acts as the $ey. ll other columns are non-$ey attributes. The non-$ey attributes should not be dependent on any other non-$ey attributes as per third normal form. The A/esignationA is dependent on A&ecturer In ChargeA and these are non $ey attributes in the &ecturer table eplained. It forms transiti%e dependency. o" to satisfy 9,1 split the table as follows. +tudent
/able 2.1%
Department
/able 2.11 /ecturer
/able 2.12 /anguages
/able 2.13 The eample gi%en abo%e cannot be decomposed further to attain higher forms of normaliation because it is already normalied to the highest le%el.,ormally only comple data bases would need net le%els of normaliation.
2.3. Joins +hat are GoinsD join is a techni@ue where records from two or more tables are retrie%ed through a single H& @uery and shown as a single output. s it forms a set" It can be sa%ed as a table or used as it is. join is a means of combining columns from two tables by using %alues common to both tables. It allows us to combine data from more than one table into a single result set. join condition is used in the +8ERE clause of select" update and delete @ueries. ,ote# The @uery will gi%e results from two tables as Cartesian product( Cartesian product is defined as all possible combinations of rows in all tables). If join condition is omitted. The first tableAs rows are joined with all rows of the second table. 1or eample" if the first table has 9? rows and the second table has 3? rows" the result will be 9? 3?" or 9?? rows. This
@uery &etAs
will use the
ta$e two tables
a below
long to
time eplain the
to join
eecute. conditions.
!able 0+tudent0
/able 2.14 !able 0Department0
/able 2.15 In the abo%e eample the column that is common between both the tables is /ept,o. 'sing /ept,o"the tudent and /epartment tables can be joined to combine data from both the tables as shown below.
Figure 2.1" : 0oining o tables
&ets consider a scenario to retrie%e the details of student who belong to ACEA department. +e ha%e to join two tables based on the common column present in the two tables.
Figure 2.1# : Maing data Result# fter joining two tables#
/able 2.1!
2.. +ummary •
The Entity-Relationship model (or ER model) is a way of graphically representing the logical relationships of objects in order to create a database.
•
n ER model is a graphical representation which contains entities or !items!" relationships among them and attributes of the entities and the relationships.
•
The database design techni@ue which is used to organie tables in a manner that reduces redundancy and dependency of data is called as ,ormaliation.
•
There are three forms of normaliation. They are 1irst ,ormal form(3,1)"econd ,ormal form(2,1) and Third ,ormal form(9,1).
•
$ey is a %alue used to uni@uely identify a row in a table. 6ne or more columns could be used to form a $ey for a table.
•
primary $ey is a single column %alue used to identify a database record uni@uely.
•
composite $ey is a primary $ey deri%ed by combining multiple columns and is used to identify a record uni@uely.
•
The field in a table which matches the primary $ey column of another table is called as foreign $ey. The cross-reference tables can be achie%ed by foreign $ey.
•
1irst ,ormal 1orm-The multi-%alued attributes (called repeating groups) should be remo%ed i.e. elimination of repeating groups. ll attributes must be atomic.
•
econd ,ormal 1orm- *artial functional dependencies must be remo%ed. The attributes that are not a part of the $ey should be dependent on the entire $ey for that entity.
•
Third normal 1orm- tates that all column reference in referenced data that are not dependent on the primary $ey(transiti%e dependency) should be remo%ed.
•
Goin is a means of combining fields from two tables by using %alues common to both. It allows to combine data from more than one table into a single result set.
ADDI!I&A/ DA!A
SQL JOIN n H& G6I, clause is used to combine rows from two or more tables" based on a common field between them. The most common type of join is# +3/ I&&ER 4I& ,simple 5oin. n H& I,,ER G6I, return all rows from multiple tables where the join condition is met. &etAs loo$ at a selection from the !6rders! table# OrderID
CustomerID
OrderDate
10308
2
1996-09-18
10309
37
1996-09-19
10310
77
1996-09-20
Then" ha%e a loo$ at a selection from the !Customers! table# CustomerI D
CustomerName
ContactName
Country
1
Alfreds Futterkiste
Maria Anders
Germany
2
Ana ru!illo "m#aredados y $elados
Ana ru!illo
Me%i&o
3
Antonio Moreno a'uer(a
Antonio Moreno
Me%i&o
,otice that the !CustomerI/! column in the !6rders! table refers to the !CustomerI/! in the !Customers! table. The relationship between the two tables abo%e is the !CustomerI/! column. Then" if we run the following H& statement (that contains an I,,ER G6I,)#
Example )"*"+ ,rders.,rder/ +ustomers.+ustomerame/ ,rders.,rderate F,M ,rders " J, +ustomers , ,rders.+ustomer+ustomers.+ustomer ry it yourself 4 it will produce something li$e this# OrderID
CustomerName
OrderDate
10308
Ana ru!illo "m#aredados y $elados
951851996
1036
Antonio Moreno a'uer(a
1152751996
10383
Around t$e orn
1251651996
103
Around t$e orn
115151996
10278
erlunds sna::k;#
851251996
Diferent SQL JOINs 7efore we continue with eamples" we will list the types of the different H& G6I,s you can use# •
•
•
•
INNER JOIN< eturns all ro=s =$en t$ere is at least one mat&$ in , ta:les LEFT JOIN< eturn all ro=s from t$e left ta:le/ and t$e mat&$ed ro=s from t$e ri$t ta:le RI!T JOIN< eturn all ro=s from t$e ri$t ta:le/ and t$e mat&$ed ro=s from t$e left ta:le F"LL JOIN< eturn all ro=s =$en t$ere is a mat&$ in ," of t$e ta:les