2. Dr. NGPASC
COIMBATORE | INDIA
Contents
• Relational Approach: CODD’s Rules-Relational Data Structure- Relation, Domain,
Attributes. Keys- Primary Key , Foreign key and Candidate key. Relational Algebra:
Introduction- Traditional Set Operations – UNION, UNION ALL, INTERSECT and
MINUS. Special Relational Operations- Selection, Projection, Division and Join.
Join Operators: Inner Join, Natural Join and Outer Join.
3. 1. RELATIONALAPPROACH: RELATIONAL DATA STRUCTURE
•The relational model has established itself as the primary data
model for commercial data-processing applications.
• The first database systems were based on either the network model or the
hierarchical model.
•The relational model is now being used in numerous applications
outside the domain of traditional data processing
•The relational data model, like all data models, consists of three basic
components:
• a set of domains and a set of relations
• operations on relations
• integrity rules
4. • Relation: Relational model can represent as a table with columns and rows.
Each row is known as a tuple. Each table of the column has a name or
attribute.
• A relation schema R, is denoted by R(A1,A2,…,An), is made up of a relation
name R and a list of attributes A1, A2, …., An. Each attribute Ai is the name of a
role played by some domain D in the relation schema R. D is called the domain
of Ai and is denoted by dom(Ai). A relation schema is used to describe a relation;
R is called the name of this relation. The degree of a relation is the number of
attributes n of its relation schema.
5. • Relation: Relational model can represent as a table with columns and rows.
Each row is known as a tuple. Each table of the column has a name or
attribute.
• Domain: It contains a set of atomic values that an attribute can take.
• Attributes: It contains the name of a column in a particular table. Each
attribute Ai must have a domain, dom(Ai)
• Relational instance: In the relational database system, the relational
instance is represented by a finite set of tuples. Relation instances do not
have duplicate tuples.
• Relational schema: A relational schema contains the name of the relation
and name of all columns or attributes.
• Relational key: In the relational key, each row has one or more attributes. It
can identify the row in the relation uniquely.
7. • Example: STUDENT Relation
NAME ROLL_NO PHONE_NO ADDRESS AGE
Ram 14795 7305758992 Noida 24
Shyam 12839 9026288936 Delhi 35
Laxman 33289 8583287182 Gurugram 20
Mahesh 27857 7086819134 Ghaziabad 27
Ganesh 17282 9028 9i3988 Delhi 40
In the given table, NAME, ROLL_NO, PHONE_NO, ADDRESS, and
AGE are the attributes.
8. PROPERTIES OF RELATIONS
Name of the relation is distinct from all other relations.
Each relation cell contains exactly one atomic (single) value
Each attribute contains a distinct name
tuple has no duplicate value
Order of tuple can have a different sequence
1. Keys
•Key plays an important role in relational database; it is used for
identifying unique rows from table. It also establishes relationship among
tables.
9. RELATIONAL DATA INTEGRITY
• Most of the relations have an attribute, which can uniquely identify each tuple in
the relation.
• In some cases there can be more than one attribute, which can uniquely identify
each tuple in the relation. This attribute is called as a candidate key.
• If there are more than one attribute both of the attributes are eligible to be
identified as a candidate key.
• One of the candidate keys is arbitrarily designated to be the primary key and
others are called as secondary or alternate keys.
• A key is minimal set of attributes guaranteeing separation for the members of the
relation. When more than one key exists, a primary key is selected.
10. Example:
In the above table symbol, name and atomic number can uniquely identify each
row, so any one can be a candidate key, or the Element_Table has three candidate
keys. Let R be the relation with attributes A1, A2, …An. The set of attributes
K=(Ai, Aj,…An) of R is said to be a candidate key of R if and only if the following
two properties are satisfied:
11. • Uniqueness – At any given point of time, no two distinct tuples of R have
the same value of Ai, the same value for Aj…..and the same value for An.
• Minimality – No proper subset of the set (Ai, Aj,…An) has the uniqueness
property.
Let us take a look at another relation, SHIPMENT_TABLE
12. • In the ELEMENT_TABLE, the attribute Symbol and in the SHIPMENT_TABLE
the attribute Item has same data values.
• It is clear that a given value for that attribute, say Item ‘Ag’ should be permitted to
appear in the database only if the same value appears as a value of the Primary Key
‘Symbol’ in the relation ELEMENT_TABLE..
• Such an attribute is a foreign key. A foreign key is an attribute or attribute
combination of one relation whose values are required to match those out of the
primary key of some other relation.
• Also the foreign key and the primary key should be defined on the same
underlying domain.
14. INTEGRITY CONSTRAINTS
• Relational model includes several types of constraints whose purpose is to
maintain the accuracy and integrity of the data in the database.
• The major types of integrity constraints are:
• Domain Constraints
• Entity Integrity
• Referential Integrity
• Operational Constraints
Domain Constraints
• All the values that appear in a column of a relation must be taken from the same
domain. A domain usually consists of the following components.
1. Domain Name
2. Meaning
3. Data Type
4. Size or length
5. Allowable values or Allowable range( if applicable)
15. Entity Integrity
• The Entity Integrity rule is designed to assure that every relation has a primary key
and that the data values for the primary key are all valid.
• Entity integrity guarantees that every primary key attribute is non null.
• No attribute participating in the primary key of a base relation is allowed to contain
nulls.
• Primary key performs unique identification function in a relational model.
Referential Integrity
• In the relational model the association between the tables is defined using foreign
keys.
• The association between the SHIPMENT and ELEMENT tables is defined by
including the Symbol attribute as a foreign key in the SHIPMENT table.
• This implies that before we insert a row in the SHIPMENT table, the element for that
order must already exist in the ELEMENT table.
• A referential integrity constraint is a rule that maintains consistency among the rows
of two tables or relations.
• The rule states that if there is a foreign key in one relation, either each of the foreign
key value must match a primary key value in the other table or else the foreign key
value must be null.
16. Operational Constraints
• These are the constraints enforced in the database by the business rules or real
world limitations.
• For example if the retirement age of the employees in a organization is 60, then the
• age column of the employee table can have a constraint “Age should be less than or
equal to 60”.
• These kinds of constraints enforced by the business and the environment are called
operational constraints.
17. • Keys
•Key plays an important role in relational database; it is used for
identifying unique rows from table. It also establishes relationship among tables
Types of keys:
Primary Key :
• A primary key is a minimal set of attributes (columns) in a table that uniquely
identifies tuples (rows) in that table.
Example in DBMS
Example:
In the following table, there are three attributes: Stu_ID, Stu_Name&Stu_Age. Out
of these three attributes, one attribute or a set of more than one attributes can be a
primary key.
18. Stu_Id Stu_Name Stu_Age
101 Steve 23
102 John 24
103 Robert 28
104 Steve 29
105 Carl 29
Table Name: STUDENT
Attribute Stu_Name alone cannot be a primary key as more than one students can have
same name. Attribute Stu_Age alone cannot be a primary key as more than one students
can have same age. Attribute Stu_Id alone is a primary key as each student has a unique
id that can identify the student record in the table.
19. Candidate Key: The minimal set of attributes that can uniquely identify a tuple is
known as a candidate key. For Example, STUD_NO in STUDENT relation.
• It is a minimal super key.
• It is a super key with no repeated data is called a candidate key.
• The minimal set of attributes that can uniquely identify a record.
• It must contain unique values.
• It can contain NULL values.
• Every table must have at least a single candidate key.
• A table can have multiple candidate keys but only one primary key (the primary
key cannot have a NULL value, so the candidate key with NULL value can’t be
the primary key).
20. •Candidate Key Example
• Lets take an example of table “Employee”. This table has three
attributes: Emp_Id, Emp_Number & Emp_Name. Here Emp_Id &
Emp_Number will be having unique values and Emp_Name can have
duplicate values as more than one employees can have same name.
•Emp_Id Emp_Number Emp_Name
•E01 2264 Steve
•E22 2278 Ajeet
•E23 2288 Chaitanya
•E45 2290 Robert
•
21. Eg: Studid, Roll No, and email are candidate keys.
• The value of the Candidate Key is unique and non-null for every tuple.
• There can be more than one candidate key in a relation. For Example,
STUD_NO is the candidate key for relation STUDENT.
• The candidate key can be simple (having only one attribute) or composite as
well. For Example, {STUD_NO, COURSE_NO} is a composite candidate key
for relation STUDENT_COURSE.
22. Super Key: The set of attributes that can uniquely identify a tuple is known as Super
Key. For Example, STUD_NO, (STUD_NO, STUD_NAME), etc.
Or A super key is a group of single or multiple keys that identifies rows in a table.
• It supports NULL values. Example: SNO+PHONE is a super key.
• Adding zero or more attributes to the candidate key generates the super key.
• A candidate key is a super key but vice versa is not true.
23. Alternate Key – Out of all candidate keys, only one gets selected as primary key, remaining
keys are known as alternate or secondary keys..
Example : Here we have a table Employee, this table has three attributes: Emp_Id,
Emp_Number&Emp_Name.
• Table: Employee
• Emp_Id Emp_Number Emp_Name
E01 2264 Steve
E22 2278 Ajeet
E23 2288 Chaitanya
E45 2290 Robert
24. • There are two candidate keys in the above table:
• {Emp_Id}
• {Emp_Number}
• DBA (Database administrator) can choose any of the above key as primary key. Lets
say Emp_Id is chosen as primary key.
• Since we have selected Emp_Id as primary key, the remaining key Emp_Number
would be called alternative or secondary key.
25. Composite Key
A key that consists of more than one attribute to uniquely identify rows (also known
as records & tuples) in a table is called composite key.
Note:
• Any key such as super key, primary key, candidate key etc. can be called
composite key if it has more than one attributes.
• Example: Lets consider a table Sales. This table has four columns (attributes) –
cust_Id, order_Id, product_code&product_count.
•
26. • Table – Sales
•cust_Id order_Id product_code product_count
•C01 O001 P007 23
• C02 O123 P007 19
•C02 O123 P230 82
•C01 O001 P890 42
• None of these columns alone can play a role of key in this table.
o Column cust_Id alone cannot become a key as a same customer can place multiple
orders, thus the same customer can have multiple entires.
o Column order_Id alone cannot be a primary key as a same order can contain the order
of multiple products, thus same order_Id can be present multiple times.
27. o Column product_code cannot be a primary key as more than one customers
can place order for the same product.
o Column product_count alone cannot be a primary key because two orders
can be placed for the same product count.
• Based on this, it is safe to assume that the key should be having more than
one attributes:
o Key in above table: {cust_id, product_code}
o This is a composite key as it is made up of more than one attributes.
28. Foreign Key
• Foreign keys are the columns of a table that points to the primary key of
another table. They act as a cross-reference between tables.
• Example:In the below example the Stu_Id column in Course_enrollment
table is a foreign key as it points to the primary key of the Student table.
Stu_Id Stu_Name Stu_Age
101 Chaitanya 22
102 Arya 26
103 Bran 25
104 Jon 21
Student table:
29. Relational Algebra
•Relational algebra is a collection of operations to manipulate relations.
•These operations such as join (to combine related tuples from two
relations), selection (to select particular tuples of a relation) and projection
(to select particular attributes of a relation).
•Relational algebra is a procedural language. It specifies the operations
to be performed on existing relations to derive result relations.
• It collects instances of relations as input and gives occurrences of relations
as output.
• The output of these operations is a new relation, which might be formed
from one or more input relations.
30. Relational operators are classified into two types:
• Traditional Set Operators
• Special Operators
31. Traditional set operations
• Basic operations are the traditional set operations: union, difference,
intersection, and cartesian product.
• Union:
• In mathematical set theory, the union of two sets is the set of all elements
belonging to both sets.
• The set, which results from the union, must not, contain duplicate
elements. It is denoted by U. Thus the union of sets:
S1 = {1, 2, 3, 4, 5} and S2= {4,5,6,7,8}
would be the set {l, 2,3,4,5,6,7,8}.
32. • A union operation on two relational tables follows the same basic principle
but is more complex in practice.
• In order to perform the Union operation, both operand relations must be
union-compatible i.e. they must have same number of columns drawn from
the same domain (means must be of same data type).
• Suppose that two tables, R and the S have the following tuples at some
instant in time, and that their header parts are as shown below:
33. These can certainly be combined in to one table containing a valid relation by
the relational union operator ( R US) as follows:
34. • Intersection: In mathematics an intersection of two sets produces a set,
which contains. all the elements those are common to both sets. It is
denoted by n., Thus the intersection of the two sets:
S1 = { 1,2,3,4,5} and
• S2= {4,5,6,7,8}
• would be {4,5}.
35. In above example, both the tables are union compatible and it can be intersected
together. The intersection operation on the R and S tables (R n S) defined above
would return:
The intersection operator is used in a similar fashion to the union operation, but
provides an ‘and’ function.
Difference:
• In mathematics, the difference between two sets S 1 and S2 produces a set,
which contains all the members of one set, which are not in the other.
• It is denoted by “-”
36. • The order in which the difference is taken is, obviously, significant. Thus the
difference between the two sets:
S1 = { 1,2,3,4,5}
Minus
S2 = {4,5,6,7,8}
would be {1,2,3} and between
S2 = {4,5,6,7,8}
Minus
S1 = {1,2,3,4,5}
would be {6,7,8}.
37. • As for the other set operations discussed so far, the difference operation can only be
performed on tables that are union compatible. The difference operation on the R and S
(R – S) defined above would return:
38. • Minus is not associative
• In order to prove this mathematically consider three sets A,B,C With following
members
• A = {1, 2,3,4,5}
• B = {2, 3}
• c = {1,4}
• (A MINUS B) MINUS C = {1,4,5,} MINUS {1,4} = {5}
• A MINUS (B MINUS C) = {1,2,3,4,5} MINUS {{2,3} MINUS {1,4}}=
{1,2,3,4,5} MINUS {2,3} ={l,4,5}
• Both the cases give different result. So, minus is not an associative operator.
39. • Minus is not commutative
• It means that A MINUS B is different from B MINUS A. In order to prove it we again
take the above values of A and B.
• A MINUS B={1,4,5}
• B MINUS A is empty or null because there is not any value, which is in B but not in A.
40. • Cartesian product: In mathematics, the Cartesian product of two sets is the set
of all ordered pairs of elements such that the first element in each pair belongs
to the first set and the second element in each pair belongs to the second set. It
is denoted by cross (x). It is for example, given two sets:
• S1 = {1,2,3}
• and
• S2 = { 4,5,6}
• The Cartesian product S1 x S2 is the set:
• {( 1,4),( 1,5),( 1,6),(2,4),(2,5),(2,6),(3,4),(3,5),(3,6)}
41. Assume that the tables refer to male and female staff respectively. Now, in order to obtain
all possible inter-staff marriages, the Cartesian product can be taken, giving.
In order to preserve unique names for attributes; the original attribute names have
had to be concatenated with the original table names. The New table has also been
given an identity.
43. • Selection: The selection operator yields a ‘horizontal subset of a given relation
that is, that subset of tuples or rows of table should be selected within the’
given relation for which a particular condition is satisfied.
• Thus, in the following example:
• S1 = {1,2,3,4,5}
• S2 = {2,3,4}
• S2 is a subset of S1.
• Since the body part of a table is a set, it is possible for it to have subsets, that
is, a selection from its tuples can be used to form another relation.
44. • Restriction is achieved using the comparison operators such as equal to (=), not
equal to (<>), greater than (>), less than (<), greater than or equal to (>=) and less
than or equal to <=).
• Example: Consider the database having following tables:
Format:
σ selection - condition (R). Choose tuples that satisfy the selection condition.
E.g.: σMajor = ‘CS’ (Students)
Student table
SID Name GPA Major
456 JAY 3.2 CS
457 AJAY 3.7 CS
45. Result table
SID Name GPA Major
456 JAY 3.2 CS
457 AJAY 3.7 CS
Note: All the Relational Algebra select commands does choose
tuples from a relation.
This means that, the desired output is to display the name of
students who has taken CS as Major. The Selection condition is a
Boolean expression including =, ≠, <, ≤, >, ≥, and, or, not.
46. Projection:
The projection yields a 'vertical' subset of a given relation- that is, the subset
obtained by selecting specified attributes, in a specified left-to-right order, and then
eliminating duplicate tuples within the attributes selected.
BookID ReturnDate
q-110 20-May-2008
w-990 21-Jun-2008
f-100 23-Jun-2008
r-800 27-Jun-2008
q-501 15-Jul-2008
Example: Issue[BookId,ReturnDate]
48. Division
The division operator divides a dividend relation "r" of degree (m+n) by a
divisor relation "s" of degree (m) and produces a resultant relation of degree "n".
REPRESENTATION- Let r(R) and s(S) be relations
r ÷ s: -
The result consists of the restrictions of tuples in r to the attribute names unique to
R, i.e. in the Header of r but not in the Header of s, for which it holds that all their
combinations with tuples in s are present in r.
51. Join
• To restrict the no of rows obtained from the Cartesian product we used joins.
• Join is performed on 2 relations having one or more attributes in common and
they should have some datatype.
• It is a binary operation and a combination of certain selections and a Cartesian
product into one operation.
• It is denoted as |X| .
• It is associative.
REPRESENTATION- (P)| X| (JOIN CONDITION) (Q)
where P,Q are the names of the relation and where condition is of the form pi Q qi
where pi is the attributes of the relation P and qi is the attribute of the relation Q.