SlideShare a Scribd company logo
For more Https://www.ThesisScientist.com
Unit 5
Normalization
Normalization is a technique to organize the contents of the table for transactional
database and data warehouse.
First Normal Form :
Seeing the data in the example in the book or assuming otherwise that all attributes contain
the atomic value, we find out the table is in the 1NF.
Second Normal Form :
Seeing the FDs, we find out that the K for the table is a composite one comprising of empId,
projName. We did not include the determinant of fourth FD, that is, the empDept, in the PK
because empDept is dependent on empId and empID is included in our proposed PK.
However, with this PK (empID, projName) we have got partial dependencies in the table
through FDs 1 and 3 where we see that some attributes are being determined by subset of our
K which is the violation of the requirement for the 2NF. So we split our table based on the
FDs 1 and 3 as follows :
PROJECT (DNELCepi, projMgr, startDate)
EMPLOYEE (ipDwd, empName, salary, empMgr, empDept)
WORK (DNELCepi.oipDwd, hours, rating)
All the above three tables are in 2NF since they are in 1NF and there is no partial dependency
in them.
Third Normal Form
Seeing the four FDs, we find out that the tables are in 2NF and there is no transitive
dependency in POJECT and WORK tables, so these two tables are in 3NF. However, there is
a transitive dependency in EMNPLOYEE table since FD 1 say empId empDept and FD 4 say
For more Https://www.ThesisScientist.com
empDept P empMgr. To remove this transitive dependency we further split the EMPLOYEE
table into following two:
EMPLOYEE (ipDwd, empName, salary, empDept)
DET (ipDriDl, empMgr)
Hence finally we got four tables
PROJECT (DNELCepi, projMgr, startDate)
EMPLOYEE (ipDwd, empName, salary, empDept)
WORK (DNELCepi.oipDwd, hours, rating)
DET (ipDriDl, empMgr)
These four tables are in 3NF based on the given FD, hence the database has been normalized
up to 3NF.
Physical Database Design
After completing the logical database design and then normalizing it, we have to establish the
physical database design. Throughout the processes of conceptual and logical database
designs and the normalization, the primary objective has been the storage efficiency and the
consistency of the database. So we have been following good design principles. In the
physical database design, however, the focus shifts from storage efficiency to the efficiency
in execution. So we deliberately violate some of the rules that we studied earlier, however,
this shift in focus should never ever lead to incorrect state of the database. The correctness of
the database we have to maintain in any case. When we do not follow the good design
principles then it makes it difficult to maintain the consistency or correctness of the database.
Since the violation is deliberate, that is, we are aware of the dangers due to violations and we
know the reasons for these violations so we have to take care of the possible threats and
adopt appropriate measures. Finally, there are different possibilities and we as designers have
to adopt particular ones based on certain reasons or objectives. We have to be clear about our
objectives.
The physical DB design involves :-
 Transforms logical DB design into technical specifications for storing and retrieving
data
For more Https://www.ThesisScientist.com
 Does not include practically implementing the design however tool specific decisions
are involved
It requires the following input:
 Normalized relations (the process performed just before)
 Definitions of each attribute (means the purpose or objective of the attributes.
Normally stored in some form of data dictionary or a case tool or may be on paper)
 Descriptions of data usage (how and by whom data will be used)
 Requirements for response time, data security, backup etc.
 Tool to be used
Decisions that are made during this process are:
 Choosing data types (precise data types depend on the tool to be used)
 Grouping attributes (although normalized)
 Deciding file organizations
 Selecting structures
 Preparing strategies for efficient access
That is all about today’s lecture, the discussion continues in the next lecture. Summary In
today’s lecture we summarized the normalization process and also saw an example to
practically implement the process. We have introduced our next topic that is the physical DB
design. We will discuss this topic in the lectures to be followed.
The Physical Database Design Considerations and Implementation
The physical design of the database is one of the most important phases in the
computerization of any organization. There are a number of important steps involved in the
physical design of the database. Steps are carried out in sequence and need to be performed
precisely so that the result of the first step is properly used as input to the next step. Before
moving onto the Physical database design the design of the database should have undergone
the following steps,
Normalization of relations
Volume estimate
Definition of each attribute
Description of where and when data is used (with frequencies)
For more Https://www.ThesisScientist.com
Expectation or requirements of response time and data security.
Description of the technologies.
For the physical database design we need to check the usage of the data in term of its size and
the frequency. This critical decision is to be made to ensure that proper structures are used
and the database is optimized for maximum performance and efficiency.
The following steps are necessary once we have the prerequisite complete: Select the
appropriate attribute and a corresponding data type for the attribute. The process of selecting
the attribute to be placed in a specific relation in the physical design. Need considerable care
as it is one of the most important and basic aspects for the creation of the database.
Grouping of attributes in the logical order so that the relation is created in such a way that no
information is missing from the relation and also no redundant or unnecessary information is
placed in the relation.
Looking at the logical design at the time of transformation into physical design there may be
stages when the information combined logically in the logical design looks odd when
transforming the design into a physical one.
Arrangement of Similar records into the secondary memory (hard disk)
The scheme of storage on hard disk is important as it leads to the efficiency and management
of the data on disk. Different types of data access mechanism are available and are useful for
rapid access, storage, and modification of data. Different types of database structures can be
used for placement of data on disks, management of data in the forms of indexes and
different database architecture is vital and leads to better retrieval and recovery of records.
Preparing queries and handling strategies for the proper usage of the database, so that any
type of input or output operation performed on the database is executed in an optimized and
efficient way.
DESIGNING FIELDS
The Field is the smallest unit of application data recognized by system software, such as a
programming language or any database management system.
Designing fields in the databases’ physical design as discussed earlier is a major issue and
needs to be dealt with great care and accuracy. Data types are the structure defined for
placing data in the attributes. Each data type is appropriate for use with certain type of data.
For more Https://www.ThesisScientist.com
4 major objectives for using data types when specifying attributes in a database are given as
under:
*Minimized usage of storage space
*Represent all possible values
*Improve data integrity
*Support all data manipulation
The correct data type selection and decision for proper domain of the attribute is very
necessary as it provides a number of benefits.
Most common data types used in the available DBMS of the day have the following
set of common attributes.
CODING AND COMPRESSION TECHNIQUES :
There a re some attributes which have some sparse set of values, these values when they are
represented in any data type are hard to express, for this purpose some codes are used. As the
codes defined by the database administrator or the programmer consume less space so they
are better for use in situations where we have large number of records and wastage of small
amount of space in each record can lead to loss of huge amount of data storage space. Thus
causing lowered database efficiency.
STID STNAME HOBBY
S1020 Sonam Reading
S1038 narendra Gardening
S1015 Tarun Reading
S1015 ajay Movie
S1018 naveen Reading
Coding techniques are also useful for compression of data values appearing the data, by
replacing those data values with the smaller sized codes we can further reduce the space
needed by the data for storage in the database.
Following tables give the use of codes and their utilization in the database environment
For more Https://www.ThesisScientist.com
Coding Example:
Student
STID STNAME HOBBY
S1020 Sonam R
S1038 narendra G
S1015 Tarun R
S1015 ajay M
S1018 naveen R
Hobby Table
CODE HOBBY
R Reading
G Gardening
M Movies
In the above example we have seen the implementation of the codes as replacement to the
data in the actual table, here we actually allocated codes to different hobbies and then replace
the codes instead of writing the codes in the table.
We get a number of benefits by the use of data types and the benefit can be in a number of
dimensions.
Default value
Default values are the values which are associated with a specific attribute and can help us to
reduce the chances of inserting incorrect values in the attribute space. And also it can help us
preventing the attribute value be left empty.
Range Control
Range control implemented over the data can be very easily achieved by using any data type.
As the data type enforces the entry of data in the field according to the limitations of the data
type.
Null Value Control
As we already know that a null value is an empty value and is distinct from zero and spaces,
Databases can implement the null value control by using the different data types or their
build in mechanisms.
Referential Integrity
For more Https://www.ThesisScientist.com
Referential Integrity means to keep the input values for a specific attribute in specific limits
in comparison to any other attribute of the same or any other relation.

More Related Content

PDF
INTRODUCTION TO Database Management System (DBMS)
PDF
A Detail Database Architecture
PDF
Database and Math Relations
PDF
Elements of Data Documentation
DOCX
The three level of data modeling
PPTX
Relational database revised
PPT
Chapter24
DOCX
Data documentation and retrieval using unity in a universe®
INTRODUCTION TO Database Management System (DBMS)
A Detail Database Architecture
Database and Math Relations
Elements of Data Documentation
The three level of data modeling
Relational database revised
Chapter24
Data documentation and retrieval using unity in a universe®

What's hot (19)

DOCX
Mc0077 – advanced database systems
PPT
Dbms models
DOCX
Critical Writing of Quality Database Design
PPTX
Introduction to Database SQL & PL/SQL
PPTX
Process management seminar
PPTX
PPS
PPT
Chap04
PDF
Formalizing Collaborative Software Development Issues: A Collaborative Work A...
PDF
Data Mining And Data Warehousing Laboratory File Manual
DOCX
Personality attrib software_arch
PPT
Chap07
PPTX
Chapter 9
PPTX
Data Modeling PPT
PPT
Chap09
PPT
Chap03
PPTX
Physical Design and Development
PPT
data modeling and models
PPTX
Data modeling dbms
Mc0077 – advanced database systems
Dbms models
Critical Writing of Quality Database Design
Introduction to Database SQL & PL/SQL
Process management seminar
Chap04
Formalizing Collaborative Software Development Issues: A Collaborative Work A...
Data Mining And Data Warehousing Laboratory File Manual
Personality attrib software_arch
Chap07
Chapter 9
Data Modeling PPT
Chap09
Chap03
Physical Design and Development
data modeling and models
Data modeling dbms
Ad

Similar to Normalisation in Database management System (DBMS) (20)

PPT
Unit 9 Database Design using ORACLE and SQL.PPT
PPTX
Physical Database Design Database Engineering.pptx
PDF
Physical Database Design & Performance
PPTX
Physical database design 1.pptx
PPTX
Physical database design(database)
PPT
D.dsgn + dbms
PPTX
Data Types and Physical Data Models MS Access
PPTX
Data Types and Physical Data Models v 123
PPT
Ch 7 Physical D B Design
PDF
Physical Database Requirements.pdf
DOCX
1414Database DesignDatabase design is the process o.docx
PPT
DB Design.ppt
PPT
week3.ppt
PPTX
chapter 1 HARDWARE AND NETWORKING SERVICE.pptx
PPTX
Data Modeling using Microsoft Access v 123
PPTX
Database Model using Microsoft Access v123
PPT
The Database Environment Chapter 6
PPTX
Database
PPT
BUS-Chapter 07.ppt
PPTX
Feb 2nd Makeup Class
Unit 9 Database Design using ORACLE and SQL.PPT
Physical Database Design Database Engineering.pptx
Physical Database Design & Performance
Physical database design 1.pptx
Physical database design(database)
D.dsgn + dbms
Data Types and Physical Data Models MS Access
Data Types and Physical Data Models v 123
Ch 7 Physical D B Design
Physical Database Requirements.pdf
1414Database DesignDatabase design is the process o.docx
DB Design.ppt
week3.ppt
chapter 1 HARDWARE AND NETWORKING SERVICE.pptx
Data Modeling using Microsoft Access v 123
Database Model using Microsoft Access v123
The Database Environment Chapter 6
Database
BUS-Chapter 07.ppt
Feb 2nd Makeup Class
Ad

More from Prof Ansari (20)

PDF
Sci Hub New Domain
PDF
Sci Hub cc Not Working
PDF
basics of computer network
PDF
JAVA INTRODUCTION
PDF
Project Evaluation and Estimation in Software Development
PDF
Stepwise Project planning in software development
PDF
Entity-Relationship Data Model in DBMS
PDF
Master thesis on Vehicular Ad hoc Networks (VANET)
PDF
Master Thesis on Vehicular Ad-hoc Network (VANET)
PDF
INTERFACING WITH INTEL 8251A (USART)
PDF
HOST AND NETWORK SECURITY by ThesisScientist.com
PDF
SYSTEM NETWORK ADMINISTRATIONS GOALS and TIPS
PDF
INTRODUCTION TO VISUAL BASICS
PDF
introduction to Blogging ppt
PDF
INTRODUCTION TO SOFTWARE ENGINEERING
PDF
Introduction to E-commerce
PDF
Sorting and Searching Techniques
PDF
Hash Tables in data Structure
PDF
File Types in Data Structure
PDF
Data Representation of Strings
Sci Hub New Domain
Sci Hub cc Not Working
basics of computer network
JAVA INTRODUCTION
Project Evaluation and Estimation in Software Development
Stepwise Project planning in software development
Entity-Relationship Data Model in DBMS
Master thesis on Vehicular Ad hoc Networks (VANET)
Master Thesis on Vehicular Ad-hoc Network (VANET)
INTERFACING WITH INTEL 8251A (USART)
HOST AND NETWORK SECURITY by ThesisScientist.com
SYSTEM NETWORK ADMINISTRATIONS GOALS and TIPS
INTRODUCTION TO VISUAL BASICS
introduction to Blogging ppt
INTRODUCTION TO SOFTWARE ENGINEERING
Introduction to E-commerce
Sorting and Searching Techniques
Hash Tables in data Structure
File Types in Data Structure
Data Representation of Strings

Recently uploaded (20)

PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
Construction Project Organization Group 2.pptx
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PDF
Well-logging-methods_new................
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
DOCX
573137875-Attendance-Management-System-original
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
PPT
Project quality management in manufacturing
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
Sustainable Sites - Green Building Construction
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PDF
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Construction Project Organization Group 2.pptx
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Model Code of Practice - Construction Work - 21102022 .pdf
Well-logging-methods_new................
Automation-in-Manufacturing-Chapter-Introduction.pdf
573137875-Attendance-Management-System-original
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
Project quality management in manufacturing
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Sustainable Sites - Green Building Construction
Foundation to blockchain - A guide to Blockchain Tech
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Embodied AI: Ushering in the Next Era of Intelligent Systems

Normalisation in Database management System (DBMS)

  • 1. For more Https://www.ThesisScientist.com Unit 5 Normalization Normalization is a technique to organize the contents of the table for transactional database and data warehouse. First Normal Form : Seeing the data in the example in the book or assuming otherwise that all attributes contain the atomic value, we find out the table is in the 1NF. Second Normal Form : Seeing the FDs, we find out that the K for the table is a composite one comprising of empId, projName. We did not include the determinant of fourth FD, that is, the empDept, in the PK because empDept is dependent on empId and empID is included in our proposed PK. However, with this PK (empID, projName) we have got partial dependencies in the table through FDs 1 and 3 where we see that some attributes are being determined by subset of our K which is the violation of the requirement for the 2NF. So we split our table based on the FDs 1 and 3 as follows : PROJECT (DNELCepi, projMgr, startDate) EMPLOYEE (ipDwd, empName, salary, empMgr, empDept) WORK (DNELCepi.oipDwd, hours, rating) All the above three tables are in 2NF since they are in 1NF and there is no partial dependency in them. Third Normal Form Seeing the four FDs, we find out that the tables are in 2NF and there is no transitive dependency in POJECT and WORK tables, so these two tables are in 3NF. However, there is a transitive dependency in EMNPLOYEE table since FD 1 say empId empDept and FD 4 say
  • 2. For more Https://www.ThesisScientist.com empDept P empMgr. To remove this transitive dependency we further split the EMPLOYEE table into following two: EMPLOYEE (ipDwd, empName, salary, empDept) DET (ipDriDl, empMgr) Hence finally we got four tables PROJECT (DNELCepi, projMgr, startDate) EMPLOYEE (ipDwd, empName, salary, empDept) WORK (DNELCepi.oipDwd, hours, rating) DET (ipDriDl, empMgr) These four tables are in 3NF based on the given FD, hence the database has been normalized up to 3NF. Physical Database Design After completing the logical database design and then normalizing it, we have to establish the physical database design. Throughout the processes of conceptual and logical database designs and the normalization, the primary objective has been the storage efficiency and the consistency of the database. So we have been following good design principles. In the physical database design, however, the focus shifts from storage efficiency to the efficiency in execution. So we deliberately violate some of the rules that we studied earlier, however, this shift in focus should never ever lead to incorrect state of the database. The correctness of the database we have to maintain in any case. When we do not follow the good design principles then it makes it difficult to maintain the consistency or correctness of the database. Since the violation is deliberate, that is, we are aware of the dangers due to violations and we know the reasons for these violations so we have to take care of the possible threats and adopt appropriate measures. Finally, there are different possibilities and we as designers have to adopt particular ones based on certain reasons or objectives. We have to be clear about our objectives. The physical DB design involves :-  Transforms logical DB design into technical specifications for storing and retrieving data
  • 3. For more Https://www.ThesisScientist.com  Does not include practically implementing the design however tool specific decisions are involved It requires the following input:  Normalized relations (the process performed just before)  Definitions of each attribute (means the purpose or objective of the attributes. Normally stored in some form of data dictionary or a case tool or may be on paper)  Descriptions of data usage (how and by whom data will be used)  Requirements for response time, data security, backup etc.  Tool to be used Decisions that are made during this process are:  Choosing data types (precise data types depend on the tool to be used)  Grouping attributes (although normalized)  Deciding file organizations  Selecting structures  Preparing strategies for efficient access That is all about today’s lecture, the discussion continues in the next lecture. Summary In today’s lecture we summarized the normalization process and also saw an example to practically implement the process. We have introduced our next topic that is the physical DB design. We will discuss this topic in the lectures to be followed. The Physical Database Design Considerations and Implementation The physical design of the database is one of the most important phases in the computerization of any organization. There are a number of important steps involved in the physical design of the database. Steps are carried out in sequence and need to be performed precisely so that the result of the first step is properly used as input to the next step. Before moving onto the Physical database design the design of the database should have undergone the following steps, Normalization of relations Volume estimate Definition of each attribute Description of where and when data is used (with frequencies)
  • 4. For more Https://www.ThesisScientist.com Expectation or requirements of response time and data security. Description of the technologies. For the physical database design we need to check the usage of the data in term of its size and the frequency. This critical decision is to be made to ensure that proper structures are used and the database is optimized for maximum performance and efficiency. The following steps are necessary once we have the prerequisite complete: Select the appropriate attribute and a corresponding data type for the attribute. The process of selecting the attribute to be placed in a specific relation in the physical design. Need considerable care as it is one of the most important and basic aspects for the creation of the database. Grouping of attributes in the logical order so that the relation is created in such a way that no information is missing from the relation and also no redundant or unnecessary information is placed in the relation. Looking at the logical design at the time of transformation into physical design there may be stages when the information combined logically in the logical design looks odd when transforming the design into a physical one. Arrangement of Similar records into the secondary memory (hard disk) The scheme of storage on hard disk is important as it leads to the efficiency and management of the data on disk. Different types of data access mechanism are available and are useful for rapid access, storage, and modification of data. Different types of database structures can be used for placement of data on disks, management of data in the forms of indexes and different database architecture is vital and leads to better retrieval and recovery of records. Preparing queries and handling strategies for the proper usage of the database, so that any type of input or output operation performed on the database is executed in an optimized and efficient way. DESIGNING FIELDS The Field is the smallest unit of application data recognized by system software, such as a programming language or any database management system. Designing fields in the databases’ physical design as discussed earlier is a major issue and needs to be dealt with great care and accuracy. Data types are the structure defined for placing data in the attributes. Each data type is appropriate for use with certain type of data.
  • 5. For more Https://www.ThesisScientist.com 4 major objectives for using data types when specifying attributes in a database are given as under: *Minimized usage of storage space *Represent all possible values *Improve data integrity *Support all data manipulation The correct data type selection and decision for proper domain of the attribute is very necessary as it provides a number of benefits. Most common data types used in the available DBMS of the day have the following set of common attributes. CODING AND COMPRESSION TECHNIQUES : There a re some attributes which have some sparse set of values, these values when they are represented in any data type are hard to express, for this purpose some codes are used. As the codes defined by the database administrator or the programmer consume less space so they are better for use in situations where we have large number of records and wastage of small amount of space in each record can lead to loss of huge amount of data storage space. Thus causing lowered database efficiency. STID STNAME HOBBY S1020 Sonam Reading S1038 narendra Gardening S1015 Tarun Reading S1015 ajay Movie S1018 naveen Reading Coding techniques are also useful for compression of data values appearing the data, by replacing those data values with the smaller sized codes we can further reduce the space needed by the data for storage in the database. Following tables give the use of codes and their utilization in the database environment
  • 6. For more Https://www.ThesisScientist.com Coding Example: Student STID STNAME HOBBY S1020 Sonam R S1038 narendra G S1015 Tarun R S1015 ajay M S1018 naveen R Hobby Table CODE HOBBY R Reading G Gardening M Movies In the above example we have seen the implementation of the codes as replacement to the data in the actual table, here we actually allocated codes to different hobbies and then replace the codes instead of writing the codes in the table. We get a number of benefits by the use of data types and the benefit can be in a number of dimensions. Default value Default values are the values which are associated with a specific attribute and can help us to reduce the chances of inserting incorrect values in the attribute space. And also it can help us preventing the attribute value be left empty. Range Control Range control implemented over the data can be very easily achieved by using any data type. As the data type enforces the entry of data in the field according to the limitations of the data type. Null Value Control As we already know that a null value is an empty value and is distinct from zero and spaces, Databases can implement the null value control by using the different data types or their build in mechanisms. Referential Integrity
  • 7. For more Https://www.ThesisScientist.com Referential Integrity means to keep the input values for a specific attribute in specific limits in comparison to any other attribute of the same or any other relation.