Disclaimer: This is an example of a student written essay.

Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UKEssays.com.

Relational Databases: Functional Dependency and Normalization

 ✅ Paper Type: Free Essay ✅ Subject: Data Analysis ✅ Wordcount: 3975 words ✅ Published: 8th Feb 2020

Abstract

Functional dependencies and Normalization play an important role in relational database design. Functional dependencies are key to establishing the relationship between key and non-key attributes in a relation. Normalization process works towards removing anomalies in a relation and prevents data redundancy. This paper, intended to be a graduate research paper, establishes the definitions of these concepts. The paper introduces Functional Dependency and explains its inference rules. The paper also introduces Normalization and various normal forms 1NF thru 5NF including the BCNF. The paper also explains how Functional Dependencies and Normalization are related, why they are important with regards to relational databases, and advantages of designing a normalized database.

Relational Databases: Functional Dependency and Normalization

Definitions and Concepts

Functional Dependency

A functional dependency is a constraint between two sets of attributes from the database. A functional dependency, represented by X → Y, between two sets of attributes X and Y that are subsets of a relation R specifies a constraint that, for any two tuples t1 and t2 in R that have t1[X] = t2[X],they must also have t1[Y] = t2[Y].

This means the values of the set of attributes Y of a tuple in R are determined by the set of attributes X. In other words, the values of the set of attributes X functionally determine the values of the set of attributes Y. We can also say that Y is functionally dependent on Y.

The set of attributes X on the left-hand side of the functional dependency is called determinant and the set of attributes Y on the right-hand side of the functional dependency is called dependent. Despite the mathematical definition a functional dependency cannot be determined automatically. It is a property of the semantics of attributes – the database designers will have to understand how the attributes are related to each other to specify a functional dependency. (Elmasri, Ramez and Shamkant B. Navathe. 2006)

Example

Consider the example of an SSN (Social Security Number) database. Every individual has a unique SSN. So, the other attributes of the relation, like name, address etc. can be determined using the SSN. That makes SSN a determinant, and name, address, the dependents – thus establishing the functional dependencies:

SSN  name

Inference Rules for Functional Dependency

Armstrong’s axioms are a set of inference rules used to infer all the functional dependencies on a relational database. They were developed by William W. Armstrong.

If Y is a subset of X then X  Y

Axiom of augmentation: if X determines Y, then XZ determines YZ for any Z

If X  Y, then XZ  YZ

Axiom of transitivity: if X determines Y and Y determines Z, then X must determine Z

If X  Y and Y  Z, then X  Z

Union: if X determines Y and X determines Z then X must also determine Y and Z

If X  Y and X  Z, then X  YZ

Decomposition: if X determines Y and Z, then X determines Y and X determines Z separately

If X  YZ, then X  Y and X  Z

Normalization

Database Normalization is a process that allows the storage of data without unnecessary redundancy and thereby eliminate data inconsistency. A normalized database eliminates anomalies in updating, inserting, and deleting data, which improves the efficiency and effectiveness of the database. Users can maintain and retrieve data from a normalized database without difficulty. Data Normalization can be used by the designer of a database to identify and group together related data elements (attributes) and establish relationships between the groups.

Database Normalization concept and its ‘Normal Forms’ were originally invented by Edgar Codd, the inventor of the relational model. The ‘Normal Forms’ provide the criteria for determining a table’s degree of vulnerability to logical inconsistencies and anomalies. The higher the normal form applicable to a table, the less vulnerable it is.

First Normal Form (1NF)

An entity type or table is in 1NF when each of its attributes contain simple values which are atomic and contains no repeating groups of data. The domain of an attribute in an 1NF table must include only atomic (simple, indivisible) values and that the value of any attribute in a tuple must be a single value from the domain of that attribute.

Example

Consider the address attribute in a sales database. It is not an atomic attribute, because it is made up of atomic attributes as street, city, state and zip. For the relation to be in 1NF, the appropriate database design should have the atomic attributes street, city, state and zip instead of an address attribute.

Un-Normalized: sales (date, order_no, product_no, product_description, price, quantity_sold, cust_name, cust_address)

1NF: sales (date, order_no, product_no, product_description, price, quantity_sold, cust_name, cust_street, cust_city, cust_state, cust_zip)

Second Normal Form (2NF)

An entity type or table is in 2NF when it is in 1NF and all its non-key attributes depend on the whole key (i.e., functional dependency). There cannot be partial dependencies.

Example

Continuing with the sales database, the order_no and the product_no form the composite key for the table. There are partial dependencies – date is dependent on order_no, but not product_no – which violates the requirement for 2NF. The product_description is dependent on product_no and not on order_no. Removing these partial dependencies will result in 2NF.

1NF: sales (date, order_no, product_no, product_description, price, quantity_sold, customer_name, customer_street, customer_city, customer_state, customer_zip)

2NF: order (date, order_no, cust_no);

product (product_no, product_description, price);

order_detail (order_no, product_no, quantity_sold);

customer (cust_no, cust_name, cust_street, cust_city, cust_state, cust_zip)

Third Normal Form (3NF)

An entity type or table is in 3NF when it is in 2NF and non-key attributes do not depend on other non-key attributes (i.e., there is no transitive dependency).

Example

Continuing with the sales database, the non-key attributes cust_city and cust_state are dependent on cust_zip which is a non-key attribute. Creating a separate zip table will transform the design into 3NF, where in there are no more dependencies between non-key attributes.

2NF: order (date, order_no, cust_no);

product (product_no, product_description, price);

order_detail (order_no, product_no, quantity_sold);

customer (cust_no, cust_name, cust_street, cust_city, cust_state, cust_zip)

3NF: order (date, order_no, cust_no);

product (product_no, product_description, price);

order_detail (order_no, product_no, quantity_sold);

customer (cust_no, cust_name, cust_street, zip_code);

zip (zip_code, city, state)

Boyce Codd Normal Form (BCNF)

An entity type or table is in BCNF when it is in 3NF and all candidate keys defined for the relation satisfy the test for third normal form.

Example

Continuing with the sales database, all the candidate keys already satisfy the 3NF requirements.

Fourth Normal Form (4NF)

An entity type or table is in 4NF when it is in BCNF and there are no non-trivial multi-valued dependencies. To move from BCNF to 4NF, remove any independently multi-valued components of the primary key to two new parent entities.

Example

For example, a professor can teach multiple subjects and can also mentor multiple students. To be in 4NF, the professor to subjects should be a separate relation and professor to students should be a separate relation – since they are independent of each other

Fifth Normal Form (5NF)

To be in 5NF, a relation decomposed into two relations must have lossless-join property, which ensures that no spurious tuples are generated when relations are reunited through a natural join.

Example

In the sales database example, when the sales database was split into order and product, the natural join of those two tables does not result in loss of data (tuples).

(Russell, Gordon. Chapter 4; Nguyen Kim Anh, Relational Design Theory)

Importance of Functional Dependency and Normalization to Relational Model

How are they related

Normalization theory draws heavily on the theory of functional dependencies. When a database designer sets out to design a database, it is essential to understand the semantics of the data – how the attributes are related to one another. This helps in establishing the functional dependencies between attributes. Once the functional dependencies are identified, the design the database in to a ‘normal form’ of the highest order possible is easier. Rules for each normal form, starting from the 1NF are invariably framed around maintaining the functional dependencies and are also based on the inference rules for functional dependencies (refer Inference Rules section). For example, to be in 2NF the non-key attributes should be dependent on the whole-key, which means the functional dependencies should be satisfied. Similarly, to be in 3NF, transitive dependency should be removed, which can be done if the functional dependencies are established correctly.

In other words, database normalization process ensures an efficient organization of data in database tables, which results in guaranteeing that data dependencies make sense, and also reducing the space occupied by the database via eliminating redundant data.

Why are they necessary for Relational Database model?

Functional dependencies play an important role in relational database design. They are used to establish keys that are used to define normal forms for relations. In addition, they help in deriving constraints based on the relationships between attributes. As a database grows in size and complexity it is essential that order and organization be maintained to control these complexities and minimize errors and redundancy in the associated data. This goal is managed by normalization. Database normalization minimizes data duplication to safeguard databases against logical and structural problems, such as data anomalies.

If you need assistance with writing your essay, our professional essay writing service is here to help!

Normalization can help keep the data free of errors and can also help ensure that the size of the database doesn’t grow large with duplicated data. Normalization permits us to design our relational database tables so that they “(1) contain all the data necessary for the purposes that the database is to serve, (2) have as little redundancy as possible, (3) accommodate multiple values for types of data that require them, (4) permit efficient updates of the data in the database, and (5) avoid the danger of losing data unknowingly (Wyllys, R. E., 2002).”

The resulting normalized database is highly efficient, which can be characterized by –

• Increased Consistency: Information is stored in one place and one place only, reducing the possibility of inconsistent data.
• Easier object-to-data mapping: Highly-normalized data schemas in general are closer conceptually to object-oriented schemas because the object-oriented goals of promoting high cohesion and loose coupling between classes results in similar solutions.

Moreover, a normalized database is advantageous when operations will be write-intensive or when ACID (Atomicity, Consistency, Isolation, Durability) compliance is required. Some advantages include:

• Updates run quickly since no data being duplicated in multiple locations.
• Inserts run quickly since there is only a single insertion point for a piece of data and no duplication is required.
• Tables are typically smaller than the tables found in non-normalized databases. This usually allows the tables to fit into the buffer, thus offering faster performance.
• Data integrity and consistency is an absolute must if the database must be ACID compliant. A normalized database helps immensely with such an undertaking.
• Searching, sorting, and creating indexes can be faster, since tables are narrower, and more rows fit on a data page.
• Minimizes/avoids data modification issues.

Summary

The paper defined the concept of functional dependency, which is the basic tool for analyzing relational schemas, and discussed some of its properties. Functional dependencies specify semantic constraints among the attributes of a relation schema. Next it described the normalization process for achieving good designs It presented examples to illustrate how by using the general definition of the normal forms, a given relation may be analyzed and decomposed to eventually yield a set of relations in 3NF. The paper also touches not often used BCNF, 4NF and 5NF normal forms.

Then the paper explains how functional dependencies and normalization are inter-related in the design of a relational model database. It explains the importance of functional dependency and normalization in the design of a relational database. A normalized database is highly efficient and has many advantages.

References

• Wyllys, R. E., 2002. Database management principles and applications
• Elmasri, Ramez and Shamkant B. Navathe. 2006. Fundamentals of Database Systems. 5th ed. Reading, MA: Addison-Wesley
• Russell, Gordon. Chapter 4 – Normalization. Database eLearning
• Nguyen Kim Anh, Relational Design Theory. OpenStax CNX
• Gaikwad, A.S., Kadri, F.A., Khandagle, S.S., Tava, N.I. (2017) Review on Automation Tool for ERD Normalization. International Research Journal of Engineering and Technology (IRJET) [Online]. 4 (2), pp. 1323-1325. [Accessed 07 May 2017]. Available from: https://www.irjet.net/archives/V4/i2/IRJET-V4I2259.pdf
• https://en.wikipedia.org/wiki/ACID_(computer_science)

Tables

Un-normalized Table: sales

sales

 date order_no cust_no product_description price quantity_sold cust_name cust_address 12/12/2018 1001 A320 MP3 10.00 8 Tom 1 Main St, Hartford, CT 06106 12/12/2015 1001 B101 Ipod 100.00 4 Tom 1 Main St, Hartford, CT 06106 01/05/2019 1002 C101 Blu Ray 80.00 3 Aaron 1 Holy Lane, Manchester, 06040

1NF Table: sales

sales

 date order_no product_no product_description price quantity_sold cust_name …. 12/12/2018 1001 A320 MP3 10.00 8 Tom 12/12/2015 1001 B101 Ipod 100.00 4 Tom 01/05/2019 1002 C101 Blu Ray 80.00 3 Aaron

Continued..

sales

 …. cust_street cust_city cust_state cust_zip 1 Main St Hartford CT 06106 1 Main St Hartford CT 06106 1 Holy Lane Manchester CT 06040

2NF Table: order

order

 date order_no cust_no 12/12/2018 1001 101 01/05/2019 1002 102

2NF Table: product

product

 product_no product_description price A320 MP3 10.00 B101 Ipod 100.00 C101 Blu Ray 80.00

2NF Table: order_detail

order_detail

 order_no product_no quantity_sold 1001 A320 8 1001 B101 4 1002 C101 3

2NF Table: customer

customer

 cust_no cust_name cust_street cust_city cust_state cust_zip 101 Tom 1 Main Street Hartford CT 06106 102 Aaron 1 Holy Lane Manchester CT 06040

3NF Table: customer

customer

 cust_no cust_name cust_street zip_code 101 Tom 1 Main Street 06106 102 Aaron 1 Holy Lane 06040

3NF Table: zip

zip

 zip_code city State 06106 Hartford CT 06040 Manchester CT

View all

DMCA / Removal Request

If you are the original writer of this essay and no longer wish to have your work published on UKEssays.com then please:

Related Services

Prices from

SGD205.84

Approximate costs for:

• 1000 words
• 7 day delivery

Humanity University

Dedicated to your worth and value as a human being!

Related Lectures

Freelance Writing Jobs

Looking for a flexible role?
Do you have a 2:1 degree or higher?

Study Resources

Free resources to assist you with your university studies!