Computer Sciene of Udayana State University

April 26, 2009

Database Design and Normalization (Normalization Quiz)

Filed under: Computer Science — ignaga @ 2:41 AM

u Database design consists in grouping attributes that form “good” relation schemas.

u Normalization: a design process to reduce redundancies and update anomalies in a relational schema.

Result: a set of decomposed relations that meet certain normal form tests

Four most commonly used normal forms are first (1NF), second (2NF) and third (3NF) normal forms, and Boyce–Codd normal form (BCNF)

A process that is based on keys and functional dependencies among the attributes of a relation

Advertisements

Normalized Databases

Filed under: Computer Science — ignaga @ 2:34 AM

  • Normalized databases have a design that reflects the true dependencies between tracked quantities, allowing quick updates to data with little risk of introducing inconsistencies.
  • Instead of attempting to lump all information into one table, data is spread out logically into many tables.
  • Normalizing the data is decomposing a single relation into a set of smaller relations which satisfy the constraints of the original relation.
  • Redundancy can be solved by decomposing the tables. However certain new problems are caused by decomposition.
  • Normalization helps us to make a conscious decision to avoid redundancy keeping the pros and cons in mind.
  • One can only describe a database as having a normal form if the relationships between quantities have been rigorously defined.
  • It is possible to use set theory to express this knowledge once a problem domain has been fully understood, but most database designers model the relationships in terms of an “idealized schema”. (The mathematical support came back into play in proofs regarding the process of transforming from one form to another.)
  • The transformation of conceptual model to computer representation format is known as Normalization.

Functional Dependency (FD)

Filed under: Computer Science — ignaga @ 2:27 AM

u Important concept for normalization.

u Functional Dependency (FD)

Are constraints that are derived from the meaning and interrelationships of the data attributes

If A and B are attributes of relation R, B is functionally dependent on A (denoted A –> B), or A functionally determines B, if each value of A in R is associated with exactly one value of B in R.

Each value of A determines a value of B

A and B can be sets of attributes.

u FDs are derived from the real-world constraints on the attributes

u Social security number determines employee name

SSN –> ENAME

u Project number determines project name and location

PNUMBER –> {PNAME, PLOCATION}

u Employee ssn and project number determines the hours per week that the employee works on the project

{SSN, PNUMBER} –> HOURS

u An FD is a property of the attributes in the schema R.

u The constraint must hold on every relation instance.

u If K is a key of R, then K functionally determines all attributes in R

(since we never have two distinct tuples with t1[K]=t2[K])

u FDs are nontrivial. (StaffNo–>StaffNo is trivial)

example

Given TEXT we know the COURSE.

TEXT –> COURSE

u TEXT maps to a single value of COURSE

Relationship Between Normal Forms

Filed under: Computer Science — ignaga @ 2:23 AM

relationship-between-normal-forms

Unnormalized Form (UNF)

Filed under: Computer Science — ignaga @ 2:19 AM
  • A table that contains one or more repeating groups.
  • To create an unnormalized table:

Transform data from information source (e.g. form) into table format with columns and rows.

First Normal Form (1NF)

Filed under: Computer Science — ignaga @ 2:16 AM
  • A relation in which intersection of each row and column contains one and only one value.

It does not allow non-atomic attributes.

  • An unnormalized relation must be converted to a 1NF relation

First identify a primary key, then

place repeating data along with copy of the original key attribute(s) into a separate relation.

Making relation to satisfy 4NF

Second Normal Form (2NF)

Filed under: Computer Science — ignaga @ 2:13 AM
  • Based on concept of full functional dependency:

A and B are attributes of a relation,

B is fully dependent on A if B is functionally dependent on A but not on any proper subset of A.

  • 2NF – A relation that is in 1NF and every non-prime attribute is fully functionally dependent on the primary key.

Prime attribute: An attribute that is member of the primary key K

Example: EMP_PROJ(SSN, Pnum, Hours, Ename, Pname, Ploc)

{SSN,Pnum}–>Hours, SSN–>Ename, Pnum–>{Pname, Ploc}

* Relation is not in 2NF

1NF to 2NF

Filed under: Computer Science — ignaga @ 2:11 AM
  • Identify FD’s in the relation.
  • If partial dependencies exist on the primary key remove them by placing them in a new relation along with copy of their determinant.

EMP_PROJ(SSN, Pnum, Hours, Ename, Pname, Ploc)

{SSN,Pnum}–>Hours, SSN–>Ename, Pnum–>{Pname, Ploc}

EMP_PROJ decomposed into:

EMPLOYEE(SSN, Pnum, Hours) , {SSN,Pnum}–>Hours

WORKS_ON(SSN, Ename) , SSN–>Ename

PROJECT(Pnum,Pname, Ploc) , Pnum–>{Pname, Ploc}

Third Normal Form (3NF)

Filed under: Computer Science — ignaga @ 2:09 AM
  • Based on concept of transitive dependency:

A, B and C are attributes of a relation such that if A –> B and B –> C,

then C is transitively dependent on A through B. (Provided that A is not functionally dependent on B or C).

  • 3NF – A relation that is in 1NF and 2NF and in which no non-prime attribute is transitively dependent on the primary key.

2NF to 3NF

Filed under: Computer Science — ignaga @ 2:07 AM
  • Identify FD’s in the relation.
  • If transitive dependencies exist on the primary key remove them by placing them in a new relation along with copy of their determinant.

EMP_DEPT(Enum, Ename, Sal, Dnum, Dname, Mgr)

Enum–>{Ename, Sal, Dnum}, Dnum–>{Dname, Mgr}

second FD is transitive through Dnum

Decompose EMP_DEPT into

EMP(Enum, Ename, Sal, Dnum) and DEPT(Dnum, Dname, Mgr)

Next Page »

Create a free website or blog at WordPress.com.