Question

In: Computer Science

Part #4: Data normalization: Background: Data in a relational database is stored in a normalized form....

Part #4: Data normalization:

Background: Data in a relational database is stored in a normalized form. Data normalization or just normalization is a strategy used to organize data into multiple related tables to reduce data redundancy while preserving data integrity.

Exercise: Normalize the student data in the University table (sample data is shown further below) into 3 tables, namely, Student, Department, and Course. You do not need to populate data. Just illustrate the schema for the 3 tables. You may make suitable assumptions to normalize the data. Ensure you introduce additional columns as needed to preserve relationships between the 3 tables.

University table

StudentID

Name

Address

Dept_ID

Dept

CRN

Course

Room

+0119144

Kodi

3 S. East..

CSE

Compu..

241433

Intro to..

Ben 010

+0235422

Mary

10 Cherr..

CSE

Compu..

241433

Intro to..

Ben 010

+0717113

Alex

64 Rose..

CSE

Compu..

241433

Intro to..

Ben 010

+0115144

Jones

3 Talaw..

CSE

Compu..

241433

Intro to..

Ben 010

+0334455

Tommy

22 Dow..

CSE

Compu..

241433

Intro to..

Ben 010

Another 95 rows of this table for CRN 241433 not shown for brevity

Show just the schema (without any data) for the normalized form of the data with 3 tables in the space below:


Estimating benefit of normalization

The objective of normalization is to reduce redundant data in a database. Assume the University table has 100 rows -- that is 100 students in 1 course taught by 1 department. Each row has 8 columns. So the total number of columns in the database (which has just 1 University table) is 100 * 8 = 800 cells.

Now, compute similar information using the normalized version of your tables. In the normalized tables, indicate how many rows & total number of columns that would be present and fill-in the following table:

Table name

Number of rows

Cols per row

Number of cells in the table

Student

  

  

Department

Course

The total number of cells in the normalized version (with 3 tables) of the database is:

If each cell occupies 100 bytes, what would be the percentage difference in size of the size of the two databases? Compute the answer by completing the data in the following tables:

Size of un-normalized University database:

Size of normalized database:

Raw difference in database size:

Percentage difference in database size:

Solutions

Expert Solution

As per data given in the university table we will distribute the data and seperate it into three tables student, course and department.

In the student table each students should have unique student id, name and address.

In case of department we have unique department id and department name.

Course table has unique CRN(course reference number),Course and room.

Now we will make relationship between these tables to preserve data integrity. Here, we will use foreign key relationship between tables.

Relationship between course and department table:

We know that a department have many courses. So, we will add one column to course table 'depID' with foreign key constraint.

So, now the columns of course table will be:

Course table has unique CRN(course reference number) ,Course,room and depID.

Relationship between student and course table:

Note: We will assume that a student can join a single course.

Then, we need to add one column 'courseID' to student table with foreign key contraints.

So, now the columns of student table will be:

Student table should have unique student id, name, address and courseID.

--------------------------------------------------------------------------------------------

Student table:

Number of rows: Assume 100 students

Number of columns: 4 (studentID, name, address, courseID)

Cols per row:   4

Number of cell in table: 400 (Number of rows multiply by cols per row)

-----------------------------------------------------------------------------------------------

Course table:

Note: As given we have only one course that is 'Intro to ...'. So, we have only one course. Thus, number of rows in course table will be 1.

Number of rows: 1

Number of columns: 4 (CRN, course, room,depID)

Cols per row: 4

Number of cell in table: 4 (Number of rows multiply by cols per row)

-------------------------------------------------------------------------------------------

Department table:

Note: As given we have only one department that is 'CSE'. So, we have only one department. Thus, number of rows in course table will be 1.

Number of rows: 1

Number of columns: 2 (depID, depName)

Cols per row: 2

Number of cell in table: 2 (Number of rows multiply by cols per row)

----------------------------------------------------------------------------------------------

The total number of cells in the normalized version (with 3 tables) of the database is: (400+4+2) = 406

-----------------------------------------------------------------------------------------------

If each cell occupies 100 bytes, what would be the percentage difference in size of the size of the two databases? Compute the answer by completing the data in the following tables:

Size of un-normalized University database:

Number of cells X space occupy by each cell

800 X 100 = 80000 bytes

Size of normalized database:

Number of cells X space occupy by each cell

= 406 X 100 = 40600 bytes

Raw difference in database size:

80000 - 40600 = 39400 bytes

-----------------------------------------------

Percentage difference in database size:

Calculate percentage difference
between V1 = 80000 and V2 = 40600

|V1−V2|[(V1+V2)/2]×100

=|80000−40600|[(80000+40600)/2]×100

=|80000−40600|[(80000+40600)/2]×100

=(39400/(120600/2))×100

=(39400/60300)×100

=0.6534×100

=65.34%difference


Related Solutions

The advantages of a relational database include limiting the amount of redundant data that are stored...
The advantages of a relational database include limiting the amount of redundant data that are stored in a database. Why is this an important advantage? What can go wrong when redundant data are stored?
The following tables form part of a database (Flights Database) held in a relational DBMS: employee...
The following tables form part of a database (Flights Database) held in a relational DBMS: employee (empNo, empName, empSalary, empPosition) aircraft (aircraftNo, acName, acModel, acFlyingRange) flight (flightNo, aircraftNo, fromAirport, toAirport, flightDistance, departTime, arriveTime) certified (empNo, aircraftNo) Where:  employee contains details of all employees (pilots and non-pilots) and empNo is the primary key;  aircraft contains details of aircraft and C is the primary key.  flight contains details of flights and (flightNo, aircraftNo) form the primary key.  certified...
One of the characteristics of good relational database design is normalized tables. Discuss two ways in...
One of the characteristics of good relational database design is normalized tables. Discuss two ways in which normalization helps minimize data redundancy and anomalies such as insertion, deletion, and update anomalies. What are some other characteristics of good database design?
Task 2 (1 mark) Transformation of data stored in the relational tables into data stored in...
Task 2 (1 mark) Transformation of data stored in the relational tables into data stored in BSON collection.
Data Modeling and Database Design (Database Concepts, Eighth Edition) The relational model is the most important...
Data Modeling and Database Design (Database Concepts, Eighth Edition) The relational model is the most important standard in database processing today. Why do you feel that this model has continued to be successful in the world of IT? What would happen if large corporations decided to reject this theory, and store their data using a non-relational model? Are there any success stories where this has happened?
Database administrators use backups and replication to protect data stored in a database. Compare and contrast...
Database administrators use backups and replication to protect data stored in a database. Compare and contrast the two operations. Describe a business application scenario, and provide and discuss a sample query of a common backup or restore operation.
For this part, assume you are entering information about transactions into their relational database. You will...
For this part, assume you are entering information about transactions into their relational database. You will be entering financial information, but you will also need to consider the other types of information Big Marker would want to know about that event. You will also utilize what you have learned to identify internal controls specific to each transaction. The transactions are as follows: a. Billed 30 communities for monthly dues of $600 (each). b. Borrowed $10,000 from First National Bank with...
Compare the Data Modeling difference between relational database and NoSQL MongoDB.
Compare the Data Modeling difference between relational database and NoSQL MongoDB.
The following table is a non-normalized table for a Rental Property Management Database.  Data is included to...
The following table is a non-normalized table for a Rental Property Management Database.  Data is included to help you to perform the normalization.  One client may have multiple rental records. Identify the functional dependencies and normalize the table to 1NF, 2NF, and 3NF respectively. You also need to show the steps to reach normal form and populate each relation with sample data. As a result, you can see that duplicate data will be removed through the normalization process. CLIENT_RENTAL ClientNo ClientName PropertyNo...
Experiment with SQL queries on 2 tables: Background: Normalized databases will involve data that is organized...
Experiment with SQL queries on 2 tables: Background: Normalized databases will involve data that is organized into 2 or more tables. However, to answer queries, the data from 2 (or more) tables will need to be joined together to get the relevant information. Join operations in SQL are accomplished in the following manner: Indicate the tables involved in the query in the FROM clause Specify relationship/condition between columns in from the tables. Exercise: For this part of the exercise the...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT