Chapter 1: Database Management System (DBMS)
1.1 Introduction : Data, Information, Database and DBMS
Information is the backbone of any organization. It is the most critical resource of the organization. Different methods and techniques have been adopted to keep up-to-date information. Before the invention of the computer, people used to maintain database using different devices and used to keep records in the paper. With the development of computer system, methods of processing, techniques of keeping data and information have been completely changed. Computer has replaced millions of files and folders. It can store a large amount of data in a small place of a disk. Let us discuss some important terms of database.
Data: Data is raw facts which are composed of alphabets, digits and other symbols. It may or may not give any sense For example: 1, Rubina, 5.4, 2, Rasbina etc.
Information: When data are processed using a database program or software, they are converted to the meaningful result, called information. In other words, the output of data processing is called information. It is an organized collection of related data, which gives a complete sense. Example: Information about weight of 3 students.
Height of students | ||
---|---|---|
S.N. | Name | Height |
1 | Rubina | 5.4 |
2 | Rasbina | 5.6 |
3 | Rahul | 5.8 |
Database: Database is a collection of interrelated data of objects or entities stored in tabular form. Database gives very useful information for an organization during data manipulation and decision making. It provides a base or foundation for managing large volume of data in well organized manner. For example: Phone Diary, Result Sheet, Customer Records and Price List etc.
Database Management System (DBMS)
A database management system is computerized records keeping system. It is a software that defines, manipulates and manages the database. It allows to access the files, update the records and retrieve data as requested. In other words, DBMS is defined as the collection of interrelated data and set of programs to access those data.
POINTS TO REMEMBER
DBMS is defined os the collection of interrelated data and set of programs to access those data. The collection of data is usually database which contains the Information about any particular organization and set of programs is special type language to manage those database.
The primary goal of a DBMS is to provide an efficient and effective environment for both data retrieval and storing data in database. Commercially database management system is called database packages or also called database engine, Some examples are FoxPro, DBase, Sybase, MS Access, My SQL, MS SQL Server, Oracle and DB2 etc.
The above figure shows the logical structure of database management system. Users may be persons or application from external environment that supplies various requests to the system. Database requests mean the various activities from users. The database system checks for the requests and performs various activities as per the need of users.
Some major database activities are:
- Adding new file to the database
- Inserting data into the database file
- Retrieving/Viewing data from the database
- Updating data in existing database file
- Deleting data from the database file
- Removing files from the database
POINT TO REMEMBER
Database is a collection of interrelated data of entities or objects which is stored in a computer in such a way that it can be easily accessed by user.
Assignment-1
1. Define database.
2. Compare data and database.
1.2. Field, Record, Objects, Primary Key, Alternate key, Candidate key
- Field: A field is the property or attributes of a table. Examples: SN, Name, Weight are fields.
- Record: A record is the collection of interrelated fields. Example: there are three records in above table.
- Table: A table is the arrangements of rows and columns. Each table must have unique name and must be simple. A row defines a record and a column defines a field in a table
Objects: Objects are used in database to store or reference data.So, objects holds data for manipulation and display it in the proper format. Main components of DBMS are field, record and table. Example of components of objects are:
Height of students | ||
---|---|---|
SN | Name | Height |
1 | Rubina | 5.4 |
2 | Rasbina | 5.6 |
3 | Rahul | 5.8 |
Primary Key: The field or fields that contain the unique value can be set as primary key. It does not permit duplicate or null values. For example, the registration number of a student is a primary key.
Table:Student | |
---|---|
Reg_Number | Name |
1001 | Hari |
1002 | Sita |
1003 | Ram |
Here, Reg_Number is the primary key.
Alternate Key: Alternate is not a primary key but combining one or more columns it make unique. So, alternate key associated key with one or more columns whose values uniquely identify every row in the table. Each alternate key can generate a unique index or a unique constraint in a target database. For example, where the primary key for a table may be the student id, the alternate key might combine the first, middle, and last names of the student.
Candidate Key: Occasionally we may encounter a relation in which there s more than one attributes processing the unique identification property. So, all attribute combinations inside a relation that can provide the unique record in a relation are called candidate keys.
1.3 Advantages and Disadvantages of DBMS
Advantages of DBMS
-
Sharing Data: Data stored in a database can be shared. It refers to the capacity that makes data simultaneously accessible by many users without any interference.
-
Reduced data redundancy: The same data may be duplicated at many times or places, is called data redundancy. DBMS reduces such type of duplication of data from database.
-
Data backup and recovery: DBMS provides backup facilities to store data for future use. If any files or data lost in any computer, it is possible to restore them from database server.
-
Inconsistency avoided: When the same data is duplicated and changes are made at one site, and mot on other sites, it causes data inconsistency. DBMS avoided such type of data inconsistency.
-
Data integrity: Data integrity means data accuracy, consistency and up to date. A DBMS should provide capabilities for defining and enforcing constraints for data integrity.
-
Data Security: In database system, an unauthorized person cannot access data from database. Although various departments may share data in database, access to specific information that can be limited to selected users.
-
Data independence: Description of the data(Schema) is stored in one central place. Therefore, applications do not have to recompile when the format of the data changes.
-
Multiple user interfaces: DBMS provides variety of interfaces for various users. It provides query language interface, forms and command interfaces so that users interact easily with the database.
-
Process complex query: It provides different methods such as view, trigger, index etc. to process complex queries.
Disadvantages of DBMS
- Expensive: Database software is very expensive for large computer systems. It also requires overhead costs for maintaining and integrity functions.
- Changing technology: It is fast changing technology.
- Needs Technical Training: It is complex to understand and implement. So, proper training is required for staff to work properly in the database system.
- Backup is needed: It needs to explicit backup. This adds costs as new storage space are needed to hold the data.
POINT TO REMEMBER
A DBMS is a set of programs that manages the database files. It allows accessing the files, updating the records and retrieving data as requested.
Assignment-2
1. Describe primary key, alternate key and candidate key with examples.
2. Write advantages of DBMS.
1.4 DDL(Data Definition Language) and DML(Data Manipulation Language)
SQL stands for Structured Query Language. It is an international standard database query language for accessing and managing data in the database. SQL was introduced and developed by IBM in early 1970s. IBM was able to demonstrate SQL which could be used for communicating with database. It provides platform which allows the user to query a database without getting depth knowledge of the design of the underlying tables SOL has statements for data definition, data manipulation and data control. A query is a request to the DBMS for the retrieval, modification, insertion and deletion of the data from the database.
SQL (Structured Query Language) Pronounced an "See"-"Quell" is made of three sub languages DDL, DML and DCL.
Basic Structure of SQL
To create a table
CREATE table table name (field1 data_type, field2 data type... );
For example:
CREATE table Student (Sid NUMBER (4), Name CHAR (25));
To insert records into a table Student
INSERT into table name
VALUES (List of values);
For example: INSERT into Student
VALUES (22, 'Ram');
To display records from a table
SELECT field1, field2......
FROM tables name
For example: SELECT SId, Name form Student
1. DDL (Data Definition Language): DDL is used by the database designers and programmers to specify the content and structure of the table. It is used to define the physical characteristics of records. It includes commands that manipulate the structure of objects such as Views, Tables, and Indexes etc. Examples of these Commands are Create, Alter, Drop, Rename etc.
To create a table
CREATE table table name (field1 data type, field2 data_type. .. );
For example:
CREATE table Student (Sid NUMBER (4), Name CHAR (25);
2. DML (Data Manipulation Language): DML is related with manipulation of records such as retrieval, sorting, display and deletion of records or data. It helps user to use query and display reports of the table. So, it provides techniques for processing the database. It includes commands to manipulate the information stored in the databases. Examples of these Commands are Insert, Delete, Select and Update etc.
To insert records into a table Student
INSERT into table name
VALUES (List of values);
For example: INSERT into Student
VALUES (22, 'Ram');
POINT TO REMEMBER
DDL is used to define the physical characteristics of records, DML is method with manipulation of records such as retrieval, sorting, display and deletion of records or data.
Assignment-3
1. Differentiate between DDL and DML.
1.5 Database Model: Network Model, Hierarchical Model, Relational database Model
There are different forms of database management system, each characterized by the way where data are defined and structured, called database model. It is the organizing principles of records in secondary storage. Following are the different architectures: -
- Hierarchical Database Model
- Network Database Model
- Relational Database Model
Hierarchical Database Model
Hierarchical database model is one of the oldest type database models., In this model data is represented in the form of records. Each record has multiple fields/attributes. All records are arranged in database as tree like structure. The relationship between the records is called parent child relationship in which any child record relates to only a single parent type record. In the figure below, there are parent records at the top level with high privileges and child records at bottom Jevel. There are three types of records: students, management and department chief records.
Advantages of Hierarchical Database Model
- It is the easiest model.
- It has one or more attributes.
- The searching is fast and easy, if parent is known.
- It supports one-to-one and one-to-many relationship.
Disadvantages of Hierarchical Database Model
- It is old fashioned, outdated database model.
- It does not support many-to-many relationship.
- The dependency on parent node is not beneficial always.
- It increases redundancy because same data is to be repeated in different places.
(Image)
Assignment 4
1. Draw a figure to show example of hierarchical database model.
Network Database Model
The network database model replaced hierarchical database model due to some limitations on the model. Suppose if an employee relates to two departments, then the hierarchical database model cannot able to arrange records in proper place. So network database model was emerged to arranged non- hierarchical database. The structure of database is more like graph rather than tree structure. A network database model consists of collection of records which are inter-related to each other with the help of relationship. Each record has multiple fields and each field has only one data value. In figure below, exam records relate to management and students, and similarly employee records relate to department chief as well as management. In this model, each record in the database can have multiple parents that is the relationships among data elements can have many to many relationships.
(image)
Advantages of Network Database Model
- It accepts many-to-many relationship, so it is more flexible.
- The searching is faster because of multidirectional pointers.
- The network model is simple and easy to design.
- It reduces redundancy because data shouldn't be repeated if same data is needed.
Disadvantages of Network Database Model
- It is difficult to handle the relationship in complex programs.
- There is less security because of sharing data.
- It increases the processing overhead due to the complex relationship
Relational Database Model
In relational database model, the data is organized into tables which contain multiple rows and columns. These tables are called relations. A row in a table represents a relationship among a set of values. Since a table is a collection of such relationships, it is generally referred to the mathematical term relation, from which the relational database model derives its name. It is also known as RDBMS.
POINT TO REMEMBER
The database system which stores and displays data in tabular format of rows and columns, like spreadsheet, is known as Relational Database Management System (RDBMS).
Examples of RDBMS are Oracle, SQL, MS Access, MySQL etc.
We notice from below table (Student and Subject), here each student has a unique roll number and has marks of Math, English and Computer. Here, roll number makes relation between these two tables.
Table :Student | |
---|---|
Roll | Name |
1 | Hari |
2 | Sita |
3 | Ram |
Table :Subject | |||
---|---|---|---|
Roll | Math | English | Computer |
1 | 80 | 90 | 95 |
2 | 90 | 80 | 85 |
3 | 95 | 95 | 95 |
For example, if we make relation between Student and Subject, we can get following.
Roll | Name | Math | English | Computer |
---|---|---|---|---|
1 | Hari | 80 | 90 | 95 |
2 | Sita | 90 | 80 | 85 |
3 | Ram | 95 | 95 | 95 |
Advantages of Relational Database Model
- The breaking of complex database table into simple database table becomes possible.
- Database processing is faster than other model.
- There is very less redundancy (reputition of data).
- The integrity rules can easily be implemented.
Disadvantages of Relational Database Model
- It is more complex than other models.
- There are too many rules because of complex relationships.
- It needs more powerful computers and data storage devices.
Assignment-5
1. Compare between Network Database model and Relational database model with diagram.
1.6 Concept of Normalization: 1NF, 2NF, 3NF
Normalization is a database design process in which complex database table is broken down into simple separate tables. It makes data model more flexible and easier to maintain.
E.F. Codd has introduced some rules for normalizing the database which is known as the normal forms.
In other words, there are different steps in Data Normalization, which are called normal forms. This process minimizes and controls the duplication of data in the database and also provides a rapid search for data from the database. It reduces the complexity of database.
There are two goals of the normalization process: Eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table).
Advantages / Benefits of Normalization:
- The dependency between the data fields is identified.
- The redundancy in database is minimized.
- Data model is made more flexible and easier to maintain.
- It improves faster storing and index creation.
- It improves the performance of the database system.
- It simplifies the structures of tables.
- It avoids the loss of information.
Assignment-6
1. What is normalization? Write advantages of normalization.
Database Normalization With Examples
Database Normalization Example can be easily understood with the help of a case study. Assume, a Book library maintains a database of Books rented out. Without any normalization in database, all information is stored in one table as shown below.
Let's understand Normalization database with normalization example with solution:
Full Names | Physical Address | Books Rented | Salutation |
---|---|---|---|
Alex Patel | Kalanki, Ktm | Alchemist, 1984 |
Ms. |
Robert Greene | Chabahil, Ktm | The 48 Laws of Power, The Art of Seduction | Mr. |
Robert Greene | Dillibazar, ktm | The 33 Strategies of War | Mr. |
Books Rented column has multiple values.
Now let's move into 1st Normal Forms:
First Normal Form (1NF)
1NF (First Normal Form) Rules
- The data field should be an atomic.
- It eliminates duplicate rows and columns from the same table.
- It minimized the data redundancy in the database table.
The above table in 1NF:
Full Names | Physical Address | Books Rented | Salutation |
---|---|---|---|
Alex Patel | Kalanki, Ktm | Alchemist | Ms. |
Alex Patel | Kalanki, Ktm | 1984 | Ms. |
Robert Greene | Chabahil, Ktm | The 48 Laws of Power | Mr. |
Robert Greene | Lainchaur, Ktm | The Art of Seduction | Mr. |
Robert Greene | Dilli Bazar, Ktm | The 33 Strategies of War | Mr. |
Before we proceed let's understand a few things:
What is a key in SQL?
A key in SQL is a value that is used to identify records in a table uniquely. It can be a single column or a combination of multiple columns. Keys are used to prevent duplicate data in a table, and they also help to establish relationships between tables in a database.What is a primary key?
A primary key is a special type of key that uniquely identifies each record in a table.
- It cannot be NULL.
- It must be unique.
- It should rarely be changed.
- It must be specified when a new record is inserted.
What is a composite key?
A composite key is a primary key that is composed of multiple columns. This is used when a single column is not enough to uniquely identify a record. For example, a table of students might have a composite key of student_id and last_name. This ensures that no two students can have the same student ID and last name.Here are some examples of keys in SQL:
- A table of products might have a primary key of product_id.
- A table of customers might have a composite key of customer_id and email_address.
- A table of orders might have a composite key of order_id and customer_id.
Keys are an important part of database design. They help to ensure that data is accurate and consistent, and they also make it easier to query and manipulate data.
In our database, we have two people with the same name Robert Greene, but they live in different places.
Robert Phil | Chabahil, Ktm | The 48 Laws of Power, The Art of Seduction | Mr. |
Robert Phil | Dillibazar, Ktm | The 33 Strategies of War | Mr. |
Names are common. Hence you need name as well as address to uniquely identify a record. Hence, we require both Full Name and Address to identify a record uniquely. That is a composite key.
Second Normal Form (2NF)
2NF (Second Normal Form) Rules
- It should be in the first normal form.
- It identifies data dependencies.
- Non Key attributes are functionally depends on key attribute (Primary Key)
It is clear that we can't move forward to make our simple database in 2nd Normalization form unless we partition the table above.
Membership_ID | Books Rented |
---|---|
1 | Alchemist |
1 | 1984 |
2 | The 48 Laws of Power |
2 | The Art of Seduction |
3 | The 33 Strategies of War |
We have divided our 1NF table into two tables:
- Table 3: This table contains member information. The primary key of this table is Membership_ID. This means that each record in this table must have a unique Membership_ID value.
- Table 4: This table contains information on books rented. The foreign key of this table is Membership_ID. This means that each record in this table must have a Membership_ID value that exists in the Membership_ID column of Table 3
Foreign keys in DBMS
A foreign key is a column or set of columns in one table that references the primary key of another table. It is used to establish a relationship between two tables in a database.Foreign keys are important for several reasons:
- They help to ensure the accuracy of data. When a foreign key value is inserted into a table, the database checks to make sure that the value exists in the primary key column of the referenced table. This prevents users from accidentally inserting invalid data into the table.
- They help to maintain referential integrity. Referential integrity is a database constraint that ensures that the data in one table is consistent with the data in another table. When a foreign key value is deleted from the referenced table, the database will automatically delete the corresponding record from the referencing table. This prevents users from accidentally deleting data that is still referenced by another table.
- They help to improve performance. Foreign keys can be used to optimize queries that join two tables. For example, if a query needs to find all records in Table 4 where the Membership_ID value is 101, the database can use the foreign key to quickly find the corresponding record in Table 3. This can significantly improve the performance of the query.
Why do you need a foreign key?
Foreign keys are important for maintaining the accuracy and consistency of data in a database. They also help to improve the performance of queries that join two tables.
In the example above, If we try to insert a record into Table 4 with a Membership_ID value of 5. However, there is no record in Table 3 with a Membership_ID value of 5. This would violate referential integrity, because the data in Table 4 would no longer be consistent with the data in Table 3.
If the foreign key constraint is enabled, the database will not allow the novice to insert the record into Table 4. This will help to prevent them from accidentally inserting invalid data into the database.
Membership_ID | Book Rented |
---|---|
5 | Computer Science Basics |
But Membership ID 5 is not present in table 3.
Membership_ID | Full Names | Physical Address | Salutation |
---|---|---|---|
1 | Alex Patel | Kalanki, Nepal | Ms. |
2 | Robert Greene | Chabahil, Nepal | Mr. |
3 | Robert Greene | Dillibazar, Nepal | Mr. |
Referential integrity is a database constraint that ensures that the data in one table is consistent with the data in another table. This is done by establishing a relationship between the two tables using a foreign key.
In the example you provided, the Membership_ID column in Table 4 is a foreign key that references the Membership_ID column in Table 3. This means that the Membership_ID value in each record in Table 4 must also exist in a record in Table 3.
If someone tries to insert a record into Table 4 with a Membership_ID value that does not exist in Table 3, the database will throw an error. This is because the foreign key constraint would be violated.
The foreign key constraint helps to ensure that the data in Table 4 is always consistent with the data in Table 3. This is important for maintaining the accuracy and integrity of the data in the database.
Transitive functional dependencies are a type of functional dependency that occurs when a non-key column depends on another non-key column through an intermediate key column.
In other words, a transitive functional dependency is when changing a non-key column, might cause any of the other non-key columns to change.
For example, let's say we have a table of students with the following columns:Student_ID (primary key)
Name
Course_1
Course_2
The Course_1 and Course_2 columns are non-key columns. However, they are functionally dependent on the Student_ID column. This means that if we change the Student_ID value for a record, we will also need to change the Course_1 and Course_2 values for that record.
Transitive functional dependencies can be a problem in databases, because they can lead to data anomalies. For example, if we delete a record with a Student_ID value of 100, we will also need to delete the records for all of the courses that the student was enrolled in.
To avoid data anomalies, it is important to identify and eliminate transitive functional dependencies in databases. This can be done by normalizing the database tables.
Consider the Table 3, Changing the non-key column Full Name may change Salutation.
Membership_ID | Full Names | Physical Address | Salutation |
---|---|---|---|
1 | Alex Patel | Kalanki, Nepal | Ms. |
2 | Robert Greene | Chabahil, Nepal | Mr. |
3 | Robert Greene Sharma | Dillibazar, Nepal | Mr. |
Change in Name may change salutation.
Third Normal Form(3NF)
3NF (Third Normal Form) Rules
- It should be in the second normal form.
- It removes transitive dependencies in a table.
To move our 2NF table into 3NF, we again need to divide our table.
Below is a 3NF example in SQL database:
Membership_ID | Full Names | Physical Address | Salutation_ID |
---|---|---|---|
1 | Alex Patel | Kalanki, Ktm | 2 |
2 | Robert Greene | Chabahil, Ktm | 1 |
3 | Robert Greene | Dillibazar, Ktm | 1 |
Membership_ID | Movie Rented |
---|---|
1 | Alchemist |
1 | 1984 |
2 | The 48 Laws of Power |
2 | The Art of Seduction |
3 | The 33 Strategies of War |
Salutation_ID | Salutation |
---|---|
1 | Mr. |
2 | Ms. |
3 | Mrs. |
4 | Dr. |
- We have again divided our tables and created a new table which stores Salutations.
- There are no transitive functional dependencies, and hence our table is in 3NF.
- In Table 7 Salutation ID is primary key, and in Table 5 Salutation ID is foreign key to primary key in Table 7.
Chapter 2: Data Communication and Networking
Chapter 5: Object-Oriented Programming (OOP)
Chapter 6: Software Process Model (SPM)
Chapter 7: Recent Trends in Technology