SlideShare a Scribd company logo
DATABASE MANAGEMENT
SYSTEMS
COURSE CODE : 24CA1202
UNIT- 1
• INTRODUCTION, DATABASE DESIGN AND RELATIONAL MODEL
• Introduction
• Database System Applications,
• Purpose of Database Systems,
• View of Data
• Database Languages,
• Database and Application Architecture
• Database Users and Administrators
• Database Schema
• Keys
• Schema Diagrams
• ER diagrams
• Mapping Cardinalities
• Alternatives Notations for Modeling Data – Data Flow Diagram.
UNIT I INTRODUCTION
Database & Database Users. Characteristics of the Database
Approach advantages of usine DBMS. Data Models, Schemas
& Instances. DBMS Architecture & Data Independence.
System Architecture for DBMS and Data Dictionary,
Database Users Data Base languages & Interfaces. Data
Modeling using the Entity-Relationship Model -Entity types,
Entity Sets, Attributes and Keys, Relationship, Relationship
Types, Week Entity Types, Structural Constraints, Enhanced
ER Model- Specialization Generalization, Constraints on
Specialization Generalization- Data flow diagram.
DATA
• Data refers to raw facts
• Data may be numerical data which may be integers or floating
point numbers, and non-numerical data such as characters,
date and etc.,
Database Management system : UNit I Helping to understand basics of dbms operation
Prepared by : M.Nirmala / AP / MCA
Example
98
89
87
92
phy
chem
maths
biology
The above numbers may be anything:
It may be distance in kms or amount
in rupees or no of days or marks in
each subject etc.,
Prepared by : M.Nirmala / AP / MCA
Information
• Processed Data is referred as Information. It delivers some
meanings to the user
98
89
87
92
phy
chem
maths
biology
Prepared by : M.Nirmala / AP / MCA
Database
• Database is an organized, collection of related
information
• Eg. telephone directory, dictionary, and college
database.
98
89
87
92
phy
chem
maths
biology
76
87
79
88
phy
chem
maths
biology
91
67
87
77
phy
chem
maths
biology
86
80
79
88
phy
chem
maths
biology
Prepared by : M.Nirmala / AP / MCA
Database Management Systems
• A database management system or DBMS is software designed
to store, modify, maintaining and utilizing large collection of data.
• A database-management system (DBMS) is a collection of interrelated
data and a set of programs to access those data
• Database Applications:
• Banking: all transactions
• Airlines: reservations, schedules
• Universities: registration, grades
• Sales: customers, products, purchases
• Online retailers: order tracking, customized recommendations
• Manufacturing: production, inventory, orders, supply chain
• Human resources: employee records, salaries, tax deductions
• Databases touch all aspects of our lives
Why do u need a DBMS?
• A Database Management System Is an Extension of Human Logic
• Computers Can Quickly Answer Lots of Questions
• Some Questions Can Be Really Complicated
• We Are Easily Overwhelmed With Information
• Automation Is the Key to Efficiency
• A DBMS Is Better Than Manual Processes in so Many Ways
DATABASE APPLICATIONS
ENTERPRISE INFORMATION
• Sales: For customer, product, and purchase information.
• Accounting: For payments, receipts, account balances, assets and other
accounting information.
• Human resources: For information about employees, salaries, payroll taxes,
and benefits, and for generation of paychecks.
• Manufacturing: For management of the supply chain and for tracking
production of items in factories, inventories of items inwarehouses and
stores, and orders for items.
• Online retailers: For sales data noted above plus online order tracking,
generation of recommendation lists, and maintenance of online product
evaluations
Banking and Finance
• ◦ Banking: For customer information, accounts, loans, and banking
transactions.
• ◦ Credit card transactions: For purchases on credit cards and generation of
• monthly statements.
• ◦ Finance: For storing information about holdings, sales, and purchases of
• financial instruments such as stocks and bonds; also for storing real-time
• market data to enable online trading by customers and automated trading
• by the firm.
Universities
• For student information, course registrations, and grades (in addition
to standard enterprise information such as human resources and
accounting).
• Airlines: For reservations and schedule information. Airlines were
among the first to use databases in a geographically distributed
manner.
Telecommunication: For keeping records of calls made, generating
monthly bills, maintaining balances on prepaid calling cards, and storing
information about the communication networks.
Prepared by : M.Nirmala / AP / MCA
Database Systems Vs File Systems
• FILE BASED DATA MANAGEMENT
• Before advent of database , file system—manual file system
was used
• 1) Collection of file, folders each tagged and kept in a filing
cabinet
• 2) When data was relatively small—few reporting requirements
—manual system was ok
• 3) Requirements grew—manual system complex (files, folders
collection growing—time consuming & cumbersome
Prepared by : M.Nirmala / AP / MCA
• The conventional file system was much faster and efficient
than manual counterparts but not flexibly than modern DBMS.
• DISADVANTAGE OF FILE BASE SYSTEM
• Need for programming and skilled personnel
• Need different application programs
• to extract records
• to add records etc
• For Each query an application program has to be written
• For different reports to be generate, different programs to be
written
• Lack of data security
• Password security alone is maintained and it is not sufficient
when different users have access permissions to different
subsets of data
• To safeguard confidentiality was difficult
• To lock part of files or part of the system not possible
Prepared by : M.Nirmala / AP / MCA
• Information compartmentalization
• Promote data ownership and thus promoting the
storage of same data in different location
• Professional says as information islands or
information compartmentalization, data at different
locations are to be consistently updated, islands
of information often contains different versions of
data
• Structural Dependency
• Change in file structure/addition/deletion of a field
requires modification of all programs using that
files
• It exhibits structural dependency
• Changing from integer to decimal will make
change in program
Prepared by : M.Nirmala / AP / MCA
• Data Dependency
• File data characteristic change
• Normal representation is ASCII
• Specify opening of a specific file type either random or
sequential
• How to store and how to retrieve is to be clearly defined
• Data Redundancy
• Employee details stored in personnel dept ,account
department, production planning department etc
• Data Inconsistency
• Data stored at different location / departments and
modification / deletion / addition can create data
inconsistencies
• Eg:-employee leaves organization should be properly
updated in all departments
Prepared by : M.Nirmala / AP / MCA
Concurrent access by multiple users
• Concurrent accessed needed for performance
• Uncontrolled concurrent accesses can lead to
inconsistencies
• Example: Two people reading a balance
and updating it at the same time
Security problems
• Hard to provide user access to some, but not
all, data
Prepared by : M.Nirmala / AP / MCA
ADVANTAGE OF DBMS
• Data independence
• Application programs should be independent as possible from
details of data representation & storage.
• Efficient data access
• Data can be effectively retrieved when data is stored on external
storage device
• Data integrity & Security
• DBMS can enforce integrity constraints on data
• DBMS can check, before inserting salary for an employee the dept
budget not exceeded.
• Security restrictions can be applied
• Access controls is enforced
• What data is visible to different classes of users
• Data administration
• When several users share data centralizing the administration of
data can offer significant improvements
Prepared by : M.Nirmala / AP / MCA
• Concurrent access and crash recovery
• Concurrent access to data, in such a manner that users can
think of the data being accessed by only one user at a time
• DBMS protects users from effects of system failures
• Reduced application development time
• It supports many important functions that are common to
much application accessing data stored in the DBMS
• Describing and storing data in a DBMS
• A data model is a collection of high level data description
that hide many low level storage details.
• Conflicting requirements can be balanced
• Knowing the overall requirements helps database designers
in creating a database design that is best for the
organization
View of Data / Database Architecture
Prepared by : M.Nirmala / AP / MCA
DBMS Architecture
• The DBMS architecture describes how data in the database is
viewed by the users.
• Database abstraction also hides the implementation details of the
data from the user
• The major purpose of a database system is to provide users with an
abstract view of the data i.e.
• The system hides certain details of how the data are stored and
maintained.
Example to Understand Abstraction
• User Level
• A layman is walking on the street in front of your college.
• When he looks at your college what all does he knows about the
college!
• He knows the name of the college then he might know that it is a
Science or Arts college, it is run by so a so society etc.
Logical Level
• You are a student of the college
• You know more about the person on the street
• You know the number of courses held in the college, most of the
teachers in the college, the course fees of the course which you are
learning, the different buildings of the college etc.
Physical Level
• Now, consider the administrative staff of the college.
• These people know even more information about the college than a
student of the college
• like how courses are conducted,
• how teachers are appointed,
• what is the fee structure of the courses
• what new courses are to be started etc.
• Now, these 3 people are looking at the same thing (i.e. the college)
from 3 different levels of abstraction.
Prepared by : M.Nirmala / AP / MCA
3 Levels of Abstraction
• The architecture is defined at 3 levels. This architecture is
proposed by ANSI/SPARC (American National Standards
Institute/Standards Planning and Requirements Committee) and
hence, is also known as ANSI/SPARC architecture
• Physical level / Internal Level
• Logical level / Conceptual
• View or external level
Prepared by : M.Nirmala / AP / MCA
DBMS Architecture Diagram
Physical level
• Lowest level of abstraction.
• It describes how the data are actually stored.
• It describes data structures in detail.
• At this level achieve various aspects are considered for optimal
runtime performance and storage space utilization.
• These aspects include storage space allocation techniques for
data and indexes, access paths such as indexes, data
compression and encryption techniques, and record placement
Logical level
• Next level of abstraction.
• It describes what data are stored in database and what
relationship exists between those data.
• Defines all database entities, their attributes, and their
relationships
• Security and integrity information
Prepared by : M.Nirmala / AP / MCA
View or external level
• It describes only some part of entire database.
• It access only database and not concerned about
physical storage location and data structures.
Prepared by : M.Nirmala / AP / MCA
• Student information it contains
• studentid
• stuname
• stuage
• stusex
• stugrade
• Physical level:
• Student information are stored as block of consecutive storage
locations. 79 memory locations are needed.
• Logical level:
• Char stuid [25], stuname [50], stusex, and stugrade;
• int age;
• External level:
• It uses set of application program to retrive data from database
• Select * from student where stuid=’1001’;
Prepared by : M.Nirmala / AP / MCA
Instances and Schemas
• Similar to types and variables in programming languages
• Schema –Overall design of a database is called as schema
• Example: The database consists of information about a set of
customers and accounts and the relationship between them)
• Analogous to type information of a variable in a program
• Physical schema: database design at the physical level
• Logical schema: database design at the logical level
• Instance
• Collection of information stored in the database at a particular
moment is called an instance of a database.
• Analogous to the value of a variable
• Physical Data Independence – the ability to modify the physical
schema without changing the logical schema
• Applications depend on the logical schema
• In general, the interfaces between the various levels and
components should be well defined so that changes in some
parts do not seriously influence others
Prepared by : M.Nirmala / AP / MCA
• Eg of instance & schema:
• In a program, various variable declarations are
referred as schema.
• Every variable has a particular value at a given
instance of database schema
• Database schemas
• 3 types of schemas:
• 1. Conceptual schema.
• 2. Physical schema/ internal.
• 3. Logical schema/external.
Prepared by : M.Nirmala / AP / MCA
Database Schemas
External schema 1 External schema 2 External schema 3
Conceptual schema
Physical schema
disk
Prepared by : M.Nirmala / AP / MCA
Instances and Schemas
• DB Schema Diagram for a Company:
Employee:
Eno Ename Salary Address
Prepared by : M.Nirmala / AP / MCA
Instances and Schemas
• DB Schema Diagram for a Company:
Department:
Dno Dname Dlocation
Project:
Pno Pname Hours
Prepared by : M.Nirmala / AP / MCA
Instances and Schemas
• Instance Example:
Eno Ename Salary Address
1
2
3
A
B
C
10,000
20,000
30,000
First street
Second street
Third street
Prepared by : M.Nirmala / AP / MCA
CONCEPTUAL SCHEMA
• It describes all relations that are stored in the
database
• Students(sid :strings,name :strings,age:integer)
• Faculty(fid :string,fname:string,sal:real)
• Courses(cid:string,cname:string
• Rooms(rno:integer,address:string,capacity:integer)
• Enrolled(sid:string,cid:string,grade:string)
• Teachers(fid:string,cid:string)
• Meets-in(cid:string,rno:integer,time:string)
Prepared by : M.Nirmala / AP / MCA
Physical Schema
• It specifies additional storage details.
• The relations described in conceptual schema are actually
stored in a secondary storage devices such as disks, tapes etc.
• What file org is used to store the relations
Prepared by : M.Nirmala / AP / MCA
External Schema
• It allows data access to be customized at the level of individual
users.
• Only one conceptual schema and one physical schema but may
have several external schemas each tailored to particular group
of users.
• A view is conceptually a relation but the records in a view are
not stored in DBMS.
• Suited for end user requirement.
Prepared by : M.Nirmala / AP / MCA
Mapping
• Whenever a user specifies a request to generate a new external
view, the DBMS must transform the request specified at
external level into a request at conceptual level, and then into a
request at physical level.
• This process of transforming the requests and results between
various levels of DBMS architecture is known as mapping.
Prepared by : M.Nirmala / AP / MCA
Data Independence
• The main advantage of three-schema architecture is that it
provides data independence.
• Data independence is the ability to change the schema at one
level of the database system without having to change the
schema at the other levels.
• Data independence is of two types, namely, logical data
independence and physical data independence.
Prepared by : M.Nirmala / AP / MCA
Three Levels of online Book Database
Prepared by : M.Nirmala / AP / MCA
Logical Data Independence
• It is the ability to change the conceptual schema without affecting the
external schemas or application programs.
• The conceptual schema may be changed due to change in
constraints or addition of new data item or removal of existing data
item, etc., from the database.
• The separation of the external level from the conceptual level enables
the users to make changes at the conceptual level without affecting
the external level or the application programs.
• For example, if a new data item, say Edition is added to the BOOK
file, the two views (view 1 and view 2 shown in the above Figure) are
not affected.
Prepared by : M.Nirmala / AP / MCA
Physical Data Independence
• Physical data independence: It is the ability to change the
internal schema without affecting the conceptual or external
schema.
• An internal schema may be changed due to several reasons
such as for creating additional access structure, changing the
storage structure, etc.
• The separation of internal schema from the conceptual schema
facilitates physical data independence.
Prepared by : M.Nirmala / AP / MCA
• Logical data independence is more difficult to achieve than the
physical data independence because the application programs
are always dependent on the logical structure of the database.
• Therefore, the change in the logical structure of the database
may require change in the application programs.
Prepared by : M.Nirmala / AP / MCA
Data Models
• A data model is an organizing principle that specifies particular
mechanisms for data storage and retrieval
• The primary difference between the various database models
lie in the method of expressing relationships and constraints
among data elements
RELATIONAL MODEL
• Relational Databases store data in relations i.e. tables.
• Each relation must have a name.
• Relation can be otherwise called as object or Table
Tuple/Row/Record
• A single entry in a table is called a Tuple or Record or Row.
• A tuple in a table represents a set of related data. For example, the
above Employee table has 4 tuples/records/rows.
• A record is a set of related data in a table.
ATTRIBUTES
• A table consists of several records(row), each record can be broken down into
several smaller parts of data known as Attributes
• An attribute is a named column of a relation. It stores a specific information
about an object / Table e.g. salary.
• Name the Attributes
• The above Employee table consist of five attributes, ID, ENAME, SALARY,
BONUS, DEPT
Attribute Domain : Attribute is defined to hold only a certain type of values, which
is known as Attribute Domain.
The attribute Name will hold the name of employee for every tuple. If we save
employee's address there, it will be violation of the Relational database model.
ATTRIBUTES
CARDINALITY OF THE RELATION
DEGREE OF THE RELATION
NULL ATTRIBUTE
• The attribute value that is currently unknown is called Null Attribute
Attribute Domain
• Attribute is defined to hold only a certain type of values, which is
known as Attribute Domain.
Database Management system : UNit I Helping to understand basics of dbms operation
Data Integrity and Constraints / Keys
• Integrity constraints are a set of rules. Integrity constraints ensure that
the data insertion, updating, and other processes have to be performed
in such a way that data integrity is not affected.
• Thus, integrity constraint is used to guard against accidental damage to
the database.
• Data integrity refers to maintaining and assuring the accuracy and
consistency of data over its entire life-cycle.
• Database Systems ensure data integrity through constraints which are
used to restrict data that can be entered or modified in the database.
• Database Systems offer three types of integrity constraints:
Types of Integrity
Integrity
Types Definition
Enforced
Through
Entity
Integrity
Each table must have a column or a set of columns through
which we can uniquely identify a row. These column(s)
cannot have empty (null) values.
PRIMARY
KEY
Domain
Integrity
All attributes in a table must have a defined domain i.e. a
finite set of values which have to be used. When we assign a
data type to a column we limit the values that it can contain.
In addition we can also have value restriction as per business
rules e.g. Gender must be M or F.
DATA TYPES,
CHECK
CONSTRAINT
Referentia
l Integrity
Every value of a column in a table must exist as a value of
another column in a different (or the same) table.
FOREIGN KEY
Keys
• Super Key
• Candidate Key
• Primary Key
Super Key
• A super key is a set of one or more attributes (columns), which can
uniquely identify a row in a table.
• Often DBMS beginners get confused between super key
and candidate key
• Let’s take an example to understand this:
• Table: Employee
Emp_SSN Emp_Number Emp_Name
123456789 226227 Steve
999999321 227 Ajeet
999999322 228 Chaitanya
777778888 229 Robert
• Super keys: All of the following sets of super key are able to uniquely
identify a row of the employee table.
• {Emp_SSN}
• {Emp_Number}
• {Emp_SSN, Emp_Number}
• {Emp_SSN, Emp_Name}
• {Emp_SSN, Emp_Number, Emp_Name}
• {Emp_Number, Emp_Name}
• 1. {Emp_Id} – No redundant attribute
• 2. {Emp_Number} – No redundant attributes
• 3. {Emp_Id, Emp_Number} – No Redundant attribute. Either of those
attributes can be a minimal super key as both of these columns have
unique values.
• 4. {Emp_Id, Emp_Name} – Redundant attribute Emp_Name
• 5. {Emp_Id, Emp_Number, Emp_Name} – Redundant attributes. Emp_Id
or Emp_Number alone are sufficient enough to uniquely identify a row of
Employee table
• 6. {Emp_Number, Emp_Name} – Redundant attribute Emp_Name.
• A super key with no redundant attribute is known as candidate key.
• Candidate keys are selected from the set of super keys, the only thing
we take care while selecting candidate key is that the candidate key
should not have any redundant attributes.
• That’s the reason they are also termed as minimal super key
• A Candidate Key is a minimal set of columns/attributes that can be
used to uniquely identify a single tuple in a relation.
Lets select the candidate keys from the above
set of super keys.
• The candidate keys we have selected are:
{Emp_Id}
{Emp_Number}
Primary key
Mandatory Desired
must uniquely identify a tuple should not change with time
must not allow NULL values should have short size e.g. numeric data types
Primary key is the candidate key that is selected to uniquely identify a
tuple in a relation.
A primary key is a minimal set of attributes (columns) in a table that
uniquely identifies tuples (rows) in that table.
A primary key is selected from the set of candidate keys. That means we
can either have Emp_Id or Emp_Number as primary key. The decision is
made by DBA (Database administrator)
Primary Key Example in DBMS
• Lets take an example to understand the concept of primary key. In the
following table, there are three attributes: Stu_ID, Stu_Name &
Stu_Age. Out of these three attributes, one attribute or a set of more
than one attributes can be a primary key.
• Attribute Stu_Name alone cannot be a primary key as more than one
students can have same name.
• Attribute Stu_Age alone cannot be a primary key as more than one
students can have same age.
• Attribute Stu_Id alone is a primary key as each student has a unique id
that can identify the student record in the table.
• We denote usually denote it by underlining the attribute name (column name).
• The value of primary key should be unique for each row of the table. The
column(s) that makes the key cannot contain duplicate values.
• The attribute(s) that is marked as primary key is not allowed to have null values.
• Primary keys are not necessarily to be a single attribute (column). It can be a set
of more than one attributes (columns). For example {Stu_Id, Stu_Name}
collectively can identify the tuple in the above table, but we do not choose it as
primary key because Stu_Id alone is enough to uniquely identifies rows in a
table and we always go for minimal set. Having that said, we should choose
more than one columns as primary key only when there is no single column
that can uniquely identify the tuple in table.
Key Points Primary key
Composite Primary Key – More than one Key
• Consider this table ORDER, this table keeps the daily record of the
purchases made by the customer.
• This table has three attributes: Customer_ID, Product_ID &
Order_Quantity.
• Customer_ID alone cannot be a primary key as a single customer can
place more than one order thus more than one rows of same
Customer_ID value. As we see in the following example that customer
id 1011 has placed two orders with product if 9023 and 9111.
• Product_ID alone cannot be a primary key as more than one
customers can place a order for the same product thus more than one
rows with same product id. In the following table, customer id 1011 &
1122 placed an order for the same product (product id 9023).
• Order_Quantity alone cannot be a primary key as more more than
one customers can place the order for the same quantity.
• Since none of the attributes alone were able to become a primary key,
lets try to make a set of attributes that plays the role of it.
• {Customer_ID, Product_ID} together can identify the rows uniquely in
the table so this set is the primary key for this table.
Order Example
Order
Customer_ID Product_ID Order_Quantity
1011 9023 10
1122 9023 15
1099 9031 20
1177 9031 18
1011 9111 50
Consider the Following Table
StudentID StudentName Department DOB
1 Raja Computer Science & Engg 10-12-1990
2 Geetha Computer Science & Engg 12-12-1990
3 Dev Information Technology 5-7-1990
4 Preetha Information Technology 12-5-1998
5 Hari Computer Science & Engg 10-3-1998
6 Vani Computer Science & Engg 10-12-1990
7 Krishna Electrical&Electronics Engg 12-12-1990
8 Ajith Electrical&Electronics Engg 5-7-1990
9 Ram Information Technology 12-5-1998
10 Kumar Computer Science & Engg 10-3-1998
11 Sandhya Automobile Engg 10-12-1990
12 Uma Automobile Engg 12-12-1990
• Deptname Gets Repeated more than once as more than one student
belong to the same dept.
• Also the size of the Department name is also big
• Instead of the Department name with an Deptid can be represented
as a separate table and the Deptid which is a primary key in the Dept
table will be a Foreign key in the Student Table
Dept Table
DeptID Department
1 Computer Science & Engg
2 Information Technology
3 Electrical & Electronics Engg
4 Automobile Engg
5 Mechanical Engg
StudentID StudentName DeptID DOB
1 Raja 1 10-12-1990
2 Geetha 1 12-12-1990
3 Dev 2 5-7-1990
4 Preetha 2 12-5-1998
5 Hari 1 10-3-1998
6 Vani 1 10-12-1990
7 Krishna 3 12-12-1990
8 Ajith 3 5-7-1990
9 Ram 2 12-5-1998
10 Kumar 1 10-3-1998
11 Sandhya 4 10-12-1990
12 Uma 4 12-12-1990
Dept
ID
Department
1 Computer Science & Engg
2 Information Technology
3 Electrical & Electronics Engg
4 Automobile Engg
5 Mechanical Engg
DeptID is Primary Key in the
Dept Table is acting as a Foreign
key in the Student Table
Student
Department
• Foreign keys are the columns of a table that points to the primary
key of another table. They act as a cross-reference between tables.
• Note: Practically, the foreign key has nothing to do with the primary
key tag of another table, if it points to a unique column (not
necessarily a primary key) of another table then too, it would be a
foreign key. So, a correct definition of foreign key would be: Foreign
keys are the columns of a table that points to the candidate key of
another table.
Properties of Foreign Key
• It states that if a foreign key exists in a relation then either the foreign key value must
match a primary key value of some tuple in its home relation or the foreign key value
must be null.
• The rules are:
• You can't delete a record from a primary table if matching records exist in a related
table.
• You can't change a primary key value in the primary table if that record has related
records.
• You can't enter a value in the foreign key field of the related table that doesn't exist in
the primary key of the primary table.
• However, you can enter a Null value in the foreign key, specifying that the records are
unrelated.
ENTITY RELATIONSHIP DIAGRAM
• Jack is part of database team and he needs to present the database
design to business users. The business users are non-technical and it's
difficult for them to read a verbose design document.
• Jack needs to use an Entity Relation (ER) Model.
• ER model is a graphical representation of entities and their
relationships which helps in understanding data independent of the
actual database implementation.
Prepared by : M.Nirmala / AP / MCA
• The ER model consists of basic objects called entities and a
relationship among these objects.
• Proposed by Peter Pin-Shan Chen (1976).
• Three basic notations in ER model are
• Entity sets
• Relationship sets
• Attributes
Prepared by : M.Nirmala / AP / MCA
Entity
• Objects in the miniworld about which information has to be
stored. E.g. persons, books.
• It must be possible to distinguish among entities
• An entity may be
• Concrete -- person /Book.
• Abstract -- loan/holiday/Concept
Prepared by : M.Nirmala / AP / MCA
Entity set
• An entity set is set of entities of the same type that share the same
properties/ Attributes.
• Eg: set of all persons who are customers at a given bank can be defined
as entity set customer.
• Eg: set of all loans awarded by a bank is referred as entity set loan.
• Note: entity sets do not need to be disjoint.
• It is possible to define the entity set of all employees of a bank(employee)
and the entity set of all customers of the bank (customer).
• A person entity may be an employee entity, a customer entity, both or
neither
Prepared by : M.Nirmala / AP / MCA
Entity set loan and Customer
LOAN NO AMOUNT
L17 1000
L23 3000
L15 1000
L17 1500
NAME SOCIAL
SECURITYNO
STREET
Jones 321-12-3123 Main
Smith 019-28-5611 North
Hayes 912-51-1239 north
Jack 335-57-1298 dupont
Prepared by : M.Nirmala / AP / MCA
Attributes
• An entity is represented by a set of attributes.
• Attributes are descriptive properties possessed by each
member of entity set
• Possible attributes of the customer entity set are
• customer-name
• social security no
• street
• Attributes of loan entity
• loan no
• amount
Prepared by : M.Nirmala / AP / MCA
Domain
• For each attributes there is a set of permitted values called domain or
value set
• Domain of attribute customer name
• set of all text strings of certain length
• Domain of attribute loanno
• all positive integer in the form “L-n” eg L1, L2
• Attributes of an entity set is a function that maps from the entity set in to a
domain
• Mapping between entity and domain
• Entity is described by (attributes,datavalue)pairs {(name,jones), (social
security,312-12-5123)…. ((street,main)}
Prepared by : M.Nirmala / AP / MCA
Attributes Types
• Simple and composite attributes
• Single valued and multi valued attributes
• Null attributes
• Derived attributes
Simple attributes
• The attributes simple in its structure, which
cannot be further sub divided are referred as
simple attributes. Eg: sex,etc.
Prepared by : M.Nirmala / AP / MCA
Hierarchy of Composite Attributes
Composite attributes:
The attribute may be composed of several components. It
can be further sub divided Address(Apt#, House#, Street,
City, State, ZipCode, Country)
Name(FirstName, MiddleName, LastName).
Prepared by : M.Nirmala / AP / MCA
Single values attributes
• Attributes have single value for particular entity
eg: loan number attribute has a single loan
number
Multivalued attributes
• Attributes has a set of values for a specific entity
consider an employee entity set dependent-name
an employee can have zero, one or more
dependents
• Eg : In bio data-references-names
• Educational qualifications sslc,hsc,ug,pg
Prepared by : M.Nirmala / AP / MCA
Null attributes
• A null value is used when an entity does not have a value for an
attributes
• Eg: One may have no middle name.
Derived attribute
• The value for this type of attribute can be derived from the values of
other related attributes/entities
• Eg 1: entity : customer
• Attributes : loans-held which represents how many no of loans a
customer had from the bank. To derive the value this attribute, we
can count the number of loans entities associated with that
customer
• Eg 2 entity : employee
• Attributes: start-date; employment length
• Employment length – total length of time the employee has worked.
Value of employment –length can be derived from the value of start-
date & current-date. Start-date is referred as base attributes
Prepared by : M.Nirmala / AP / MCA
Relationship sets
• Relationship is an association among several entities.
• Relationship set is the relationship of the same type.
• E.g. “X teaches course Y”.
Let us understand some key terms used in ER Modelling
Term Definition Examples
Entity Real world objects which have an independent
existence and about which we intend to collect data.
Employee,
Computer
Attribute A property that describes an entity. Name, Salary
Different Forms
ID ENAME SALARY DEPT
EMPLOYEE
Customer Account
cusname
SSN
Cus-city
Cus-street
A/cno
balance
Deposit
or
More than one relationship between entities
• There can be more than one relationship between entities, e.g. an
Employee works in a Department while the head of department (also
an employee) manages a Department.
Prepared by : M.Nirmala / AP / MCA
Recursive Relationship
• If the same entity participates more than once in a relationship it
is known as a recursive relationship. In the below example an
employee can be a supervisor and be supervised, so there is a
recursive relationship.
• Eg: A women can take up the role of a wife, mother, daughter-
in-law etc.
Prepared by : M.Nirmala / AP / MCA
specifying that each employee has zero or one bosses (assuming the
CEO doesn‘t have any boss), and that each employee may supervise
zero or more other employees would look like
Prepared by : M.Nirmala / AP / MCA
Degree of Relationships
• The number of roles in the relationship
• Binary – links two entity sets; set of ordered pairs (most common)
• Ternary – links three entity sets; ordered triples (rare). If a relationship
exists among the three entities, all three must be present
• Ternary relationship set involving the entity sets CUSTOMER, LOAN,
BRANCH.. customer jones, loan – L-17 and the perryridge branch.
• N-ary – links n entity sets; ordered n-tuples (very rare). If a
relationship exists among the entities, then all must be present.
Cannot represesnt subsets.
Prepared by : M.Nirmala / AP / MCA
E-R Diagram with a Ternary Relationship
Prepared by : M.Nirmala / AP / MCA
One - one
• An entity in A is associated with at most one entity in B, and an
entity in B is associated with at most one entity in A.
• Borrower in a particular bank
• A loan can belong to only customer.
• 1 Department one HOD
Prepared by : M.Nirmala / AP / MCA
1-1
Prepared by : M.Nirmala / AP / MCA
Cardinality Constraints
• We express cardinality constraints by drawing either a
directed line (), signifying “one,” or an undirected line
(—), signifying “many,” between the relationship set and
the entity set.
• E.g.: One-to-one relationship:
• A customer is associated with at most one loan via the
relationship borrower
• A loan is associated with at most one customer via borrower
Prepared by : M.Nirmala / AP / MCA
1-Many
• An entity is A is associated with any number of entities in B. an entity in
B, however can be associated with almost one entity in A.
a1
a2
a3
b1
b2
b3
b4
b5
Depositor
1.customer may have
several accounts.
Prepared by : M.Nirmala / AP / MCA
One-To-Many Relationship
• In the one-to-many relationship a loan is
associated with at most one customer via
borrower, a customer is associated with
several (including 0) loans via borrower
Prepared by : M.Nirmala / AP / MCA
Many - 1
• An entity in A is associated with atmost one entity
in B. AN entity in B, however can be associated
with any number of entities in A.
a1
a2
a4
a3
b1
b2
a5
b3
Prepared by : M.Nirmala / AP / MCA
Many-To-One Relationships
• In a many-to-one relationship a loan is
associated with several (including 0)
customers via borrower, a customer is
associated with at most one loan via borrower
Prepared by : M.Nirmala / AP / MCA
Many - Many
• An entity in A is associated with any number of entities in B, and
an entity in B is associated with any no of entities in A.
a1
a2
a4
a3
b1
b2
b4
b3
Loan can belong to
several customers can
loans can be taken jointly
by several business
partners
Prepared by : M.Nirmala / AP / MCA
Many-To-Many Relationship
• A customer is associated with several (possibly
0) loans via borrower
• A loan is associated with several (possibly 0)
customers via borrower
Prepared by : M.Nirmala / AP / MCA
Prepared by : M.Nirmala / AP / MCA
Prepared by : M.Nirmala / AP / MCA
Prepared by : M.Nirmala / AP / MCA
Prepared by : M.Nirmala / AP / MCA
Prepared by : M.Nirmala / AP / MCA
E-R Diagram With Composite, Multivalued, and Derived
Attributes
Composite Attribute – name, address , street
Multivalued Attribute = phone number depicted by double ellipse
Derived attribute = age depicted by dashed ellipse
How to Create an Entity Relationship Diagram
(ERD)
Scenario
• In a university, a Student enrolls in Courses. A student must be
assigned to at least one or more Courses. Each course is taught by a
single Professor. To maintain instruction quality, a Professor can
deliver only one course
Step 1) Entity Identification
• Student
• Course
• Professor
Step 2) Relationship Identification
• Identify the association between the entities
• The student is assigned a course
• Professor delivers a course
Step 3) Cardinality Identification
• A student can be assigned multiple courses
• A Professor can deliver only one course
Step 4) Identify Attributes
Entity Primary Key Attribute
Student Student_ID StudentName
Professor Employee_ID ProfessorName
Course Course_ID CourseName
Database Management system : UNit I Helping to understand basics of dbms operation
• ER DIAGRAM
•“Patients are treated in a single ward by the
doctors assigned to them.
•Usually each patient will be assigned a single
doctor, but in rare cases they will have two.
Healthcare assistants also attend to the
patients, a number of these are associated with
each ward.
Step 1) Entity Identification
• Patients
• Doctor
• healthcareAssistants
• ward
Step 2) Relationship Identification
• Identify the association between the entities
• Patients are treated in ward
• Patient are assigned to doctor
• Healthcare assistants attends to patients
Step 3) Cardinality Identification
• Patients are treated in single ward (M:1)
• Patient are assigned to doctor(1:M)
• Healthcare assistants attends to patients (M:M)
Database Management system : UNit I Helping to understand basics of dbms operation
DFD Examples
Why Data Flow Diagram
• A picture is worth a thousand words.
• A Data Flow Diagram (DFD) is a traditional way to visualize the
information flows within a system.
• It shows
• how information enters the System
• How Information leaves the system
• what changes the information
• where information is stored.
• Good DFDs are critical to the process
• Building DFDs == understanding the system
• Analyzing DFDs == understanding the threats
DFD Types of Notation
• Yourdon & Coad
• Gene & Sarson are the two main methods of notation used in DFDs
Elements of DFD
External Entity
• An external agent is a source or destination of data
• The sharp cornered rectangles(or simply boxes) in a
DFD indicates entities.
• An external entity, which are also known as
terminators, sources, sinks, or actors, are an outside
system or process that sends or receives data to and
from the diagrammed system
• Entities are people things, organizations etc of data from the system
Process
• A process is a unit of work that operates on the data
• The rounded cornered rectangles or circles in a DFD indicate
processes
• Process is a procedure that manipulates the data and its flow by
taking incoming data, changing it, and producing an output with it
Examples of Process
Data Store
• Opened sided rectangles in DFD indicates data store.
• The Data Store symbol represents data that is not moving (delayed
data at rest).
• A Data Store is a logical repository of data.
• Data can be written into the data store. This is depicted by an
incoming arrow.
• Data can be read from a data store. This is depicted by an outgoing
arrow.
Data store
• Two data stores cannot be connected by a data flow.
• External entity cannot read or write to the data store.
Examples of data store
Data flow
• Data flow is the path the system’s information takes from external
entities through processes and data stores. With arrows and succinct
labels, the DFD can show you the direction of the data flow.
• Arrow symbol in DFD indicate data flow
• The Data Flow symbol represents movement of data
•
Examples of Data Flow
Rules of Data Flow
• Data can flow from
• External Entity to Process
• Process to External Entity
• Process to Data Store and back
• Process to Process
Rules of Data Flow
• Data cannot flow from
• External entity to External Entity
• External entity to store
• Data store to External Entity
• Data store to Data Store
Four rules of thumb to create a valid DFD.
1. Each process should have at least one input and one output.
2. Each data store should have at least one data flow in and data flow
out.
3. A system’s stored data must go through a process.
4. All processes in a DFD must link to another process or data store.
Creating Data Flow Diagrams
Steps:
1. Create a list of activities
2. Construct Context Level DFD
(identifies external entities and processes)
3. Construct Level 0 DFD
(identifies manageable sub process )
4. Construct Level 1- n DFD
(identifies actual data flows and data stores )
5. Check against rules of DFD
DFD Naming Guidelines
• External Entity  Noun
• Data Flow  Names of data
• Process  verb phrase
• a system name
• a subsystem name
• Data Store  Noun
Creating Data Flow Diagrams
Lemonade Stand Example
Creating Data Flow Diagrams
Steps:
1. Create a list of activities
• New way: use Use-Case Diagram
2. Construct Context Level DFD
(identifies sources and sink)
3. Construct Level 0 DFD
(identifies manageable sub processes )
4. Construct Level 1- n DFD
(identifies actual data flows and data stores )
Example
The operations of a simple
lemonade stand will be used
to demonstrate the creation
of dataflow diagrams.
Creating Data Flow Diagrams
1. Create a list of activities
Example
Think through the activities
that take place at a
lemonade stand.
Customer Order
Produce Product
Serve Product
Collect Payment
Store Product
Creating Data Flow Diagrams
Example
Also think of the additional
activities needed to support
the basic activities.
Customer Order
Produce Product
Serve Product
Collect Payment
Store Product
Order Raw Materials
Pay for Raw Materials
Pay for Labor
1. Create a list of activities
Creating Data Flow Diagrams
Example
Group these activities in
some logical fashion,
possibly functional areas.
Customer Order
Serve Product
Collect Payment
Produce Product
Store Product
Order Raw Materials
Pay for Raw Materials
Pay for Labor
1. Create a list of activities
Process
Customer Order Sale
Serve Product
Collect Payment
Produce Product (product
preparation)
Production
Store Product
Order Raw Materials Procurement
Pay for Raw Materials
Pay for Labor payroll
Context Level DFD
• Basic form of DFD
• Called as Level 0
• Shows how the system works at a Glance
• There is only one process(Main Process) in the system and all the data
flows either into or out of this process.
• Context level DFD’s demonstrates the interactions between the
process and external entities.
• They do not contain Data Stores.
Creation of Context Level DFD
• Identify the main system (Lemonade)
• Identify the external people who interact with the system
• Customer
• Employee (Seller)
• Vendor (for purchasing Raw Materials)
• Decide what data these entities will enter into the system
• Determine what these entities expect as output from the system
Creating Data Flow Diagrams
Context Level DFD
Example
Create a context level
diagram identifying the
sources and sinks (users).
Customer Order
Serve Product
Collect Payment
Produce Product
Store Product
Order Raw Materials
Pay for Raw Materials
Pay for Labor
2. Construct Context Level DFD
(identifies sources and sink)
0.0
Lemonade
System
EMPLOYEE
CUSTOMER
Pay
Payment
Order
VENDOR
Payment
Purchase Order
Produce Product
Received Goods
Time Worked
Sales Forecast
Product Served
Creating Data Flow Diagrams
Level 0 DFD
Example
Create a level 0 diagram
identifying the logical
subsystems that may exist.
Customer Order
Serve Product
Collect Payment
Produce Product
Store Product
Order Raw Materials
Pay for Raw Materials
Pay for Labor
3. Construct Level 0 DFD
(identifies manageable sub processes )
2.0
Production
EMPLOYEE
Produce
Product
1.0
Sale
3.0
Procure-
ment
Sales Forecast
Product Ordered
CUSTOMER
Pay
Payment
Customer Order
VENDOR
Payment
Purchase Order
Order
Decisions
Received Goods
Time Worked
Inventory
Product Served
4.0
Payroll
• Create a level 1 decomposing the processes in level 0 and identifying
data stores
• In the Level 0 with respect to 1.0 Sale
Main Process Sub Process Data Stores
Sale Customer Order Customer , Order
Payment Order, Payment
Sales forecast Payment
Creating Data Flow Diagrams
Level 1 DFD
Example
Create a level 1
decomposing the processes
in level 0 and identifying
data stores.
4. Construct Level 1- n DFD
(identifies actual data flows and data stores )
1.3
Produce
Sales
Forecast
Sales Forecast
Payment
Customer Order
Serve Product
Collect Payment
Produce Product
Store Product
Order Raw Materials
Pay for Raw Materials
Pay for Labor
1.1
Record
Order
Customer Order
ORDER
1.2
Receive
Payment
PAYMENT
Severed Order
Request for Forecast
CUSTOMER
Creating Data Flow Diagrams
Level 1 DFD
Example
Create a level 1
decomposing the processes
in level 0 and identifying
data stores.
4. Construct Level 1 (continued)
Customer Order
Serve Product
Collect Payment
Produce Product
Store Product
Order Raw Materials
Pay for Raw Materials
Pay for Labor
2.1
Serve
Product
Product Order
ORDER
2.2
Produce
Product
INVENTORTY
Quantity Severed
Production
Schedule
RAW
MATERIALS
2.3
Store
Product
Quantity Produced &
Location Stored
Quantity Used
Production Data
Creating Data Flow Diagrams
Level 1 DFD
Example
Create a level 1
decomposing the processes
in level 0 and identifying
data stores.
4. Construct Level 1 (continued)
Customer Order
Serve Product
Collect Payment
Produce Product
Store Product
Order Raw Materials
Pay for Raw Materials
Pay for Labor
3.1
Produce
Purchase
Order
Order Decision
PURCHASE
ORDER
3.2
Receive
Items
Received
Goods
RAW
MATERIALS
3.3
Pay
Vendor
Quantity
Received
Quantity On-Hand
RECEIVED
ITEMS
VENDOR
Payment Approval
Payment
Creating Data Flow Diagrams
Level 1 DFD
Example
Create a level 1
decomposing the processes
in level 0 and identifying
data stores.
4. Construct Level 1 (continued)
Time Worked
Customer Order
Serve Product
Collect Payment
Produce Product
Store Product
Order Raw Materials
Pay for Raw Materials
Pay for Labor
4.1
Record
Time
Worked
TIME CARDS
4.2
Calculate
Payroll
Payroll Request
EMPLOYEE
4.3
Pay
Employe
e
Employee ID
PAYROLL
PAYMENTS
Payment Approval
Payment
Unpaid time cards
Process Decomposition
4.1
Record
Time
Worked
4.2
Calculate
Payroll
4.3
Pay
Employe
e
3.1
Produce
Purchase
Order
3.2
Receive
Items
3.3
Pay
Vendor
2.1
Serve
Product
2.2
Produce
Product
2.3
Store
Product
1.1
Record
Order
1.2
Receive
Payment
2.0
Production
1.0
Sale
3.0
Procure-
ment
4.0
Payroll
0.0
Lemonade
System
Level 0 Level 1
Context Level
Let us try to identify some candidate
keys for this relation
Key Rationale
EmployeeNo This seems to be a good candidate key as companies usually issue a unique
number for each employee.
AadharNo This seems to be a good candidate key for a company based in India. However
we have assumed that every employee has an Aadhar number which may not
be true. In addition for a multinational firm with employees across the globe
this will not work at all.
Name,
DateOfBirth
This might work for a small firm of 10 people as a combination of Name and
Date of Birth is likely to be unique.
Salary This is not a good candidate as salary is generally paid uniformly to people at
same level.
EmployeeNo,
DateOfBirth
It is not a candidate key as EmployeeNo alone is unique. By definition only
minimal set of attributes can be candidate key.
Thus the choice of candidate key depends upon the business context.
Key Rationale
EmployeeNo Good candidate as it is numeric, cannot be null
and does not change with time.
AadharNo It will be null for people who do not have
Aadhar number. Hence it cannot be considered
as primary key.
Name, DateOfBirth Both Name and DateOfBirth cannot be null.
However even if uniqueness is guaranteed, it is
not a good choice due to large size.
When two or more columns together identify the unique row then
it's referred to as Composite Primary Key. The combination of
Name and DateOfBirth if selected as a primary key would be a
composite primary key.
Database Users and Administrators
• Primary goal of the database system is to retrieve information and
store new information in the database
• 2 Types of Users
• Database Users
• Database Administrators
Database Users
• 4 Types of Database Users
• Naive users
• Application Programmers
• Sophisticated Users
• Specialized Users
Naïve Users
• They are unsophisticated users who interact with the system by
invoking one of the application program that have been written
previously.
• Simple User, who operates the applications
• Does not write any code
Application Programmers
• They are computer professionals who write application programs.
• Application programmers can choose from many tools to develop
user interfaces.
Sophisticated Users
•They interact with the system without writing
programs.
•Instead they write their requests in a database
query language
Specialized Users
•They are sophisticated users who write
specialized database applications that do not fit
in to the traditional data-processing frame work.
Database Administrator
• DBA is the person or group that is responsible for supervising
both the data base and the use of the DBMS
• Roles of DBA
• Policy Formulation and Implementation
• Access privileges
• Security
• Disaster planning
• Archives
• Data Dictionary Management
• Training
• DBMS Support
• DBMS evaluation and selection
• DBMS responsibility
• Data Base Design
DBA formulates policies and communicates
these policies to users.
Among these policies are those covering access
• privilege,
• security,
• disaster planning,
• archives.
AccessPrivilege
• Access to every table and field in the database is not a necessity for
every user.
• User A Access denide Access permitted
Security
• Security refers to the prevention of unauthorized access to the
database
• Once access privilege have been specified and security features are in
place,
• DBA draws up policies to explain the security privileges and then
distributes these policies to authorized users.
Planning for Disaster
• A database may be harmed by some physical problems or by some
natural disasters.
• For this the DBA’s responsibility is to establish and implement backup
and recovery procedures.
Archives
• This tells that data need to be kept in the database for only a limited
time.
• A data archive is a place where a record of certain corporate data is
kept
• Data Dictionary Management (Data About Data)
• Data dictionary contains a wider range of information, including at the
very least, information on tables, indexes, and programs.
• The creation and distribution of appropriate reports from the data
dictionary is another of DBA”s responsibility.
UNIT I
INTRODUCTION
Data Base and Database Users
 A database is a collection of related data. By data, we mean known facts that can be
recorded and that have implicit meaning.
 For example, consider the names, telephone numbers, and addresses of the people you
know.
 Nowadays, this data is typically stored in mobile phones, which have their own simple
database software.
 This data can also be recorded in an indexed address book or stored on a hard drive,
using a personal computer and software such as Microsoft Access or Excel.
 This collection of related data with an implicit meaning is a database.
Properties
 A database represents some aspect of the real world, sometimes called the
miniworld or the universe of discourse (UoD). Changes to the miniworld are
reflected in the database.
 A database is a logically coherent collection of data with some inherent meaning.
A random assortment of data cannot correctly be referred to as a database.
 A database is designed, built, and populated with data for a specific purpose. It
has an intended group of users and some preconceived applications in which
these users are interested.
 A database management system (DBMS) is a computerized system that enables users to
create and maintain a database.
 The DBMS is a general-purpose software system that facilitates the processes of defining,
constructing, manipulating, and sharing databases among various users and applications.
 Defining a database involves specifying the data types, structures, and constraints of the data
to be stored in the database.
 The database definition or descriptive information is also stored by the DBMS in the form of
a database catalog or dictionary; it is called meta-data.
 Constructing the database is the process of storing the data on some storage
medium that is controlled by the DBMS.
 Manipulating a database includes functions such as querying the database to
retrieve specific data, updating the database to reflect changes in the mini world,
and generating reports from the data.
 Sharing a database allows multiple users and programs to access the database
simultaneously.
Database Environment
Database Example
Database Management system : UNit I Helping to understand basics of dbms operation
Characteristics of the Database Approach
 A number of characteristics distinguish the database approach from the much older approach of writing
customized programs to access data stored in files.
 In traditional file processing, each user defines and implements the files needed for a specific software
application as part of programming the application.
 For example, one user, the grade reporting office, may keep files on students and their grades.
 Programs to print a student’s transcript and to enter new grades are implemented as part of the
application.
 A second user, the accounting office, may keep track of students’ fees and their payments. Although
both users are interested in data about students, each user maintains separate files—and programs to
manipulate these files—because each requires some data not available from the other user’s files.
 The main characteristics of the database approach versus the file-processing
approach are the following:
 Self-describing nature of a database system
 Insulation between programs and data, and data abstraction
 Support of multiple views of the data
 Sharing of data and multiuser transaction processing
Self-Describing Nature of a Database
System
 A fundamental characteristic of the database approach is that the database system
contains not only the database itself but also a complete definition or description of
the database structure and constraints.
 This definition is stored in the DBMS catalog, which contains information such as the
structure of each file, the type and storage format of each data item, and various
constraints on the data.
 The information stored in the catalog is called meta-data, and it describes the
structure of the primary database
 It is important to note that some newer types of database systems, known as NOSQL systems,
do not require meta-data.
 Rather the data is stored as self-describing data that includes the data item names and data
values together in one structure
 The catalog is used by the DBMS software and also by database users who need information
about the database structure.
 A general-purpose DBMS software package is not written for a specific database application.
 Therefore, it must refer to the catalog to know the structure of the files in a specific database,
such as the type and format of data it will access.
Example
Insulation between Programs and Data,
and Data Abstraction
 In traditional file processing, the structure of data files is embedded in the application
programs, so any changes to the structure of a file may require changing all programs
that access that file.
 By contrast, DBMS access programs do not require such changes in most cases.
 The structure of data files is stored in the DBMS catalog separately from the access
programs.
 We call this property program-data independence.
 An operation (also called a function or method) is specified in two
parts.
 The interface (or signature) of an operation includes the operation
name and the data types of its arguments (or parameters).
 The implementation (or method) of the operation is specified separately
and can be changed without affecting the interface.
 User application programs can operate on the data by invoking these
operations through their names and arguments, regardless of how the
operations are implemented. This may be termed program-operation
independence.
Support of Multiple Views of the Data
 A database typically has many types of users, each of whom may require a
different perspective or view of the database.
 A view may be a subset of the database or it may contain virtual data that is
derived from the database files but is not explicitly stored.
 Some users may not need to be aware of whether the data they refer to is stored or
derived.
 A multiuser DBMS whose users have a variety of distinct applications must
provide facilities for defining multiple views.
Sharing of Data and Multiuser Transaction
Processing
 A multiuser DBMS, as its name implies, must allow multiple users to access the
database at the same time.
 This is essential if data for multiple applications is to be integrated and
maintained in a single database.
 The DBMS must include concurrency control software to ensure that several
users trying to update the same data do so in a controlled manner so that the result
of the updates is correct.
Two views from Database
Advantages of Using the DBMS Approach
 Controlling Redundancy
 Restricting Unauthorized Access
 Providing Persistent Storage for Program Objects
 Providing Storage Structures and Search Techniques for Efficient Query Processing
 Providing Backup and Recovery
 Providing Multiple User Interfaces
 Representing Complex Relationships among Data
 Enforcing Integrity Constraints
 Permitting Inferencing and Actions Using Rules and Triggers
 Additional Implications of Using the Database Approach
 Potential for Enforcing Standards.
 The database approach permits the DBA to define and enforce standards among database
users in a large organization. This facilitates communication and cooperation among various
departments, projects, and users within the organization. Standards can be defined for names
and formats of data elements, display formats, report structures, terminology, and so on.
 Reduced Application Development Time.
 A prime selling feature of the database approach is that developing a new application—such
as the retrieval of certain data from the database for printing a new report—takes very little
time. Designing and implementing a large multiuser database from scratch may take more
time than writing a single specialized file application.
 Flexibility.
 It may be necessary to change the structure of a database as requirements change. For
example, a new user group may emerge that needs information not currently in the
database. In response, it may be necessary to add a file to the database or to extend the
data elements in an existing file.
 Availability of Up-to-Date Information.
 A DBMS makes the database available to all users. As soon as one user’s update is
applied to the database, all other users can immediately see this update. This
availability of up-to-date information is essential for many transaction-processing
applications, such as reservation systems or banking databases, and it is made possible
by the concurrency control and recovery subsystems of a DBMS.
 Economies of Scale.
 The DBMS approach permits consolidation of data and applications, thus reducing the
amount of wasteful overlap between activities of data-processing personnel in
different projects or departments as well as redundancies among applications.
 This enables the whole organization to invest in more powerful processors, storage
devices, or networking gear, rather than having each department purchase its own
(lower performance) equipment.
 This reduces overall costs of operation and management.`
Data Models, Schemas, and Instances
 A data model—a collection of concepts that can be used to describe the structure of a
database—provides the necessary means to achieve this abstraction.
 By structure of a database we mean the data types, relationships, and constraints that apply
to the data.
 Most data models also include a set of basic operations for specifying retrievals and
updates on the database.
Categories of Data Models
 Many data models have been proposed, which we can categorize according to the types of
concepts they use to describe the database structure.
 High-level or conceptual data models provide concepts that are close to the way many
users perceive data, whereas low-level or physical data models provide concepts that
describe the details of how data is stored on the computer storage media, typically magnetic
disks.
 Concepts provided by physical data models are generally meant for computer specialists,
not for end users.
 Between these two extremes is a class of representational (or implementation) data
models, which provide concepts that may be easily understood by end users but that are not
too far removed from the way data is organized in computer storage.
 Conceptual data models use concepts such as entities, attributes, and relationships.
 An entity represents a real-world object or concept, such as an employee or a project from
the mini-world that is described in the database.
 An attribute represents some property of interest that further describes an entity, such as
the employee’s name or salary.
 A relationship among two or more entities represents an association among the entities,
for example, a works-on relationship between an employee and a project.
 Representational or implementation data models are the models used most frequently in
traditional commercial DBMSs.
 These include the widely used relational data model, as well as the so-called legacy
data models—the network and hierarchical models—that have been widely used in the
past. Part 3 of the text is devoted to the relational data model, and its constraints,
operations, and languages.
 Representational data models represent data by using record structures and hence are
sometimes called record-based data model
Schema diagram for the database
 The description of a database is called
the database schema, which is specified
during database design and is not
expected to change frequently.
 Most data models have certain
conventions for displaying schemas as
diagrams.
 A displayed schema is called a schema
diagram
Architecture and Data Independence
 The internal level has an internal schema, which describes the physical storage structure of
the database.
 The internal schema uses a physical data model and describes the complete details of data
storage and access paths for the database.
 The conceptual level has a conceptual schema, which describes the structure of the whole
database for a community of users.
 The conceptual schema hides the details of physical storage structures and concentrates on
describing entities, data types, relationships, user operations, and constraints.
 Usually, a representational data model is used to describe the conceptual schema when a
database system is implemented.
 This implementation conceptual schema is often based on a conceptual schema design in a
high-level data model.
 The external or view level includes a number of external schemas or user views.
 Each external schema describes the part of the database that a particular user group is
interested in and hides the rest of the database from that user group.
 As in the previous level, each external schema is typically implemented using a
representational data model, possibly based on an external schema design in a high-
level conceptual data model.
 The three-schema architecture is a convenient tool with which the user can visualize the
schema levels in a database system.
 Most DBMSs do not separate the three levels completely and explicitly, but they support the
three-schema architecture to some extent.
 Some older DBMSs may include physical-level details in the conceptual schema.
 The three-level ANSI architecture has an important place in database technology
development because it clearly separates the users’ external level, the database’s conceptual
level, and the internal storage level for designing a database.
 It is very much applicable in the design of DBMSs, even today.
 In most DBMSs that support user views, external schemas are specified in the same data
model that describes the conceptual-level information
Data Independence
 Logical data independence is the capacity to change the conceptual schema without having to
change external schemas or application programs.
 We may change the conceptual schema to expand the database (by adding a record type or data
item), to change constraints, or to reduce the database (by removing a record type or data item).
 In the last case, external schemas that refer only to the remaining data should not be affected.
 Only the view definition and the mappings need to be changed in a DBMS that supports logical
data independence.
 After the conceptual schema undergoes a logical reorganization, application programs that
reference the external schema constructs must work as before.
 Changes to constraints can be applied to the conceptual schema without affecting the external
schemas or application programs.
 Physical data independence is the capacity to change the internal schema without having
to change the conceptual schema. Hence, the external schemas need not be changed as
well.
 Changes to the internal schema may be needed because some physical files were
reorganized—for example, by creating additional access structures—to improve the
performance of retrieval or update.
 If the same data as before remains in the database, we should not have to change the
conceptual schema.
Database Languages and Interfaces
 Once the design of a database is completed and a DBMS is chosen to implement the
database, the first step is to specify conceptual and internal schemas for the database and
any mappings between the two.
 In many DBMSs where no strict separation of levels is maintained, one language, called
the data definition language (DDL), is used by the DBA and by database designers to
define both schemas.
 The DBMS will have a DDL compiler whose function is to process DDL statements in
order to identify descriptions of the schema constructs and to store the schema description
in the DBMS catalog.
 In DBMSs where a clear separation is maintained between the conceptual and internal levels, the
DDL is used to specify the conceptual schema only.
 Another language, the storage definition language (SDL), is used to specify the internal
schema.
 The mappings between the two schemas may be specified in either one of these languages. In
most relational DBMSs today, there is no specific language that performs the role of SDL.
 Instead, the internal schema is specified by a combination of functions, parameters, and
specifications related to storage of files. These permit the DBA staff to control indexing choices
and mapping of data to storage.
 For a true three-schema architecture, we would need a third language, the view definition
language (VDL), to specify user views and their mappings to the conceptual schema, but in
most DBMSs the DDL is used to define both conceptual and external schemas.
 There are two main types of DMLs.
 A high-level or nonprocedural DML can be used on its own to specify complex database
operations concisely.
 Many DBMSs allow high-level DML statements either to be entered interactively from a display
monitor or terminal or to be embedded in a general-purpose programming language.
 A low-level or procedural DML must be embedded in a general-purpose programming language.
This type of DML typically retrieves individual records or objects from the database and processes
each separately.
 Therefore, it needs to use programming language constructs, such as looping, to retrieve and process
each record from a set of records.
 Low-level DMLs are also called record-at-a-time DMLs because of this property. High-
level DMLs, such as SQL, can specify and retrieve many records in a single DML
statement; therefore, they are called set-at-a-time or set-oriented DMLs.
 A query in a high-level DML often specifies which data to retrieve rather than how to
retrieve it; therefore, such languages are also called declarative.
 Whenever DML commands, whether high level or low level, are embedded in a general-
purpose programming language, that language is called the host language and the DML is
called the data sublanguage.
 On the other hand, a high-level DML used in a standalone interactive manner is called a
query language.
DBMS Interfaces
 Menu-based Interfaces for Web Clients or Browsing. These interfaces present the
user with lists of options (called menus) that lead the user through the formulation of a
request.
 Menus do away with the need to memorize the specific commands and syntax of a
query language; rather, the query is composed step-by-step by picking options from a
menu that is displayed by the system.
 Pull-down menus are a very popular technique in Web-based user interfaces.
 They are also often used in browsing interfaces, which allow a user to look through the
contents of a database in an exploratory and unstructured manner.
Apps for Mobile Devices.
 These interfaces present mobile users with access to their data.
 For example, banking, reservations, and insurance companies, among many others,
provide apps that allow users to access their data through a mobile phone or mobile
device.
 The apps have built-in programmed interfaces that typically allow users to login using
their account name and password; the apps then provide a limited menu of options for
mobile access to the user data, as well as options such as paying bills (for banks) or
making reservations (for reservation Web sites).
 Forms-based Interfaces.
 A forms-based interface displays a form to each user.
 Users can fill out all of the form entries to insert new data, or they can fill out
only certain entries, in which case the DBMS will retrieve matching data for the
remaining entries.
 Forms are usually designed and programmed for naive users as interfaces to
canned transactions.
 Many DBMSs have forms specification languages, which are special languages
that help programmers specify such forms.
 SQL*Forms is a form-based language that specifies queries using a form
designed in conjunction with the relational database schema.
 Oracle Forms is a component of the Oracle product suite that provides an
extensive set of features to design and build applications using forms.
 Some systems have utilities that define a form by letting the end user
interactively construct a sample form on the screen.
 Graphical User Interfaces. A GUI typically displays a schema to the user in
diagrammatic form.
 The user then can specify a query by manipulating the diagram. In many cases,
GUIs utilize both menus and forms.
 Natural Language Interfaces
 These interfaces accept requests written in English or some other language and attempt to
understand them.
 A natural language interface usually has its own schema, which is similar to the database
conceptual schema, as well as a dictionary of important words.
 The natural language interface refers to the words in its schema, as well as to the set of
standard words in its dictionary, that are used to interpret the request.
 If the interpretation is successful, the interface generates a high-level query corresponding to
the natural language request and submits it to the DBMS for processing; otherwise, a
dialogue is started with the user to clarify the request.
 Keyword-based Database Search.
 These are somewhat similar to Web search engines, which accept strings of natural
language (like English or Spanish) words and match them with documents at
specific sites (for local search engines) or Web pages on the Web at large (for
engines like Google or Ask).
 They use predefined indexes on words and use ranking functions to retrieve and
present resulting documents in a decreasing degree of match. Such “free form”
textual query interfaces are not yet common in structured relational databases,
although a research area called keyword-based querying has emerged recently for
relational databases.
 Speech Input and Output.
 Limited use of speech as an input query and speech as an answer to a question or result of
a request is becoming commonplace.
 Applications with limited vocabularies, such as inquiries for telephone directory, flight
arrival/departure, and credit card account information, are allowing speech for input and
output to enable customers to access this information.
 The speech input is detected using a library of predefined words and used to set up the
parameters that are supplied to the queries.
 For output, a similar conversion from text or numbers into speech takes place.
 Interfaces for Parametric Users.
 Parametric users, such as bank tellers, often have a small set of operations that they must perform
repeatedly. For example, a teller is able to use single function keys to invoke routine and repetitive
transactions such as account deposits or withdrawals, or balance inquiries.
 Systems analysts and programmers design and implement a special interface for each known class of naive
users.
 Usually a small set of abbreviated commands is included, with the goal of minimizing the number of
keystrokes required for each request.
 Interfaces for the DBA. Most database systems contain privileged commands that can be used only by the
DBA staff.
 These include commands for creating accounts, setting system parameters, granting account authorization,
changing a schema, and reorganizing the storage structures of a database.
Data Modeling Using the Entity– Relationship (ER)
Model
 The first step shown is requirements collection and analysis.
 During this step, the database designers interview prospective database users to
understand and document their data requirements.
 The result of this step is a concisely written set of users’ requirements.
 These requirements should be specified in as detailed and complete a form as possible.
In parallel with specifying the data requirements, it is useful to specify the known
functional requirements of the application.
 These consist of the user defined operations (or transactions) that will be applied to the
database, including both retrievals and updates. In software design, it is common to use
data flow diagrams, sequence diagrams, scenarios, and other techniques to specify
functional requirements.
 Once the requirements have been collected and analyzed, the next step is to create a conceptual schema
for the database, using a high-level conceptual data model.
 This step is called conceptual design. The conceptual schema is a concise description of the data
requirements of the users and includes detailed descriptions of the entity types, relationships, and
constraints; these are expressed using the concepts provided by the high-level data model.
 Because these concepts do not include implementation details, they are usually easier to understand and
can be used to communicate with nontechnical users.
 The high-level conceptual schema can also be used as a reference to ensure that all users’ data
requirements are met and that the requirements do not conflict.
 This approach enables database designers to concentrate on specifying the properties
of the data, without being concerned with storage and implementation details, which
makes it is easier to create a good conceptual database design.
 During or after the conceptual schema design, the basic data model operations can be
used to specify the high-level user queries and operations identified during functional
analysis.
 This also serves to confirm that the conceptual schema meets all the identified
functional requirements.
 Modifications to the conceptual schema can be introduced if some functional
requirements cannot be specified using the initial schema.
 The next step in database design is the actual implementation of the database,
using a commercial DBMS.
 Most current commercial DBMSs use an implementation data model—such as the
relational (SQL) model—so the conceptual schema is transformed from the high-
level data model into the implementation data model.
 This step is called logical design or data model mapping; its result is a database
schema in the implementation data model of the DBMS.
 Data model mapping is often automated or semi-automated within the
database design tools.
 The last step is the physical design phase, during which the internal
storage structures, file organizations, indexes, access paths, and physical
design parameters for the database files are specified.
 In parallel with these activities, application programs are designed and
implemented as database transactions corresponding to the high-level
transaction specifications.
Entity Types, Entity Sets, Attributes,
and Keys
Two Entities
Entities and Attributes
 Entities and Their Attributes.
 The basic concept that the ER model represents is an entity, which is a thing
or object in the real world with an independent existence.
 An entity may be an object with a physical existence (for example, a particular
person, car, house, or employee) or it may be an object with a conceptual
existence (for instance, a company, a job, or a university course).
 Each entity has attributes—the particular properties that describe it.
Single-Valued versus Multivalued Attributes
 Most attributes have a single value for a particular entity; such attributes are called single-
valued. For example,
 Age is a single-valued attribute of a person. In some cases an attribute can have a set of
values for the same entity—for instance, a Colors attribute for a car, or a College_degrees
attribute for a person.
 Cars with one color have a single value, whereas two-tone cars have two color values.
Similarly, one person may not have any college degrees, another person may have one, and
a third person may have two or more degrees; therefore, different people can have different
numbers of values for the College_degrees attribute. Such attributes are called multivalued.
 A multivalued attribute may have lower and upper bounds to constrain the number of
values allowed for each individual entity.
 For example, the Colors attribute of a car may be restricted to have between one and
two values, if we assume that a car can have two colors at most.
Stored versus Derived Attributes
 In some cases, two (or more) attribute values are related—for example, the Age and
Birth_date attributes of a person.
 For a particular person entity, the value of Age can be determined from the current
(today’s) date and the value of that person’s Birth_date.
 The Age attribute is hence called a derived attribute and is said to be derivable from
the Birth_date attribute, which is called a stored attribute.
 Some attribute values can be derived from related entities; for example, an attribute
Number_of_employees of a DEPARTMENT entity can be derived by counting the
number of employees related to (working for) that department.
NULL Values
 In some cases, a particular entity may not have an applicable value for an attribute.
For example, the Apartment_number attribute of an address applies only to addresses
that are in apartment buildings and not to other types of residences, such as single-
family homes. Similarly, a College_degrees attribute applies only to people with
college degrees.
 For such situations, a special value called NULL is created.
 The meaning of the former type of NULL is not applicable, whereas the meaning of
the latter is unknown.
 The unknown category of NULL can be further classified into two cases. The first
case arises when it is known that the attribute value exists but is missing—for
instance, if the Height attribute of a person is listed as NULL.
Complex Attributes
 Notice that, in general, composite and multivalued attributes can be nested
arbitrarily.
 We can represent arbitrary nesting by grouping components of a composite
attribute between parentheses ( ) and separating the components with commas,
and by displaying multivalued attributes between braces { }.
 Such attributes are called complex attributes. For example, if a person can have
more than one residence and each residence can have a single address and
multiple phones, an attribute Address_phone for a person can be specified as
Both Phone and Address are themselves composite attributes.
Entity Types, Entity Sets, Keys, and Value Sets
 Entity Types and Entity Sets. A database
usually contains groups of entities that are
similar.
 For example, a company employing hundreds
of employees may want to store similar
information concerning each of the employees.
 These employee entities share the same
attributes, but each entity has its own value(s)
for each attribute.
 An entity type defines a collection (or set) of
entities that have the same attributes.
 Each entity type in the database is described by
its name and attributes.
 Database at any point in time is called an entity set or entity collection;
the entity set is usually referred to using the same name as the entity
type, even though they are two separate concepts.
 For example, EMPLOYEE refers to both a type of entity as well as the
current collection of all employee entities in the database.
 It is now more common to give separate names to the entity type and
entity collection; for example in object and object-relational data models
Key Attributes of an Entity Type
 An important constraint on the entities of an entity type is the key or
uniqueness constraint on attributes.
 An entity type usually has one or more attributes whose values are distinct
for each individual entity in the entity set.
 Such an attribute is called a key attribute, and its values can be used to
identify each entity uniquely.
 Sometimes several attributes together form a key, meaning that the combination of the
attribute values must be distinct for each entity.
 If a set of attributes possesses this property, the proper way to represent this in the ER
model that describe here is to define a composite attribute and designate it as a key
attribute of the entity type.
 Specifying that an attribute is a key of an entity type means that the preceding
uniqueness property must hold for every entity set of the entity type.
 Hence, it is a constraint that prohibits any two entities from having the same value for
the key attribute at the same time.
 It is not the property of a particular entity set; rather, it is a constraint on any entity set
of the entity type at any point in time.
Database Management system : UNit I Helping to understand basics of dbms operation
Value Sets (Domains) of Attributes
 Each simple attribute of an entity type is associated with a value set (or domain of
values), which specifies the set of values that may be assigned to that attribute for each
individual entity.
 If the range of ages allowed for employees is between 16 and 70, we can specify the value
set of the Age attribute of EMPLOYEE to be the set of integer numbers between 16 and
70.
 Similarly, we can specify the value set for the Name attribute to be the set of strings of
alphabetic characters separated by blank characters, and so on.
 Value sets are not typically displayed in basic ER diagrams and are similar to the basic
data types available in most programming languages, such as integer, string, Boolean,
Relationship Types, Relationship Sets,
Roles, and Structural Constraints
 For example, the attribute Manager of DEPARTMENT refers to an employee who
manages the department; the attribute Controlling_department of PROJECT refers to the
department that controls the project; the attribute
 Supervisor of EMPLOYEE refers to another employee (the one who supervises this
employee); the attribute Department of EMPLOYEE refers to the department for which
the employee works; and so on.
 In the ER model, these references should not be represented as attributes but as
relationships.
 A relationship type R among n entity types E1, E2, . . . , En defines a set of
associations— or a relationship set—among entities from these entity types.
 Similar to the case of entity types and entity sets, a relationship type and its
corresponding relationship set are customarily referred to by the same name, R.
 Mathematically, the relationship set R is a set of relationship instances ri, where
each ri associates n individual entities (e1, e2, . . . , en), and each entity ej in ri is a
member of entity set Ej, 1 ≤ j ≤ n. Hence, a relationship set is a mathematical relation
on E1, E2, . . . , En; alternatively, it can be defined as a subset of the Cartesian
product of the entity sets
 E1 × E2 × . . . × En. Each of the entity types E1, E2, . . . , En is said to participate in
the relationship type R; similarly, each of the individual entities e1, e2, . . . , en is said
Database Management system : UNit I Helping to understand basics of dbms operation
 EMPLOYEE and DEPARTMENT, which associates each employee with the department
for which the employee works.
 Each relationship instance in the relationship set WORKS_FOR associates one
EMPLOYEE entity and one DEPARTMENT entity.
 Figure illustrates this example, where each relationship instance ri is shown connected to
the EMPLOYEE and DEPARTMENT entities that participate in ri.
 In the mini-world represented by Figure, the employees e1, e3, and e6 work for department
d1; the employees e2 and e4 work for department d2; and the employees e5 and e7 work
for department d3.
Relationship Degree, Role Names, and
Recursive Relationships
 Degree of a Relationship Type
 The degree of a relationship type is the number of participating entity types. Hence, the
WORKS_FOR relationship is of degree two.
 A relationship type of degree two is called binary, and one of degree three is called ternary.
 An example of a ternary relationship is SUPPLY, shown in Figure where each relationship
instance ri associates three entities—a supplier s, a part p, and a project j—whenever s supplies
part p to project j.
 Relationships can generally be of any degree, but the ones most common are binary relationships.
 Higher-degree relationships are generally more complex than binary relationships
Relationships as Attributes
 One can think of an attribute called Department of the EMPLOYEE entity type, where the value of
Department for each EMPLOYEE entity is (a reference to) the DEPARTMENT entity for which that
employee works.
 Hence, the value set for this Department attribute is the set of all DEPARTMENT entities, which is
the DEPARTMENT entity set.
 This is what we did in Figure when we specified the initial design of the entity type EMPLOYEE
for the COMPANY database.
 However, when we think of a binary relationship as an attribute, we always have two options or two
points of view.
 In this example, the alternative point of view is to think of a multivalued attribute Employees of the
entity type DEPARTMENT whose value for each DEPARTMENT entity is the set of EMPLOYEE
entities who work for that department.
 The value set of this Employees attribute is the power set of the EMPLOYEE entity set. Either of
these two attributes—Department of EMPLOYEE or Employees of DEPARTMENT—can
represent the WORKS_FOR relationship type.
 Role Names and Recursive Relationships. Each entity type that participates in a relationship type
plays a particular role in the relationship.
 The role name signifies the role that a participating entity from the entity type plays
in each relationship instance, and it helps to explain what the relationship means.
 For example, in the WORKS_FOR relationship type, EMPLOYEE plays the role of
employee or worker and DEPARTMENT plays the role of department or employer.
 In such cases the role name becomes essential for distinguishing the meaning of the
role that each participating entity plays. Such relationship types are called recursive
relationships or self-referencing relationships.
Database Management system : UNit I Helping to understand basics of dbms operation
Weak Entity Types
 Entity types that do not have key attributes of their own are called weak entity types.
 In contrast, regular entity types that do have a key attribute—which include all the examples
discussed so far—are called strong entity types.
 Entities belonging to a weak entity type are identified by being related to specific entities from
another entity type in combination with one of their attribute values.
 We call this other entity type the identifying or owner entity type,10 and we call the
relationship type that relates a weak entity typeto its owner the identifying relationship of the
weak entity type.
 A weak entity type always has a total participation constraint (existence dependency)
with respect to its identifying relationship because a weak entity cannot be identified
without an owner entity.
 However, not every existence dependency results in a weak entity type. For example,
a DRIVER_LICENSE entity cannot exist unless it is related to a PERSON entity, even
though it has its own key (License_number) and hence is not a weak entity.
 A weak entity type normally has a partial key, which is the attribute that
can uniquely identify weak entities that are related to the same owner
entity.
 In our example, if we assume that no two dependents of the same
employee ever have the same first name, the attribute Name of
DEPENDENT is the partial key.
 In the worst case, a composite attribute of all the weak entity’s attributes
will be the partial key.
ER Diagram Notations
Database Management system : UNit I Helping to understand basics of dbms operation
Database Management system : UNit I Helping to understand basics of dbms operation
Enhanced ER Model
 The EER model includes all the modeling concepts of the ER model that were presented. In addition,
it includes the concepts of subclass and superclass and the related concepts of specialization and
generalization
 Another concept included in the EER model is that of a category or union type which is used to
represent a collection of objects (entities) that is the union of objects of different entity types.
 Associated with these concepts is the important mechanism of attribute and relationship
inheritance.
 Unfortunately, no standard terminology exists for these concepts, so we use the most common
terminology.
 Alternative terminology is given in footnotes. We also describe a diagrammatic technique for
displaying these concepts when they arise in an EER schema. We call the resulting schema diagrams
enhanced ER or EER diagrams.
 For example, the entity type EMPLOYEE describes the type (that is, the attributes and
relationships) of each employee entity, and also refers to the current set of EMPLOYEE
entities in the COMPANY database.
 In many cases an entity type has numerous subgroupings or subtypes of its entities that are
meaningful and need to be represented explicitly because of their significance to the database
application.
 For example, the entities that are members of the EMPLOYEE entity type may be
distinguished further into SECRETARY, ENGINEER, MANAGER, TECHNICIAN,
SALARIED_EMPLOYEE, HOURLY_EMPLOYEE, and so on.
 The set or collection of entities in each of the latter groupings is a subset of the entities that
belong to the EMPLOYEE entity set, meaning that every entity that is a member of one of
these subgroupings is also an employee.
Specialization and Generalization
 Specialization is the process of defining a set of subclasses of an entity type; this entity type is
called the superclass of the specialization. The set of subclasses that forms a specialization is
defined on the basis of some distinguishing characteristic of the entities in the superclass.
 For example, the set of subclasses {SECRETARY, ENGINEER, TECHNICIAN} is a specialization
of the superclass EMPLOYEE that distinguishes among employee entities based on the job type of
each employee.
 We may have several specializations of the same entity type based on different distinguishing
characteristics.
 For example, another specialization of the EMPLOYEE entity type may yield the set of subclasses
{SALARIED_EMPLOYEE, HOURLY_EMPLOYEE}; this specialization distinguishes among
employees based on the method of pay.
Database Management system : UNit I Helping to understand basics of dbms operation
 There are two main reasons for including class/subclass relationships and specializations.
 The first is that certain attributes may apply to some but not all entities of which these attributes apply.
 The members of the subclass may still share the majority of their attributes with the other members of
the superclass.
 The SECRETARY subclass has the specific attribute Typing_speed, whereas the ENGINEER subclass
has the specific attribute Eng_type, but SECRETARY and ENGINEER share their other inherited
attributes from the EMPLOYEE entity type.
 The second reason for using subclasses is that some relationship types may be participated in only by
entities that are members of the subclass.
 For example, if only HOURLY_EMPLOYEES can belong to a trade union, we can represent that fact by
creating the subclass HOURLY_EMPLOYEE of EMPLOYEE and relating the subclass to an entity type
TRADE_UNION via the BELONGS_TO relationship type, as illustrated in Figure
Generalization
 Notice that the generalization process can be viewed as being functionally the inverse of
the specialization process; we can view {CAR, TRUCK} as a specialization of
VEHICLE rather than viewing VEHICLE as a generalization of CAR and TRUCK.
 A diagrammatic notation to distinguish between generalization and specialization is used
in some design methodologies.
 An arrow pointing to the generalized superclass represents a generalization process,
whereas arrows pointing to the specialized subclasses represent a specialization process.
 We will not use this notation because the decision as to which process was followed in a
particular situation is often subjective.
Constraints and Characteristics of Specialization
and Generalization Hierarchies
 If all subclasses in a specialization have their membership condition on the
same attribute of the superclass, the specialization itself is called an
attribute-defined specialization, and the attribute is called the defining
attribute of the specialization.
 In this case, all the entities with the same value for the attribute belong to the
same subclass. We display an attribute-defined specialization by placing the
defining attribute name next to the arc from the circle to the superclass
Database Management system : UNit I Helping to understand basics of dbms operation
 Two other constraints may apply to a specialization. The first is the disjointness
constraint, which specifies that the subclasses of the specialization must be disjoint sets.
 This means that an entity can be a member of at most one of the subclasses of the
specialization.
 A specialization that is attribute-defined implies the dis jointness constraint (if the
attribute used to define the membership predicate is single valued).
 Figure illustrates this case, where the d in the circle stands for disjoint. The d notation
also applies to user-defined subclasses of a specialization that must be disjoint, as
illustrated by the specialization {HOURLY_EMPLOYEE, SALARIED_EMPLOYEE} in
Figure
Database Management system : UNit I Helping to understand basics of dbms operation
 The second constraint on specialization is called the completeness (or totalness)
constraint, which may be total or partial.
 A total specialization constraint specifies that every entity in the superclass must be a
member of at least one subclass in the specialization.
 For example, if every EMPLOYEE must be either an HOURLY_EMPLOYEE or a
SALARIED_EMPLOYEE, then the specialization {HOURLY_EMPLOYEE,
SALARIED_EMPLOYEE} in Figure 4.1 is a total specialization of EMPLOYEE.
 This is shown in EER diagrams by using a double line to connect the superclass to the
circle. A single line is used to display a partial specialization, which allows an entity not to
belong to any of the subclasses.
Utilizing Specialization and Generalization in
Refining Conceptual Schemas
 In the specialization process, the database designers typically start with an entity type and
then define subclasses of the entity type by successive specialization; that is, they
repeatedly define more specific groupings of the entity type.
 For example, when designing the specialization lattice in Figure, we may first specify an
entity type PERSON for a university database.
 Then we discover that three types of persons will be represented in the database: university
employees, alumni, and students and we create the specialization {EMPLOYEE,
ALUMNUS, STUDENT}.
 The overlapping constraint is chosen because a person may belong to more than one of the
subclasses.
 We specialize EMPLOYEE further into {STAFF, FACULTY, STUDENT_ASSISTANT},
and specialize STUDENT into {GRADUATE_STUDENT,
UNDERGRADUATE_STUDENT}.
 Finally, we specialize STUDENT_ASSISTANT into {RESEARCH_ASSISTANT,
TEACHING_ASSISTANT}.
 This process is called top-down conceptual refinement.
 So far, we have a hierarchy; then we realize that STUDENT_ASSISTANT is a shared
subclass, since it is also a subclass of STUDENT, leading to the lattice.
 The final design of hierarchies or lattices resulting from either process may
be identical;
 the only difference relates to the manner or order in which the schema
super classes and subclasses were created during the design process.
 In practice, it is likely that a combination of the two processes is
employed.
Database Management system : UNit I Helping to understand basics of dbms operation
Database Management system : UNit I Helping to understand basics of dbms operation
Ad

More Related Content

Similar to Database Management system : UNit I Helping to understand basics of dbms operation (20)

01-Database Administration and Management.pdf
01-Database Administration and Management.pdf01-Database Administration and Management.pdf
01-Database Administration and Management.pdf
TOUSEEQHAIDER14
 
DATABASE MANAGEMENT SYSTEMS CS 3492.pptx
DATABASE MANAGEMENT SYSTEMS CS 3492.pptxDATABASE MANAGEMENT SYSTEMS CS 3492.pptx
DATABASE MANAGEMENT SYSTEMS CS 3492.pptx
venigkrish89
 
Data concepts
Data conceptsData concepts
Data concepts
Sachidananda M H
 
Unit 1_a3a8ea5ab7e65a96a4c41a4868546494.pptx
Unit 1_a3a8ea5ab7e65a96a4c41a4868546494.pptxUnit 1_a3a8ea5ab7e65a96a4c41a4868546494.pptx
Unit 1_a3a8ea5ab7e65a96a4c41a4868546494.pptx
AshishHiwale1
 
PHP/MySQL First Session Material
PHP/MySQL First Session MaterialPHP/MySQL First Session Material
PHP/MySQL First Session Material
National IT Professionals Association of Afghanistan
 
BCA Database Management Systems Unit - 1.pptx
BCA Database Management Systems  Unit - 1.pptxBCA Database Management Systems  Unit - 1.pptx
BCA Database Management Systems Unit - 1.pptx
jinkhatima
 
Unit1 dbms
Unit1 dbmsUnit1 dbms
Unit1 dbms
gowrivageesan87
 
CST204 DBMS Module-1
CST204 DBMS Module-1CST204 DBMS Module-1
CST204 DBMS Module-1
Jyothis Menon
 
DBMS Introduction 5 .pptx
DBMS Introduction 5 .pptxDBMS Introduction 5 .pptx
DBMS Introduction 5 .pptx
NitinKashyap89
 
Intro Duction of Database and its fundamentals .ppt
Intro Duction of Database and its fundamentals .pptIntro Duction of Database and its fundamentals .ppt
Intro Duction of Database and its fundamentals .ppt
Faisal Khan
 
UNIT-1.pptx discusses about introduction to dbms
UNIT-1.pptx discusses about introduction to dbmsUNIT-1.pptx discusses about introduction to dbms
UNIT-1.pptx discusses about introduction to dbms
DrRBullibabu
 
Introduction to Database Management System.pdf
Introduction to Database Management System.pdfIntroduction to Database Management System.pdf
Introduction to Database Management System.pdf
biswajit62002
 
dbms Unit 1.pdf arey bhai teri maa chodunga
dbms Unit 1.pdf arey bhai teri maa chodungadbms Unit 1.pdf arey bhai teri maa chodunga
dbms Unit 1.pdf arey bhai teri maa chodunga
VaradKadtan1
 
Database Management Systems (DBMS) are software systems used to store, retrie...
Database Management Systems (DBMS) are software systems used to store, retrie...Database Management Systems (DBMS) are software systems used to store, retrie...
Database Management Systems (DBMS) are software systems used to store, retrie...
mayurjagdale4
 
Dbms introduction about data models,data
Dbms introduction about data models,dataDbms introduction about data models,data
Dbms introduction about data models,data
kngaravind54
 
Unit 2 DATABASE ESSENTIALS.pptx
Unit 2 DATABASE ESSENTIALS.pptxUnit 2 DATABASE ESSENTIALS.pptx
Unit 2 DATABASE ESSENTIALS.pptx
Nirmalavenkatachalam
 
CS3270 - DATABASE SYSTEM - Lecture (1)
CS3270 - DATABASE SYSTEM -  Lecture (1)CS3270 - DATABASE SYSTEM -  Lecture (1)
CS3270 - DATABASE SYSTEM - Lecture (1)
Dilawar Khan
 
Database and Database Users_Dr.Balamurugan M.pdf
Database and Database Users_Dr.Balamurugan M.pdfDatabase and Database Users_Dr.Balamurugan M.pdf
Database and Database Users_Dr.Balamurugan M.pdf
Dr. Balamurugan M
 
CST204 DBMSMODULE1 PPT (1).pptx
CST204 DBMSMODULE1 PPT (1).pptxCST204 DBMSMODULE1 PPT (1).pptx
CST204 DBMSMODULE1 PPT (1).pptx
MEGHANA508383
 
6134DBMS Introduction 5 sem.pptx
6134DBMS Introduction 5 sem.pptx6134DBMS Introduction 5 sem.pptx
6134DBMS Introduction 5 sem.pptx
Danishompi
 
01-Database Administration and Management.pdf
01-Database Administration and Management.pdf01-Database Administration and Management.pdf
01-Database Administration and Management.pdf
TOUSEEQHAIDER14
 
DATABASE MANAGEMENT SYSTEMS CS 3492.pptx
DATABASE MANAGEMENT SYSTEMS CS 3492.pptxDATABASE MANAGEMENT SYSTEMS CS 3492.pptx
DATABASE MANAGEMENT SYSTEMS CS 3492.pptx
venigkrish89
 
Unit 1_a3a8ea5ab7e65a96a4c41a4868546494.pptx
Unit 1_a3a8ea5ab7e65a96a4c41a4868546494.pptxUnit 1_a3a8ea5ab7e65a96a4c41a4868546494.pptx
Unit 1_a3a8ea5ab7e65a96a4c41a4868546494.pptx
AshishHiwale1
 
BCA Database Management Systems Unit - 1.pptx
BCA Database Management Systems  Unit - 1.pptxBCA Database Management Systems  Unit - 1.pptx
BCA Database Management Systems Unit - 1.pptx
jinkhatima
 
CST204 DBMS Module-1
CST204 DBMS Module-1CST204 DBMS Module-1
CST204 DBMS Module-1
Jyothis Menon
 
DBMS Introduction 5 .pptx
DBMS Introduction 5 .pptxDBMS Introduction 5 .pptx
DBMS Introduction 5 .pptx
NitinKashyap89
 
Intro Duction of Database and its fundamentals .ppt
Intro Duction of Database and its fundamentals .pptIntro Duction of Database and its fundamentals .ppt
Intro Duction of Database and its fundamentals .ppt
Faisal Khan
 
UNIT-1.pptx discusses about introduction to dbms
UNIT-1.pptx discusses about introduction to dbmsUNIT-1.pptx discusses about introduction to dbms
UNIT-1.pptx discusses about introduction to dbms
DrRBullibabu
 
Introduction to Database Management System.pdf
Introduction to Database Management System.pdfIntroduction to Database Management System.pdf
Introduction to Database Management System.pdf
biswajit62002
 
dbms Unit 1.pdf arey bhai teri maa chodunga
dbms Unit 1.pdf arey bhai teri maa chodungadbms Unit 1.pdf arey bhai teri maa chodunga
dbms Unit 1.pdf arey bhai teri maa chodunga
VaradKadtan1
 
Database Management Systems (DBMS) are software systems used to store, retrie...
Database Management Systems (DBMS) are software systems used to store, retrie...Database Management Systems (DBMS) are software systems used to store, retrie...
Database Management Systems (DBMS) are software systems used to store, retrie...
mayurjagdale4
 
Dbms introduction about data models,data
Dbms introduction about data models,dataDbms introduction about data models,data
Dbms introduction about data models,data
kngaravind54
 
CS3270 - DATABASE SYSTEM - Lecture (1)
CS3270 - DATABASE SYSTEM -  Lecture (1)CS3270 - DATABASE SYSTEM -  Lecture (1)
CS3270 - DATABASE SYSTEM - Lecture (1)
Dilawar Khan
 
Database and Database Users_Dr.Balamurugan M.pdf
Database and Database Users_Dr.Balamurugan M.pdfDatabase and Database Users_Dr.Balamurugan M.pdf
Database and Database Users_Dr.Balamurugan M.pdf
Dr. Balamurugan M
 
CST204 DBMSMODULE1 PPT (1).pptx
CST204 DBMSMODULE1 PPT (1).pptxCST204 DBMSMODULE1 PPT (1).pptx
CST204 DBMSMODULE1 PPT (1).pptx
MEGHANA508383
 
6134DBMS Introduction 5 sem.pptx
6134DBMS Introduction 5 sem.pptx6134DBMS Introduction 5 sem.pptx
6134DBMS Introduction 5 sem.pptx
Danishompi
 

More from SanthanalakshmiSelva2 (9)

data communication and networkig fibre optical cable and non fibre cable
data communication and networkig fibre optical cable and  non fibre cabledata communication and networkig fibre optical cable and  non fibre cable
data communication and networkig fibre optical cable and non fibre cable
SanthanalakshmiSelva2
 
Engineering Computer network data communication and networkig fibre optical c...
Engineering Computer network data communication and networkig fibre optical c...Engineering Computer network data communication and networkig fibre optical c...
Engineering Computer network data communication and networkig fibre optical c...
SanthanalakshmiSelva2
 
computer network and data communication and networkig fibre optical cable and...
computer network and data communication and networkig fibre optical cable and...computer network and data communication and networkig fibre optical cable and...
computer network and data communication and networkig fibre optical cable and...
SanthanalakshmiSelva2
 
unit 3 Chapter 4 - privaacy.pptx : This is in ethichs. helping to understand ...
unit 3 Chapter 4 - privaacy.pptx : This is in ethichs. helping to understand ...unit 3 Chapter 4 - privaacy.pptx : This is in ethichs. helping to understand ...
unit 3 Chapter 4 - privaacy.pptx : This is in ethichs. helping to understand ...
SanthanalakshmiSelva2
 
Ethics in Business World imporant. Ethucs required in business which help to ...
Ethics in Business World imporant. Ethucs required in business which help to ...Ethics in Business World imporant. Ethucs required in business which help to ...
Ethics in Business World imporant. Ethucs required in business which help to ...
SanthanalakshmiSelva2
 
Unit-5 - 19 community mobilization in waste management
Unit-5 - 19 community mobilization in waste managementUnit-5 - 19 community mobilization in waste management
Unit-5 - 19 community mobilization in waste management
SanthanalakshmiSelva2
 
unit-1- 2 - challenges. waste management subject it is environmental sciences
unit-1- 2 - challenges. waste management subject it is environmental sciencesunit-1- 2 - challenges. waste management subject it is environmental sciences
unit-1- 2 - challenges. waste management subject it is environmental sciences
SanthanalakshmiSelva2
 
Software Engineering use many phases include integrity, consistency and accou...
Software Engineering use many phases include integrity, consistency and accou...Software Engineering use many phases include integrity, consistency and accou...
Software Engineering use many phases include integrity, consistency and accou...
SanthanalakshmiSelva2
 
SOFTWARE DEVELOPMENT AND PROCESS MODELS.pptx
SOFTWARE DEVELOPMENT AND PROCESS MODELS.pptxSOFTWARE DEVELOPMENT AND PROCESS MODELS.pptx
SOFTWARE DEVELOPMENT AND PROCESS MODELS.pptx
SanthanalakshmiSelva2
 
data communication and networkig fibre optical cable and non fibre cable
data communication and networkig fibre optical cable and  non fibre cabledata communication and networkig fibre optical cable and  non fibre cable
data communication and networkig fibre optical cable and non fibre cable
SanthanalakshmiSelva2
 
Engineering Computer network data communication and networkig fibre optical c...
Engineering Computer network data communication and networkig fibre optical c...Engineering Computer network data communication and networkig fibre optical c...
Engineering Computer network data communication and networkig fibre optical c...
SanthanalakshmiSelva2
 
computer network and data communication and networkig fibre optical cable and...
computer network and data communication and networkig fibre optical cable and...computer network and data communication and networkig fibre optical cable and...
computer network and data communication and networkig fibre optical cable and...
SanthanalakshmiSelva2
 
unit 3 Chapter 4 - privaacy.pptx : This is in ethichs. helping to understand ...
unit 3 Chapter 4 - privaacy.pptx : This is in ethichs. helping to understand ...unit 3 Chapter 4 - privaacy.pptx : This is in ethichs. helping to understand ...
unit 3 Chapter 4 - privaacy.pptx : This is in ethichs. helping to understand ...
SanthanalakshmiSelva2
 
Ethics in Business World imporant. Ethucs required in business which help to ...
Ethics in Business World imporant. Ethucs required in business which help to ...Ethics in Business World imporant. Ethucs required in business which help to ...
Ethics in Business World imporant. Ethucs required in business which help to ...
SanthanalakshmiSelva2
 
Unit-5 - 19 community mobilization in waste management
Unit-5 - 19 community mobilization in waste managementUnit-5 - 19 community mobilization in waste management
Unit-5 - 19 community mobilization in waste management
SanthanalakshmiSelva2
 
unit-1- 2 - challenges. waste management subject it is environmental sciences
unit-1- 2 - challenges. waste management subject it is environmental sciencesunit-1- 2 - challenges. waste management subject it is environmental sciences
unit-1- 2 - challenges. waste management subject it is environmental sciences
SanthanalakshmiSelva2
 
Software Engineering use many phases include integrity, consistency and accou...
Software Engineering use many phases include integrity, consistency and accou...Software Engineering use many phases include integrity, consistency and accou...
Software Engineering use many phases include integrity, consistency and accou...
SanthanalakshmiSelva2
 
SOFTWARE DEVELOPMENT AND PROCESS MODELS.pptx
SOFTWARE DEVELOPMENT AND PROCESS MODELS.pptxSOFTWARE DEVELOPMENT AND PROCESS MODELS.pptx
SOFTWARE DEVELOPMENT AND PROCESS MODELS.pptx
SanthanalakshmiSelva2
 
Ad

Recently uploaded (20)

LDMMIA Reiki Yoga S5 Daily Living Workshop
LDMMIA Reiki Yoga S5 Daily Living WorkshopLDMMIA Reiki Yoga S5 Daily Living Workshop
LDMMIA Reiki Yoga S5 Daily Living Workshop
LDM Mia eStudios
 
Rock Art As a Source of Ancient Indian History
Rock Art As a Source of Ancient Indian HistoryRock Art As a Source of Ancient Indian History
Rock Art As a Source of Ancient Indian History
Virag Sontakke
 
CNS infections (encephalitis, meningitis & Brain abscess
CNS infections (encephalitis, meningitis & Brain abscessCNS infections (encephalitis, meningitis & Brain abscess
CNS infections (encephalitis, meningitis & Brain abscess
Mohamed Rizk Khodair
 
Overview Well-Being and Creative Careers
Overview Well-Being and Creative CareersOverview Well-Being and Creative Careers
Overview Well-Being and Creative Careers
University of Amsterdam
 
How to Manage Amounts in Local Currency in Odoo 18 Purchase
How to Manage Amounts in Local Currency in Odoo 18 PurchaseHow to Manage Amounts in Local Currency in Odoo 18 Purchase
How to Manage Amounts in Local Currency in Odoo 18 Purchase
Celine George
 
How to Create Kanban View in Odoo 18 - Odoo Slides
How to Create Kanban View in Odoo 18 - Odoo SlidesHow to Create Kanban View in Odoo 18 - Odoo Slides
How to Create Kanban View in Odoo 18 - Odoo Slides
Celine George
 
Transform tomorrow: Master benefits analysis with Gen AI today webinar, 30 A...
Transform tomorrow: Master benefits analysis with Gen AI today webinar,  30 A...Transform tomorrow: Master benefits analysis with Gen AI today webinar,  30 A...
Transform tomorrow: Master benefits analysis with Gen AI today webinar, 30 A...
Association for Project Management
 
The role of wall art in interior designing
The role of wall art in interior designingThe role of wall art in interior designing
The role of wall art in interior designing
meghaark2110
 
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast BrooklynBridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
i4jd41bk
 
E-Filing_of_Income_Tax.pptx and concept of form 26AS
E-Filing_of_Income_Tax.pptx and concept of form 26ASE-Filing_of_Income_Tax.pptx and concept of form 26AS
E-Filing_of_Income_Tax.pptx and concept of form 26AS
Abinash Palangdar
 
Cultivation Practice of Onion in Nepal.pptx
Cultivation Practice of Onion in Nepal.pptxCultivation Practice of Onion in Nepal.pptx
Cultivation Practice of Onion in Nepal.pptx
UmeshTimilsina1
 
How to Clean Your Contacts Using the Deduplication Menu in Odoo 18
How to Clean Your Contacts Using the Deduplication Menu in Odoo 18How to Clean Your Contacts Using the Deduplication Menu in Odoo 18
How to Clean Your Contacts Using the Deduplication Menu in Odoo 18
Celine George
 
APGAR SCORE BY sweety Tamanna Mahapatra MSc Pediatric
APGAR SCORE  BY sweety Tamanna Mahapatra MSc PediatricAPGAR SCORE  BY sweety Tamanna Mahapatra MSc Pediatric
APGAR SCORE BY sweety Tamanna Mahapatra MSc Pediatric
SweetytamannaMohapat
 
How to Share Accounts Between Companies in Odoo 18
How to Share Accounts Between Companies in Odoo 18How to Share Accounts Between Companies in Odoo 18
How to Share Accounts Between Companies in Odoo 18
Celine George
 
Ajanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of HistoryAjanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of History
Virag Sontakke
 
Botany Assignment Help Guide - Academic Excellence
Botany Assignment Help Guide - Academic ExcellenceBotany Assignment Help Guide - Academic Excellence
Botany Assignment Help Guide - Academic Excellence
online college homework help
 
Form View Attributes in Odoo 18 - Odoo Slides
Form View Attributes in Odoo 18 - Odoo SlidesForm View Attributes in Odoo 18 - Odoo Slides
Form View Attributes in Odoo 18 - Odoo Slides
Celine George
 
Myasthenia gravis (Neuromuscular disorder)
Myasthenia gravis (Neuromuscular disorder)Myasthenia gravis (Neuromuscular disorder)
Myasthenia gravis (Neuromuscular disorder)
Mohamed Rizk Khodair
 
LDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDMMIA Reiki News Ed3 Vol1 For Team and GuestsLDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDM Mia eStudios
 
MCQ PHYSIOLOGY II (DR. NASIR MUSTAFA) MCQS)
MCQ PHYSIOLOGY II (DR. NASIR MUSTAFA) MCQS)MCQ PHYSIOLOGY II (DR. NASIR MUSTAFA) MCQS)
MCQ PHYSIOLOGY II (DR. NASIR MUSTAFA) MCQS)
Dr. Nasir Mustafa
 
LDMMIA Reiki Yoga S5 Daily Living Workshop
LDMMIA Reiki Yoga S5 Daily Living WorkshopLDMMIA Reiki Yoga S5 Daily Living Workshop
LDMMIA Reiki Yoga S5 Daily Living Workshop
LDM Mia eStudios
 
Rock Art As a Source of Ancient Indian History
Rock Art As a Source of Ancient Indian HistoryRock Art As a Source of Ancient Indian History
Rock Art As a Source of Ancient Indian History
Virag Sontakke
 
CNS infections (encephalitis, meningitis & Brain abscess
CNS infections (encephalitis, meningitis & Brain abscessCNS infections (encephalitis, meningitis & Brain abscess
CNS infections (encephalitis, meningitis & Brain abscess
Mohamed Rizk Khodair
 
Overview Well-Being and Creative Careers
Overview Well-Being and Creative CareersOverview Well-Being and Creative Careers
Overview Well-Being and Creative Careers
University of Amsterdam
 
How to Manage Amounts in Local Currency in Odoo 18 Purchase
How to Manage Amounts in Local Currency in Odoo 18 PurchaseHow to Manage Amounts in Local Currency in Odoo 18 Purchase
How to Manage Amounts in Local Currency in Odoo 18 Purchase
Celine George
 
How to Create Kanban View in Odoo 18 - Odoo Slides
How to Create Kanban View in Odoo 18 - Odoo SlidesHow to Create Kanban View in Odoo 18 - Odoo Slides
How to Create Kanban View in Odoo 18 - Odoo Slides
Celine George
 
Transform tomorrow: Master benefits analysis with Gen AI today webinar, 30 A...
Transform tomorrow: Master benefits analysis with Gen AI today webinar,  30 A...Transform tomorrow: Master benefits analysis with Gen AI today webinar,  30 A...
Transform tomorrow: Master benefits analysis with Gen AI today webinar, 30 A...
Association for Project Management
 
The role of wall art in interior designing
The role of wall art in interior designingThe role of wall art in interior designing
The role of wall art in interior designing
meghaark2110
 
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast BrooklynBridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
Bridging the Transit Gap: Equity Drive Feeder Bus Design for Southeast Brooklyn
i4jd41bk
 
E-Filing_of_Income_Tax.pptx and concept of form 26AS
E-Filing_of_Income_Tax.pptx and concept of form 26ASE-Filing_of_Income_Tax.pptx and concept of form 26AS
E-Filing_of_Income_Tax.pptx and concept of form 26AS
Abinash Palangdar
 
Cultivation Practice of Onion in Nepal.pptx
Cultivation Practice of Onion in Nepal.pptxCultivation Practice of Onion in Nepal.pptx
Cultivation Practice of Onion in Nepal.pptx
UmeshTimilsina1
 
How to Clean Your Contacts Using the Deduplication Menu in Odoo 18
How to Clean Your Contacts Using the Deduplication Menu in Odoo 18How to Clean Your Contacts Using the Deduplication Menu in Odoo 18
How to Clean Your Contacts Using the Deduplication Menu in Odoo 18
Celine George
 
APGAR SCORE BY sweety Tamanna Mahapatra MSc Pediatric
APGAR SCORE  BY sweety Tamanna Mahapatra MSc PediatricAPGAR SCORE  BY sweety Tamanna Mahapatra MSc Pediatric
APGAR SCORE BY sweety Tamanna Mahapatra MSc Pediatric
SweetytamannaMohapat
 
How to Share Accounts Between Companies in Odoo 18
How to Share Accounts Between Companies in Odoo 18How to Share Accounts Between Companies in Odoo 18
How to Share Accounts Between Companies in Odoo 18
Celine George
 
Ajanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of HistoryAjanta Paintings: Study as a Source of History
Ajanta Paintings: Study as a Source of History
Virag Sontakke
 
Botany Assignment Help Guide - Academic Excellence
Botany Assignment Help Guide - Academic ExcellenceBotany Assignment Help Guide - Academic Excellence
Botany Assignment Help Guide - Academic Excellence
online college homework help
 
Form View Attributes in Odoo 18 - Odoo Slides
Form View Attributes in Odoo 18 - Odoo SlidesForm View Attributes in Odoo 18 - Odoo Slides
Form View Attributes in Odoo 18 - Odoo Slides
Celine George
 
Myasthenia gravis (Neuromuscular disorder)
Myasthenia gravis (Neuromuscular disorder)Myasthenia gravis (Neuromuscular disorder)
Myasthenia gravis (Neuromuscular disorder)
Mohamed Rizk Khodair
 
LDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDMMIA Reiki News Ed3 Vol1 For Team and GuestsLDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDMMIA Reiki News Ed3 Vol1 For Team and Guests
LDM Mia eStudios
 
MCQ PHYSIOLOGY II (DR. NASIR MUSTAFA) MCQS)
MCQ PHYSIOLOGY II (DR. NASIR MUSTAFA) MCQS)MCQ PHYSIOLOGY II (DR. NASIR MUSTAFA) MCQS)
MCQ PHYSIOLOGY II (DR. NASIR MUSTAFA) MCQS)
Dr. Nasir Mustafa
 
Ad

Database Management system : UNit I Helping to understand basics of dbms operation

  • 2. UNIT- 1 • INTRODUCTION, DATABASE DESIGN AND RELATIONAL MODEL • Introduction • Database System Applications, • Purpose of Database Systems, • View of Data • Database Languages, • Database and Application Architecture • Database Users and Administrators • Database Schema • Keys • Schema Diagrams • ER diagrams • Mapping Cardinalities • Alternatives Notations for Modeling Data – Data Flow Diagram.
  • 3. UNIT I INTRODUCTION Database & Database Users. Characteristics of the Database Approach advantages of usine DBMS. Data Models, Schemas & Instances. DBMS Architecture & Data Independence. System Architecture for DBMS and Data Dictionary, Database Users Data Base languages & Interfaces. Data Modeling using the Entity-Relationship Model -Entity types, Entity Sets, Attributes and Keys, Relationship, Relationship Types, Week Entity Types, Structural Constraints, Enhanced ER Model- Specialization Generalization, Constraints on Specialization Generalization- Data flow diagram.
  • 4. DATA • Data refers to raw facts • Data may be numerical data which may be integers or floating point numbers, and non-numerical data such as characters, date and etc.,
  • 6. Prepared by : M.Nirmala / AP / MCA Example 98 89 87 92 phy chem maths biology The above numbers may be anything: It may be distance in kms or amount in rupees or no of days or marks in each subject etc.,
  • 7. Prepared by : M.Nirmala / AP / MCA Information • Processed Data is referred as Information. It delivers some meanings to the user 98 89 87 92 phy chem maths biology
  • 8. Prepared by : M.Nirmala / AP / MCA Database • Database is an organized, collection of related information • Eg. telephone directory, dictionary, and college database. 98 89 87 92 phy chem maths biology 76 87 79 88 phy chem maths biology 91 67 87 77 phy chem maths biology 86 80 79 88 phy chem maths biology
  • 9. Prepared by : M.Nirmala / AP / MCA Database Management Systems • A database management system or DBMS is software designed to store, modify, maintaining and utilizing large collection of data. • A database-management system (DBMS) is a collection of interrelated data and a set of programs to access those data • Database Applications: • Banking: all transactions • Airlines: reservations, schedules • Universities: registration, grades • Sales: customers, products, purchases • Online retailers: order tracking, customized recommendations • Manufacturing: production, inventory, orders, supply chain • Human resources: employee records, salaries, tax deductions • Databases touch all aspects of our lives
  • 10. Why do u need a DBMS? • A Database Management System Is an Extension of Human Logic • Computers Can Quickly Answer Lots of Questions • Some Questions Can Be Really Complicated • We Are Easily Overwhelmed With Information • Automation Is the Key to Efficiency • A DBMS Is Better Than Manual Processes in so Many Ways
  • 12. ENTERPRISE INFORMATION • Sales: For customer, product, and purchase information. • Accounting: For payments, receipts, account balances, assets and other accounting information. • Human resources: For information about employees, salaries, payroll taxes, and benefits, and for generation of paychecks. • Manufacturing: For management of the supply chain and for tracking production of items in factories, inventories of items inwarehouses and stores, and orders for items. • Online retailers: For sales data noted above plus online order tracking, generation of recommendation lists, and maintenance of online product evaluations
  • 13. Banking and Finance • ◦ Banking: For customer information, accounts, loans, and banking transactions. • ◦ Credit card transactions: For purchases on credit cards and generation of • monthly statements. • ◦ Finance: For storing information about holdings, sales, and purchases of • financial instruments such as stocks and bonds; also for storing real-time • market data to enable online trading by customers and automated trading • by the firm.
  • 14. Universities • For student information, course registrations, and grades (in addition to standard enterprise information such as human resources and accounting).
  • 15. • Airlines: For reservations and schedule information. Airlines were among the first to use databases in a geographically distributed manner. Telecommunication: For keeping records of calls made, generating monthly bills, maintaining balances on prepaid calling cards, and storing information about the communication networks.
  • 16. Prepared by : M.Nirmala / AP / MCA Database Systems Vs File Systems • FILE BASED DATA MANAGEMENT • Before advent of database , file system—manual file system was used • 1) Collection of file, folders each tagged and kept in a filing cabinet • 2) When data was relatively small—few reporting requirements —manual system was ok • 3) Requirements grew—manual system complex (files, folders collection growing—time consuming & cumbersome
  • 17. Prepared by : M.Nirmala / AP / MCA • The conventional file system was much faster and efficient than manual counterparts but not flexibly than modern DBMS. • DISADVANTAGE OF FILE BASE SYSTEM • Need for programming and skilled personnel • Need different application programs • to extract records • to add records etc • For Each query an application program has to be written • For different reports to be generate, different programs to be written • Lack of data security • Password security alone is maintained and it is not sufficient when different users have access permissions to different subsets of data • To safeguard confidentiality was difficult • To lock part of files or part of the system not possible
  • 18. Prepared by : M.Nirmala / AP / MCA • Information compartmentalization • Promote data ownership and thus promoting the storage of same data in different location • Professional says as information islands or information compartmentalization, data at different locations are to be consistently updated, islands of information often contains different versions of data • Structural Dependency • Change in file structure/addition/deletion of a field requires modification of all programs using that files • It exhibits structural dependency • Changing from integer to decimal will make change in program
  • 19. Prepared by : M.Nirmala / AP / MCA • Data Dependency • File data characteristic change • Normal representation is ASCII • Specify opening of a specific file type either random or sequential • How to store and how to retrieve is to be clearly defined • Data Redundancy • Employee details stored in personnel dept ,account department, production planning department etc • Data Inconsistency • Data stored at different location / departments and modification / deletion / addition can create data inconsistencies • Eg:-employee leaves organization should be properly updated in all departments
  • 20. Prepared by : M.Nirmala / AP / MCA Concurrent access by multiple users • Concurrent accessed needed for performance • Uncontrolled concurrent accesses can lead to inconsistencies • Example: Two people reading a balance and updating it at the same time Security problems • Hard to provide user access to some, but not all, data
  • 21. Prepared by : M.Nirmala / AP / MCA ADVANTAGE OF DBMS • Data independence • Application programs should be independent as possible from details of data representation & storage. • Efficient data access • Data can be effectively retrieved when data is stored on external storage device • Data integrity & Security • DBMS can enforce integrity constraints on data • DBMS can check, before inserting salary for an employee the dept budget not exceeded. • Security restrictions can be applied • Access controls is enforced • What data is visible to different classes of users • Data administration • When several users share data centralizing the administration of data can offer significant improvements
  • 22. Prepared by : M.Nirmala / AP / MCA • Concurrent access and crash recovery • Concurrent access to data, in such a manner that users can think of the data being accessed by only one user at a time • DBMS protects users from effects of system failures • Reduced application development time • It supports many important functions that are common to much application accessing data stored in the DBMS • Describing and storing data in a DBMS • A data model is a collection of high level data description that hide many low level storage details. • Conflicting requirements can be balanced • Knowing the overall requirements helps database designers in creating a database design that is best for the organization
  • 23. View of Data / Database Architecture
  • 24. Prepared by : M.Nirmala / AP / MCA DBMS Architecture • The DBMS architecture describes how data in the database is viewed by the users. • Database abstraction also hides the implementation details of the data from the user • The major purpose of a database system is to provide users with an abstract view of the data i.e. • The system hides certain details of how the data are stored and maintained.
  • 25. Example to Understand Abstraction • User Level • A layman is walking on the street in front of your college. • When he looks at your college what all does he knows about the college! • He knows the name of the college then he might know that it is a Science or Arts college, it is run by so a so society etc.
  • 26. Logical Level • You are a student of the college • You know more about the person on the street • You know the number of courses held in the college, most of the teachers in the college, the course fees of the course which you are learning, the different buildings of the college etc.
  • 27. Physical Level • Now, consider the administrative staff of the college. • These people know even more information about the college than a student of the college • like how courses are conducted, • how teachers are appointed, • what is the fee structure of the courses • what new courses are to be started etc. • Now, these 3 people are looking at the same thing (i.e. the college) from 3 different levels of abstraction.
  • 28. Prepared by : M.Nirmala / AP / MCA 3 Levels of Abstraction • The architecture is defined at 3 levels. This architecture is proposed by ANSI/SPARC (American National Standards Institute/Standards Planning and Requirements Committee) and hence, is also known as ANSI/SPARC architecture • Physical level / Internal Level • Logical level / Conceptual • View or external level
  • 29. Prepared by : M.Nirmala / AP / MCA DBMS Architecture Diagram
  • 30. Physical level • Lowest level of abstraction. • It describes how the data are actually stored. • It describes data structures in detail. • At this level achieve various aspects are considered for optimal runtime performance and storage space utilization. • These aspects include storage space allocation techniques for data and indexes, access paths such as indexes, data compression and encryption techniques, and record placement
  • 31. Logical level • Next level of abstraction. • It describes what data are stored in database and what relationship exists between those data. • Defines all database entities, their attributes, and their relationships • Security and integrity information
  • 32. Prepared by : M.Nirmala / AP / MCA View or external level • It describes only some part of entire database. • It access only database and not concerned about physical storage location and data structures.
  • 33. Prepared by : M.Nirmala / AP / MCA • Student information it contains • studentid • stuname • stuage • stusex • stugrade • Physical level: • Student information are stored as block of consecutive storage locations. 79 memory locations are needed. • Logical level: • Char stuid [25], stuname [50], stusex, and stugrade; • int age; • External level: • It uses set of application program to retrive data from database • Select * from student where stuid=’1001’;
  • 34. Prepared by : M.Nirmala / AP / MCA Instances and Schemas • Similar to types and variables in programming languages • Schema –Overall design of a database is called as schema • Example: The database consists of information about a set of customers and accounts and the relationship between them) • Analogous to type information of a variable in a program • Physical schema: database design at the physical level • Logical schema: database design at the logical level • Instance • Collection of information stored in the database at a particular moment is called an instance of a database. • Analogous to the value of a variable • Physical Data Independence – the ability to modify the physical schema without changing the logical schema • Applications depend on the logical schema • In general, the interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others
  • 35. Prepared by : M.Nirmala / AP / MCA • Eg of instance & schema: • In a program, various variable declarations are referred as schema. • Every variable has a particular value at a given instance of database schema • Database schemas • 3 types of schemas: • 1. Conceptual schema. • 2. Physical schema/ internal. • 3. Logical schema/external.
  • 36. Prepared by : M.Nirmala / AP / MCA Database Schemas External schema 1 External schema 2 External schema 3 Conceptual schema Physical schema disk
  • 37. Prepared by : M.Nirmala / AP / MCA Instances and Schemas • DB Schema Diagram for a Company: Employee: Eno Ename Salary Address
  • 38. Prepared by : M.Nirmala / AP / MCA Instances and Schemas • DB Schema Diagram for a Company: Department: Dno Dname Dlocation Project: Pno Pname Hours
  • 39. Prepared by : M.Nirmala / AP / MCA Instances and Schemas • Instance Example: Eno Ename Salary Address 1 2 3 A B C 10,000 20,000 30,000 First street Second street Third street
  • 40. Prepared by : M.Nirmala / AP / MCA CONCEPTUAL SCHEMA • It describes all relations that are stored in the database • Students(sid :strings,name :strings,age:integer) • Faculty(fid :string,fname:string,sal:real) • Courses(cid:string,cname:string • Rooms(rno:integer,address:string,capacity:integer) • Enrolled(sid:string,cid:string,grade:string) • Teachers(fid:string,cid:string) • Meets-in(cid:string,rno:integer,time:string)
  • 41. Prepared by : M.Nirmala / AP / MCA Physical Schema • It specifies additional storage details. • The relations described in conceptual schema are actually stored in a secondary storage devices such as disks, tapes etc. • What file org is used to store the relations
  • 42. Prepared by : M.Nirmala / AP / MCA External Schema • It allows data access to be customized at the level of individual users. • Only one conceptual schema and one physical schema but may have several external schemas each tailored to particular group of users. • A view is conceptually a relation but the records in a view are not stored in DBMS. • Suited for end user requirement.
  • 43. Prepared by : M.Nirmala / AP / MCA Mapping • Whenever a user specifies a request to generate a new external view, the DBMS must transform the request specified at external level into a request at conceptual level, and then into a request at physical level. • This process of transforming the requests and results between various levels of DBMS architecture is known as mapping.
  • 44. Prepared by : M.Nirmala / AP / MCA Data Independence • The main advantage of three-schema architecture is that it provides data independence. • Data independence is the ability to change the schema at one level of the database system without having to change the schema at the other levels. • Data independence is of two types, namely, logical data independence and physical data independence.
  • 45. Prepared by : M.Nirmala / AP / MCA Three Levels of online Book Database
  • 46. Prepared by : M.Nirmala / AP / MCA Logical Data Independence • It is the ability to change the conceptual schema without affecting the external schemas or application programs. • The conceptual schema may be changed due to change in constraints or addition of new data item or removal of existing data item, etc., from the database. • The separation of the external level from the conceptual level enables the users to make changes at the conceptual level without affecting the external level or the application programs. • For example, if a new data item, say Edition is added to the BOOK file, the two views (view 1 and view 2 shown in the above Figure) are not affected.
  • 47. Prepared by : M.Nirmala / AP / MCA Physical Data Independence • Physical data independence: It is the ability to change the internal schema without affecting the conceptual or external schema. • An internal schema may be changed due to several reasons such as for creating additional access structure, changing the storage structure, etc. • The separation of internal schema from the conceptual schema facilitates physical data independence.
  • 48. Prepared by : M.Nirmala / AP / MCA • Logical data independence is more difficult to achieve than the physical data independence because the application programs are always dependent on the logical structure of the database. • Therefore, the change in the logical structure of the database may require change in the application programs.
  • 49. Prepared by : M.Nirmala / AP / MCA Data Models • A data model is an organizing principle that specifies particular mechanisms for data storage and retrieval • The primary difference between the various database models lie in the method of expressing relationships and constraints among data elements
  • 50. RELATIONAL MODEL • Relational Databases store data in relations i.e. tables. • Each relation must have a name. • Relation can be otherwise called as object or Table
  • 51. Tuple/Row/Record • A single entry in a table is called a Tuple or Record or Row. • A tuple in a table represents a set of related data. For example, the above Employee table has 4 tuples/records/rows. • A record is a set of related data in a table.
  • 52. ATTRIBUTES • A table consists of several records(row), each record can be broken down into several smaller parts of data known as Attributes • An attribute is a named column of a relation. It stores a specific information about an object / Table e.g. salary. • Name the Attributes • The above Employee table consist of five attributes, ID, ENAME, SALARY, BONUS, DEPT Attribute Domain : Attribute is defined to hold only a certain type of values, which is known as Attribute Domain. The attribute Name will hold the name of employee for every tuple. If we save employee's address there, it will be violation of the Relational database model.
  • 54. CARDINALITY OF THE RELATION
  • 55. DEGREE OF THE RELATION
  • 56. NULL ATTRIBUTE • The attribute value that is currently unknown is called Null Attribute
  • 57. Attribute Domain • Attribute is defined to hold only a certain type of values, which is known as Attribute Domain.
  • 59. Data Integrity and Constraints / Keys • Integrity constraints are a set of rules. Integrity constraints ensure that the data insertion, updating, and other processes have to be performed in such a way that data integrity is not affected. • Thus, integrity constraint is used to guard against accidental damage to the database. • Data integrity refers to maintaining and assuring the accuracy and consistency of data over its entire life-cycle. • Database Systems ensure data integrity through constraints which are used to restrict data that can be entered or modified in the database. • Database Systems offer three types of integrity constraints:
  • 61. Integrity Types Definition Enforced Through Entity Integrity Each table must have a column or a set of columns through which we can uniquely identify a row. These column(s) cannot have empty (null) values. PRIMARY KEY Domain Integrity All attributes in a table must have a defined domain i.e. a finite set of values which have to be used. When we assign a data type to a column we limit the values that it can contain. In addition we can also have value restriction as per business rules e.g. Gender must be M or F. DATA TYPES, CHECK CONSTRAINT Referentia l Integrity Every value of a column in a table must exist as a value of another column in a different (or the same) table. FOREIGN KEY
  • 62. Keys • Super Key • Candidate Key • Primary Key
  • 63. Super Key • A super key is a set of one or more attributes (columns), which can uniquely identify a row in a table. • Often DBMS beginners get confused between super key and candidate key • Let’s take an example to understand this: • Table: Employee Emp_SSN Emp_Number Emp_Name 123456789 226227 Steve 999999321 227 Ajeet 999999322 228 Chaitanya 777778888 229 Robert
  • 64. • Super keys: All of the following sets of super key are able to uniquely identify a row of the employee table. • {Emp_SSN} • {Emp_Number} • {Emp_SSN, Emp_Number} • {Emp_SSN, Emp_Name} • {Emp_SSN, Emp_Number, Emp_Name} • {Emp_Number, Emp_Name}
  • 65. • 1. {Emp_Id} – No redundant attribute • 2. {Emp_Number} – No redundant attributes • 3. {Emp_Id, Emp_Number} – No Redundant attribute. Either of those attributes can be a minimal super key as both of these columns have unique values. • 4. {Emp_Id, Emp_Name} – Redundant attribute Emp_Name • 5. {Emp_Id, Emp_Number, Emp_Name} – Redundant attributes. Emp_Id or Emp_Number alone are sufficient enough to uniquely identify a row of Employee table • 6. {Emp_Number, Emp_Name} – Redundant attribute Emp_Name.
  • 66. • A super key with no redundant attribute is known as candidate key. • Candidate keys are selected from the set of super keys, the only thing we take care while selecting candidate key is that the candidate key should not have any redundant attributes. • That’s the reason they are also termed as minimal super key • A Candidate Key is a minimal set of columns/attributes that can be used to uniquely identify a single tuple in a relation.
  • 67. Lets select the candidate keys from the above set of super keys. • The candidate keys we have selected are: {Emp_Id} {Emp_Number}
  • 68. Primary key Mandatory Desired must uniquely identify a tuple should not change with time must not allow NULL values should have short size e.g. numeric data types Primary key is the candidate key that is selected to uniquely identify a tuple in a relation. A primary key is a minimal set of attributes (columns) in a table that uniquely identifies tuples (rows) in that table. A primary key is selected from the set of candidate keys. That means we can either have Emp_Id or Emp_Number as primary key. The decision is made by DBA (Database administrator)
  • 69. Primary Key Example in DBMS • Lets take an example to understand the concept of primary key. In the following table, there are three attributes: Stu_ID, Stu_Name & Stu_Age. Out of these three attributes, one attribute or a set of more than one attributes can be a primary key. • Attribute Stu_Name alone cannot be a primary key as more than one students can have same name. • Attribute Stu_Age alone cannot be a primary key as more than one students can have same age. • Attribute Stu_Id alone is a primary key as each student has a unique id that can identify the student record in the table.
  • 70. • We denote usually denote it by underlining the attribute name (column name). • The value of primary key should be unique for each row of the table. The column(s) that makes the key cannot contain duplicate values. • The attribute(s) that is marked as primary key is not allowed to have null values. • Primary keys are not necessarily to be a single attribute (column). It can be a set of more than one attributes (columns). For example {Stu_Id, Stu_Name} collectively can identify the tuple in the above table, but we do not choose it as primary key because Stu_Id alone is enough to uniquely identifies rows in a table and we always go for minimal set. Having that said, we should choose more than one columns as primary key only when there is no single column that can uniquely identify the tuple in table. Key Points Primary key
  • 71. Composite Primary Key – More than one Key • Consider this table ORDER, this table keeps the daily record of the purchases made by the customer. • This table has three attributes: Customer_ID, Product_ID & Order_Quantity. • Customer_ID alone cannot be a primary key as a single customer can place more than one order thus more than one rows of same Customer_ID value. As we see in the following example that customer id 1011 has placed two orders with product if 9023 and 9111. • Product_ID alone cannot be a primary key as more than one customers can place a order for the same product thus more than one rows with same product id. In the following table, customer id 1011 & 1122 placed an order for the same product (product id 9023).
  • 72. • Order_Quantity alone cannot be a primary key as more more than one customers can place the order for the same quantity. • Since none of the attributes alone were able to become a primary key, lets try to make a set of attributes that plays the role of it. • {Customer_ID, Product_ID} together can identify the rows uniquely in the table so this set is the primary key for this table.
  • 74. Order Customer_ID Product_ID Order_Quantity 1011 9023 10 1122 9023 15 1099 9031 20 1177 9031 18 1011 9111 50
  • 75. Consider the Following Table StudentID StudentName Department DOB 1 Raja Computer Science & Engg 10-12-1990 2 Geetha Computer Science & Engg 12-12-1990 3 Dev Information Technology 5-7-1990 4 Preetha Information Technology 12-5-1998 5 Hari Computer Science & Engg 10-3-1998 6 Vani Computer Science & Engg 10-12-1990 7 Krishna Electrical&Electronics Engg 12-12-1990 8 Ajith Electrical&Electronics Engg 5-7-1990 9 Ram Information Technology 12-5-1998 10 Kumar Computer Science & Engg 10-3-1998 11 Sandhya Automobile Engg 10-12-1990 12 Uma Automobile Engg 12-12-1990
  • 76. • Deptname Gets Repeated more than once as more than one student belong to the same dept. • Also the size of the Department name is also big • Instead of the Department name with an Deptid can be represented as a separate table and the Deptid which is a primary key in the Dept table will be a Foreign key in the Student Table
  • 77. Dept Table DeptID Department 1 Computer Science & Engg 2 Information Technology 3 Electrical & Electronics Engg 4 Automobile Engg 5 Mechanical Engg
  • 78. StudentID StudentName DeptID DOB 1 Raja 1 10-12-1990 2 Geetha 1 12-12-1990 3 Dev 2 5-7-1990 4 Preetha 2 12-5-1998 5 Hari 1 10-3-1998 6 Vani 1 10-12-1990 7 Krishna 3 12-12-1990 8 Ajith 3 5-7-1990 9 Ram 2 12-5-1998 10 Kumar 1 10-3-1998 11 Sandhya 4 10-12-1990 12 Uma 4 12-12-1990 Dept ID Department 1 Computer Science & Engg 2 Information Technology 3 Electrical & Electronics Engg 4 Automobile Engg 5 Mechanical Engg DeptID is Primary Key in the Dept Table is acting as a Foreign key in the Student Table Student Department
  • 79. • Foreign keys are the columns of a table that points to the primary key of another table. They act as a cross-reference between tables. • Note: Practically, the foreign key has nothing to do with the primary key tag of another table, if it points to a unique column (not necessarily a primary key) of another table then too, it would be a foreign key. So, a correct definition of foreign key would be: Foreign keys are the columns of a table that points to the candidate key of another table.
  • 80. Properties of Foreign Key • It states that if a foreign key exists in a relation then either the foreign key value must match a primary key value of some tuple in its home relation or the foreign key value must be null. • The rules are: • You can't delete a record from a primary table if matching records exist in a related table. • You can't change a primary key value in the primary table if that record has related records. • You can't enter a value in the foreign key field of the related table that doesn't exist in the primary key of the primary table. • However, you can enter a Null value in the foreign key, specifying that the records are unrelated.
  • 81. ENTITY RELATIONSHIP DIAGRAM • Jack is part of database team and he needs to present the database design to business users. The business users are non-technical and it's difficult for them to read a verbose design document. • Jack needs to use an Entity Relation (ER) Model. • ER model is a graphical representation of entities and their relationships which helps in understanding data independent of the actual database implementation.
  • 82. Prepared by : M.Nirmala / AP / MCA • The ER model consists of basic objects called entities and a relationship among these objects. • Proposed by Peter Pin-Shan Chen (1976). • Three basic notations in ER model are • Entity sets • Relationship sets • Attributes
  • 83. Prepared by : M.Nirmala / AP / MCA Entity • Objects in the miniworld about which information has to be stored. E.g. persons, books. • It must be possible to distinguish among entities • An entity may be • Concrete -- person /Book. • Abstract -- loan/holiday/Concept
  • 84. Prepared by : M.Nirmala / AP / MCA Entity set • An entity set is set of entities of the same type that share the same properties/ Attributes. • Eg: set of all persons who are customers at a given bank can be defined as entity set customer. • Eg: set of all loans awarded by a bank is referred as entity set loan. • Note: entity sets do not need to be disjoint. • It is possible to define the entity set of all employees of a bank(employee) and the entity set of all customers of the bank (customer). • A person entity may be an employee entity, a customer entity, both or neither
  • 85. Prepared by : M.Nirmala / AP / MCA Entity set loan and Customer LOAN NO AMOUNT L17 1000 L23 3000 L15 1000 L17 1500 NAME SOCIAL SECURITYNO STREET Jones 321-12-3123 Main Smith 019-28-5611 North Hayes 912-51-1239 north Jack 335-57-1298 dupont
  • 86. Prepared by : M.Nirmala / AP / MCA Attributes • An entity is represented by a set of attributes. • Attributes are descriptive properties possessed by each member of entity set • Possible attributes of the customer entity set are • customer-name • social security no • street • Attributes of loan entity • loan no • amount
  • 87. Prepared by : M.Nirmala / AP / MCA Domain • For each attributes there is a set of permitted values called domain or value set • Domain of attribute customer name • set of all text strings of certain length • Domain of attribute loanno • all positive integer in the form “L-n” eg L1, L2 • Attributes of an entity set is a function that maps from the entity set in to a domain • Mapping between entity and domain • Entity is described by (attributes,datavalue)pairs {(name,jones), (social security,312-12-5123)…. ((street,main)}
  • 88. Prepared by : M.Nirmala / AP / MCA Attributes Types • Simple and composite attributes • Single valued and multi valued attributes • Null attributes • Derived attributes Simple attributes • The attributes simple in its structure, which cannot be further sub divided are referred as simple attributes. Eg: sex,etc.
  • 89. Prepared by : M.Nirmala / AP / MCA Hierarchy of Composite Attributes Composite attributes: The attribute may be composed of several components. It can be further sub divided Address(Apt#, House#, Street, City, State, ZipCode, Country) Name(FirstName, MiddleName, LastName).
  • 90. Prepared by : M.Nirmala / AP / MCA Single values attributes • Attributes have single value for particular entity eg: loan number attribute has a single loan number Multivalued attributes • Attributes has a set of values for a specific entity consider an employee entity set dependent-name an employee can have zero, one or more dependents • Eg : In bio data-references-names • Educational qualifications sslc,hsc,ug,pg
  • 91. Prepared by : M.Nirmala / AP / MCA Null attributes • A null value is used when an entity does not have a value for an attributes • Eg: One may have no middle name. Derived attribute • The value for this type of attribute can be derived from the values of other related attributes/entities • Eg 1: entity : customer • Attributes : loans-held which represents how many no of loans a customer had from the bank. To derive the value this attribute, we can count the number of loans entities associated with that customer • Eg 2 entity : employee • Attributes: start-date; employment length • Employment length – total length of time the employee has worked. Value of employment –length can be derived from the value of start- date & current-date. Start-date is referred as base attributes
  • 92. Prepared by : M.Nirmala / AP / MCA Relationship sets • Relationship is an association among several entities. • Relationship set is the relationship of the same type. • E.g. “X teaches course Y”.
  • 93. Let us understand some key terms used in ER Modelling Term Definition Examples Entity Real world objects which have an independent existence and about which we intend to collect data. Employee, Computer Attribute A property that describes an entity. Name, Salary
  • 94. Different Forms ID ENAME SALARY DEPT EMPLOYEE Customer Account cusname SSN Cus-city Cus-street A/cno balance Deposit or
  • 95. More than one relationship between entities • There can be more than one relationship between entities, e.g. an Employee works in a Department while the head of department (also an employee) manages a Department.
  • 96. Prepared by : M.Nirmala / AP / MCA Recursive Relationship • If the same entity participates more than once in a relationship it is known as a recursive relationship. In the below example an employee can be a supervisor and be supervised, so there is a recursive relationship. • Eg: A women can take up the role of a wife, mother, daughter- in-law etc.
  • 97. Prepared by : M.Nirmala / AP / MCA specifying that each employee has zero or one bosses (assuming the CEO doesn‘t have any boss), and that each employee may supervise zero or more other employees would look like
  • 98. Prepared by : M.Nirmala / AP / MCA Degree of Relationships • The number of roles in the relationship • Binary – links two entity sets; set of ordered pairs (most common) • Ternary – links three entity sets; ordered triples (rare). If a relationship exists among the three entities, all three must be present • Ternary relationship set involving the entity sets CUSTOMER, LOAN, BRANCH.. customer jones, loan – L-17 and the perryridge branch. • N-ary – links n entity sets; ordered n-tuples (very rare). If a relationship exists among the entities, then all must be present. Cannot represesnt subsets.
  • 99. Prepared by : M.Nirmala / AP / MCA E-R Diagram with a Ternary Relationship
  • 100. Prepared by : M.Nirmala / AP / MCA One - one • An entity in A is associated with at most one entity in B, and an entity in B is associated with at most one entity in A. • Borrower in a particular bank • A loan can belong to only customer. • 1 Department one HOD
  • 101. Prepared by : M.Nirmala / AP / MCA 1-1
  • 102. Prepared by : M.Nirmala / AP / MCA Cardinality Constraints • We express cardinality constraints by drawing either a directed line (), signifying “one,” or an undirected line (—), signifying “many,” between the relationship set and the entity set. • E.g.: One-to-one relationship: • A customer is associated with at most one loan via the relationship borrower • A loan is associated with at most one customer via borrower
  • 103. Prepared by : M.Nirmala / AP / MCA 1-Many • An entity is A is associated with any number of entities in B. an entity in B, however can be associated with almost one entity in A. a1 a2 a3 b1 b2 b3 b4 b5 Depositor 1.customer may have several accounts.
  • 104. Prepared by : M.Nirmala / AP / MCA One-To-Many Relationship • In the one-to-many relationship a loan is associated with at most one customer via borrower, a customer is associated with several (including 0) loans via borrower
  • 105. Prepared by : M.Nirmala / AP / MCA Many - 1 • An entity in A is associated with atmost one entity in B. AN entity in B, however can be associated with any number of entities in A. a1 a2 a4 a3 b1 b2 a5 b3
  • 106. Prepared by : M.Nirmala / AP / MCA Many-To-One Relationships • In a many-to-one relationship a loan is associated with several (including 0) customers via borrower, a customer is associated with at most one loan via borrower
  • 107. Prepared by : M.Nirmala / AP / MCA Many - Many • An entity in A is associated with any number of entities in B, and an entity in B is associated with any no of entities in A. a1 a2 a4 a3 b1 b2 b4 b3 Loan can belong to several customers can loans can be taken jointly by several business partners
  • 108. Prepared by : M.Nirmala / AP / MCA Many-To-Many Relationship • A customer is associated with several (possibly 0) loans via borrower • A loan is associated with several (possibly 0) customers via borrower
  • 109. Prepared by : M.Nirmala / AP / MCA
  • 110. Prepared by : M.Nirmala / AP / MCA
  • 111. Prepared by : M.Nirmala / AP / MCA
  • 112. Prepared by : M.Nirmala / AP / MCA
  • 113. Prepared by : M.Nirmala / AP / MCA
  • 114. Prepared by : M.Nirmala / AP / MCA E-R Diagram With Composite, Multivalued, and Derived Attributes Composite Attribute – name, address , street Multivalued Attribute = phone number depicted by double ellipse Derived attribute = age depicted by dashed ellipse
  • 115. How to Create an Entity Relationship Diagram (ERD)
  • 116. Scenario • In a university, a Student enrolls in Courses. A student must be assigned to at least one or more Courses. Each course is taught by a single Professor. To maintain instruction quality, a Professor can deliver only one course
  • 117. Step 1) Entity Identification • Student • Course • Professor
  • 118. Step 2) Relationship Identification • Identify the association between the entities • The student is assigned a course • Professor delivers a course
  • 119. Step 3) Cardinality Identification • A student can be assigned multiple courses • A Professor can deliver only one course
  • 120. Step 4) Identify Attributes Entity Primary Key Attribute Student Student_ID StudentName Professor Employee_ID ProfessorName Course Course_ID CourseName
  • 122. • ER DIAGRAM •“Patients are treated in a single ward by the doctors assigned to them. •Usually each patient will be assigned a single doctor, but in rare cases they will have two. Healthcare assistants also attend to the patients, a number of these are associated with each ward.
  • 123. Step 1) Entity Identification • Patients • Doctor • healthcareAssistants • ward
  • 124. Step 2) Relationship Identification • Identify the association between the entities • Patients are treated in ward • Patient are assigned to doctor • Healthcare assistants attends to patients
  • 125. Step 3) Cardinality Identification • Patients are treated in single ward (M:1) • Patient are assigned to doctor(1:M) • Healthcare assistants attends to patients (M:M)
  • 128. Why Data Flow Diagram • A picture is worth a thousand words. • A Data Flow Diagram (DFD) is a traditional way to visualize the information flows within a system. • It shows • how information enters the System • How Information leaves the system • what changes the information • where information is stored.
  • 129. • Good DFDs are critical to the process • Building DFDs == understanding the system • Analyzing DFDs == understanding the threats
  • 130. DFD Types of Notation • Yourdon & Coad • Gene & Sarson are the two main methods of notation used in DFDs
  • 132. External Entity • An external agent is a source or destination of data • The sharp cornered rectangles(or simply boxes) in a DFD indicates entities. • An external entity, which are also known as terminators, sources, sinks, or actors, are an outside system or process that sends or receives data to and from the diagrammed system
  • 133. • Entities are people things, organizations etc of data from the system
  • 134. Process • A process is a unit of work that operates on the data • The rounded cornered rectangles or circles in a DFD indicate processes • Process is a procedure that manipulates the data and its flow by taking incoming data, changing it, and producing an output with it
  • 136. Data Store • Opened sided rectangles in DFD indicates data store. • The Data Store symbol represents data that is not moving (delayed data at rest). • A Data Store is a logical repository of data. • Data can be written into the data store. This is depicted by an incoming arrow. • Data can be read from a data store. This is depicted by an outgoing arrow.
  • 137. Data store • Two data stores cannot be connected by a data flow. • External entity cannot read or write to the data store.
  • 139. Data flow • Data flow is the path the system’s information takes from external entities through processes and data stores. With arrows and succinct labels, the DFD can show you the direction of the data flow. • Arrow symbol in DFD indicate data flow • The Data Flow symbol represents movement of data •
  • 141. Rules of Data Flow • Data can flow from • External Entity to Process • Process to External Entity • Process to Data Store and back • Process to Process
  • 142. Rules of Data Flow • Data cannot flow from • External entity to External Entity • External entity to store • Data store to External Entity • Data store to Data Store
  • 143. Four rules of thumb to create a valid DFD. 1. Each process should have at least one input and one output. 2. Each data store should have at least one data flow in and data flow out. 3. A system’s stored data must go through a process. 4. All processes in a DFD must link to another process or data store.
  • 144. Creating Data Flow Diagrams Steps: 1. Create a list of activities 2. Construct Context Level DFD (identifies external entities and processes) 3. Construct Level 0 DFD (identifies manageable sub process ) 4. Construct Level 1- n DFD (identifies actual data flows and data stores ) 5. Check against rules of DFD
  • 145. DFD Naming Guidelines • External Entity  Noun • Data Flow  Names of data • Process  verb phrase • a system name • a subsystem name • Data Store  Noun
  • 146. Creating Data Flow Diagrams Lemonade Stand Example
  • 147. Creating Data Flow Diagrams Steps: 1. Create a list of activities • New way: use Use-Case Diagram 2. Construct Context Level DFD (identifies sources and sink) 3. Construct Level 0 DFD (identifies manageable sub processes ) 4. Construct Level 1- n DFD (identifies actual data flows and data stores ) Example The operations of a simple lemonade stand will be used to demonstrate the creation of dataflow diagrams.
  • 148. Creating Data Flow Diagrams 1. Create a list of activities Example Think through the activities that take place at a lemonade stand. Customer Order Produce Product Serve Product Collect Payment Store Product
  • 149. Creating Data Flow Diagrams Example Also think of the additional activities needed to support the basic activities. Customer Order Produce Product Serve Product Collect Payment Store Product Order Raw Materials Pay for Raw Materials Pay for Labor 1. Create a list of activities
  • 150. Creating Data Flow Diagrams Example Group these activities in some logical fashion, possibly functional areas. Customer Order Serve Product Collect Payment Produce Product Store Product Order Raw Materials Pay for Raw Materials Pay for Labor 1. Create a list of activities
  • 151. Process Customer Order Sale Serve Product Collect Payment Produce Product (product preparation) Production Store Product Order Raw Materials Procurement Pay for Raw Materials Pay for Labor payroll
  • 152. Context Level DFD • Basic form of DFD • Called as Level 0 • Shows how the system works at a Glance • There is only one process(Main Process) in the system and all the data flows either into or out of this process.
  • 153. • Context level DFD’s demonstrates the interactions between the process and external entities. • They do not contain Data Stores.
  • 154. Creation of Context Level DFD • Identify the main system (Lemonade) • Identify the external people who interact with the system • Customer • Employee (Seller) • Vendor (for purchasing Raw Materials)
  • 155. • Decide what data these entities will enter into the system • Determine what these entities expect as output from the system
  • 156. Creating Data Flow Diagrams Context Level DFD Example Create a context level diagram identifying the sources and sinks (users). Customer Order Serve Product Collect Payment Produce Product Store Product Order Raw Materials Pay for Raw Materials Pay for Labor 2. Construct Context Level DFD (identifies sources and sink) 0.0 Lemonade System EMPLOYEE CUSTOMER Pay Payment Order VENDOR Payment Purchase Order Produce Product Received Goods Time Worked Sales Forecast Product Served
  • 157. Creating Data Flow Diagrams Level 0 DFD Example Create a level 0 diagram identifying the logical subsystems that may exist. Customer Order Serve Product Collect Payment Produce Product Store Product Order Raw Materials Pay for Raw Materials Pay for Labor 3. Construct Level 0 DFD (identifies manageable sub processes ) 2.0 Production EMPLOYEE Produce Product 1.0 Sale 3.0 Procure- ment Sales Forecast Product Ordered CUSTOMER Pay Payment Customer Order VENDOR Payment Purchase Order Order Decisions Received Goods Time Worked Inventory Product Served 4.0 Payroll
  • 158. • Create a level 1 decomposing the processes in level 0 and identifying data stores • In the Level 0 with respect to 1.0 Sale
  • 159. Main Process Sub Process Data Stores Sale Customer Order Customer , Order Payment Order, Payment Sales forecast Payment
  • 160. Creating Data Flow Diagrams Level 1 DFD Example Create a level 1 decomposing the processes in level 0 and identifying data stores. 4. Construct Level 1- n DFD (identifies actual data flows and data stores ) 1.3 Produce Sales Forecast Sales Forecast Payment Customer Order Serve Product Collect Payment Produce Product Store Product Order Raw Materials Pay for Raw Materials Pay for Labor 1.1 Record Order Customer Order ORDER 1.2 Receive Payment PAYMENT Severed Order Request for Forecast CUSTOMER
  • 161. Creating Data Flow Diagrams Level 1 DFD Example Create a level 1 decomposing the processes in level 0 and identifying data stores. 4. Construct Level 1 (continued) Customer Order Serve Product Collect Payment Produce Product Store Product Order Raw Materials Pay for Raw Materials Pay for Labor 2.1 Serve Product Product Order ORDER 2.2 Produce Product INVENTORTY Quantity Severed Production Schedule RAW MATERIALS 2.3 Store Product Quantity Produced & Location Stored Quantity Used Production Data
  • 162. Creating Data Flow Diagrams Level 1 DFD Example Create a level 1 decomposing the processes in level 0 and identifying data stores. 4. Construct Level 1 (continued) Customer Order Serve Product Collect Payment Produce Product Store Product Order Raw Materials Pay for Raw Materials Pay for Labor 3.1 Produce Purchase Order Order Decision PURCHASE ORDER 3.2 Receive Items Received Goods RAW MATERIALS 3.3 Pay Vendor Quantity Received Quantity On-Hand RECEIVED ITEMS VENDOR Payment Approval Payment
  • 163. Creating Data Flow Diagrams Level 1 DFD Example Create a level 1 decomposing the processes in level 0 and identifying data stores. 4. Construct Level 1 (continued) Time Worked Customer Order Serve Product Collect Payment Produce Product Store Product Order Raw Materials Pay for Raw Materials Pay for Labor 4.1 Record Time Worked TIME CARDS 4.2 Calculate Payroll Payroll Request EMPLOYEE 4.3 Pay Employe e Employee ID PAYROLL PAYMENTS Payment Approval Payment Unpaid time cards
  • 165. Let us try to identify some candidate keys for this relation Key Rationale EmployeeNo This seems to be a good candidate key as companies usually issue a unique number for each employee. AadharNo This seems to be a good candidate key for a company based in India. However we have assumed that every employee has an Aadhar number which may not be true. In addition for a multinational firm with employees across the globe this will not work at all. Name, DateOfBirth This might work for a small firm of 10 people as a combination of Name and Date of Birth is likely to be unique. Salary This is not a good candidate as salary is generally paid uniformly to people at same level. EmployeeNo, DateOfBirth It is not a candidate key as EmployeeNo alone is unique. By definition only minimal set of attributes can be candidate key. Thus the choice of candidate key depends upon the business context.
  • 166. Key Rationale EmployeeNo Good candidate as it is numeric, cannot be null and does not change with time. AadharNo It will be null for people who do not have Aadhar number. Hence it cannot be considered as primary key. Name, DateOfBirth Both Name and DateOfBirth cannot be null. However even if uniqueness is guaranteed, it is not a good choice due to large size. When two or more columns together identify the unique row then it's referred to as Composite Primary Key. The combination of Name and DateOfBirth if selected as a primary key would be a composite primary key.
  • 167. Database Users and Administrators • Primary goal of the database system is to retrieve information and store new information in the database • 2 Types of Users • Database Users • Database Administrators
  • 168. Database Users • 4 Types of Database Users • Naive users • Application Programmers • Sophisticated Users • Specialized Users
  • 169. Naïve Users • They are unsophisticated users who interact with the system by invoking one of the application program that have been written previously. • Simple User, who operates the applications • Does not write any code
  • 170. Application Programmers • They are computer professionals who write application programs. • Application programmers can choose from many tools to develop user interfaces.
  • 171. Sophisticated Users •They interact with the system without writing programs. •Instead they write their requests in a database query language
  • 172. Specialized Users •They are sophisticated users who write specialized database applications that do not fit in to the traditional data-processing frame work.
  • 173. Database Administrator • DBA is the person or group that is responsible for supervising both the data base and the use of the DBMS • Roles of DBA • Policy Formulation and Implementation • Access privileges • Security • Disaster planning • Archives • Data Dictionary Management • Training • DBMS Support • DBMS evaluation and selection • DBMS responsibility • Data Base Design
  • 174. DBA formulates policies and communicates these policies to users. Among these policies are those covering access • privilege, • security, • disaster planning, • archives.
  • 175. AccessPrivilege • Access to every table and field in the database is not a necessity for every user. • User A Access denide Access permitted
  • 176. Security • Security refers to the prevention of unauthorized access to the database • Once access privilege have been specified and security features are in place, • DBA draws up policies to explain the security privileges and then distributes these policies to authorized users.
  • 177. Planning for Disaster • A database may be harmed by some physical problems or by some natural disasters. • For this the DBA’s responsibility is to establish and implement backup and recovery procedures.
  • 178. Archives • This tells that data need to be kept in the database for only a limited time. • A data archive is a place where a record of certain corporate data is kept • Data Dictionary Management (Data About Data) • Data dictionary contains a wider range of information, including at the very least, information on tables, indexes, and programs. • The creation and distribution of appropriate reports from the data dictionary is another of DBA”s responsibility.
  • 180. Data Base and Database Users  A database is a collection of related data. By data, we mean known facts that can be recorded and that have implicit meaning.  For example, consider the names, telephone numbers, and addresses of the people you know.  Nowadays, this data is typically stored in mobile phones, which have their own simple database software.  This data can also be recorded in an indexed address book or stored on a hard drive, using a personal computer and software such as Microsoft Access or Excel.  This collection of related data with an implicit meaning is a database.
  • 181. Properties  A database represents some aspect of the real world, sometimes called the miniworld or the universe of discourse (UoD). Changes to the miniworld are reflected in the database.  A database is a logically coherent collection of data with some inherent meaning. A random assortment of data cannot correctly be referred to as a database.  A database is designed, built, and populated with data for a specific purpose. It has an intended group of users and some preconceived applications in which these users are interested.
  • 182.  A database management system (DBMS) is a computerized system that enables users to create and maintain a database.  The DBMS is a general-purpose software system that facilitates the processes of defining, constructing, manipulating, and sharing databases among various users and applications.  Defining a database involves specifying the data types, structures, and constraints of the data to be stored in the database.  The database definition or descriptive information is also stored by the DBMS in the form of a database catalog or dictionary; it is called meta-data.
  • 183.  Constructing the database is the process of storing the data on some storage medium that is controlled by the DBMS.  Manipulating a database includes functions such as querying the database to retrieve specific data, updating the database to reflect changes in the mini world, and generating reports from the data.  Sharing a database allows multiple users and programs to access the database simultaneously.
  • 187. Characteristics of the Database Approach  A number of characteristics distinguish the database approach from the much older approach of writing customized programs to access data stored in files.  In traditional file processing, each user defines and implements the files needed for a specific software application as part of programming the application.  For example, one user, the grade reporting office, may keep files on students and their grades.  Programs to print a student’s transcript and to enter new grades are implemented as part of the application.  A second user, the accounting office, may keep track of students’ fees and their payments. Although both users are interested in data about students, each user maintains separate files—and programs to manipulate these files—because each requires some data not available from the other user’s files.
  • 188.  The main characteristics of the database approach versus the file-processing approach are the following:  Self-describing nature of a database system  Insulation between programs and data, and data abstraction  Support of multiple views of the data  Sharing of data and multiuser transaction processing
  • 189. Self-Describing Nature of a Database System  A fundamental characteristic of the database approach is that the database system contains not only the database itself but also a complete definition or description of the database structure and constraints.  This definition is stored in the DBMS catalog, which contains information such as the structure of each file, the type and storage format of each data item, and various constraints on the data.  The information stored in the catalog is called meta-data, and it describes the structure of the primary database
  • 190.  It is important to note that some newer types of database systems, known as NOSQL systems, do not require meta-data.  Rather the data is stored as self-describing data that includes the data item names and data values together in one structure  The catalog is used by the DBMS software and also by database users who need information about the database structure.  A general-purpose DBMS software package is not written for a specific database application.  Therefore, it must refer to the catalog to know the structure of the files in a specific database, such as the type and format of data it will access.
  • 192. Insulation between Programs and Data, and Data Abstraction  In traditional file processing, the structure of data files is embedded in the application programs, so any changes to the structure of a file may require changing all programs that access that file.  By contrast, DBMS access programs do not require such changes in most cases.  The structure of data files is stored in the DBMS catalog separately from the access programs.  We call this property program-data independence.
  • 193.  An operation (also called a function or method) is specified in two parts.  The interface (or signature) of an operation includes the operation name and the data types of its arguments (or parameters).  The implementation (or method) of the operation is specified separately and can be changed without affecting the interface.  User application programs can operate on the data by invoking these operations through their names and arguments, regardless of how the operations are implemented. This may be termed program-operation independence.
  • 194. Support of Multiple Views of the Data  A database typically has many types of users, each of whom may require a different perspective or view of the database.  A view may be a subset of the database or it may contain virtual data that is derived from the database files but is not explicitly stored.  Some users may not need to be aware of whether the data they refer to is stored or derived.  A multiuser DBMS whose users have a variety of distinct applications must provide facilities for defining multiple views.
  • 195. Sharing of Data and Multiuser Transaction Processing  A multiuser DBMS, as its name implies, must allow multiple users to access the database at the same time.  This is essential if data for multiple applications is to be integrated and maintained in a single database.  The DBMS must include concurrency control software to ensure that several users trying to update the same data do so in a controlled manner so that the result of the updates is correct.
  • 196. Two views from Database
  • 197. Advantages of Using the DBMS Approach  Controlling Redundancy  Restricting Unauthorized Access  Providing Persistent Storage for Program Objects  Providing Storage Structures and Search Techniques for Efficient Query Processing  Providing Backup and Recovery  Providing Multiple User Interfaces  Representing Complex Relationships among Data  Enforcing Integrity Constraints  Permitting Inferencing and Actions Using Rules and Triggers  Additional Implications of Using the Database Approach
  • 198.  Potential for Enforcing Standards.  The database approach permits the DBA to define and enforce standards among database users in a large organization. This facilitates communication and cooperation among various departments, projects, and users within the organization. Standards can be defined for names and formats of data elements, display formats, report structures, terminology, and so on.  Reduced Application Development Time.  A prime selling feature of the database approach is that developing a new application—such as the retrieval of certain data from the database for printing a new report—takes very little time. Designing and implementing a large multiuser database from scratch may take more time than writing a single specialized file application.
  • 199.  Flexibility.  It may be necessary to change the structure of a database as requirements change. For example, a new user group may emerge that needs information not currently in the database. In response, it may be necessary to add a file to the database or to extend the data elements in an existing file.  Availability of Up-to-Date Information.  A DBMS makes the database available to all users. As soon as one user’s update is applied to the database, all other users can immediately see this update. This availability of up-to-date information is essential for many transaction-processing applications, such as reservation systems or banking databases, and it is made possible by the concurrency control and recovery subsystems of a DBMS.
  • 200.  Economies of Scale.  The DBMS approach permits consolidation of data and applications, thus reducing the amount of wasteful overlap between activities of data-processing personnel in different projects or departments as well as redundancies among applications.  This enables the whole organization to invest in more powerful processors, storage devices, or networking gear, rather than having each department purchase its own (lower performance) equipment.  This reduces overall costs of operation and management.`
  • 201. Data Models, Schemas, and Instances  A data model—a collection of concepts that can be used to describe the structure of a database—provides the necessary means to achieve this abstraction.  By structure of a database we mean the data types, relationships, and constraints that apply to the data.  Most data models also include a set of basic operations for specifying retrievals and updates on the database.
  • 202. Categories of Data Models  Many data models have been proposed, which we can categorize according to the types of concepts they use to describe the database structure.  High-level or conceptual data models provide concepts that are close to the way many users perceive data, whereas low-level or physical data models provide concepts that describe the details of how data is stored on the computer storage media, typically magnetic disks.  Concepts provided by physical data models are generally meant for computer specialists, not for end users.  Between these two extremes is a class of representational (or implementation) data models, which provide concepts that may be easily understood by end users but that are not too far removed from the way data is organized in computer storage.
  • 203.  Conceptual data models use concepts such as entities, attributes, and relationships.  An entity represents a real-world object or concept, such as an employee or a project from the mini-world that is described in the database.  An attribute represents some property of interest that further describes an entity, such as the employee’s name or salary.  A relationship among two or more entities represents an association among the entities, for example, a works-on relationship between an employee and a project.
  • 204.  Representational or implementation data models are the models used most frequently in traditional commercial DBMSs.  These include the widely used relational data model, as well as the so-called legacy data models—the network and hierarchical models—that have been widely used in the past. Part 3 of the text is devoted to the relational data model, and its constraints, operations, and languages.  Representational data models represent data by using record structures and hence are sometimes called record-based data model
  • 205. Schema diagram for the database  The description of a database is called the database schema, which is specified during database design and is not expected to change frequently.  Most data models have certain conventions for displaying schemas as diagrams.  A displayed schema is called a schema diagram
  • 206. Architecture and Data Independence
  • 207.  The internal level has an internal schema, which describes the physical storage structure of the database.  The internal schema uses a physical data model and describes the complete details of data storage and access paths for the database.  The conceptual level has a conceptual schema, which describes the structure of the whole database for a community of users.  The conceptual schema hides the details of physical storage structures and concentrates on describing entities, data types, relationships, user operations, and constraints.  Usually, a representational data model is used to describe the conceptual schema when a database system is implemented.  This implementation conceptual schema is often based on a conceptual schema design in a high-level data model.
  • 208.  The external or view level includes a number of external schemas or user views.  Each external schema describes the part of the database that a particular user group is interested in and hides the rest of the database from that user group.  As in the previous level, each external schema is typically implemented using a representational data model, possibly based on an external schema design in a high- level conceptual data model.
  • 209.  The three-schema architecture is a convenient tool with which the user can visualize the schema levels in a database system.  Most DBMSs do not separate the three levels completely and explicitly, but they support the three-schema architecture to some extent.  Some older DBMSs may include physical-level details in the conceptual schema.  The three-level ANSI architecture has an important place in database technology development because it clearly separates the users’ external level, the database’s conceptual level, and the internal storage level for designing a database.  It is very much applicable in the design of DBMSs, even today.  In most DBMSs that support user views, external schemas are specified in the same data model that describes the conceptual-level information
  • 210. Data Independence  Logical data independence is the capacity to change the conceptual schema without having to change external schemas or application programs.  We may change the conceptual schema to expand the database (by adding a record type or data item), to change constraints, or to reduce the database (by removing a record type or data item).  In the last case, external schemas that refer only to the remaining data should not be affected.  Only the view definition and the mappings need to be changed in a DBMS that supports logical data independence.  After the conceptual schema undergoes a logical reorganization, application programs that reference the external schema constructs must work as before.  Changes to constraints can be applied to the conceptual schema without affecting the external schemas or application programs.
  • 211.  Physical data independence is the capacity to change the internal schema without having to change the conceptual schema. Hence, the external schemas need not be changed as well.  Changes to the internal schema may be needed because some physical files were reorganized—for example, by creating additional access structures—to improve the performance of retrieval or update.  If the same data as before remains in the database, we should not have to change the conceptual schema.
  • 212. Database Languages and Interfaces  Once the design of a database is completed and a DBMS is chosen to implement the database, the first step is to specify conceptual and internal schemas for the database and any mappings between the two.  In many DBMSs where no strict separation of levels is maintained, one language, called the data definition language (DDL), is used by the DBA and by database designers to define both schemas.  The DBMS will have a DDL compiler whose function is to process DDL statements in order to identify descriptions of the schema constructs and to store the schema description in the DBMS catalog.
  • 213.  In DBMSs where a clear separation is maintained between the conceptual and internal levels, the DDL is used to specify the conceptual schema only.  Another language, the storage definition language (SDL), is used to specify the internal schema.  The mappings between the two schemas may be specified in either one of these languages. In most relational DBMSs today, there is no specific language that performs the role of SDL.  Instead, the internal schema is specified by a combination of functions, parameters, and specifications related to storage of files. These permit the DBA staff to control indexing choices and mapping of data to storage.  For a true three-schema architecture, we would need a third language, the view definition language (VDL), to specify user views and their mappings to the conceptual schema, but in most DBMSs the DDL is used to define both conceptual and external schemas.
  • 214.  There are two main types of DMLs.  A high-level or nonprocedural DML can be used on its own to specify complex database operations concisely.  Many DBMSs allow high-level DML statements either to be entered interactively from a display monitor or terminal or to be embedded in a general-purpose programming language.  A low-level or procedural DML must be embedded in a general-purpose programming language. This type of DML typically retrieves individual records or objects from the database and processes each separately.  Therefore, it needs to use programming language constructs, such as looping, to retrieve and process each record from a set of records.
  • 215.  Low-level DMLs are also called record-at-a-time DMLs because of this property. High- level DMLs, such as SQL, can specify and retrieve many records in a single DML statement; therefore, they are called set-at-a-time or set-oriented DMLs.  A query in a high-level DML often specifies which data to retrieve rather than how to retrieve it; therefore, such languages are also called declarative.  Whenever DML commands, whether high level or low level, are embedded in a general- purpose programming language, that language is called the host language and the DML is called the data sublanguage.  On the other hand, a high-level DML used in a standalone interactive manner is called a query language.
  • 216. DBMS Interfaces  Menu-based Interfaces for Web Clients or Browsing. These interfaces present the user with lists of options (called menus) that lead the user through the formulation of a request.  Menus do away with the need to memorize the specific commands and syntax of a query language; rather, the query is composed step-by-step by picking options from a menu that is displayed by the system.  Pull-down menus are a very popular technique in Web-based user interfaces.  They are also often used in browsing interfaces, which allow a user to look through the contents of a database in an exploratory and unstructured manner.
  • 217. Apps for Mobile Devices.  These interfaces present mobile users with access to their data.  For example, banking, reservations, and insurance companies, among many others, provide apps that allow users to access their data through a mobile phone or mobile device.  The apps have built-in programmed interfaces that typically allow users to login using their account name and password; the apps then provide a limited menu of options for mobile access to the user data, as well as options such as paying bills (for banks) or making reservations (for reservation Web sites).
  • 218.  Forms-based Interfaces.  A forms-based interface displays a form to each user.  Users can fill out all of the form entries to insert new data, or they can fill out only certain entries, in which case the DBMS will retrieve matching data for the remaining entries.  Forms are usually designed and programmed for naive users as interfaces to canned transactions.  Many DBMSs have forms specification languages, which are special languages that help programmers specify such forms.
  • 219.  SQL*Forms is a form-based language that specifies queries using a form designed in conjunction with the relational database schema.  Oracle Forms is a component of the Oracle product suite that provides an extensive set of features to design and build applications using forms.  Some systems have utilities that define a form by letting the end user interactively construct a sample form on the screen.  Graphical User Interfaces. A GUI typically displays a schema to the user in diagrammatic form.  The user then can specify a query by manipulating the diagram. In many cases, GUIs utilize both menus and forms.
  • 220.  Natural Language Interfaces  These interfaces accept requests written in English or some other language and attempt to understand them.  A natural language interface usually has its own schema, which is similar to the database conceptual schema, as well as a dictionary of important words.  The natural language interface refers to the words in its schema, as well as to the set of standard words in its dictionary, that are used to interpret the request.  If the interpretation is successful, the interface generates a high-level query corresponding to the natural language request and submits it to the DBMS for processing; otherwise, a dialogue is started with the user to clarify the request.
  • 221.  Keyword-based Database Search.  These are somewhat similar to Web search engines, which accept strings of natural language (like English or Spanish) words and match them with documents at specific sites (for local search engines) or Web pages on the Web at large (for engines like Google or Ask).  They use predefined indexes on words and use ranking functions to retrieve and present resulting documents in a decreasing degree of match. Such “free form” textual query interfaces are not yet common in structured relational databases, although a research area called keyword-based querying has emerged recently for relational databases.
  • 222.  Speech Input and Output.  Limited use of speech as an input query and speech as an answer to a question or result of a request is becoming commonplace.  Applications with limited vocabularies, such as inquiries for telephone directory, flight arrival/departure, and credit card account information, are allowing speech for input and output to enable customers to access this information.  The speech input is detected using a library of predefined words and used to set up the parameters that are supplied to the queries.  For output, a similar conversion from text or numbers into speech takes place.
  • 223.  Interfaces for Parametric Users.  Parametric users, such as bank tellers, often have a small set of operations that they must perform repeatedly. For example, a teller is able to use single function keys to invoke routine and repetitive transactions such as account deposits or withdrawals, or balance inquiries.  Systems analysts and programmers design and implement a special interface for each known class of naive users.  Usually a small set of abbreviated commands is included, with the goal of minimizing the number of keystrokes required for each request.  Interfaces for the DBA. Most database systems contain privileged commands that can be used only by the DBA staff.  These include commands for creating accounts, setting system parameters, granting account authorization, changing a schema, and reorganizing the storage structures of a database.
  • 224. Data Modeling Using the Entity– Relationship (ER) Model
  • 225.  The first step shown is requirements collection and analysis.  During this step, the database designers interview prospective database users to understand and document their data requirements.  The result of this step is a concisely written set of users’ requirements.  These requirements should be specified in as detailed and complete a form as possible. In parallel with specifying the data requirements, it is useful to specify the known functional requirements of the application.  These consist of the user defined operations (or transactions) that will be applied to the database, including both retrievals and updates. In software design, it is common to use data flow diagrams, sequence diagrams, scenarios, and other techniques to specify functional requirements.
  • 226.  Once the requirements have been collected and analyzed, the next step is to create a conceptual schema for the database, using a high-level conceptual data model.  This step is called conceptual design. The conceptual schema is a concise description of the data requirements of the users and includes detailed descriptions of the entity types, relationships, and constraints; these are expressed using the concepts provided by the high-level data model.  Because these concepts do not include implementation details, they are usually easier to understand and can be used to communicate with nontechnical users.  The high-level conceptual schema can also be used as a reference to ensure that all users’ data requirements are met and that the requirements do not conflict.
  • 227.  This approach enables database designers to concentrate on specifying the properties of the data, without being concerned with storage and implementation details, which makes it is easier to create a good conceptual database design.  During or after the conceptual schema design, the basic data model operations can be used to specify the high-level user queries and operations identified during functional analysis.  This also serves to confirm that the conceptual schema meets all the identified functional requirements.  Modifications to the conceptual schema can be introduced if some functional requirements cannot be specified using the initial schema.
  • 228.  The next step in database design is the actual implementation of the database, using a commercial DBMS.  Most current commercial DBMSs use an implementation data model—such as the relational (SQL) model—so the conceptual schema is transformed from the high- level data model into the implementation data model.  This step is called logical design or data model mapping; its result is a database schema in the implementation data model of the DBMS.
  • 229.  Data model mapping is often automated or semi-automated within the database design tools.  The last step is the physical design phase, during which the internal storage structures, file organizations, indexes, access paths, and physical design parameters for the database files are specified.  In parallel with these activities, application programs are designed and implemented as database transactions corresponding to the high-level transaction specifications.
  • 230. Entity Types, Entity Sets, Attributes, and Keys
  • 232. Entities and Attributes  Entities and Their Attributes.  The basic concept that the ER model represents is an entity, which is a thing or object in the real world with an independent existence.  An entity may be an object with a physical existence (for example, a particular person, car, house, or employee) or it may be an object with a conceptual existence (for instance, a company, a job, or a university course).  Each entity has attributes—the particular properties that describe it.
  • 233. Single-Valued versus Multivalued Attributes  Most attributes have a single value for a particular entity; such attributes are called single- valued. For example,  Age is a single-valued attribute of a person. In some cases an attribute can have a set of values for the same entity—for instance, a Colors attribute for a car, or a College_degrees attribute for a person.  Cars with one color have a single value, whereas two-tone cars have two color values. Similarly, one person may not have any college degrees, another person may have one, and a third person may have two or more degrees; therefore, different people can have different numbers of values for the College_degrees attribute. Such attributes are called multivalued.
  • 234.  A multivalued attribute may have lower and upper bounds to constrain the number of values allowed for each individual entity.  For example, the Colors attribute of a car may be restricted to have between one and two values, if we assume that a car can have two colors at most.
  • 235. Stored versus Derived Attributes  In some cases, two (or more) attribute values are related—for example, the Age and Birth_date attributes of a person.  For a particular person entity, the value of Age can be determined from the current (today’s) date and the value of that person’s Birth_date.  The Age attribute is hence called a derived attribute and is said to be derivable from the Birth_date attribute, which is called a stored attribute.  Some attribute values can be derived from related entities; for example, an attribute Number_of_employees of a DEPARTMENT entity can be derived by counting the number of employees related to (working for) that department.
  • 236. NULL Values  In some cases, a particular entity may not have an applicable value for an attribute. For example, the Apartment_number attribute of an address applies only to addresses that are in apartment buildings and not to other types of residences, such as single- family homes. Similarly, a College_degrees attribute applies only to people with college degrees.  For such situations, a special value called NULL is created.  The meaning of the former type of NULL is not applicable, whereas the meaning of the latter is unknown.  The unknown category of NULL can be further classified into two cases. The first case arises when it is known that the attribute value exists but is missing—for instance, if the Height attribute of a person is listed as NULL.
  • 237. Complex Attributes  Notice that, in general, composite and multivalued attributes can be nested arbitrarily.  We can represent arbitrary nesting by grouping components of a composite attribute between parentheses ( ) and separating the components with commas, and by displaying multivalued attributes between braces { }.  Such attributes are called complex attributes. For example, if a person can have more than one residence and each residence can have a single address and multiple phones, an attribute Address_phone for a person can be specified as Both Phone and Address are themselves composite attributes.
  • 238. Entity Types, Entity Sets, Keys, and Value Sets  Entity Types and Entity Sets. A database usually contains groups of entities that are similar.  For example, a company employing hundreds of employees may want to store similar information concerning each of the employees.  These employee entities share the same attributes, but each entity has its own value(s) for each attribute.  An entity type defines a collection (or set) of entities that have the same attributes.  Each entity type in the database is described by its name and attributes.
  • 239.  Database at any point in time is called an entity set or entity collection; the entity set is usually referred to using the same name as the entity type, even though they are two separate concepts.  For example, EMPLOYEE refers to both a type of entity as well as the current collection of all employee entities in the database.  It is now more common to give separate names to the entity type and entity collection; for example in object and object-relational data models
  • 240. Key Attributes of an Entity Type  An important constraint on the entities of an entity type is the key or uniqueness constraint on attributes.  An entity type usually has one or more attributes whose values are distinct for each individual entity in the entity set.  Such an attribute is called a key attribute, and its values can be used to identify each entity uniquely.
  • 241.  Sometimes several attributes together form a key, meaning that the combination of the attribute values must be distinct for each entity.  If a set of attributes possesses this property, the proper way to represent this in the ER model that describe here is to define a composite attribute and designate it as a key attribute of the entity type.  Specifying that an attribute is a key of an entity type means that the preceding uniqueness property must hold for every entity set of the entity type.  Hence, it is a constraint that prohibits any two entities from having the same value for the key attribute at the same time.  It is not the property of a particular entity set; rather, it is a constraint on any entity set of the entity type at any point in time.
  • 243. Value Sets (Domains) of Attributes  Each simple attribute of an entity type is associated with a value set (or domain of values), which specifies the set of values that may be assigned to that attribute for each individual entity.  If the range of ages allowed for employees is between 16 and 70, we can specify the value set of the Age attribute of EMPLOYEE to be the set of integer numbers between 16 and 70.  Similarly, we can specify the value set for the Name attribute to be the set of strings of alphabetic characters separated by blank characters, and so on.  Value sets are not typically displayed in basic ER diagrams and are similar to the basic data types available in most programming languages, such as integer, string, Boolean,
  • 244. Relationship Types, Relationship Sets, Roles, and Structural Constraints  For example, the attribute Manager of DEPARTMENT refers to an employee who manages the department; the attribute Controlling_department of PROJECT refers to the department that controls the project; the attribute  Supervisor of EMPLOYEE refers to another employee (the one who supervises this employee); the attribute Department of EMPLOYEE refers to the department for which the employee works; and so on.  In the ER model, these references should not be represented as attributes but as relationships.
  • 245.  A relationship type R among n entity types E1, E2, . . . , En defines a set of associations— or a relationship set—among entities from these entity types.  Similar to the case of entity types and entity sets, a relationship type and its corresponding relationship set are customarily referred to by the same name, R.  Mathematically, the relationship set R is a set of relationship instances ri, where each ri associates n individual entities (e1, e2, . . . , en), and each entity ej in ri is a member of entity set Ej, 1 ≤ j ≤ n. Hence, a relationship set is a mathematical relation on E1, E2, . . . , En; alternatively, it can be defined as a subset of the Cartesian product of the entity sets  E1 × E2 × . . . × En. Each of the entity types E1, E2, . . . , En is said to participate in the relationship type R; similarly, each of the individual entities e1, e2, . . . , en is said
  • 247.  EMPLOYEE and DEPARTMENT, which associates each employee with the department for which the employee works.  Each relationship instance in the relationship set WORKS_FOR associates one EMPLOYEE entity and one DEPARTMENT entity.  Figure illustrates this example, where each relationship instance ri is shown connected to the EMPLOYEE and DEPARTMENT entities that participate in ri.  In the mini-world represented by Figure, the employees e1, e3, and e6 work for department d1; the employees e2 and e4 work for department d2; and the employees e5 and e7 work for department d3.
  • 248. Relationship Degree, Role Names, and Recursive Relationships  Degree of a Relationship Type  The degree of a relationship type is the number of participating entity types. Hence, the WORKS_FOR relationship is of degree two.  A relationship type of degree two is called binary, and one of degree three is called ternary.  An example of a ternary relationship is SUPPLY, shown in Figure where each relationship instance ri associates three entities—a supplier s, a part p, and a project j—whenever s supplies part p to project j.  Relationships can generally be of any degree, but the ones most common are binary relationships.  Higher-degree relationships are generally more complex than binary relationships
  • 249. Relationships as Attributes  One can think of an attribute called Department of the EMPLOYEE entity type, where the value of Department for each EMPLOYEE entity is (a reference to) the DEPARTMENT entity for which that employee works.  Hence, the value set for this Department attribute is the set of all DEPARTMENT entities, which is the DEPARTMENT entity set.  This is what we did in Figure when we specified the initial design of the entity type EMPLOYEE for the COMPANY database.  However, when we think of a binary relationship as an attribute, we always have two options or two points of view.
  • 250.  In this example, the alternative point of view is to think of a multivalued attribute Employees of the entity type DEPARTMENT whose value for each DEPARTMENT entity is the set of EMPLOYEE entities who work for that department.  The value set of this Employees attribute is the power set of the EMPLOYEE entity set. Either of these two attributes—Department of EMPLOYEE or Employees of DEPARTMENT—can represent the WORKS_FOR relationship type.  Role Names and Recursive Relationships. Each entity type that participates in a relationship type plays a particular role in the relationship.
  • 251.  The role name signifies the role that a participating entity from the entity type plays in each relationship instance, and it helps to explain what the relationship means.  For example, in the WORKS_FOR relationship type, EMPLOYEE plays the role of employee or worker and DEPARTMENT plays the role of department or employer.  In such cases the role name becomes essential for distinguishing the meaning of the role that each participating entity plays. Such relationship types are called recursive relationships or self-referencing relationships.
  • 253. Weak Entity Types  Entity types that do not have key attributes of their own are called weak entity types.  In contrast, regular entity types that do have a key attribute—which include all the examples discussed so far—are called strong entity types.  Entities belonging to a weak entity type are identified by being related to specific entities from another entity type in combination with one of their attribute values.  We call this other entity type the identifying or owner entity type,10 and we call the relationship type that relates a weak entity typeto its owner the identifying relationship of the weak entity type.
  • 254.  A weak entity type always has a total participation constraint (existence dependency) with respect to its identifying relationship because a weak entity cannot be identified without an owner entity.  However, not every existence dependency results in a weak entity type. For example, a DRIVER_LICENSE entity cannot exist unless it is related to a PERSON entity, even though it has its own key (License_number) and hence is not a weak entity.
  • 255.  A weak entity type normally has a partial key, which is the attribute that can uniquely identify weak entities that are related to the same owner entity.  In our example, if we assume that no two dependents of the same employee ever have the same first name, the attribute Name of DEPENDENT is the partial key.  In the worst case, a composite attribute of all the weak entity’s attributes will be the partial key.
  • 260.  The EER model includes all the modeling concepts of the ER model that were presented. In addition, it includes the concepts of subclass and superclass and the related concepts of specialization and generalization  Another concept included in the EER model is that of a category or union type which is used to represent a collection of objects (entities) that is the union of objects of different entity types.  Associated with these concepts is the important mechanism of attribute and relationship inheritance.  Unfortunately, no standard terminology exists for these concepts, so we use the most common terminology.  Alternative terminology is given in footnotes. We also describe a diagrammatic technique for displaying these concepts when they arise in an EER schema. We call the resulting schema diagrams enhanced ER or EER diagrams.
  • 261.  For example, the entity type EMPLOYEE describes the type (that is, the attributes and relationships) of each employee entity, and also refers to the current set of EMPLOYEE entities in the COMPANY database.  In many cases an entity type has numerous subgroupings or subtypes of its entities that are meaningful and need to be represented explicitly because of their significance to the database application.  For example, the entities that are members of the EMPLOYEE entity type may be distinguished further into SECRETARY, ENGINEER, MANAGER, TECHNICIAN, SALARIED_EMPLOYEE, HOURLY_EMPLOYEE, and so on.  The set or collection of entities in each of the latter groupings is a subset of the entities that belong to the EMPLOYEE entity set, meaning that every entity that is a member of one of these subgroupings is also an employee.
  • 262. Specialization and Generalization  Specialization is the process of defining a set of subclasses of an entity type; this entity type is called the superclass of the specialization. The set of subclasses that forms a specialization is defined on the basis of some distinguishing characteristic of the entities in the superclass.  For example, the set of subclasses {SECRETARY, ENGINEER, TECHNICIAN} is a specialization of the superclass EMPLOYEE that distinguishes among employee entities based on the job type of each employee.  We may have several specializations of the same entity type based on different distinguishing characteristics.  For example, another specialization of the EMPLOYEE entity type may yield the set of subclasses {SALARIED_EMPLOYEE, HOURLY_EMPLOYEE}; this specialization distinguishes among employees based on the method of pay.
  • 264.  There are two main reasons for including class/subclass relationships and specializations.  The first is that certain attributes may apply to some but not all entities of which these attributes apply.  The members of the subclass may still share the majority of their attributes with the other members of the superclass.  The SECRETARY subclass has the specific attribute Typing_speed, whereas the ENGINEER subclass has the specific attribute Eng_type, but SECRETARY and ENGINEER share their other inherited attributes from the EMPLOYEE entity type.  The second reason for using subclasses is that some relationship types may be participated in only by entities that are members of the subclass.  For example, if only HOURLY_EMPLOYEES can belong to a trade union, we can represent that fact by creating the subclass HOURLY_EMPLOYEE of EMPLOYEE and relating the subclass to an entity type TRADE_UNION via the BELONGS_TO relationship type, as illustrated in Figure
  • 266.  Notice that the generalization process can be viewed as being functionally the inverse of the specialization process; we can view {CAR, TRUCK} as a specialization of VEHICLE rather than viewing VEHICLE as a generalization of CAR and TRUCK.  A diagrammatic notation to distinguish between generalization and specialization is used in some design methodologies.  An arrow pointing to the generalized superclass represents a generalization process, whereas arrows pointing to the specialized subclasses represent a specialization process.  We will not use this notation because the decision as to which process was followed in a particular situation is often subjective.
  • 267. Constraints and Characteristics of Specialization and Generalization Hierarchies  If all subclasses in a specialization have their membership condition on the same attribute of the superclass, the specialization itself is called an attribute-defined specialization, and the attribute is called the defining attribute of the specialization.  In this case, all the entities with the same value for the attribute belong to the same subclass. We display an attribute-defined specialization by placing the defining attribute name next to the arc from the circle to the superclass
  • 269.  Two other constraints may apply to a specialization. The first is the disjointness constraint, which specifies that the subclasses of the specialization must be disjoint sets.  This means that an entity can be a member of at most one of the subclasses of the specialization.  A specialization that is attribute-defined implies the dis jointness constraint (if the attribute used to define the membership predicate is single valued).  Figure illustrates this case, where the d in the circle stands for disjoint. The d notation also applies to user-defined subclasses of a specialization that must be disjoint, as illustrated by the specialization {HOURLY_EMPLOYEE, SALARIED_EMPLOYEE} in Figure
  • 271.  The second constraint on specialization is called the completeness (or totalness) constraint, which may be total or partial.  A total specialization constraint specifies that every entity in the superclass must be a member of at least one subclass in the specialization.  For example, if every EMPLOYEE must be either an HOURLY_EMPLOYEE or a SALARIED_EMPLOYEE, then the specialization {HOURLY_EMPLOYEE, SALARIED_EMPLOYEE} in Figure 4.1 is a total specialization of EMPLOYEE.  This is shown in EER diagrams by using a double line to connect the superclass to the circle. A single line is used to display a partial specialization, which allows an entity not to belong to any of the subclasses.
  • 272. Utilizing Specialization and Generalization in Refining Conceptual Schemas  In the specialization process, the database designers typically start with an entity type and then define subclasses of the entity type by successive specialization; that is, they repeatedly define more specific groupings of the entity type.  For example, when designing the specialization lattice in Figure, we may first specify an entity type PERSON for a university database.  Then we discover that three types of persons will be represented in the database: university employees, alumni, and students and we create the specialization {EMPLOYEE, ALUMNUS, STUDENT}.  The overlapping constraint is chosen because a person may belong to more than one of the subclasses.
  • 273.  We specialize EMPLOYEE further into {STAFF, FACULTY, STUDENT_ASSISTANT}, and specialize STUDENT into {GRADUATE_STUDENT, UNDERGRADUATE_STUDENT}.  Finally, we specialize STUDENT_ASSISTANT into {RESEARCH_ASSISTANT, TEACHING_ASSISTANT}.  This process is called top-down conceptual refinement.  So far, we have a hierarchy; then we realize that STUDENT_ASSISTANT is a shared subclass, since it is also a subclass of STUDENT, leading to the lattice.
  • 274.  The final design of hierarchies or lattices resulting from either process may be identical;  the only difference relates to the manner or order in which the schema super classes and subclasses were created during the design process.  In practice, it is likely that a combination of the two processes is employed.
  翻译: