RGPV Sem : 5th DBMS key point Notes
Chapter: 1
DBMS
Concepts and architecture Introduction, Database approach v/s Traditional file
accessing approach, Advantages, of database systems, Data models, Schemas and
instances, Data independence, Data Base Language and interfaces, Overall
Database Structure, Functions of DBA and designer, ER data model: Entitles and
attributes, Entity types, Defining the E-R diagram, Concept of Generalization,
Aggregation and Specialization. transforming ER diagram into the tables.
Various other data models object oriented data Model, Network data model, and
Relational data model, Comparison between the three types of models.
Introduction to DBMS
A Database Management System (DBMS) is a software system
designed to manage and store data efficiently.1 It provides a
structured approach to organizing, accessing, and manipulating data within a
computer system.2
Database Approach vs. Traditional File Accessing Approach
Feature |
Database Approach |
Traditional File Accessing Approach |
Data Redundancy |
Minimal |
High |
Data Inconsistency |
Low |
High |
Data Isolation |
Low |
High |
Data Integrity |
High |
Low |
Data Security |
High |
Low |
Data Independence |
High |
Low |
Concurrent Access |
High |
Low |
Advantages of Database Systems
- Reduced
Data Redundancy: Minimizes duplication of data, saving storage space
and improving data consistency.
- Improved
Data Consistency: Ensures that changes made to data are reflected
accurately across the entire database.
- Improved
Data Security: Provides mechanisms to control access to data,
preventing unauthorized use or modification.3
- Increased
Data Integrity: Enforces rules and constraints to maintain the
accuracy and validity of data.4
- Improved
Data Independence: Separates data from the applications that use it,
making it easier to modify the database structure without affecting
applications.
- Enhanced
Productivity: Provides tools and interfaces for efficient data
management, retrieval, and analysis.5
- Increased
Data Sharing: Facilitates the sharing of data among multiple users and
applications.6
Data Models, Schemas, and Instances
- Data
Model: A formal description of how data is organized and structured
within a database. Examples include the relational model, object-oriented
model, and network model.
- Schema:
A logical or conceptual representation of the database structure, defining
the tables, attributes, and relationships.7
- Instance:
A specific set of data values stored in the database at a particular point
in time.
Data Independence
- Physical
Data Independence: Changes in the physical storage of data do not
affect the way applications access the data.8
- Logical
Data Independence: Changes in the logical structure of the database do
not affect the way applications access the data.
Database Language and Interfaces
- Data
Definition Language (DDL): Used to create, modify, and delete database
objects like tables, views, and indexes.
- Data
Manipulation Language (DML): Used to insert, retrieve, update, and
delete data within the database.9
- Query
Language (SQL): A powerful language that combines DDL and DML
capabilities.
Overall Database Structure
A typical database system consists of several components:
- Database:
The collection of organized data.
- DBMS:
The software that manages the database.
- Users:
Individuals or applications that interact with the database.
- Application
Programs: Software that accesses and manipulates the database.
Functions of DBA and Designer
- DBA:
Responsible for overall database administration, including security,
performance tuning, and backup and recovery.10
- Designer:
Responsible for designing the database schema, ensuring data integrity and
efficiency.11
ER Data Model
- Entities:
Real-world objects or concepts that are of interest to the organization.
- Attributes:
Properties or characteristics of entities.
- Entity
Types: A collection of entities with similar attributes.
- Defining
the E-R Diagram: A graphical representation of the entities,
attributes, and relationships within a database.12
- Generalization:
A higher-level entity type that represents a general concept, with more
specific entity types inheriting its properties.
- Aggregation:
A relationship between an entity type and a collection of other entity
types, treating the collection as a single entity.
- Specialization:
The process of dividing a general entity type into more specific subtypes.
Transforming ER Diagram into Tables
The ER diagram is translated into a set of tables, with each
entity type becoming a table and relationships represented by foreign keys.
Other Data Models
- Object-Oriented
Data Model: Represents data as objects with attributes and methods.13
- Network
Data Model: Represents data using a network of interconnected records.14
- Relational
Data Model: Represents data using tables with rows and columns.15
Comparison of Data Models
Feature |
Relational Model |
Network Model |
Object-Oriented Model |
Data Representation |
Tables |
Network of records |
Objects with attributes and methods |
Complexity |
Simpler |
More complex |
More complex |
Flexibility |
Less flexible |
More flexible |
Most flexible |
Implementation |
Easier to implement |
More difficult to implement |
More difficult to implement |
Common Usage |
Most widely used |
Less common |
Used in specific applications |
This comprehensive overview provides a foundational
understanding of DBMS concepts and architecture. Remember that specific
implementations and technologies may vary.
CHAPTER:2
Relational
Data models: Domains, Tuples, Attributes, Relations, Characteristics of
relations, Keys, Key attributes of relation, Relational database, Schemas,
Integrity constraints. Referential integrity, Intension and Extension,
Relational Query languages: SQL-DDL, DML, integrity con straints, Complex
queries, various joins, indexing, triggers, assertions , Relational algebra and
relational calculus, Relational algebra operations like select, Project ,Join,
Division, outer union. Types of relational calculus i.e. Tuple oriented and
domain oriented relational calculus and its operations.
Relational Data Model
- Domains:
A set of permitted values for an attribute. For example, the domain of
'Age' could be integers between 0 and 150.
- Attributes:
Named columns in a relation that represent properties of the entities.
- Tuples:
Rows in a relation, representing individual instances of the entities.
- Relations:
A named, two-dimensional table with rows (tuples) and columns
(attributes).
Characteristics of Relations
- Order
of tuples is irrelevant: The order in which tuples appear in a
relation does not matter.
- Order
of attributes is irrelevant: The order of columns does not affect the
meaning of the relation.
- No
duplicate tuples: A relation cannot contain two identical tuples.
Keys
- Superkey:
A set of attributes that uniquely identifies each tuple in a relation.
- Candidate
Key: A minimal superkey (no subset is also a superkey).
- Primary
Key: One of the candidate keys chosen to uniquely identify tuples.
- Foreign
Key: An attribute (or set of attributes) in one relation that refers
to the primary key of another relation.
Relational Database
A collection of interrelated relations.
Schemas
- Relation
Schema: Defines the structure of a relation, including attribute names
and their data types.
- Database
Schema: A collection of relation schemas that defines the entire
database.
Integrity Constraints
Rules that must be enforced on the data in the database.
- Domain
Constraints: Restrict the values of attributes to their specified
domains.
- Key
Constraints: Ensure that each tuple has a unique primary key value.
- Entity
Integrity: The primary key attribute of each tuple cannot be null.
- Referential
Integrity: Foreign key values must either match a primary key value in
another relation or be null.
Referential Integrity
A constraint that maintains the consistency between two
related relations.
Intension and Extension
- Intension:
The definition of a relation, including its name, attributes, and their
domains.
- Extension:
The current set of tuples in a relation at a given time.
Relational Query Languages
Languages used to manipulate and retrieve data from
relational databases.
- SQL
(Structured Query Language)
- DDL
(Data Definition Language): Used to create, alter, and drop database
objects (e.g., tables, views, indexes).
- DML
(Data Manipulation Language): Used to insert, retrieve, update, and
delete data from tables.
- Integrity
Constraints: Defined using SQL constraints (e.g., CHECK, UNIQUE,
FOREIGN KEY).
- Relational
Algebra
- Select
(σ): Selects tuples that satisfy a given condition.
- Project
(Ï€): Selects specific attributes from a relation.
- Join
(⋈): Combines tuples from two relations based
on a join condition.
- Union
(∪): Combines two relations, removing
duplicates.
- Intersection
(∩): Finds tuples that exist in both relations.
- Difference
(-): Finds tuples that exist in one relation but not in another.
- Cartesian
Product (×): Combines each tuple of one relation with each tuple of
another relation.
- Division
(÷): Finds tuples that satisfy a specific relationship between two
relations.
- Relational
Calculus
- Tuple
Relational Calculus: Expresses queries in terms of tuples that
satisfy a given condition.
- Domain
Relational Calculus: Expresses queries in terms of domains of
attributes.
Complex Queries
Queries that involve multiple relations, joins, subqueries,
and other complex operations.
Joins
- Inner
Join: Returns tuples that match the join condition.
- Left
Outer Join: Returns all tuples from the left relation and matching
tuples from the right relation.
- Right
Outer Join: Returns all tuples from the right relation and matching
tuples from the left relation.
- Full
Outer Join: Returns all tuples from both relations, with null values
for unmatched attributes.
Indexing
A data structure that allows for faster retrieval of data
from a relation.
Triggers
Procedures that are automatically executed in response to
certain events (e.g., insert, update, delete).
Assertions
Declarative rules that specify constraints on the database.
This comprehensive explanation covers the fundamental
concepts of the relational data model. Remember that specific implementations
and features may vary across different database systems.
Chapter : 3
Data Base
Design: Introduction to normalization, Normal forms, Functional dependency,
Decomposition, Dependency preservation and lossless join, problems with null
valued and dangling tuples, multivalued dependencies. Query Optimization:
Introduction, steps of optimization, various algorithms to implement select,
project and join operations of relational algebra, optimization methods:
heuristic based, cost estimation based.
Database Design: Normalization
- Introduction
to Normalization:
- A
process of organizing data within the database.
- Aims
to minimize redundancy and improve data integrity.
- Achieved
by decomposing relations into smaller, well-structured relations.
- Normal
Forms:
- 1NF
(First Normal Form):
- Each
attribute contains only atomic (single) values.
- No
multi-valued attributes or repeating groups.
- 2NF
(Second Normal Form):
- In
1NF and every non-key attribute is fully functionally dependent on the
primary key.
- 3NF
(Third Normal Form):
- In
2NF and no non-key attribute is transitively dependent on the primary
key.
- BCNF
(Boyce-Codd Normal Form):
- A
stricter version of 3NF.
- For
every functional dependency X -> Y, X must be a superkey.
- Higher
Normal Forms (4NF, 5NF):
- Address
more complex dependencies like multivalued dependencies and join
dependencies.
- Functional
Dependency:
- A
relationship between two sets of attributes, X and Y, where the value of
X uniquely determines the value of Y.
- Notation:
X -> Y (X functionally determines Y)
- Decomposition:
- The
process of breaking down a relation into smaller relations.
- Lossless
Join Decomposition: The original relation can be perfectly
reconstructed by joining the decomposed relations.
- Dependency
Preserving Decomposition: All functional dependencies in the original
relation are preserved in the decomposed relations.
- Problems
with Null Values and Dangling Tuples:
- Null
Values: Can lead to ambiguity and inconsistencies.
- Dangling
Tuples: Occur when a foreign key references a primary key that no
longer exists.
- Multivalued
Dependencies:
- A
type of dependency where the presence of one or more tuples in a relation
determines the presence of a particular set of other tuples in that
relation.
Query Optimization
- Introduction:
- The
process of finding the most efficient execution plan for a given query.
- Aims
to minimize resource usage (CPU, memory, I/O) and improve query
performance.
- Steps
of Optimization:
- Parsing:
The query is analyzed for syntax and semantics.
- Semantic
Analysis: The meaning of the query is determined.
- Logical
Optimization: The query is transformed into equivalent but more
efficient logical expressions.
- Physical
Optimization: The chosen logical plan is translated into a physical
execution plan, selecting the best access paths (indexes) and join
algorithms.
- Algorithms
for Relational Algebra Operations:
- Select:
- Index-based
selection (using indexes on the selected attribute)
- File
scan (scanning the entire relation)
- Project:
- Projection
can be performed during other operations (e.g., join) to reduce data
size.
- Join:
- Nested-loop
join
- Merge
join (requires sorted relations)
- Hash
join
- Optimization
Methods:
- Heuristic-Based
Optimization:
- Uses
rules of thumb and heuristics to choose the most promising execution
plan.
- Cost-Based
Optimization:
- Estimates
the cost of different execution plans based on statistical information
about the data (e.g., cardinality, selectivity).
- Chooses
the plan with the lowest estimated cost.
Note: This is a general overview. The specific
techniques and algorithms used in query optimization can vary significantly
across different database systems.
Chapter: 4
Transaction Processing Concepts: -Transaction
System, Testing of Serializability, Serializability of schedules, conflict
& view serializable schedule, recoverability, Recovery from transaction
failures. Log based recovery. Checkpoints deadlock handling. Concurrency
Control Techniques: Concurrency Control, locking Techniques for concurrency
control, time stamping protocols for concurrency control, validation based
protocol, multiple granularity. Multi version schemes, Recovery with concurrent
transaction. Introduction to Distributed databases, data mining, data
warehousing, Object Technology and DBMS, Comparative study of OODBMS Vs DBMS .
Temporal, Deductive, Multimedia, Web & Mobile database
Certainly, let's delve into the concepts of Transaction
Processing, Concurrency Control, and other related topics.
Transaction Processing Concepts
- Transaction
System: A system that manages and executes units of work called
transactions.
- Transaction:
A sequence of operations that access and manipulate data in a database.
- ACID
Properties:
- Atomicity:
All operations within a transaction must be executed as a single unit.
Either all operations succeed, or all operations fail.
- Consistency:
A transaction must leave the database in a consistent state.
- Isolation:
Concurrent execution of transactions should not interfere with each
other.
- Durability:
Once a transaction is committed, its effects must be permanently
recorded and survive system failures.
- Testing
of Serializability:
- Serializability:
A property of a schedule (order of execution of transactions) where the
result is equivalent to a serial schedule (transactions executed one
after another).
- Conflict
Serializability: A schedule is conflict serializable if it is
equivalent to some serial schedule by swapping non-conflicting
operations.
- View
Serializability: A schedule is view serializable if each transaction
"sees" a consistent view of the database, as if it were
executed alone.
- Recoverability:
- A
property that ensures that if a transaction fails, its effects are
undone, and the database is restored to a consistent state.
- Cascadeless
Recovery: A stronger form of recoverability where the failure of one
transaction does not cause the rollback of other transactions.
- Recovery
from Transaction Failures:
- Log-Based
Recovery:
- Log:
A record of all transaction operations (e.g., read, write, commit,
abort).
- Checkpoints:
Periodically recorded states of the database and the log, used for
faster recovery.
- Recovery
Techniques: Undo, Redo, Undo/Redo.
- Deadlock
Handling:
- A
situation where two or more transactions are waiting for resources held
by each other, leading to a standstill.
- Deadlock
Prevention: Techniques to prevent deadlocks from occurring (e.g.,
timeouts, resource ordering).
- Deadlock
Detection: Algorithms to detect deadlocks and resolve them (e.g.,
wait-for graph).
Concurrency Control Techniques
- Concurrency
Control: Mechanisms to ensure that concurrent execution of
transactions does not violate data integrity or serializability.
- Locking
Techniques:
- Two-Phase
Locking (2PL): Transactions acquire locks before accessing data and
release them only after all operations are completed.
- Strict
2PL: All locks are held until the transaction commits.
- Shared
Locks: Allow concurrent reads.
- Exclusive
Locks: Prevent concurrent reads and writes.
- Timestamping
Protocols:
- Assign
timestamps to transactions.
- Use
timestamps to order operations and prevent inconsistencies.
- Validation-Based
Protocols:
- Transactions
execute without acquiring locks.
- At
commit time, the transaction is validated to ensure that no other
transaction has interfered with its operations.
- Multiple
Granularity:
- Data
is organized in a hierarchy (e.g., database, table, page, row).
- Locks
can be acquired at different levels of granularity.
- Multi-Version
Schemes:
- Maintain
multiple versions of data items.
- Allow
concurrent transactions to access different versions of the same data.
- Recovery
with Concurrent Transactions:
- More
complex than single-transaction recovery.
- Requires
careful logging and checkpointing to ensure correct recovery in the
presence of concurrent operations.
Distributed Databases
- Distributed
Database: A database system where the data is physically distributed
across multiple sites.
- Advantages:
Improved availability, performance, and modularity.
- Challenges:
Data replication, concurrency control, transaction management, query
processing.
Data Mining
- The
process of discovering interesting and previously unknown patterns and
knowledge from large datasets.
- Techniques:
Classification, clustering, association rule mining, regression.
Data Warehousing
- A
system for collecting, storing, and analyzing large volumes of data for
business intelligence and decision-making.
Object-Oriented Database Management Systems (OODBMS)
- OODBMS:
Store and manage complex objects, including their attributes, methods, and
relationships.
- Comparison
with Traditional DBMS:
- OODBMS:
Better suited for complex data types, object-oriented programming
languages, and applications requiring complex relationships and behavior.
- Traditional
DBMS: Simpler to implement and maintain, more widely used for
general-purpose applications.
Other Types of Databases
- Temporal
Databases: Store data with time-varying attributes.
- Deductive
Databases: Combine relational databases with logic programming.
- Multimedia
Databases: Store and manage multimedia data (e.g., images, audio,
video).
- Web
Databases: Designed to support web applications and interact with web
browsers.
- Mobile
Databases: Designed for use on mobile devices, often with limited
resources.
This overview provides a general understanding of the key
concepts in transaction processing, concurrency control, and other related
database topics. Remember that specific implementations and techniques can vary
significantly across different database systems.
Chapter
: 5
Study of
Relational Database Management Systems through Oracle/PL SQL QL/MySQL:
Architecture, physical files, memory structures, background process. Concept of
table spaces, segments, extents and block. Dedicated server, multi threaded
server. Distributed database, database links, and snapshot. Data dictionary,
dynamic performance view. Security, role management, privilege management,
profiles, invoker defined security model. SQL queries, Data extraction from
single, multiple tables equi- join, non equi-join, self -join, outer join.
Usage of like, any, all, exists, in Special operators. Hierarchical quires,
inline queries, flashback queries. Introduction of ANSI SQL, anonymous block,
nested anonymous block, branching and looping constructs in ANSI SQL. Cursor
management: nested and parameterized cursors, Oracle exception handling
mechanism. Stored procedures, in, out, in out type parameters, usage of
parameters in procedures. User defined functions their limitations. Triggers,
mutating errors, instead of triggers
Oracle/PL-SQL: A Deep Dive
Oracle Architecture
- Instance:
A single running copy of the Oracle server software.
- Physical
Files: Data files that store the actual data.
- Memory
Structures:
- System
Global Area (SGA): Shared memory area used by the instance.
- Shared
Pool: Caches frequently executed SQL statements, parse trees, and
data dictionary information.
- Buffer
Cache: Stores recently accessed data blocks from data files.
- Redo
Log Buffer: Stores changes made to the database to ensure data
recovery.
- Process
Global Area (PGA): Private memory area for each Oracle session.
- Background
Processes:
- DBWn
(Database Writer): Writes modified data blocks from the buffer cache
to data files.
- LGWR
(Log Writer): Writes redo log entries from the redo log buffer to the
redo log files.
- SMON
(System Monitor): Performs recovery tasks, such as instance recovery
and media recovery.
- PMON
(Process Monitor): Monitors user sessions and cleans up resources
when a session terminates abnormally.
Storage Structures
- Tablespaces:
Logical containers for data objects (tables, indexes, etc.).
- Segments:
Extents allocated to a specific data object within a tablespace.
- Extents:
Contiguous blocks of storage allocated to a segment.
- Blocks:
The smallest unit of storage in Oracle.
Server Types
- Dedicated
Server: Each user session has its own dedicated server process.
- Multithreaded
Server: Multiple user sessions can share a single server process.
Distributed Databases
- Distributed
Database: Data is physically distributed across multiple sites.
- Database
Links: Allow users to access data in remote databases.
- Snapshots:
Read-only copies of data from a remote database.
Data Dictionary
- Stores
metadata about the database, such as table definitions, indexes, users,
and privileges.
Dynamic Performance Views
- Provide
real-time information about the database, such as resource usage,
performance metrics, and session activity.
Security
- Role
Management: Groups of privileges that can be granted to users.
- Privilege
Management: Controls access to specific database objects (tables,
views, etc.).
- Profiles:
Define resource limits and security parameters for users.
- Invoker
Rights Model: Privileges are granted to the invoker of a stored
procedure or function.
SQL Queries
- Single
Table Queries: SELECT, WHERE, ORDER BY, GROUP BY, HAVING.
- Multiple
Tables:
- Equi-Join:
Joins tables based on equality of columns.
- Non-Equi-Join:
Joins tables based on conditions other than equality.
- Self-Join:
Joins a table to itself.
- Outer
Join: Returns all rows from one table and matching rows from the
other table, with null values for unmatched rows.
- Special
Operators: LIKE, ANY, ALL, EXISTS, IN.
- Hierarchical
Queries: Retrieve hierarchical data using CONNECT BY clause.
- Inline
Queries: Subqueries embedded within the WHERE or FROM clause.
- FlashBack
Queries: Query historical versions of data.
ANSI SQL and PL/SQL
- ANSI
SQL: Standard for SQL language.
- PL/SQL:
Procedural extension to SQL.
- Anonymous
Blocks: Named or unnamed blocks of PL/SQL code.
- Nested
Blocks: Blocks within other blocks.
- Branching:
IF-THEN-ELSE, CASE statements.
- Looping:
FOR loops, WHILE loops, LOOP-END LOOP.
Cursor Management
- Cursor:
A pointer that moves through a set of rows returned by a SQL query.
- Nested
Cursors: Cursors within other cursors.
- Parameterized
Cursors: Cursors with parameters that can be passed at runtime.
Exception Handling
- Exceptions:
Errors that occur during program execution.
- Exception
Handling Blocks: Use EXCEPTION keyword to handle exceptions.
Stored Procedures
- Stored
Procedures: Named PL/SQL blocks that can be executed by name.
- Parameters:
IN, OUT, IN OUT parameters for passing data to and from procedures.
User-Defined Functions
- Functions:
Named PL/SQL blocks that return a single value.
- Limitations:
Cannot modify database tables directly.
Triggers
- Triggers:
PL/SQL code that is automatically executed in response to database events
(INSERT, UPDATE, DELETE).
- Mutating
Table Errors: Errors that occur when a trigger tries to modify the
same table that fired the trigger.
- Instead
Of Triggers: Replace DML operations on a view with operations on the
underlying tables.
This comprehensive overview provides a foundational
understanding of Oracle/PL-SQL and its key components. Remember that specific
implementations and features may vary across different Oracle database
versions.
Comments
Post a Comment