Database systems

Diomidis Spinellis
Department of Management Science and Technology
Athens University of Economics and Business
Athens, Greece
dds@aueb.gr

Advantages

Problems and Limitations

Levels of Abstraction

Physical schema
Defines how data is stored
Conceptual schema or logical schema
Defines data in terms of a data model
External schema or view level
Defines a number of simplified domain-specific views

DBMS Levels of Abstraction

DBMS Levels of Abstraction

Data Independence

The three different levels of abstraction allow us to:

Database Design

  1. Requirements analysis
  2. Conceptual database design: develop high level description
  3. Logical database design: map description into the specific DBMS
  4. Schema refinement: identify and solve potential problems
  5. Physical database design: optimize for specific workloads
  6. Security design

Entities and Attributes

Entity
Object in the real world
Attributes
Describe each entity
Entity set
Group of similar entities (share same attributes)
Attribute domain
Values that can be used for the attribute
Key
Minimal set of attributes that uniquely identify an entity
Primary key
Designated among a number of candidate keys

The books entity set

The books entity set

The Entity Relationship Model

Relationship
Association between two entities
Relationship set
Group of similar relationships
Descriptive attributes
can identify a relationsip
A relationship can be identified as:

The Published relationship

The Published relationship

The Relational Model

Relation Schema

Relation Instance

(Table of rows with the same number of fields)
ISBNTitlePrice
0-387-02620-7Beyond Fear29.00
0-201-79940-5Code Reading49.95
0-07-246535-2Database Management Systems40.00
0-19-502402-8The Timeless Way of Building60.00

Redundancy

Storing redundant information in a database results in:

Example

ISBNTitlePriceDiscountReduced Price
0-387-02620-7Beyond Fear30.001027.00
0-201-79940-5Code Reading50.002040.00
0-07-246535-2Database Management Systems40.002038.00
0-19-502402-8The Timeless Way of Building60.001054.00

Dependencies are resolved by decomposing the relations.

The Structured Query Language

Language used for relational DBMSs.

Used:

SQL Table Definition and Data Manipulation

Create the Table

CREATE TABLE Books (
        isbn CHAR(13),
        title CHAR(100),
        price real)

Add a Record

INSERT
INTO Books (isbn, title, price)
VALUES ("0-201-79940-5""Code Reading"49.95)

Modify Records

UPDATE Books B
SET B.price = B.price * 1.1

Delete a Record

DELETE
FROM Books B
WHERE ISBN="0-201-79940-5"

SQL Queries

Display the Details for a Record

SELECT *
FROM Books
WHERE ISBN="0-201-79940-5"

Display Records Based on a Condition

SELECT title, price
FROM Books
WHERE price > 50 AND price > 10

Perform an Aggregate Calculation

SELECT AVG(price)
FROM Books

Join Two Tables

SELECT title, authorname
FROM Books, Authors
WHERE Books.authorid = Authors.authorid

Query By Example

Query by Example in Microsoft Access
SELECT BOOKS.title, BOOKS.price, BOOKS.price, BOOKS.isbn
FROM BOOKS
WHERE (((BOOKS.price)>100)) OR (((BOOKS.price)<10))
ORDER BY BOOKS.title;

Additional SQL Query Constructs

SQL provides many sophisticated capabilities:

Views

By defining appropriate views on a schema we
CREATE VIEW BrowseBooks(isbn, title)
        AS SELECT B.isbn, B.title
        FROM Books B
        WHERE B.price < 100

Indexes

An index is an auxilliary data structure used to speed up operations that are not efficiently supported by a table's record organisation.
CREATE INDEX IndPrice ON Books
WITH STRUCTURE = BTREE
KEY = (price)
To design the index structure take into account:

Security Issues

Our goal in designing a secure database is to achieve: Discretionary access control provides us the capability to give (and revoke) rights to specific users or groups.

Examples

GRANT SELECT
ON BrowseBooks
TO WebUsers

REVOKE INSERTDELETE
ON Books
From Alice

GRANT INSERTDELETE
ON Books
TO InventoryGroup

GRANT UPDATE(price)
ON Books
TO MarketingGroup

GRANT UPDATE(title, isbn)
ON Books
TO MaintenanceGroup

Transaction Management

Four important properties (ACID):
Atomic
Database ensures that all actions are carried out, or none
Consistency
Users ensure that transactions leave the data in a consistent state
Isolation
Users to not have to worry about concurrently executing transactions
Durability
Completed transactions persist after a crash even if the database has not been updated

Crash Recovery

The Database Administrator

Coordinates all the activities of the database system; the database administrator has a good understanding of the enterprise's information resources and needs.

The database administrator's duties include:

User Profiles

Users are differentiated by the way they expect to interact with the system
Application programmers
interact with system through DML calls
Sophisticated users
form requests in a database query language
Specialized users
write specialized database applications that do not fit into the traditional data processing framework
Naive users
invoke one of the permanent application programs that have been written previously

RAID Storage

Redundant arrays of independent disks (RAID) are used to overcome two bottlenecks associated with disk storage: By using additional redundant disks we can increase both. The following are some typically identified RAID levels:
0: Nonredundant
Data is distributed across different disks to increase performance
1: Mirrored
A second disk set keeps a copy of the data (50% overhead)
0+1 (or 10): Striping and mirroring
Data is distributed across a second disk set to increase performance
2: Error correcting codes
Additional check disks (4:3, 10:4, 25:5) are used to provide redundancy with a smaller overhead (e.g. 57%, 71%, 83%)
3: Bit-interleaved parity
A single additional check disk is used to recover the data by distributing bits across all disks
4: block-interleaved parity
The check disk contains the parity on a block level: higher read throughput
5: block-interleaved distributed parity
Block parity is distributed across all disks: higher read and write throughput
6: higher redundancy
Like 5 with an additional check disk guarding against a second failure

XML Concepts

XML Syntax

An XML document is structured on the following building blocks:
Elements
Delimited by starting and ending tags
Example:
        <title>The Lord of the Rings</title>
Elements can also be empty; a special shortcut makes them smaller:
Example:
        <dvd></dvd>
is the same as
        <dvd />
Attributes
Name=valua pairs in the starting tag, that can be used for metadata
Example:
        <title lang="english">The Lord of the Rings</title>
Entity references
Are used to refer to special (reserved) symbols
SymbolEntity reference
<&lt;
>&gt;
&&amp;
"&quot;
'&apos;
Comments
Delimited by the <!-- and -> sequence
Example:
        <!-- this is a comment ->

Example

<!-- List of shipped books -->
<booklist>
        <book>
                <isbn>0-387-02620-7</isbn>
                <title>Beyond Fear</title>
                <author>
                        <givenname>Bruce</givenname>
                        <familyname>Schneier</familyname>
                </author>
                <price>30.00</price>
        </book>
        <book>
                <isbn>0-201-79940-5</isbn>
                <title>Code Reading</title>
                <author>
                        <givenname>Diomidis</givenname>
                        <familyname>Spinellis</familyname>
                </author>
                <price>50.00</price>
        </book>
        <book>
                <isbn>0-07-246535-2</isbn>
                <title>Database Management Systems</title>
                <author>
                        <givenname>Raghu</givenname>
                        <familyname>Ramakrishnan</familyname>
                </author>
                <author>
                        <givenname>Johannes</givenname>
                        <familyname>Gehrke</familyname>
                </author>
                <price>40.00</price>
        </book>
        <book>
                <isbn>0-19-502402-8</isbn>
                <title>The Timeless Way of Building</title>
                <author>
                        <givenname>Cristopher</givenname>
                        <familyname>Alexander</familyname>
                </author>
                <price>60.00</price>
        </book>
        <book>
                <isbn>3-8218-0479-3</isbn>
                <title lang="german">Lexikon der Populaeren Irrtuemer</title>
                <author>
                        <givenname>Walter</givenname>
                        <familyname>Kraemer</familyname>
                </author>
                <author>
                        <givenname>Goetz</givenname>
                        <familyname>Trenkler</familyname>
                </author>
                <price>40.00</price>
        </book>
</booklist>

Document Type Definitions

Example:
  <!ELEMENT booklist (book)*>
    <!ELEMENT book (isbn,title,author+,price,edition?)*>
      <!ELEMENT isbn (#PCDATA)>
      <!ELEMENT title (#PCDATA)>
      <!ATTLIST title lang (german|english) "german">
      <!ELEMENT author (givenname,familyname)>
        <!ELEMENT givenname (#PCDATA)*>
        <!ELEMENT familyname (#PCDATA)>
      <!ELEMENT price (#PCDATA)>
      <!ELEMENT edition (#PCDATA)>

Bibliography