NULL Values and Three-Valued Logic
-
NULL: Represents missing values in a database with three possible interpretations:
- Value unknown: Value exists but is not known, or it's unknown whether the value exists
- Value unavailable: Value exists but is purposely withheld
- Value not applicable: The attribute doesn't apply to this tuple
-
Three-Valued Logic: SQL uses TRUE, FALSE, and UNKNOWN (instead of just TRUE/FALSE)
- Comparisons involving NULL produce UNKNOWN results
- Only tuples that evaluate to TRUE (not FALSE or UNKNOWN) are selected in WHERE clauses
- Use
IS NULL
orIS NOT NULL
for NULL comparisons (not=
or<>
)
Nested Queries
- Nested Query: A complete SELECT-FROM-WHERE query within another SQL query
- Outer Query: The query that contains the nested query
-
Comparison Operators:
IN
: Tests if a value is in a set of values= ANY
or= SOME
: Equivalent to IN> ALL
: True if value is greater than all values in the set- Other operators:
<
,<=
,>=
,<>
can be used with ANY, SOME, or ALL
-
Correlated Nested Queries: When a condition in the nested query references an attribute from the outer query
- Evaluated once for each tuple in the outer query
EXISTS and UNIQUE Functions
- EXISTS: Returns TRUE if the nested query returns at least one tuple
-- Find employees who have at least one dependent
SELECT Fname, Lname
FROM EMPLOYEE
WHERE EXISTS (
SELECT *
FROM DEPENDENT
WHERE EMPLOYEE.Ssn = DEPENDENT.Essn
);
-- Find employees who have no dependents
SELECT Fname, Lname
FROM EMPLOYEE
WHERE NOT EXISTS (
SELECT *
FROM DEPENDENT
WHERE EMPLOYEE.Ssn = DEPENDENT.Essn
);
EXISTS
and NOT EXISTS
.
Some database systems may require alternative approaches to achieve the same functionality.
-- Find projects where each assigned employee works the same number of hours
SELECT Pnumber, Pname
FROM PROJECT
WHERE UNIQUE (
SELECT Hours
FROM WORKS_ON
WHERE WORKS_ON.Pno = PROJECT.Pnumber
);
Joined Tables
- Joined Table: Specifies joins in the FROM clause rather than in the WHERE clause
- JOIN Types:
INNER JOIN
(orJOIN
): Returns only matching rows. If onlyJOIN
is specified in an SQL statement without any additional qualifier, anINNER JOIN
is executed. TheJOIN
keyword by itself is shorthand forINNER JOIN
in standard SQL. An inner join returns only the rows where there is a match between the tables being joined based on the specified join condition.
-- these two statements are equivalent SELECT * FROM Employee JOIN Department ON Employee.DeptId = Department.DeptId; SELECT * FROM Employee INNER JOIN Department ON Employee.DeptId = Department.DeptId;
LEFT OUTER JOIN
: All rows from left table, matched with right tableRIGHT OUTER JOIN
: All rows from right table, matched with left tableFULL OUTER JOIN
: All rows from both tablesNATURAL JOIN
: Implicit equi-join (a type of join that combines tables based on a matching column between them, specifically where the values in the specified columns are equal) on columns with same nameCROSS JOIN
: Cartesian product of tables. A CROSS JOIN performs a Cartesian product between two tables, combining each row from the first table with every row from the second table. This results in a table that has the number of rows equal to the product of the number of rows in the two input tables.
-- this query would give all possible combinations of EmployeeId, Name, DepartmentID, and DepartmentName
SELECT e.EmployeeId, e.Name, d.DepartmentId, d.DepartmentName
FROM Employee e
CROSS JOIN Department d;
Aggregate Functions
- COUNT: Returns number of tuples or values
- SUM: Returns sum of numeric values
- MAX: Returns maximum value
- MIN: Returns minimum value
- AVG: Returns average (mean) of values
- NULL Handling: NULLs are removed before calculation (except COUNT(*))
Grouping and HAVING
- GROUP BY: Partitions tuples into groups based on specified attribute values
- HAVING: Filters groups (applied after grouping)
- Key Points:
- WHERE filters individual tuples (before grouping)
- HAVING filters groups (after grouping)
-- this query -- Joins the Employee and Department tables -- Groups the results by department name -- Calculates two aggregate values for each group -- Count of employees in each department -- Average salary in each department -- Uses HAVING to filter the groups, showing only departments that have: -- An average salary greater than $50,000, AND -- At least 3 employees SELECT Department.Dname AS DepartmentName, COUNT(Employee.Ssn) AS EmployeeCount, AVG(Employee.Salary) AS AverageSalary FROM Employee JOIN Department ON Employee.Dno = Department.Dnumber GROUP BY Department.Dname HAVING AVG(Employee.Salary) > 50000 AND COUNT(Employee.Ssn) >= 3;
- SELECT clause with GROUP BY should include only grouping attributes and aggregate functions
WITH and CASE Clauses
- WITH: Defines a temporary table for use in a single query (similar to a temporary view)
- CASE: Allows different values based on specific conditions
- Can be used in SELECT, INSERT, and UPDATE statements
Recursive Queries
- Introduced in SQL:99 for handling hierarchical/recursive relationships
- Uses
WITH RECURSIVE
clause - Example applications: organizational hierarchies, bill of materials
-- this query
-- Defines a recursive Common Table Expression (CTE) named EmployeeHierarchy
-- Starts with a base case that selects the top manager (employee ID 100)
-- Combines this with a recursive part that finds all direct reports to anyone already in the result
-- Tracks the level in the hierarchy with a counter
-- The outer query formats the results to show the reporting structure visually through indentation
WITH RECURSIVE EmployeeHierarchy AS (
-- Base case: start with the top manager (employee ID 100)
SELECT EmployeeID, Name, Title, ManagerID, 1 AS Level
FROM Employees
WHERE EmployeeID = 100
UNION ALL
-- Recursive case: find all employees who report to someone in our current result set
SELECT e.EmployeeID, e.Name, e.Title, e.ManagerID, eh.Level + 1
FROM Employees e
JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID
)
SELECT
EmployeeID,
Name,
Title,
Level,
CONCAT(REPEAT(' ', Level-1), Name) AS HierarchyDisplay
FROM EmployeeHierarchy
ORDER BY Level, Name;
-- The query will return all employees in the hierarchy, with each employee's level and a formatted display showing the organizational structure.
Views
- View: A virtual table derived from other tables (base tables or other views)
-
Benefits:
- Simplifies query writing
- Provides security (restricting access to certain columns/rows)
- Offers data independence
-
View Implementation:
- Query modification: Transforms view query into a query on base tables
- View materialization: Physically creates a temporary table
-
View Updates:
- Single-table views may be updatable if they contain the primary key
- Views with joins or aggregate functions are generally not updatable
- Use
WITH CHECK OPTION
to enforce view update constraints
Schema Changes
-
DROP Commands:
DROP SCHEMA
: Removes an entire schemaDROP TABLE
: Removes a table- Options:
CASCADE
(removes dependent objects) orRESTRICT
(fails if dependencies exist)
-
ALTER Commands:
ALTER TABLE ADD COLUMN
: Adds a new columnALTER TABLE DROP COLUMN
: Removes a columnALTER TABLE ALTER COLUMN
: Modifies column propertiesALTER TABLE ADD CONSTRAINT
: Adds a constraintALTER TABLE DROP CONSTRAINT
: Removes a constraint
Constraints and Triggers
-
Assertions: General constraints using CREATE ASSERTION
- Can specify complex integrity rules across multiple tables
-
Triggers: Automatic actions performed when events occur
- Components: Event, Condition, Action (ECA rules)
- Events: INSERT, UPDATE, DELETE operations
- Can execute before or after the triggering event
- Used for maintaining consistency, monitoring updates, and automating actions
Query Execution Order
Conceptual order of query evaluation:
- FROM clause (identify tables and perform joins)
- WHERE clause (filter tuples)
- GROUP BY (create groups)
- HAVING (filter groups)
- SELECT (select columns)
- ORDER BY (sort results)
SQL Relations vs. ER Model Relations
Relations in SQL tables and those in the formal ER model differ in several important ways:
-
Mathematical vs. Implementation: ER relations are based on mathematical set theory where relations are strictly defined as sets of tuples with no duplicates. SQL tables are physical implementations that can contain duplicate rows.
-
Ordering: ER relations have no inherent ordering of tuples or attributes, while SQL tables can have physical ordering and indexed access paths.
-
Null Values: Formal ER relations don't typically accommodate null values, while SQL tables explicitly support nulls.
-
Constraints: ER model constraints are represented conceptually, while in SQL they're implemented through specific declaration mechanisms.
Terminology Differences
ER Model | SQL |
---|---|
Entity | Row or Record |
Attribute | Column |
Relationship | Foreign Key connection |
Entity Set | Table |
Domain | Data Type |
Tuple | Row |
SQL and Duplicate Tuples
SQL allows duplicate tuples (rows) in tables and query results for several reasons:
- Performance: Enforcing uniqueness requires additional processing overhead.
- Practical Applications: Many real-world applications need duplicates (e.g., transaction records).
- Query Operations: Operations like joins and aggregations often produce meaningful duplicates.
- Multiset Semantics: SQL is based on multiset (bag) rather than set semantics.
To eliminate duplicates, SQL provides the DISTINCT
keyword.
SQL Data Types
Common SQL data types include:
- Numeric: INTEGER, SMALLINT, BIGINT, DECIMAL, NUMERIC, FLOAT, REAL, DOUBLE PRECISION
- Character: CHAR, VARCHAR, TEXT
- Boolean: BOOLEAN
- Temporal: DATE, TIME, TIMESTAMP, INTERVAL
- Binary: BLOB, BINARY, VARBINARY
- Specialized: XML, JSON, UUID, ARRAY, ENUM
- Spatial: GEOMETRY, GEOGRAPHY (in some implementations)
SQL Integrity Constraints
Entity Integrity
SQL implements entity integrity through:
PRIMARY KEY
constraintsUNIQUE
constraintsNOT NULL
constraints
Referential Integrity
Implemented through:
FOREIGN KEY
constraints with references to primary or unique keys
Referential Triggered Actions
SQL supports referential triggered actions through:
ON DELETE
: CASCADE, SET NULL, SET DEFAULT, RESTRICT, NO ACTIONON UPDATE
: CASCADE, SET NULL, SET DEFAULT, RESTRICT, NO ACTION
Example:
FOREIGN KEY (dept_id) REFERENCES Department(id)
ON DELETE CASCADE
ON UPDATE SET NULL
SQL Query Syntax
A simple SQL retrieval query consists of four main clauses:
-
SELECT (required)
- Attribute list or expressions
- Aggregate functions (COUNT, SUM, AVG, MIN, MAX)
- DISTINCT keyword to eliminate duplicates
- Arithmetic expressions
- String functions
-
FROM (required)
- Table names
- Table aliases
- Join operations (INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN)
- Subqueries as derived tables
-
WHERE (optional)
- Comparison conditions (=, <>, <, >, <=, >=)
- Logical operators (AND, OR, NOT)
- Pattern matching (LIKE, NOT LIKE)
- Membership tests (IN, NOT IN)
- Existence tests (EXISTS, NOT EXISTS)
- NULL tests (IS NULL, IS NOT NULL)
- Subqueries
-
ORDER BY (optional)
- Column names or positions for sorting
- ASC or DESC for sort direction
- Multiple sort keys
Example:
SELECT e.first_name, e.last_name, d.name AS department
FROM employees e
INNER JOIN departments d ON e.dept_id = d.id
WHERE e.hire_date > '2020-01-01' AND d.location = 'New York'
ORDER BY e.last_name, e.first_name;