Principles of Database Systems (H) Lecture 3 Notes

Zhige Chen Lv4

Retrieving Data from a Table

SELECT

To display the full content of a table, we can use

1
SELECT * FROM table_name
But this should not be used in a program, this is because:

  • If our program depends on some specific columns, altering the database will probably crashes our program.
  • Retrieving the whole table could waste unnecessary hardware resource and bandwidth.

And in most circumstances, we are only interested in a specific part of the database. So we need to filter the data by the WHERE clause. The conditions are usually expressed by a column name followed by a comparison operator and the value to which the content of the column is compared. e.g.

1
SELECT * FROM movies WHERE country = 'us'
Only the rows where the expression is true will be returned.

You can compare to a number, a string constant, another column or the result of a function.

If the original table doesn’t contain duplicates and forms a valid relation. Then the filtering result will also not contain duplicates and form a valid relation. Hence we can treat is as a “virtual table”, e.g.

1
2
3
SELECT * FROM 
(SELECT * FROM movies WHERE country = 'us') AS us_movies
WHERE year_released BETWEEN 1940 AND 1949
Note that reversing the filters will result in a behaviorally equivalent statement (but not necessarily equivalent in performance):
1
2
3
4
SELECT * FROM 
(SELECT * FROM movies WHERE year_released BETWEEN 1940 AND 1949)
AS movies_from_40_to_49
WHERE country = 'us'
Or for a simple query like this, we can just use a AND to combine the conditions:
1
2
SELECT * FROM movies
WHERE country = 'us' AND year_released BETWEEN 1940 AND 1949

Caution

AND has higher precedence than OR. Always precedence where they can help improve readability, even if they are unnecessary.

Relational operators:

  • equal =
  • not equal <> or !=
  • less than <
  • less or equal <=
  • greater than >
  • greater or equal >=

Caution

Beware that most database products implicitly convert one of the sides in a comparison between values of differing types, and the result can be quite unintuitive.

Tip

We can test a expression’s output by using

1
SELECT expression AS VALUE;

The IN operator can simplify chained OR conditions:

1
2
SELECT * FROM movies
WHERE country IN ('us', 'gb') AND year_released BETWEEN 1940 AND 1949

For filtering strings, we can use a regex-like pattern. The LIKE operator will filter all strings that fit the specified pattern. The pattern can have two wildcard:

  • % means “any number of character, including none”, and
  • _ means “one and only one character”.

For example, to select all movies that not have a ‘A’ in their name:

1
2
SELECT * FROM movies
WHERE title NOT LIKE '%A%' and title NOT LIKE '%a%'
Notice that the behaviorally equivalent statement
1
2
SELECT * FROM movies
WHERE upper(title) NOT LIKE '%A%'
is not good from a performance perspective.

Caution

It is not good to use function with columns in conditions.

Dates

Beware that the dates have three common formats: - DD/MM/YYYY - MM/DD/YYYY - YYYY/MM/DD To avoid possible issues, always explicity convert data types before using them.

1
2
3
4
5
6
-- bad
SELECT * FROM forum_posts WHERE post_date>= '2018-03-12';

-- good
SELECT * FROM forum_posts WHERE post_date>=date('2018-03-12');
SELECT * FROM forum_posts WHERE post_date>=date('12 March, 2018');

Another frequent mistake is the datetime value. If we compare the datetime s to dates. The SQL engine will not understand that the date part of the datetime value should be equal to the date value.

Also be aware that the comparisons involving dates and times can be somewhat unintuitive:

1
2
3
4
5
6
7
8
9
10
-- Good
WHERE issued >= <March 12>
WHERE issued >= <March 12 00:00:00>

-- Probably bad
WHERE issued <= <March 16>
WHERE issued <= <March 16 00:00:00>

-- The possible intention
WHERE issued <= <March 16 23:59:59>

Date arithmetic can also be problematic, e.g. adding 30 days and adding one month are different. And the syntax of different DBMS is also different. For example, adding a month to the specified date:

  • Microsoft SQLServer: dateadd(month, 1, date_col)
  • IBM DB2: date_col + 1 month
  • PostgreSQL: date_col + interval'1 month'
  • MySQL: date_add(date_col, interval 1 month)
  • Oracle: add_months(date_col, 1) or date_col + decimal_number
  • SQLite: date(date_col, '1 month')

NULL

Unlike in the most programming languages, in SQL, NULL is not a value. For example, the conditions

1
2
WHERE column_name = NULL
WHERE column_name != NULL
This two conditions will never be true. This can be intuitively understand as we are comparing the columns to a value we don’t know, so it will never be true. To test whether the column contains a value or not, we need to use the IS operator:
1
2
WHERE column_name IS NULL
WHERE column_name IS NOT NULL

Some Functions

We can of course query some specific columns instead all of them from the database:

1
SELECT title, year_released FROM movies WHERE country = 'us'
And we need to know the structure of the database to do this. We can see the description of the database by: - desc movies in Oracle, and MySQL - describe table movies in IBM DB2 - \d movies in PostgreSQL - .schema movies in SQLite

One important feature of DBMS is that we can retrieve transformed data through some operators and functions. For example the string concatenation: - 'hello' || 'world' in IBM DB2, Oracle, PostgreSQL, and SQLite - 'hello' + 'world' in SQLServer - concat('hello', 'world') in MySQL

1
2
3
4
SELECT title || ' was released in ' || cast(year_released AS varchar) 
AS movie_release
FROM movies
WHERE country = 'us'

Caution

For performance reasons that will be explained later, there is no issue applying a function (or transformation) to data that is returned or constants that were input. You should avoid, though, applying them to columns that are used for comparison.

Some useful functions:

  • trunc(num, digits)
  • round(num, digits)
  • floor(num)
  • ceiling(num)
  • upper(string)
  • lower(string)
  • substr(string, start, count)
  • trim(string)
  • replace(string, needle, replace)
  • cast(expr as type)

CASE...END

The CASE construct is similar to the switch in Java or other pattern matching constructs. For example

1
2
3
4
5
CASE upper(color)
WHEN 'Y' THEN 'Color'
WHEN 'N' THEN 'B&W'
ELSE '?'
END AS color
In a CASE statement, the value is tested against various values. If a match is found, the corresponding value is returned, If there is no match and there is no ELSE clause, a NULL is returned.

Same as the WHERE clauses, the NULL values cannot be tested directly in a WHEN clause. We need to use the following instead:

1
2
3
4
CASE
WHEN died IS NULL THEN 'alive and kicking'
ELSE 'passed away'
END AS status

  • Title: Principles of Database Systems (H) Lecture 3 Notes
  • Author: Zhige Chen
  • Created at : 2025-09-26 18:32:39
  • Updated at : 2025-10-10 13:19:14
  • Link: https://nofe1248.github.io/2025/09/26/dbh-03/
  • License: This work is licensed under CC BY-NC-SA 4.0.
Comments
On this page
Principles of Database Systems (H) Lecture 3 Notes