Conceptual information on relational databases and tables is of course entirely moot if you don't have any idea of
how to directly interact with your data. From a general perspective, SQL consists entirely of structured
statements, with which all data in the database is added, modified, and removed. These statements form
the basis for your communication with the PostgreSQL server.
The following sections dissect the anatomy of a SQL statement into its structural pieces, explaining the significance of each, and their relation to one another. The standard PostgreSQL command-line client, psql, provides output to display example PostgreSQL statements.
Our SQL examples commonly take place within an example database called
booktown, the database for our imaginary bookstore, Book Town. The output from
psql is consistently prefixed with a default prompt style, which looks like this:
booktown=#
Some simpler examples may use our generic test database, testdb, if
not specific to the Book Town examples. By default, the psql prompt displays only the name
of the connected database and the =# characters indicating that the system is ready for a
new command (though you will see that the = symbol will change dynamically as
psql tracks the status of SQL input). We display this prompt along with the SQL input and
output in order to help familiarize you with the psql output.
Chapter 4 documents psql in more detail, and it is only mentioned
here to explain the source and style of this book's SQL examples using PostgreSQL.
Note: The schema (with sample data) for the booktown database can be found in the
booktown.sql file, on the CD-ROM. To install this database, type
psql -U postgres template1 -f /mnt/cdrom/booktown.sql from the command line (where
/mnt/cdrom is the path to your mounted CD, and postgres is your
PostgreSQL superuser).
SQL statements always begin with a command (a word, or group of words, that describes
what action the statement will initiate). The command can be called the verb
of the SQL statement, as it always describes an action to be taken. Statements typically contain one or more
clauses, which are formal modifiers that further describe the function of the SQL statement.
Table 3-2 contains a list of some of the most commonly used PostgreSQL commands.
Table 3-2. Fundamental PostgreSQL commands
Command
Description
CREATE DATABASE
Creates a new database
CREATE INDEX
Creates a new index on a table column
CREATE SEQUENCE
Creates a new sequence in an existing database
CREATE TABLE
Creates a new table in an existing database
CREATE TRIGGER
Creates a new trigger definition
CREATE VIEW
Creates a new view on an existing table
SELECT
Retrieves records from a table
INSERT
Adds one or more new records into a table
UPDATE
Modifies the data in existing table records
DELETE
Removes existing records from a table
DROP DATABASE
Destroys an existing database
DROP INDEX
Removes a column index from an existing table
DROP SEQUENCE
Destroys an existing sequence generator
DROP TABLE
Destroys an existing table
DROP TRIGGER
Destroys an existing trigger definition
DROP VIEW
Destroys an existing table view
CREATE USER
Adds a new PostgreSQL user account to the system
ALTER USER
Modifies an existing PostgreSQL user account
DROP USER
Removes an existing PostgreSQL user account
GRANT
Grant rights on a database object to a user
REVOKE
Deny rights on a database object from a user
CREATE FUNCTION
Creates a new SQL function within a database
CREATE LANGUAGE
Creates a new language definition within a database
CREATE OPERATOR
Creates a new SQL operator within a database
CREATE TYPE
Creates a new SQL data type within a database
While obviously code-like in nature, SQL was designed with ease of use and readability in mind. As a result, SQL
statements often bear a strong resemblance to simple, instructional English sentences. A strong feature of SQL is that its
statements are designed to instruct the server what data to find, not literally
how to find it, as you would be forced to do in an ordinary programming language. Reading a
well-designed SQL query should be nearly as easy as reading an ordinary sentence.
Note: In SQL texts, the word query is frequently used interchangeably with
statement. In order to be clear, within this book the term query is used
only to refer to statements which return data (e.g., SELECT statements),
rather than general SQL statements, which may instead create, add, or modify data.
Internally, PostgreSQL interprets structured SQL statements as a sequence of tokens, usually
delimited by whitespace (spaces or newlines, outside of quotes), though some tokens may be placed adjacently if there is no
chance of ambiguity (such as when operators are placed directly next to identifiers). A token in this context is a
word or character that can be identified meaningfully by the server when the SQL statement is parsed,
or interpreted.
Technically, each token can either be considered a
keyword, an identifier, a
quoted identifier, a
constant (also called a
literal ), or one of several special character
symbols. Keywords are words PostgreSQL recognizes as words with pre-defined SQL or PostgreSQL-specific meanings; these include SQL
commands, clauses, function names, and special
noise terms, which are often accompanied
optionally with SQL commands (e.g., the noise term
WORK in the COMMIT command). In contrast, identifiers represent
variable names for tables, columns, and any other database object.
Both keywords and identifiers reference internally defined functions, values, or records, as far as
PostgreSQL is concerned. Constants, on the other hand, describe pieces of data that are interpreted literally, such as a number or character
string.
Finally, a SQL statement contains special character symbols. These are reserved characters (such as parentheses, the semicolon, and square brackets) that logically affect the meaning and arrangement of your keywords,
identifiers, and literals. You can think of these characters as the punctuation for your SQL statements.
Operators fall under the category of special character symbols; they can be used to imply
logical operations or evaluations between data values (either literals, or represented by identifiers), and are generally
between one and four characters in length.
The following sections explain and expand upon the nature of these elementary components of SQL.
As described in the preceding section, each sequential element of a SQL statement is considered a token. What may
not be immediately clear, however, is that tokens may be kept all on the same line, or they may be split across several
lines, as extra whitespace is ignored by PostgreSQL's parser.
Consider the SQL statement in Example 3-1, which is executed first on a single line, and then
executed again, split across two separate lines. Both SELECT statements instruct the
database to display the entire contents of the my_list table:
Example 3-1. Spaces and newlines
testdb=# SELECT * FROM my_list;
todos
------------------------------------------------
Pick up laundry.
Send out bills.
Wrap up Grand Unifying Theory for publication.
(3 rows)
testdb=# SELECT *
testdb-# FROM
testdb-# my_list;
todos
------------------------------------------------
Pick up laundry.
Send out bills.
Wrap up Grand Unifying Theory for publication.
(3 rows)
In Example 3-1 there are several
newlines and spaces between the second statement's tokens. As you
can see by the identical output, PostgreSQL ignores the extra
newlines and spaces, making both statements semantically equivalent.
You can take advantage of this behavior by splitting a long string
of tokens across numerous lines for improved readability of your SQL
statement. This probably isn't necessary for statements as simple
as those in Example 3-1, but it can be quite
helpful when dealing with complex SQL statements with numerous
clauses, expressions, and conditions. Throughout this book we will
periodically split some statements over several lines to help show
what each part of the statement is intended to accomplish.
Keywords are any reserved SQL terms which have a reserved syntactic meaning to the server. Some common keywords are INSERT, UPDATE,
SELECT, and DELETE.
All SQL commands are keywords, though many keywords themselves are not complete commands. For instance, the command
INSERT INTO is a valid SQL command, and the word INTO is
a reserved keyword. As you might guess, however, the word INTO has no particular
significance when used out of context.
Identifiers, as described earlier, are variable names that reference database objects. These names are arbitrarily
designated by the creator of the database object upon creation. The objects which can be referred to by identifiers in
PostgreSQL may be databases, tables, columns, indices, views, sequences, rules, triggers, or functions.
Example 3-2 adds three pieces of information about Oregon into a simple table called
states.
Example 3-2. Keywords and commands
booktown=# INSERT INTO states VALUES (33, 'Oregon', 'OR');
INSERT 3389701 1
In Example 3-2, the INSERT INTO SQL
command makes use of the SQL keywords
INSERT, INTO, and
VALUES.
The INSERT INTO command modifies the table referenced by the states identifier. The modification in this case is the insertion of a new
record.
While not normally required, quotes can be used around identifiers, meaning they should be interpreted literally.
For example, if we want to view each of the columns from a table called states, a simple
statement to achieve this would ordinarily read:
booktown=# SELECT * FROM states;
id | name | abbreviation
----+------------+--------------
33 | Oregon | OR
42 | Washington | WA
(2 rows)
The keywords in this statement are SELECT and
FROM, while the identifiers are the asterisk *
(indicating all columns), and states (the table name). With this command, we are
selecting all columns from a table named states and thereby viewing its contents.
You can accomplish the same thing by putting quotes around the identifier, with the following
statement:
booktown=# SELECT * FROM "states";
id | name | abbreviation
----+------------+--------------
33 | Oregon | OR
42 | Washington | WA
(2 rows)
As you can see, the output is identical when applying quotes to a lowercase identifier.
However, the following statement, which uses quotes around the stAtes
identifier, will fail:
booktown=# SELECT * FROM "stAtEs";
ERROR: Relation 'stAtEs' does not exist
This statement fails because it instructs PostgreSQL to look for a table called, literally,
stAtEs (rather than states). In other words, with the
use of quotes, the statement has explicitly requested that PostgreSQL interpret the identifier name
literally.
All non-quoted identifiers are folded, or converted, to lowercase. When specifying
stAtEs, or STATES (i.e., any combination of
uppercase or lowercase letters) without quotes, PostgreSQL automatically converts the
identifier to lowercase (states) before processing the statement.
Note: The folding of unquoted identifiers to lowercase names is a PostgreSQL-specific convention. The
SQL92 standard specifies that unquoted identifiers always be converted to uppercase. For both legacy and
readability reasons, PostgreSQL does not intend to move to this part of the SQL92 standard.
This should be of special note to database administrators familiar with other SQL products, such as
Oracle, who expect case to automatically change to uppercase. If you are a developer, and you
are interested in writing easily portable applications, be sure to consider this case issue to avoid conflicts over this convention.
Since the parser can still read and understand mixed-case statements (provided that they are formed with the
correct syntax), you should use uppercase and lowercase terminology carefully. Your use of case can both
help and hinder your efficiency when working with a large amount of SQL.
We recommend that, for readability, you try typing identifiers in lowercase and keywords in uppercase, the convention used
throughout this book. By visually separating the fixed, systematic terminology from the user-defined data objects, you make it a great deal easier to quickly read and understand complex SQL statements.
The only instances where you are required to use quotes are either when a database object's
identifier is identical to a keyword, or when the identifier has at least one capitalized letter in its name. In either
of these circumstances, you must remember to quote the identifier both when creating the object, as well as in any
subsequent references to that object (e.g., in SELECT,
DELETE, or UPDATE statements).
If you do not quote an identifier that is spelled identically to an existing keyword, PostgreSQL will
return an error message because it interprets the intended identifier as a keyword. For
instance, if you had a table whose name was literally select, you would get an error message if you tried
querying it with the following statement:
testdb=# SELECT * FROM select;
ERROR: parser: parse error at or near "select"
As you can see, an unquoted query on a table called select produces an
error message. To specify that select is in fact a table, and not a keyword,
it needs to be placed inside of quotes. Therefore, the correct syntax to view a table named select
is as follows.
Remember that any identifiers with at least one capitalized letter must be treated similarly. For example, if
you've for some reason created a table named ProDucts (notice the capitalized "P" and
"D"), and you want to destroy it (as you probably should, with a name like that!), then once again the identifier needs
to be quoted in order to accurately describe its name to PostgreSQL, as follow:
booktown=# DROP TABLE ProDucts;
ERROR: table "products" does not exist
booktown=# DROP TABLE "ProDucts";
DROP
This technique can be extremely useful in some circumstances, even if you never name database objects with
these criteria yourself. For example, importing data through an external ODBC connection (e.g., via Microsoft Access)
can result in table names with all capitalized letters. Without the functionality of quoted identifiers, you would
have no way to accurately reference these tables.
Both keywords and identifier names in PostgreSQL have a maximum length limit of 31 characters. Parsed keywords or
identifiers over that length limit are automatically truncated. Identifiers may begin with any letter (a through z), or with an underscore,
and may then be followed by letters, numbers (0 through 9), or underscores. While keywords are not permitted to start or end
with an underscore, identifier names are permitted to do so. Neither keywords nor identifiers should ever begin with a
number.
In the Section called When quotes are required" we described how quoted identifiers could be used to "overrule" the case
insensitivity of identifiers by placing quotes around them. The same rule-bending can apply to the assertion that an identifier
cannot begin with a number. While PostgreSQL will not allow you to create a table using the name
1st_bent_rule without quotes, the name is acceptable if it is surrounded with quotes.
Example 3-3 first fails in trying to create an illegally named table. It then proceeds to bend the
rules with quotes.
Example 3-3. Bending rules
booktown=# CREATE TABLE 1st_bent_rule (rule_name text);
ERROR: parser: parse error at or near "1"
booktown=# CREATE TABLE "1st_bent_rule" (rule_name text);
CREATE
Furthermore, while quotes themselves are, of course, not allowed within the set of quotes to refer to a table name,
other normally illegal characters are allowed, such as spaces and ampersands. Take note that while
the ANSI/ISO SQL standard forbids using identifiers with the same names as SQL keywords, PostgreSQL (like many other SQL
implementations) has a similarly relaxed view on this, allowing you to force such names with quoted identifiers.
Remember that while the use of quotes can be a useful trick to know for unusual circumstances, if you wish to design portable,
standard SQL statements and relations, it is best to adhere to ANSI/SIO standards whenever possible.
While much of the data in working with a database is stored on the disk and referred to via identifiers (e.g., table
names, column names, and functions), there are obviously times when new data must be introduced to the system. This may be
observed when inserting new records, when forming clauses to specify criteria to delete or modify, or even when
performing calculations on existing records. This data is input through constants, which are sometimes called literals
because they literally represent a value in a SQL statement (rather than referencing an existing value by identifier).
An implicitly typed constant is one whose type is recognized automatically
by PostgreSQL's parser merely by its syntax. PostgreSQL supports five types of implicitly typed constants:
A string constant is an arbitrary sequence of characters bound by single quotes (apostrophes). These are
typically used when inserting character data into a table or passing character data to any other database object. A
practical example of the necessity of string constants is updating the first and last names of
authors in Book Town's authors table:
booktown=# SELECT * FROM authors;
id | last_name | first_name
-------+-----------+------------------
1809 | Geisel | Theodor Seuss
1111 | Denham | Ariel
15990 | Bourgeois | Paulette
25041 | Bianco | Margery Williams
16 | Alcott | Luoisa May
115 | Poe | Edgar Allen
(6 rows)
Looking at this table's contents, it might stand out to you that the first_name with
id 16, Louisa May has been misspelled as
Luoisa May. To correct this, an UPDATE statement can be made with a string
constant, as shown in Example 3-4.
Example 3-4. Using string constants
booktown=# UPDATE authors
booktown-# SET first_name = 'Louisa May'
booktown-# WHERE first_name = 'Luoisa May';
UPDATE 1
booktown=# SELECT * FROM authors;
id | last_name | first_name
-------+-----------+------------------
1809 | Geisel | Theodor Seuss
1111 | Denham | Ariel
15990 | Bourgeois | Paulette
25041 | Bianco | Margery Williams
15 | Poe | Edgar Allen
16 | Alcott | Louisa May
(6 rows)
The UPDATE statement made in Example 3-4 uses the string
constants Louisa May and Luoisa May in conjunction with the
SET and WHERE keywords. This statement updates the
contents of the table referenced by the authors identifier and, as shown, corrects the
misspelling.
The fact that string constants are bound by single quotes presents an obvious semantic problem, however, in that
if the sequence itself contains a single quote, the literal bounds of the constant are made ambiguous. To
escape (make literal) a single quote within the string, you may type two adjacent single
quotes. The parser will interpret the two adjacent single quotes within the string constant as a single, literal quote. PostgreSQL will also allow single quotes to be embedded by using a C-style backslash:
testdb=# SELECT 'PostgreSQL''s great!' AS example;
example
---------------------
PostgreSQL's great!
(1 row)
booktown=# SELECT 'PostgreSQL\'s C-style slashes are great!' AS example;
example
-----------------------------------------
PostgreSQL's C-style slashes are great!
(1 row)
PostgreSQL also supports the C-style "backslash escape" sequences, which are listed in Table 3-3.
ASCII character with the corresponding octal number xxx
Warning
As a result of the backslashes' special meaning described in Table 3-3, in order to include a backslash in the string you
must escape it using a another backslash (e.g., 'A single backslash is: \\'
will transform the pair of backslashes into a single backslash).
When entering two quoted character strings to PostgreSQL that are separated by some amount of whitespace, and where that
whitespace includes at least one newline, the strings are concatenated and viewed as if they had been typed as one constant. This
is illustrated in Example 3-5.
Example 3-5. Multiline string constants
booktown=# SELECT 'book'
booktown-#
booktown-# 'end' AS example;
example
---------
bookend
(1 row)
booktown=# SELECT 'bookend' AS example;
example
---------
bookend
(1 row)
As you can see, the semantics of the two statements is equivalent.
However, at least one newline is required for this interpretation to be possible,
as spaces alone would result in the following error:
booktown=# SELECT 'book' 'end' AS mistake;
ERROR: parser: parse error at or near "'"
This error occurs because without a newline, PostgreSQL will assume that you are referring to two separate constants. If you
wish to concatenate two string constants this way on a single line, PostgreSQL supports the
|| operator for text concatenation (see Chapter 5, for more details on this operator).
booktown=# SELECT 'book' || 'end' AS example;
example
---------
bookend
(1 row)
Bit string constants provide a way to directly represent a binary value with an arbitrary sequence of ones and
zeroes. Similarly to string constants, they are bound by single quotes, but they also must be preceded by a leading
B character (which may be uppercase or lowercase). This character identifies to
PostgreSQL that the forthcoming constant is a bit string, and not a normal string of character data.
Syntactically, the opening single quote must follow immediately after the leading
B, and the bit string may not contain any character other than
0 or 1. While there cannot be whitespace within this
string of bits, it can be continued across multiple lines just like regular string constants, as documented in
the Section called String constants."
Bit string constants are generally only useful when working with tables or functions that require binary values.
Example 3-6 demonstrates the use of a bit string constant upon a simple table containing raw
bytes. A bit string byte is inserted into a list of bytes in the my_bytes table, and
insertion is verified with a simple query.
Example 3-6. Using bit string constants
testdb=# INSERT INTO my_bytes VALUES (B'00000000');
testdb=# SELECT my_byte FROM my_bytes;
my_byte
----------
10000000
10000001
10000101
11111111
00000000
(5 rows)
Integer constants are far more frequently used than bit string constants. PostgreSQL identifies an integer constant
as any token that consists solely of a sequence of numbers (without a decimal point) and that is outside of
single-quotes. Technically, SQL defines integer constants as a sequence of decimal digits with no decimal point.
The range of values available for an integer constant depends largely on the context within which it is used, but PostgreSQL's default for the
integer data type is a 4-byte signed integer, with range from –2147483648 to 2147483647.
Integer constants are used anywhere you wish to represent a literal integer value. They are used frequently
within mathematical operations, as well as in SQL commands that reference a column with an integer data type. Example 3-7 is a simple demonstration of the use of integer constants to update an author's numeric
identifier via an UPDATE command.
Consider once again the authors table used in previous sections, which correlates
a numeric author identifier with two character strings representing the author's first and last name. Suppose that, for
administrative reasons, it has been deemed necessary that any author with an identifier of less than 100 must be modified
to a value of more than 100.
The first step to correct this would be to locate any author with such an id
value. An integer constant can first be used in a SELECT statement's
WHERE clause to perform a less-than comparison to check.
Example 3-7. Using integer constants
booktown=# SELECT * FROM authors WHERE id < 100;
id | last_name | first_name
-------+-----------+------------------
16 | Alcott | Louisa May
(1 row)
booktown=# SELECT * FROM authors WHERE id = 116;
id | last_name | first_name
-------+-----------+------------------
(0 rows)
booktown=# UPDATE authors
booktown-# SET id = 116
booktown-# WHERE id = 16;
UPDATE 1
booktown=# SELECT * FROM authors WHERE id = 116;
id | last_name | first_name
-------+-----------+------------------
116 | Alcott | Louisa May
(1 row)
In Example 3-7, the WHERE clause in the
SELECT statement compares the id column identifier
against an integer constant of 100, returning one row. Once the author with the offending
id is found, a second SELECT statement is issued to
check for an existing author with an id of 116. This is to verify
that the new id is not in use by another author within the authors table, as this column
has been specified as requiring a unique identifier. Finally, an UPDATE statement is
executed, again using integer constants in both the SET and
WHERE clauses.
A floating-point constant is similar to an integer constant, but it is used to represent decimal values as well as
whole integers. These are required whenever such a floating-point value must be represented literally within a SQL
statement.
A floating-point constant can be represented in several forms, as shown in Table 3-4.
Each occurrence of ## represents one or more digits.
Table 3-4. Floating-point representations
Representation
Example
##.##
6.4
##e[+-]##]
8e-8
[##].##[e[+-]##]
.04e8
##.[##][e[+-]##]
4.e5
In the first form, there must be at least one digit before or after the decimal point for PostgreSQL to recognize
the value as a floating-point constant versus an integer constant. The other options involve having at least one digit
before or after an exponent clause, denoted by the e in the list. The presence
of either the decimal point, the exponent clause, or both, distinguishes an integer constant from a floating-point.
Each of these valid formats is represented in Example 3-8 through a simple SQL
SELECT statement illustrating a variety of floating-point conventions.
Example 3-8. Valid floating-point values
booktown=# SELECT .04 AS small_float,
booktown-# -16.63 AS negative_float,
booktown-# 4e3 AS exponential_float,
booktown-# 6.1e-2 AS negative_exponent;
small_float | negative_float | exponential_float | negative_exponent
-------------+----------------+-------------------+-------------------
0.04 | -16.63 | 4000 | 0.061
(1 row)
Boolean constants are much simpler than any other constant values recognized by PostgreSQL, as they may consist
only of two possible values: true and false. When PostgreSQL encounters either of these terms outside of single quotes,
they are implicitly interpreted as Boolean constants, rather than a string constant. Example 3-9 shows this important distinction.
Example 3-9. The difference between true and 'true'
testdb=# SELECT true AS boolean_t,
testdb-# 'true' AS string_t,
testdb-# false AS boolean_f,
testdb-# 'false' AS string_f;
bool_t | string_t | bool_f | string_f
--------+----------+--------+----------
t | true | f | false
(1 row)
When the terms true and false are parsed by PostgreSQL outside of single
quotes, they are implied Boolean values. As shown in Example 3-9, PostgreSQL
displays values which are literally of the type boolean as t
or f, though be careful not to try to use only t
or f as Boolean constant values, as this will not be interpreted correctly by PostgreSQL,
and will cause an error.
Special character symbols are characters with a pre-defined syntactic meaning in PostgreSQL. They are typically
disallowed from being used in identifier names for this reason, though as mentioned in the section on quoted identifiers, this restriction can usually be worked around with quotes if need
be.
Some special character symbols help to make up the "punctuation" of a SQL statement, much like parentheses, periods
and commas do in the English language. Table 3-5 shows some common PostgreSQL-recognized
syntactic symbols.
Table 3-5. Punctuation Symbols
Character
Definition
* (asterisk)
Used with the SELECT command to query all columns in the table, and with the count() aggregate function to count all rows in a table.
() (parentheses)
Used to group expressions, enforce operator precedence, and to make function calls. The use of parentheses is highly subjective to the context in which they are used.
[] (brackets)
Used in the selection of specific elements in an array, or in the declaration of an array type (e.g., with the CREATE TABLE command).
; (semicolon)
Used to terminate a SQL command. The only place it can be used within a statement is within a string constant or quoted identifier.
, (comma)
Some commands use the comma to separate elements within a list.
. (period)
Used in floating-point constants (e.g., 3.1415), as well as to reference column names as children of tables (e.g., table_name.column_name).
: (colon)
Used to select slices from arrays.
$ (dollar sign)
Used in the body of a function definition to represent a positional parameter, or argument.
An operator is another type of special character symbol; it is used to perform operations
on identifiers or constants, returning resultant values. Operators can be used for mathematical operations, such as
addition, as well as to perform comparison and logical operations.
Consider again the books table, and its numeric
author_id field. Recall that the author_id column is
an integer used to identify an author. Now imagine that, due to a system modification, all author identifiers must be incremented by 1,500. This can be
achieved by evaluating the result of an operation (an operator expression) in an
UPDATE statement upon the author_id column. This
requires use of the addition (+) operator. An example of this can be seen in Example 3-10.
Example 3-10. Operators in statements
booktown=# SELECT * FROM books;
id | title | author_id | subject_id
------+-----------------------------+-----------+------------
7808 | The Shining | 4156 | 9
156 | The Tell-Tale Heart | 15 | 9
4513 | Dune | 1866 | 15
4267 | 2001: A Space Odyssey | 2001 | 15
1608 | The Cat in the Hat | 1809 | 2
1590 | Bartholomew and the Oobleck | 1809 | 2
(6 rows)
booktown=# UPDATE books SET author_id = author_id + 1500;
UPDATE 6
booktown=# SELECT * FROM books;
id | title | author_id | subject_id
------+-----------------------------+-----------+------------
7808 | The Shining | 5656 | 9
156 | The Tell-Tale Heart | 1515 | 9
4513 | Dune | 3366 | 15
4267 | 2001: A Space Odyssey | 3501 | 15
1608 | The Cat in the Hat | 3309 | 2
1590 | Bartholomew and the Oobleck | 3309 | 2
(6 rows)
As you can see in Example 3-10, each author_id record is
modified with the results of the + operator's operation upon the previous
author_id value.
Common operators that you are may already familiar with include the basic mathematical operators: the
+ sign for the addition of two numeric values, the
- sign for the subtraction of one numeric value from another,
etc. Some of the more esoteric operators include the bitwise & and
| operators, which modify binary values at the bit level.
In addition to these character symbol operators, it's important to remember the SQL keywords, which
are frequently called operators as well. Most notably, this includes the logical operators
AND, OR, and NOT.
While technically keywords, these terms are grouped with the operators because of their operational effect upon constants
and identifiers.
Table 3-6 lists some fundamental PostgreSQL operators.
Table 3-6. Fundamental PostgreSQL operators
Category
Operator
Definition
Mathematical operators
+ (addition)
Adds two numeric types
- (subtraction)
Subtracts one numeric type from another
/ (division)
Divides one numeric type by another
* (multiplication)
Multiplies one numeric type by another
! (factorial)
Returns an integer's factorial
@ (absolute value)
Returns the absolute value of a numeric value
Comparison operators
= (equivalence)
Compares two values for equivalence
< (less than)
Evaluates whether or not one number is less than another
> (greater than)
Evaluates whether or not one number is larger than another
~ (regular expression)
Performs a regular expression comparison on text
values
Logical operators
AND
Returns true if both Boolean conditions are true
OR
Returns true if at least one of two Boolean conditions is true
NOT
Returns the opposite of a Boolean condition
While many operators have various connotations depending on their context, the
= operator is an especially important one due to its meaning when used with an
UPDATE statement's SET clause.
While in most expressions the = operator is an equivalence operator (used to
compare two values for equivalence), when following the SET clause and an identifier
name in an UPDATE statement, the = is read as an
assignment operator. This means that it is used to assign a new value to an existing identifier, as
the SET term implies.
Comments are blocks of text that, through special character sequences, can embed non-SQL text within SQL
code. These can be used within blocks of code, because PostgreSQL removes the commented areas from the input stream and
treats it as whitespace. There are two styles of comments available: single-line comments, and multiline comments.
Single-line comments are preceded by two dashes (- -) and may either be on a line by
themselves, or they may follow valid SQL tokens. (The comments themselves are not considered tokens to PostgreSQL's parser, as
any character data following the - - sequence, up to the end of the line, is treated as
whitespace.) This is demonstrated in Example 3-11.
Example 3-11. Single-line comments
testdb=# SELECT 'Test' -- This can follow valid SQL tokens,
testdb-# -- or be on a line of it own.
testdb-# AS example;
example
---------
Test
(1 row)
Multiline comments begin with a sequential slash-asterisk
(/*) sequence, and terminate
with a sequential asterisk-slash (*/)
sequence. This style of commenting may already be familiar to C
programmers, but there is one key difference between PostgreSQL's
interpreter and the C language interpreter: PostgreSQL comments may be
nested. Therefore, when you create a multiline
comment within another multiline comment, the */
used to close the inner comment does not also close the outer comment.
Example 3-12 provides a comment explanation.
Example 3-12. Multiline comments
testdb=# SELECT 'Multi' /* This comment extends across
testdb*# * numerous lines, and can be
testdb*# * /* nested safely */ */
testdb-# || '-test' AS example;
example
------------
Multi-test
(1 row)
Nesting comments can be useful if you have a file containing SQL syntax of which you wish to comment a
large portion before sending to PostgreSQL for interpreting and execution. If you have already used multiline comments
within that document and you wish to comment a large section which includes those comments, PostgreSQL is intelligent
enough to recognize that a closing comment sequence (*/) closes only the most recently
opened comment, not the entire commented region.
Note: The asterisk character by itself (without an adjacent slash character) has no special meaning within a comment.
The extra asterisks in Example 3-12 on multiline comments are provided only for aesthetic purposes
and readability.
In summary, a SQL statement is comprised of tokens, where each token can represent either a keyword, identifier, quoted identifier, constant, or special character symbol. Table 3-7 uses a simple
SELECT statement to illustrate a basic, but complete, SQL statement and its
components.
Table 3-7. A simple SQL query
SELECT
id, name
FROM
states
Token Type
Keyword
Identifiers
Keyword
Identifier
Description
Command
Id and name columns
Clause
Table name
As shown in the table, the SELECT statement contains the keywords
SELECT and FROM. Together, the
FROM keyword and states token compose
a clause, as they modify and further describe the SELECT command.
The id, name, and
states tokens are the identifiers of the statement. The
id and name identifiers specify the selected columns, while the states identifier specifies the table name to select from. Therefore,
with the preceding SQL query, you are instructing PostgreSQL to display the columns named
id and name for each row from the
states table. Example 3-13 shows the output
this query generates within the booktown database.
Example 3-13. Example SQL query
booktown=# SELECT id, name FROM states;
id | name
----+------------
42 | Washington
51 | Oregon
(2 rows)
booktown=#
Getting more complicated, Table 3-8 and Table 3-9
break down another example statement. This statement uses the UPDATE command, along with
SET and WHERE clauses, which respectively specify with
what to update the records, and how to find the records to update.
Table 3-8. UPDATE example: the SET clause
UPDATE
states
SET
id
=
51
keyword
identifier
keyword
identifier
operator
integer constant
command
table name
clause
column
assignment
new id value
Table 3-9. UPDATE example: the WHERE clause
WHERE
name
=
'Oregon'
keyword
identifier
operator
string constant
clause
column name
equivalence
string value to match
When executed, this statement examines each record's name column to find matches for
the WHERE clause's stated condition (equivalence to the string constant 'Oregon'). Then,
for each row which matches that condition, it updates the id column with the value
51.
Breaking it down, this UPDATE statement has three keywords, three identifiers, two
operators, and two constants. The keywords are UPDATE (the SQL command),
SET (specifies the updates to make), and WHERE (identifies
the rows to update). The identifiers are the states table name, the id column
name, and the name column name.
The operators are both represented by the = operator. When used with the
SET clause, this operator is used for assignment (to assign a new value to an existing
record's identified column); this is a special use which is unique to the SET clause. In
contrast, when used with the WHERE clause, the = operator
is used to check equivalence between values. In this case, this means that the equivalence operator will check the value of
a record's name column against a string constant with the value of
Oregon.
Finally, the constants in this statement are the integer constant 51 (the new value for the
id column), and the string constant Oregon (compared to the
name column through the WHERE clause).
Example 3-14 therefore updates the states table by setting the
id column to 51 whenever the name
column matches the value Oregon. It then checks the results of that
UPDATE statement with another SELECT statement.
Example 3-14. A SQL update
booktown=# UPDATE states
booktown-# SET id = 51
booktown-# WHERE name = 'Oregon';
UPDATE 1
booktown=# SELECT * FROM states
booktown-# WHERE name = 'Oregon';
id | name | abbreviation
----+--------+--------------
51 | Oregon | OR
(1 row)