If you are already familiar with SQL or other RDBMS packages, you probably already have a solid understanding of many
of the relational database concepts put forth in this chapter. However, each RDBMS handles tables differently at the
system level. This section takes a closer look at tables as they are implemented in PostgreSQL.
PostgreSQL defines a series of system columns in all tables, which are normally invisible to the
user (e.g., they will not be shown by queries unless explicitly requested). These columns contain
meta-data about the content of the table's rows. Many of these contain data that can help to
differentiate between tuples (an individual state of a row) when working with transaction blocks.
(See Chapter 7 for more about transactions.)
As a result of these system-defined columns, in addition to the user-defined columns of a table, any inserted row
will have values in each of the columns described in Table 3-25.
Table 3-25. System columns
Column | Description |
---|
oid (object identifier) | The unique object identifier of a row. PostgreSQL automatically adds this 4-byte number to
all rows. It is never re-used within the same table. |
tableoid (table object identifier) | The oid of the table that contains a row. The name and oid of a table are
related by the pg_class system table. |
xmin (transaction minimum) | The transaction identifier of the inserting transaction of a tuple. |
cmin (command minimum) | The command identifier, starting at 0, associated with the inserting transaction of a tuple. |
xmax (transaction maximum) | The transaction identifier of a tuple's deleting transaction. If a tuple is visible (has not been deleted) this is set to zero. |
cmax (command maximum) | The command identifier associated with the deleting transaction of a tuple. Like xmax, if a tuple is visible, this is set to zero. |
ctid (tuple identifier) | The identifier which describes the physical location of the tuple within the database. A pair of numbers are represented by the ctid: the block number, and tuple index within that block. |
As described in the Section called Understanding Tables," each database consists of tables, and each table consists of at
least one named column. These tables may contain rows, but do not necessarily at any given time.
One table management concern can be how to distinguish between
two rows whose column values are identical. A very useful PostgreSQL
feature is that every row has its own object identifier
number, or OID, which is
unique within that table. In other words, no two rows within the same
table will ever have the same OID. This means that even if a table
were designed in such a way that two rows might be identical, there is
still a programmatic way to discern between them: via the OID. This is
demonstrated in Example 3-31.
Example 3-31. Differentiating rows via the OID
testdb=# SELECT * FROM my_list;
todos
----------------------------------
Correct redundancies in my list.
Correct redundancies in my list.
(2 rows)
testdb=# SELECT *,oid FROM my_list;
todos | oid
----------------------------------+---------
Correct redundancies in my list. | 3391263
Correct redundancies in my list. | 3391264
(2 rows)
testdb=# DELETE FROM my_list
testdb-# WHERE oid = 3391264;
DELETE 1
testdb=# SELECT *,oid FROM my_list;
todos | oid
----------------------------------+---------
Correct redundancies in my list. | 3391263
(1 row)
Before you start creating any tables, we suggest that you take some extra time to plan out your intended database
objects by deciding the names, types, and purposes of all columns within each table. This can help you to be
consistent with table naming structures, which in turn helps you more easily read and construct "legible" queries and
statements.
In addition to taking the somewhat semantic considerations just
described (names, types, and purposes), it is important to be sure
that each table's relationship to each other table is clearly defined.
This can be an important point of table design, as you do not wish to
redundantly represent large amounts of data, nor do you want to end up
omitting important data from one table by misunderstanding the needs
that must be satisfied by your implementation.
As an example, consider again the Book Town books table, from Table 3-1. This table holds an internal Book Town identification number for each book, the title,
author identification number, and a subject identification number. Notice that rather than storing the name of the author,
and rather than storing a text representation of the subject of the book, simple identification integers are stored.
These identification numbers are used to create relationships to two other tables: the
authors, and subjects tables, whose partial
contents are shown in Table 3-26 and Table 3-27.
Table 3-26. The authors table
id | last_name | first_name |
---|
1809 | Geisel | Theodor Seuss |
1111 | Denham | Ariel |
15990 | Bourgeois | Paulette |
2031 | Brown | Margaret Wise |
25041 | Margery Williams | Bianco |
16 | Alcott | Louisa May |
115 | Poe | Edgar Allen |
Table 3-27. The subjects table
id | subject | location |
---|
0 | Arts | Creativity St |
2 | Children's Books | Kids Ct |
3 | Classics | Academic Rd |
4 | Computers | Productivity Ave |
6 | Drama | Main St |
9 | Horror | Black Raven Dr |
15 | Science Fiction | Main St |
By keeping the author and subject-specific data separate from the books table, the
data is stored more efficiently. When multiple books need to be correlated with a particular subject, only the
subject_id needs to be stored, rather than all of the data associated with that subject. This
also makes for simpler maintenance of data associated with book subjects, such as the location in the store. Such data can
be updated in a single, small table, rather than having to update all affected book records with such a
modification. The same general principle applies to the authors table, and its relationship
to the books table via the author_id.
Thoughtful planning can also help to avoid mistakes in choosing appropriate data types. For example, in the
editions table, ISBN numbers are associated with Book Town book identification numbers.
At first glance, it might seem that the ISBN number could be represented with a column of type
integer. The design oversight in this case would be that not only can ISBNs sometimes
contain character data, but a value of type integer would lose any leading zeroes in the
ISBN (e.g., 0451160916 would become 451160916).
For all of these reasons, good table design is not an issue to be overlooked in database administration.