17.1. Overview of Partitioning in MySQL
This section provides a conceptual overview of partitioning in
MySQL 5.1.
For information on partitioning restrictions and feature
limitations, see Section 17.4, “Restrictions and Limitations on Partitioning”.
The SQL standard does not provide much in the way of guidance
regarding the physical aspects of data storage. The SQL language
itself is intended to work independently of any data structures or
media underlying the schemas, tables, rows, or columns with which
it works. Nonetheless, most advanced database management systems
have evolved some means of determining the physical location to be
used for storing specific pieces of data in terms of the
filesystem, hardware or even both. In MySQL, the
InnoDB
storage engine has long supported the
notion of a tablespace, and the MySQL Server, even prior to the
introduction of partitioning, could be configured to employ
different physical directories for storing different databases
(see Section 7.6.1, “Using Symbolic Links”, for an explanation of how
this is done).
Partitioning takes this notion a step
further, by allowing you to distribute portions of individual
tables across a filesystem according to rules which you can set
largely as needed. In effect, different portions of a table are
stored as separate tables in different locations. The
user-selected rule by which the division of data is accomplished
is known as a partitioning function, which
in MySQL can be the modulus, simple matching against a set of
ranges or value lists, an internal hashing function, or a linear
hashing function. The function is selected according to the
partitioning type specified by the user, and takes as its
parameter the value of a user-supplied expression. This expression
can be either an integer column value, or a function acting on one
or more column values and returning an integer. The value of this
expression is passed to the partitioning function, which returns
an integer value representing the number of the partition in which
that particular record should be stored. This function must be
non-constant and non-random. It may not contain any queries, but
may use virtually any SQL expression that is valid in MySQL, so
long as that expression returns a positive integer less than
MAXVALUE
(the greatest possible positive
integer). Examples of partitioning functions can be found in the
discussions of partitioning types later in this chapter (see
Section 17.2, “Partition Types”), as well as in the
partitioning syntax descriptions given in
Section 13.1.5, “CREATE TABLE
Syntax”.
This is known as horizontal partitioning
— that is, different rows of a table may be assigned to
different physical partitions. MySQL 5.1 does not
support vertical partitioning, in which
different columns of a table are assigned to different physical
partitions. There are not at this time any plans to introduce
vertical partitioning into MySQL 5.1.
Partitioning support is included in the -max
releases of MySQL 5.1 (that is, the 5.1
-max
binaries will be built with
--with-partition
). If the MySQL binary is built
with partitioning support, nothing further needs to be done in
order to enable it (for example, no special entries are required
in your my.cnf
file). You can determine
whether your MySQL server supports partitioning by means of a
SHOW VARIABLES
command such as this one:
mysql> SHOW VARIABLES LIKE '%partition%';
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| have_partitioning | YES |
+-------------------+-------+
1 row in set (0.00 sec)
If you do not see the have_partitioning
variable with the value YES
listed as shown
above in the output of an appropriate SHOW
VARIABLES
, then your version of MySQL does not support
partitioning.
Prior to MySQL 5.1.6, this variable was named
have_partition_engine
. (Bug #16718)
For creating partitioned tables, you can use any storage engine
that is supported by your MySQL server; the MySQL partitioning
engine runs in a separate layer and can interact with any of
these. In MySQL 5.1, all partitions of the same
partitioned table must use the same storage engine; for
example, you cannot use MyISAM
for one
partition and InnoDB
for another. However,
there is nothing preventing you from using different storage
engines for different partitioned tables on the same MySQL server
or even in the same database.
Note: Prior to MySQL 5.1.6, it
was not feasible to create a partitioned table using the
BLACKHOLE
storage engine. (Bug #14524).
To employ a particular storage engine for a partitioned table, it
is necessary only to use the [STORAGE] ENGINE
option just as you would for a non-partitioned table. However, you
should keep in mind that [STORAGE] ENGINE
(and
other table options) need to be listed before
any partitioning options are used in a CREATE
TABLE
statement. This example shows how to create a
table that is partitioned by hash into 6 partitions and which uses
the InnoDB
storage engine:
CREATE TABLE ti (id INT, amount DECIMAL(7,2), tr_date DATE)
ENGINE=INNODB
PARTITION BY HASH( MONTH(tr_date) )
PARTITIONS 6;
(Note that each PARTITION
clause can include a
[STORAGE] ENGINE
option, but in MySQL
5.1 this has no effect.)
It is possible to create partitioned temporary tables; however,
the lifetime of such tables is only as long as the current MySQL
session. This is the same as for non-partitioned temporary tables.
Note: Partitioning applies to all
data and indexes of a table; you cannot partition only the data
and not the indexes, or vice versa,
nor can you partition only a portion of the table.
Data and indexes for each partition can be assigned to a specific
directory using the DATA DIRECTORY
and
INDEX DIRECTORY
options for the PARTITION
clause of the CREATE TABLE
statement used to
create the partitioned table. In addition,
MAX_ROWS
and MIN_ROWS
can be
used to determine the maximum and minimum numbers of rows,
respectively, that can be stored in each partition. See
Section 17.3, “Partition Management”, for more information on
these options. Note: This
particular feature is currently nonfunctional due to Bug #13250;
we should have this fixed by the time the first 5.1 binaries are
made available.
Some of the advantages of partitioning include:
Being able to store more data in one table than can be held on
a single disk or filesystem partition.
Data that loses its usefulness can often be easily be removed
from the table by dropping the partition containing only that
data. Conversely, the process of adding new data can in some
cases be greatly facilitated by adding a new partition
specifically for that data.
Some queries can be greatly optimized in virtue of the fact
that data satisfying a given WHERE
clause
can be stored only on one or more partitions, thereby
excluding any remaining partitions from the search. Because
partitions can be altered after a partitioned table has been
created, you can reorganize your data to enhance frequent
queries that may not have been so when the partitioning scheme
was first set up. This capability, sometimes referred to as
partition pruning, was implemented in
MySQL 5.1.6.
Other benefits usually associated with partitioning include those
in the following list. These features are not currently
implemented in MySQL Partitioning, but are high on our list of
priorities; we hope to include them in the 5.1 production release.
Queries involving aggregate functions such as
SUM()
and COUNT()
can
easily be parallelized. A simple example of such a query might
be SELECT salesperson_id, COUNT(orders) as
order_total FROM sales GROUP BY salesperson_id;
. By
“parallelized,” we mean that the query can be run
simultaneously on each partition, and the final result
obtained merely by summing the results obtained for all
partitions.
Achieving greater query throughput in virtue of spreading data
seeks over multiple disks.
Be sure to check this page and chapter frequently for updates as
Partitioning development for MySQL 5.1 continues.